/** @mainpage Apache MADlib is an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine learning methods for structured and unstructured data. The MADlib mission: to foster widespread development of scalable analytic skills, by harnessing efforts from commercial practice, academic research, and open-source development. Useful links: Please refer to the ReadMe file for information about incorporated third-party material. License information regarding MADlib and included third-party libraries can be found inside the License directory. @defgroup grp_datatrans Data Types and Transformations @{Data types and transformation operations @} @defgroup grp_arraysmatrix Arrays and Matrices @ingroup grp_datatrans @brief Mathematical operations for arrays and matrices @details These modules provide basic mathematical operations to be run on array and matrices. For a distributed system, a matrix cannot simply be represented as a 2D array of numbers in memory. We provide two forms of distributed representation of a matrix: - Dense: The matrix is represented as a distributed collection of 1-D arrays. An example 3x10 matrix would be the below table:
 row_id |         row_vec
--------+-------------------------
   1    | {9,6,5,8,5,6,6,3,10,8}
   2    | {8,2,2,6,6,10,2,1,9,9}
   3    | {3,9,9,9,8,6,3,9,5,6}
- Sparse: The matrix is represented using the row and column indices for each non-zero entry of the matrix. Example:
 row_id | col_id | value
--------+--------+-------
      1 |      1 |     9
      1 |      5 |     6
      1 |      6 |     6
      2 |      1 |     8
      3 |      1 |     3
      3 |      2 |     9
      4 |      7 |     0
(6 rows)
  All matrix operations work with either form of representation. In many cases, a matrix function can be decomposed to vector operations applied independently on each row of a matrix (or corresponding rows of two matrices). We have also provided access to these internal vector operations (\ref grp_array) for greater flexibility. Matrix operations like matrix_add use the corresponding vector operation (array_add) and also include additional validation and formating. Other functions like matrix_mult are complex and use a combination of such vector operations and other SQL operations. It's important to note that these array functions are only available for the dense format representation of the matrix. In general, the scope of a single array function invocation is limited to only an array (1-dimensional or 2-dimensional) that fits in memory. When such function is executed on a table of arrays, the function is called multiple times - once for each array (or pair of arrays). On contrary, scope of a single matrix function invocation is the complete matrix stored as a distributed table. @{ @defgroup grp_array Array Operations @defgroup grp_matrix Matrix Operations @defgroup grp_matrix_factorization Matrix Factorization @brief Matrix Factorization methods including Singular Value Decomposition and Low-rank Matrix Factorization @{ @defgroup grp_lmf Low-rank Matrix Factorization @defgroup grp_svd Singular Value Decomposition @} @defgroup grp_linalg Norms and Distance functions @defgroup grp_svec Sparse Vectors @} @defgroup grp_pca Dimensionality Reduction @ingroup grp_datatrans @brief A collection of methods for dimensionality reduction. @details A collection of methods for dimensionality reduction. @{ @defgroup grp_pca_train Principal Component Analysis @defgroup grp_pca_project Principal Component Projection @} @defgroup grp_encode_categorical Encoding Categorical Variables @ingroup grp_datatrans @defgroup grp_pivot Pivot @ingroup grp_datatrans @defgroup grp_stemmer Stemming @ingroup grp_datatrans @defgroup grp_graph Graph Contains graph algorithms. @{ @defgroup grp_apsp All Pairs Shortest Path @defgroup grp_bfs Breadth-First Search @defgroup grp_graph_measures Measures Graph Measures @{ @defgroup grp_graph_closeness Closeness @defgroup grp_graph_diameter Graph Diameter @defgroup grp_graph_avg_path_length Average Path Length @defgroup grp_graph_vertex_degrees In-Out degree @} @defgroup grp_pagerank PageRank @defgroup grp_sssp Single Source Shortest Path @defgroup grp_wcc Weakly Connected Components @} @defgroup grp_mdl Model Selection @{Contains functions for model selection and model evaluation. @} @defgroup grp_validation Cross Validation @ingroup grp_mdl @defgroup grp_pred Prediction Metrics @ingroup grp_mdl @defgroup grp_train_test_split Train-Test Split @ingroup grp_mdl @defgroup grp_stats Statistics @{Contains statistics modules @} @defgroup grp_desc_stats Descriptive Statistics @ingroup grp_stats @{A collection of methods to compute descriptive statistics of the dataset @} @defgroup grp_sketches Cardinality Estimators @ingroup grp_desc_stats @{ @defgroup grp_countmin CountMin (Cormode-Muthukrishnan) @defgroup grp_fmsketch FM (Flajolet-Martin) @defgroup grp_mfvsketch MFV (Most Frequent Values) @} @defgroup grp_correlation Pearson's Correlation @ingroup grp_desc_stats @defgroup grp_summary Summary @ingroup grp_desc_stats @defgroup grp_inf_stats Inferential Statistics @ingroup grp_stats @{A collection of methods to compute inferential statistics on a dataset.@} @defgroup grp_stats_tests Hypothesis Tests @ingroup grp_inf_stats @defgroup grp_prob Probability Functions @ingroup grp_stats @defgroup grp_super Supervised Learning @{Contains methods which perform supervised learning tasks @} @defgroup grp_crf Conditional Random Field @ingroup grp_super @defgroup grp_nn Neural Network @ingroup grp_super @defgroup grp_regml Regression Models @ingroup grp_super A collection of methods for modeling conditional expectation of a response variable. @{ @defgroup grp_clustered_errors Clustered Variance @defgroup grp_cox_prop_hazards Cox-Proportional Hazards Regression @defgroup grp_elasticnet Elastic Net Regularization @defgroup grp_glm Generalized Linear Models @defgroup grp_linreg Linear Regression @defgroup grp_logreg Logistic Regression @defgroup grp_marginal Marginal Effects @defgroup grp_multinom Multinomial Regression @defgroup grp_ordinal Ordinal Regression @defgroup grp_robust Robust Variance @} @defgroup grp_svm Support Vector Machines @ingroup grp_super @defgroup grp_tree Tree Methods @ingroup grp_super @{A collection of recursive partitioning (tree) methods. @} @defgroup grp_decision_tree Decision Tree @ingroup grp_tree @defgroup grp_random_forest Random Forest @ingroup grp_tree @defgroup grp_tsa Time Series Analysis @{A collection of methods to analyze time series data. @} @defgroup grp_arima ARIMA @ingroup grp_tsa @defgroup grp_unsupervised Unsupervised Learning @{A collection of methods for unsupervised learning tasks @} @defgroup grp_association_rules Association Rules @ingroup grp_unsupervised @{A collection of methods used to uncover interesting patterns in transactional datasets. @} @defgroup grp_assoc_rules Apriori Algorithm @ingroup grp_association_rules @defgroup grp_clustering Clustering @ingroup grp_unsupervised @{A collection of methods for clustering data@} @defgroup grp_kmeans k-Means Clustering @ingroup grp_clustering @defgroup grp_topic_modelling Topic Modelling @ingroup grp_unsupervised @{A collection of methods to uncover abstract topics in a document corpus @} @defgroup grp_lda Latent Dirichlet Allocation @ingroup grp_topic_modelling @defgroup grp_utility_functions Utility Functions @defgroup @grp_utilities Developer Database Functions @ingroup grp_utility_functions @defgroup grp_linear_solver Linear Solvers @ingroup grp_utility_functions @{A collection of methods that implement solutions for systems of consistent linear equations. @} @defgroup grp_dense_linear_solver Dense Linear Systems @ingroup grp_linear_solver @defgroup grp_sparse_linear_solver Sparse Linear Systems @ingroup grp_linear_solver @defgroup grp_path Path @ingroup grp_utility_functions @defgroup grp_pmml PMML Export @ingroup grp_utility_functions @defgroup grp_sampling Sampling @ingroup grp_utility_functions @{A collection of methods for sampling from a population. @} @defgroup grp_strs Stratified Sampling @ingroup grp_sampling @defgroup grp_sessionize Sessionize @ingroup grp_utility_functions @defgroup grp_text_analysis Text Analysis @ingroup grp_utility_functions @{A collection of methods to find patterns in textual data. @} @defgroup grp_text_utilities Term Frequency @ingroup grp_text_analysis @defgroup grp_early_stage Early Stage Development @brief A collection of implementations which are in early stage of development. There may be some issues that will be addressed in a future version. Interface and implementation are subject to change. @{ @defgroup grp_cg Conjugate Gradient @defgroup grp_bayes Naive Bayes Classification @defgroup grp_sample Random Sampling @defgroup grp_nene Nearest Neighbors @{A collection of methods to create nearest neigbor based models. @defgroup grp_knn k-Nearest Neighbors @} @} @defgroup grp_deprecated Deprecated Modules @brief A collection of deprecated modules. These functions will be removed in the next major version (2.0). @{ @defgroup grp_indicator Create Indicator Variables @defgroup grp_mlogreg Multinomial Logistic Regression @} */