Aggregation Queries

Aggregations are powerful tools in Elasticsearch that allow you to summarize, compute statistics, and analyze data trends within your dataset. In the Laravel-Elasticsearch integration, aggregations are simplified to align with Eloquent’s method of handling aggregate functions, making it intuitive for developers to perform complex data analysis.

Basic Aggregations

You can use standard aggregate functions such as count(), max(), min(), avg(), and sum() directly on your Eloquent models, just like you would with a SQL database. These functions provide quick insights into your dataset.

$totalSales = Sale::count(); // Total number of sales
$highestPrice = Sale::max('price'); // Maximum sale price
$lowestPrice = Sale::min('price'); // Minimum sale price
$averagePricePerSale = Sale::avg('price'); // Average sale price
$totalEarnings = Sale::sum('price'); // Sum of all sale prices

//Multiple fields at once
$highestPriceAndDiscountValue = Sale::max(['price','discount_amount']);

These aggregation functions are straightforward and mirror the typical usage in Laravel’s Eloquent ORM, providing a seamless experience for developers.

Aggregations with Conditions

As the aggregation functions are part of the Eloquent ORM, you can also use them with conditions to filter the data you want to analyze.

$averagePrice = Product::whereNotIn('color', ['red', 'green'])->avg('price');

Average price of products excluding ‘red’ and ‘green’ colored products

Grouped Aggregations

agg() is an optimization method that allows you to call multiple aggregation functions on a single field in one call.

This call saves you from making multiple queries to get different statistics for the same field.

Product::where('is_active',true)->agg(['count','avg','min','max','sum'],'sales');

Returns count, average, minimum, maximum, and sum of sales for active products

Available aggregation functions: count, avg, min, max, sum, matrix.

Elasticsearch Aggregations

Elasticsearch offers advanced aggregation capabilities, including matrix stats aggregations, which provide comprehensive statistics about multiple fields. The Laravel-Elasticsearch integration simplifies the usage of these advanced features.

Matrix Stats Aggregations

`matrix(string|array $fields,$options = [])`

// Matrix stats for 'price' and `orders` fields, excluding 'red' and 'green' colored products
Product::whereNotIn('color', ['red', 'green'])->matrix('price');

// Matrix stats for both 'price' and 'orders' fields
Product::whereNotIn('color', ['red', 'green'])->matrix(['price', 'orders']);

Example result for matrix(['price', 'orders']);:

{
    "matrix_stats_price": {
        "name": "price",
        "count": 80,
        "mean": 971.7610051512718,
        "variance": 319573.9412606584,
        "skewness": 0.09455383230430257,
        "kurtosis": 1.8069082807157395,
        "covariance": {
            "price": 319573.9412606584,
            "orders": 6096.144923266881
        },
        "correlation": {
            "price": 1,
            "orders": 0.14887562780257566
        }
    },
    "matrix_stats_orders": {
        "name": "orders",
        "count": 80,
        "mean": 120.7,
        "variance": 5246.769620253164,
        "skewness": -0.06873472971174674,
        "kurtosis": 1.8427347935190153,
        "covariance": {
            "price": 6096.144923266881,
            "orders": 5246.769620253164
        },
        "correlation": {
            "price": 0.14887562780257566,
            "orders": 1
        }
    }
}

Matrix Results Explained

The result of a matrix aggregation is a detailed statistical summary of the selected fields. Here’s a breakdown of what each statistic represents:

doc_count: The total number of documents that matched the aggregation query.
fields: An array containing the statistical data for each field included in the matrix aggregation.
- name: The name of the field.
- count: The number of values analyzed for this field.
- mean: The average value.
- variance: The variance indicating the data’s spread.
- skewness: A measure of the asymmetry of the data distribution.
- kurtosis: A measure of the ‘tailedness’ of the data distribution.
- covariance: The covariance between the current field and other fields in the matrix, indicating how the fields vary together.
- correlation: The correlation between the current field and other fields, showing the strength and direction of a linear relationship.

Boxplot Aggregations NEW

Boxplots help visualize distribution spread, quartiles, and outliers.

`boxplot(string|array $fields,$options = [])`

// Boxplot for 'price' and `orders` fields, excluding 'red' and 'green' colored products
Product::whereNotIn('color', ['red', 'green'])->boxplot(['price', 'orders']);

Example result for boxplot(['price', 'orders']);:

{
    "boxplot_price": {
        "min": 5.409999847412109,
        "max": 1997.1500244140625,
        "q1": 490.53751373291016,
        "q2": 904.2349853515625,
        "q3": 1437.7875366210938,
        "lower": 5.409999847412109,
        "upper": 1997.1500244140625
    },
    "boxplot_orders": {
        "min": 3,
        "max": 246,
        "q1": 48.25,
        "q2": 126,
        "q3": 172,
        "lower": 3,
        "upper": 246
    }
}

Stats Aggregations NEW

`stats(string|array $fields,$options = [])`

// Stats for 'price' and `orders` fields, excluding 'red' and 'green' colored products
Product::whereNotIn('color', ['red', 'green'])->stats(['price', 'orders']);

Example result for stats(['price', 'orders']);:

{
    "stats_price": {
        "count": 80,
        "min": 5.409999847412109,
        "max": 1997.1500244140625,
        "avg": 971.7610051512718,
        "sum": 77740.88041210175
    },
    "stats_orders": {
        "count": 80,
        "min": 3,
        "max": 246,
        "avg": 120.7,
        "sum": 9656
    }
}

Extended Stats Aggregations NEW

`extendedStats(string|array $fields,$options = [])`

// Extended Stats for 'price' and `orders` fields, excluding 'red' and 'green' colored products
Product::whereNotIn('color', ['red', 'green'])->extendedStats(['price', 'orders']);

Example result for extendedStats(['price', 'orders']);:

{
    "extended_stats_price": {
        "count": 80,
        "min": 5.409999847412109,
        "max": 1997.1500244140625,
        "avg": 971.7610051512718,
        "sum": 77740.88041210175,
        "sum_of_squares": 100791897.45020083,
        "variance": 315579.26699490024,
        "variance_population": 315579.26699490024,
        "variance_sampling": 319573.9412606585,
        "std_deviation": 561.7644230412783,
        "std_deviation_population": 561.7644230412783,
        "std_deviation_sampling": 565.308713236103,
        "std_deviation_bounds": {
            "upper": 2095.2898512338284,
            "lower": -151.7678409312848,
            "upper_population": 2095.2898512338284,
            "lower_population": -151.7678409312848,
            "upper_sampling": 2102.378431623478,
            "lower_sampling": -158.85642132093426
        }
    },
    "extended_stats_orders": {
        "count": 80,
        "min": 3,
        "max": 246,
        "avg": 120.7,
        "sum": 9656,
        "sum_of_squares": 1579974,
        "variance": 5181.185,
        "variance_population": 5181.185,
        "variance_sampling": 5246.769620253165,
        "std_deviation": 71.98044873436119,
        "std_deviation_population": 71.98044873436119,
        "std_deviation_sampling": 72.43458856273821,
        "std_deviation_bounds": {
            "upper": 264.6608974687224,
            "lower": -23.260897468722376,
            "upper_population": 264.6608974687224,
            "lower_population": -23.260897468722376,
            "upper_sampling": 265.5691771254764,
            "lower_sampling": -24.169177125476423
        }
    }
}

Cardinality Aggregations NEW

`cardinality(string|array $fields,$options = [])`

// Cardinality for 'price' and `orders` fields, excluding 'red' and 'green' colored products
Product::whereNotIn('color', ['red', 'green'])->cardinality(['price', 'orders']);

Example result for cardinality(['price', 'orders']);:

{
    "cardinality_price": 80,
    "cardinality_orders": 66
}

Median Absolute Deviation Aggregations NEW

`medianAbsoluteDeviation(string|array $fields,$options = [])`

// Median Absolute Deviation for 'price' and `orders` fields, excluding 'red' and 'green' colored products
Product::whereNotIn('color', ['red', 'green'])->medianAbsoluteDeviation(['price', 'orders']);

Example result for medianAbsoluteDeviation(['price', 'orders']);:

{
    "median_absolute_deviation_price": 452.6300048828125,
    "median_absolute_deviation_orders": 63.5
}

Percentiles Aggregations NEW

`percentiles(string|array $fields,$options = [])`

// Percentiles for 'price' and `orders` fields, excluding 'red' and 'green' colored products
Product::whereNotIn('color', ['red', 'green'])->percentiles(['price', 'orders']);

Example result for percentiles(['price', 'orders']);:

{
    "percentiles_price": {
        "1.0": 13.144099817276,
        "5.0": 124.55599937438966,
        "25.0": 490.53751373291016,
        "50.0": 904.2349853515625,
        "75.0": 1437.7875366210938,
        "95.0": 1827.5749877929686,
        "99.0": 1904.7594665527336
    },
    "percentiles_orders": {
        "1.0": 5.37,
        "5.0": 11,
        "25.0": 48.25,
        "50.0": 126,
        "75.0": 172,
        "95.0": 237.1,
        "99.0": 242.04999999999995
    }
}

String Stats Aggregations NEW

`stringStats(string|array $fields,$options = [])`

// String stats for the 'name' and 'description' field, excluding 'red' and 'green' colored products
Product::whereNotIn('color', ['red', 'green'])->stringStats(['name.keyword', 'description.keyword']);

Example result for stringStats(['name.keyword', 'description.keyword']);:

{
    "string_stats_name.keyword": {
        "count": 80,
        "min_length": 8,
        "max_length": 25,
        "avg_length": 15.1875,
        "entropy": 4.881605005180552
    },
    "string_stats_description.keyword": {
        "count": 80,
        "min_length": 190,
        "max_length": 205,
        "avg_length": 197.3625,
        "entropy": 4.530787852109844
    }
}