larray.Array.describe_by
- Array.describe_by(*args, percentiles=None) Array [source]
Descriptive summary statistics, excluding NaN values, along axes or for groups.
By default, it includes the number of non-NaN values, the mean, standard deviation, minimum, maximum and the 25, 50 and 75 percentiles.
- Parameters
- *argsint or str or Axis or Group or any combination of those, optional
Axes or groups to include in the result after aggregating. Defaults to aggregate over the whole array.
- percentilesarray-like, optional.
list of integer percentiles to include. Defaults to [25, 50, 75].
- Returns
- Array
See also
Examples
>>> data = [[0, 6, 3, 5, 4, 2, 1, 3], [7, 5, 3, 2, 8, 5, 6, 4]] >>> arr = Array(data, 'gender=Male,Female;year=2013..2020').astype(float) >>> arr gender\year 2013 2014 2015 2016 2017 2018 2019 2020 Male 0.0 6.0 3.0 5.0 4.0 2.0 1.0 3.0 Female 7.0 5.0 3.0 2.0 8.0 5.0 6.0 4.0 >>> arr.describe_by('gender') gender\statistic count mean std min 25% 50% 75% max Male 8.0 3.0 2.0 0.0 1.75 3.0 4.25 6.0 Female 8.0 5.0 2.0 2.0 3.75 5.0 6.25 8.0 >>> arr.describe_by('gender', (X.year[:2015], X.year[2018:])) gender year\statistic count mean std min 25% 50% 75% max Male :2015 3.0 3.0 3.0 0.0 1.5 3.0 4.5 6.0 Male 2018: 3.0 2.0 1.0 1.0 1.5 2.0 2.5 3.0 Female :2015 3.0 5.0 2.0 3.0 4.0 5.0 6.0 7.0 Female 2018: 3.0 5.0 1.0 4.0 4.5 5.0 5.5 6.0 >>> arr.describe_by('gender', percentiles=[50, 90]) gender\statistic count mean std min 50% 90% max Male 8.0 3.0 2.0 0.0 3.0 5.3 6.0 Female 8.0 5.0 2.0 2.0 5.0 7.3 8.0