larray.Array.mean_by

Array.mean_by(*axes_and_groups, dtype=None, out=None, skipna=None, keepaxes=False, **explicit_axes)[source]

Compute the arithmetic mean.

Parameters
*axes_and_groupsNone or int or str or Axis or Group or any combination of those

The mean is performed along all axes except the given one(s). For groups, mean is performed along groups and non associated axes. The default (no axis or group) is to perform the mean over all the dimensions of the input array.

An axis can be referred by:

  • its index (integer). Index can be a negative integer, in which case it counts from the last to the first axis.

  • its name (str or AxisReference). You can use either a simple string (‘axis_name’) or the special variable X (X.axis_name).

  • a variable (Axis). If the axis has been defined previously and assigned to a variable, you can pass it as argument.

You may not want to perform the mean over a whole axis but over a selection of specific labels. To do so, you have several possibilities:

  • ([‘a1’, ‘a3’, ‘a5’], ‘b1, b3, b5’) : labels separated by commas in a list or a string

  • (‘a1:a5:2’) : select labels using a slice (general syntax is ‘start:end:step’ where is ‘step’ is optional and 1 by default).

  • (a=’a1, a2, a3’, X.b[‘b1, b2, b3’]) : in case of possible ambiguity, i.e. if labels can belong to more than one axis, you must precise the axis.

  • (‘a1:a3; a5:a7’, b=’b0,b2; b1,b3’) : create several groups with semicolons. Names are simply given by the concatenation of labels (here: ‘a1,a2,a3’, ‘a5,a6,a7’, ‘b0,b2’ and ‘b1,b3’)

  • (‘a1:a3 >> a123’, ‘b[b0,b2] >> b12’) : operator ‘ >> ‘ allows to rename groups.

dtypedtype, optional

The data type of the returned array. Defaults to None (the dtype of the input array).

outArray, optional

Alternate output array in which to place the result. It must have the same shape as the expected output and its type is preserved (e.g., if dtype(out) is float, the result will consist of 0.0’s and 1.0’s). Axes and labels can be different, only the shape matters. Defaults to None (create a new array).

skipnabool, optional

Whether to skip NaN (null) values. If False, resulting cells will be NaN if any of the aggregated cells is NaN. Defaults to True.

keepaxesbool or label-like, optional

Whether reduced axes are left in the result as dimensions with size one. If True, reduced axes will contain a unique label representing the applied aggregation (e.g. ‘sum’, ‘prod’, …). It is possible to override this label by passing a specific value (e.g. keepaxes=’summation’). Defaults to False.

Returns
Array or scalar

Examples

>>> arr = ndtest((4, 4))
>>> arr
a\b  b0  b1  b2  b3
 a0   0   1   2   3
 a1   4   5   6   7
 a2   8   9  10  11
 a3  12  13  14  15
>>> arr.mean()
7.5
>>> # along axis 'a'
>>> arr.mean_by('a')
a   a0   a1   a2    a3
   1.5  5.5  9.5  13.5
>>> # along axis 'b'
>>> arr.mean_by('b')
b   b0   b1   b2   b3
   6.0  7.0  8.0  9.0

Select some rows only

>>> arr.mean_by(['a0', 'a1'])
3.5
>>> # or equivalently
>>> # arr.mean_by('a0,a1')

Split an axis in several parts

>>> arr.mean_by((['a0', 'a1'], ['a2', 'a3']))
a  a0,a1  a2,a3
     3.5   11.5
>>> # or equivalently
>>> # arr.mean_by('a0,a1;a2,a3')

Same with renaming

>>> arr.mean_by((X.a['a0', 'a1'] >> 'a01', X.a['a2', 'a3'] >> 'a23'))
a  a01   a23
   3.5  11.5
>>> # or equivalently
>>> # arr.mean_by('a0,a1>>a01;a2,a3>>a23')