larray.LArray.percentile_by

LArray.percentile_by(q, *axes_and_groups, out=None, interpolation='linear', skipna=None, keepaxes=False, **explicit_axes)[source]

Computes the qth percentile of the data for the specified axis.

Parameters
qint in range of [0,100] (or sequence of floats)

Percentile to compute, which must be between 0 and 100 inclusive.

*axes_and_groupsNone or int or str or Axis or Group or any combination of those

The qth percentile is performed along all axes except the given one(s). For groups, qth percentile is performed along groups and non associated axes. The default (no axis or group) is to perform the qth percentile over all the dimensions of the input array.

An axis can be referred by:

  • its index (integer). Index can be a negative integer, in which case it counts from the last to the first axis.

  • its name (str or AxisReference). You can use either a simple string (‘axis_name’) or the special variable X (X.axis_name).

  • a variable (Axis). If the axis has been defined previously and assigned to a variable, you can pass it as argument.

You may not want to perform the qth percentile over a whole axis but over a selection of specific labels. To do so, you have several possibilities:

  • ([‘a1’, ‘a3’, ‘a5’], ‘b1, b3, b5’) : labels separated by commas in a list or a string

  • (‘a1:a5:2’) : select labels using a slice (general syntax is ‘start:end:step’ where is ‘step’ is optional and 1 by default).

  • (a=’a1, a2, a3’, X.b[‘b1, b2, b3’]) : in case of possible ambiguity, i.e. if labels can belong to more than one axis, you must precise the axis.

  • (‘a1:a3; a5:a7’, b=’b0,b2; b1,b3’) : create several groups with semicolons. Names are simply given by the concatenation of labels (here: ‘a1,a2,a3’, ‘a5,a6,a7’, ‘b0,b2’ and ‘b1,b3’)

  • (‘a1:a3 >> a123’, ‘b[b0,b2] >> b12’) : operator ‘ >> ‘ allows to rename groups.

outLArray, optional

Alternate output array in which to place the result. It must have the same shape as the expected output and its type is preserved (e.g., if dtype(out) is float, the result will consist of 0.0’s and 1.0’s). Axes and labels can be different, only the shape matters. Defaults to None (create a new array).

interpolation{‘linear’, ‘lower’, ‘higher’, ‘midpoint’, ‘nearest’}, optional

Interpolation method to use when the desired quantile lies between two data points i < j:

  • linear: i + (j - i) * fraction, where fraction is the fractional part of the index surrounded by i and j.

  • lower: i.

  • higher: j.

  • nearest: i or j, whichever is nearest.

  • midpoint: (i + j) / 2.

Defaults to ‘linear’.

skipnabool, optional

Whether or not to skip NaN (null) values. If False, resulting cells will be NaN if any of the aggregated cells is NaN. Defaults to True.

keepaxesbool or label-like, optional

Whether or not reduced axes are left in the result as dimensions with size one. If True, reduced axes will contain a unique label representing the applied aggregation (e.g. ‘sum’, ‘prod’, …). It is possible to override this label by passing a specific value (e.g. keepaxes=’summation’). Defaults to False.

Returns
LArray or scalar

Examples

>>> arr = ndtest((4, 4))
>>> arr
a\b  b0  b1  b2  b3
 a0   0   1   2   3
 a1   4   5   6   7
 a2   8   9  10  11
 a3  12  13  14  15
>>> arr.percentile_by(25)
3.75
>>> # along axis 'a'
>>> arr.percentile_by(25, 'a')
a    a0    a1    a2     a3
   0.75  4.75  8.75  12.75
>>> # along axis 'b'
>>> arr.percentile_by(25, 'b')
b   b0   b1   b2   b3
   3.0  4.0  5.0  6.0
>>> # several percentile values
>>> arr.percentile_by([25, 50, 75], 'b')
percentile\b   b0    b1    b2    b3
          25  3.0   4.0   5.0   6.0
          50  6.0   7.0   8.0   9.0
          75  9.0  10.0  11.0  12.0

Select some rows only

>>> arr.percentile_by(25, ['a0', 'a1'])
1.75
>>> # or equivalently
>>> # arr.percentile_by('a0,a1')

Split an axis in several parts

>>> arr.percentile_by(25, (['a0', 'a1'], ['a2', 'a3']))
a  a0,a1  a2,a3
    1.75   9.75
>>> # or equivalently
>>> # arr.percentile_by('a0,a1;a2,a3')

Same with renaming

>>> arr.percentile_by(25, (X.a['a0', 'a1'] >> 'a01', X.a['a2', 'a3'] >> 'a23'))
a   a01   a23
   1.75  9.75
>>> # or equivalently
>>> # arr.percentile_by('a0,a1>>a01;a2,a3>>a23')