Interactive online version: Binder badge

Some Useful Functions

Import the LArray library:

[1]:
from larray import *
[2]:
# load 'demography_eurostat' dataset
demo_eurostat = load_example_data('demography_eurostat')

# extract the 'pop' array from the dataset
pop = demo_eurostat.pop
pop
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-ce337d55681e> in <module>
      1 # load 'demography_eurostat' dataset
----> 2 demo_eurostat = load_example_data('demography_eurostat')
      3
      4 # extract the 'pop' array from the dataset
      5 pop = demo_eurostat.pop

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/example.py in load_example_data(name)
     91     if name not in AVAILABLE_EXAMPLE_DATA.keys():
     92         raise ValueError("example_data must be chosen from list {}".format(list(AVAILABLE_EXAMPLE_DATA.keys())))
---> 93     return la.Session(AVAILABLE_EXAMPLE_DATA[name])

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/core/session.py in __init__(self, *args, **kwargs)
     94             if isinstance(a0, str):
     95                 # assume a0 is a filename
---> 96                 self.load(a0)
     97             else:
     98                 # iterable of tuple or dict-like

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/core/session.py in load(self, fname, names, engine, display, **kwargs)
    426         else:
    427             handler = handler_cls(fname)
--> 428         metadata, objects = handler.read(names, display=display, **kwargs)
    429         for k, v in objects.items():
    430             self[k] = v

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/common.py in read(self, keys, *args, **kwargs)
    128                 print("loading", type, "object", key, "...", end=' ')
    129             try:
--> 130                 res[key] = self._read_item(key, type, *args, **kwargs)
    131             except Exception:
    132                 if not ignore_exceptions:

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/hdf.py in _read_item(self, key, type, *args, **kwargs)
    137         else:
    138             raise TypeError()
--> 139         return read_hdf(self.handle, hdf_key, *args, **kwargs)
    140
    141     def _dump_item(self, key, value, *args, **kwargs):

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/hdf.py in read_hdf(filepath_or_buffer, key, fill_value, na, sort_rows, sort_columns, name, **kwargs)
     81             cartesian_prod = writer != 'LArray'
     82             res = df_asarray(pd_obj, sort_rows=sort_rows, sort_columns=sort_columns, fill_value=fill_value,
---> 83                              parse_header=False, cartesian_prod=cartesian_prod)
     84             if _meta is not None:
     85                 res.meta = _meta

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in df_asarray(df, sort_rows, sort_columns, raw, parse_header, wide, cartesian_prod, **kwargs)
    338         unfold_last_axis_name = isinstance(axes_names[-1], basestring) and '\\' in axes_names[-1]
    339         res = from_frame(df, sort_rows=sort_rows, sort_columns=sort_columns, parse_header=parse_header,
--> 340                          unfold_last_axis_name=unfold_last_axis_name, cartesian_prod=cartesian_prod, **kwargs)
    341
    342     # ugly hack to avoid anonymous axes converted as axes with name 'Unnamed: x' by pandas

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in from_frame(df, sort_rows, sort_columns, parse_header, unfold_last_axis_name, fill_value, meta, cartesian_prod, **kwargs)
    241             raise ValueError('sort_rows and sort_columns cannot not be used when cartesian_prod is set to False. '
    242                              'Please call the method sort_axes on the returned array to sort rows or columns')
--> 243         axes_labels = index_to_labels(df.index, sort=False)
    244
    245     # Pandas treats column labels as column names (strings) so we need to convert them to values

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in index_to_labels(idx, sort)
     41     Returns unique labels for each dimension.
     42     """
---> 43     if isinstance(idx, pd.core.index.MultiIndex):
     44         if sort:
     45             return list(idx.levels)

AttributeError: module 'pandas.core' has no attribute 'index'

with total

Add totals to one or several axes:

[3]:
pop.with_total('gender', label='Total')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-3-d5ae827e7ef1> in <module>
----> 1 pop.with_total('gender', label='Total')

NameError: name 'pop' is not defined

See with_total for more details and examples.

where

The where function can be used to apply some computation depending on a condition:

[4]:
# where(condition, value if true, value if false)
where(pop < pop.mean('time'), -pop, pop)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-4-05a575f73c29> in <module>
      1 # where(condition, value if true, value if false)
----> 2 where(pop < pop.mean('time'), -pop, pop)

NameError: name 'pop' is not defined

See where for more details and examples.

clip

Set all data between a certain range:

[5]:
# values below 10 millions are set to 10 millions
pop.clip(minval=10**7)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-5-13053b977351> in <module>
      1 # values below 10 millions are set to 10 millions
----> 2 pop.clip(minval=10**7)

NameError: name 'pop' is not defined
[6]:
# values above 40 millions are set to 40 millions
pop.clip(maxval=4*10**7)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-6-e0b0b4c8bb9c> in <module>
      1 # values above 40 millions are set to 40 millions
----> 2 pop.clip(maxval=4*10**7)

NameError: name 'pop' is not defined
[7]:
# values below 10 millions are set to 10 millions and
# values above 40 millions are set to 40 millions
pop.clip(10**7, 4*10**7)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-7-ba0330376a84> in <module>
      1 # values below 10 millions are set to 10 millions and
      2 # values above 40 millions are set to 40 millions
----> 3 pop.clip(10**7, 4*10**7)

NameError: name 'pop' is not defined

See clip for more details and examples.

divnot0

Replace division by 0 by 0:

[8]:
divisor = ones(pop.axes, dtype=int)
divisor['Male'] = 0
divisor
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-8-4dc88d9626bf> in <module>
----> 1 divisor = ones(pop.axes, dtype=int)
      2 divisor['Male'] = 0
      3 divisor

NameError: name 'pop' is not defined
[9]:
pop / divisor
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-9-75d7cd731c57> in <module>
----> 1 pop / divisor

NameError: name 'pop' is not defined
[10]:
# we use astype(int) since the divnot0 method
# returns a float array in this case while
# we want an integer array
pop.divnot0(divisor).astype(int)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-10-864c0d271282> in <module>
      2 # returns a float array in this case while
      3 # we want an integer array
----> 4 pop.divnot0(divisor).astype(int)

NameError: name 'pop' is not defined

See divnot0 for more details and examples.

ratio

The ratio (rationot0) method returns an array with all values divided by the sum of values along given axes:

[11]:
pop.ratio('gender')

# which is equivalent to
pop / pop.sum('gender')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-11-0468caf179b2> in <module>
----> 1 pop.ratio('gender')
      2
      3 # which is equivalent to
      4 pop / pop.sum('gender')

NameError: name 'pop' is not defined

See ratio and rationot0 for more details and examples.

percents

[12]:
# or, if you want the previous ratios in percents
pop.percent('gender')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-12-25393219a8df> in <module>
      1 # or, if you want the previous ratios in percents
----> 2 pop.percent('gender')

NameError: name 'pop' is not defined

See percent for more details and examples.

diff

The diff method calculates the n-th order discrete difference along a given axis.

The first order difference is given by out[n+1] = in[n+1] - in[n] along the given axis.

[13]:
# calculates 'diff[year+1] = pop[year+1] - pop[year]'
pop.diff('time')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-13-8de85c6279ec> in <module>
      1 # calculates 'diff[year+1] = pop[year+1] - pop[year]'
----> 2 pop.diff('time')

NameError: name 'pop' is not defined
[14]:
# calculates 'diff[year+2] = pop[year+2] - pop[year]'
pop.diff('time', d=2)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-14-52d4a8a02e05> in <module>
      1 # calculates 'diff[year+2] = pop[year+2] - pop[year]'
----> 2 pop.diff('time', d=2)

NameError: name 'pop' is not defined
[15]:
# calculates 'diff[year] = pop[year+1] - pop[year]'
pop.diff('time', label='lower')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-15-ac9f0715f28c> in <module>
      1 # calculates 'diff[year] = pop[year+1] - pop[year]'
----> 2 pop.diff('time', label='lower')

NameError: name 'pop' is not defined

See diff for more details and examples.

growth_rate

The growth_rate method calculates the growth along a given axis.

It is roughly equivalent to a.diff(axis, d, label) / a[axis.i[:-d]]:

[16]:
pop.growth_rate('time')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-16-745057b9829b> in <module>
----> 1 pop.growth_rate('time')

NameError: name 'pop' is not defined

See growth_rate for more details and examples.

shift

The shift method drops first label of an axis and shifts all subsequent labels

[17]:
pop.shift('time')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-17-f61f4cde41a9> in <module>
----> 1 pop.shift('time')

NameError: name 'pop' is not defined
[18]:
# when shift is applied on an (increasing) time axis,
# it effectively brings "past" data into the future
pop_shifted = pop.shift('time')
stack({'pop_shifted_2014': pop_shifted[2014], 'pop_2013': pop[2013]}, 'array')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-18-c18dcab698a8> in <module>
      1 # when shift is applied on an (increasing) time axis,
      2 # it effectively brings "past" data into the future
----> 3 pop_shifted = pop.shift('time')
      4 stack({'pop_shifted_2014': pop_shifted[2014], 'pop_2013': pop[2013]}, 'array')

NameError: name 'pop' is not defined

See shift for more details and examples.

Other interesting functions

There are a lot more interesting functions that you can find in the API reference in sections Aggregation Functions, Miscellaneous and Utility Functions.