Interactive online version: Binder badge

Working With Sessions

Import the LArray library:

[1]:
from larray import *

Before To Continue

If you not yet comfortable with creating, saving and loading sessions, please read first the Creating Sessions and Loading and Dumping Sessions sections of the tutorial before going further.

Exploring Content

To get the list of items names of a session, use the names shortcut (be careful that the list is sorted alphabetically and does not follow the internal order!):

[2]:
# load a session representing the results of a demographic model
filepath_hdf = get_example_filepath('demography_eurostat.h5')
s_pop = Session(filepath_hdf)

# print the content of the session
print(s_pop.names)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-6d556d84191c> in <module>
      1 # load a session representing the results of a demographic model
      2 filepath_hdf = get_example_filepath('demography_eurostat.h5')
----> 3 s_pop = Session(filepath_hdf)
      4
      5 # print the content of the session

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/core/session.py in __init__(self, *args, **kwargs)
     94             if isinstance(a0, str):
     95                 # assume a0 is a filename
---> 96                 self.load(a0)
     97             else:
     98                 # iterable of tuple or dict-like

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/core/session.py in load(self, fname, names, engine, display, **kwargs)
    426         else:
    427             handler = handler_cls(fname)
--> 428         metadata, objects = handler.read(names, display=display, **kwargs)
    429         for k, v in objects.items():
    430             self[k] = v

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/common.py in read(self, keys, *args, **kwargs)
    128                 print("loading", type, "object", key, "...", end=' ')
    129             try:
--> 130                 res[key] = self._read_item(key, type, *args, **kwargs)
    131             except Exception:
    132                 if not ignore_exceptions:

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/hdf.py in _read_item(self, key, type, *args, **kwargs)
    137         else:
    138             raise TypeError()
--> 139         return read_hdf(self.handle, hdf_key, *args, **kwargs)
    140
    141     def _dump_item(self, key, value, *args, **kwargs):

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/hdf.py in read_hdf(filepath_or_buffer, key, fill_value, na, sort_rows, sort_columns, name, **kwargs)
     81             cartesian_prod = writer != 'LArray'
     82             res = df_asarray(pd_obj, sort_rows=sort_rows, sort_columns=sort_columns, fill_value=fill_value,
---> 83                              parse_header=False, cartesian_prod=cartesian_prod)
     84             if _meta is not None:
     85                 res.meta = _meta

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in df_asarray(df, sort_rows, sort_columns, raw, parse_header, wide, cartesian_prod, **kwargs)
    338         unfold_last_axis_name = isinstance(axes_names[-1], basestring) and '\\' in axes_names[-1]
    339         res = from_frame(df, sort_rows=sort_rows, sort_columns=sort_columns, parse_header=parse_header,
--> 340                          unfold_last_axis_name=unfold_last_axis_name, cartesian_prod=cartesian_prod, **kwargs)
    341
    342     # ugly hack to avoid anonymous axes converted as axes with name 'Unnamed: x' by pandas

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in from_frame(df, sort_rows, sort_columns, parse_header, unfold_last_axis_name, fill_value, meta, cartesian_prod, **kwargs)
    241             raise ValueError('sort_rows and sort_columns cannot not be used when cartesian_prod is set to False. '
    242                              'Please call the method sort_axes on the returned array to sort rows or columns')
--> 243         axes_labels = index_to_labels(df.index, sort=False)
    244
    245     # Pandas treats column labels as column names (strings) so we need to convert them to values

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in index_to_labels(idx, sort)
     41     Returns unique labels for each dimension.
     42     """
---> 43     if isinstance(idx, pd.core.index.MultiIndex):
     44         if sort:
     45             return list(idx.levels)

AttributeError: module 'pandas.core' has no attribute 'index'

To get more information of items of a session, the summary will provide not only the names of items but also the list of labels in the case of axes or groups and the list of axes, the shape and the dtype in the case of arrays:

[3]:
# print the content of the session
print(s_pop.summary())
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-3-aadafdb11856> in <module>
      1 # print the content of the session
----> 2 print(s_pop.summary())

NameError: name 's_pop' is not defined

Selecting And Filtering Items

Session objects work like ordinary dict Python objects. To select an item, use the usual syntax <session_var>['<item_name>']:

[4]:
s_pop['pop']
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-4-c28bdf7ad79c> in <module>
----> 1 s_pop['pop']

NameError: name 's_pop' is not defined

A simpler way consists in the use the syntax <session_var>.<item_name>:

[5]:
s_pop.pop
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-5-e889592ecf82> in <module>
----> 1 s_pop.pop

NameError: name 's_pop' is not defined

Warning: The syntax session_var.item_name will work as long as you don’t use any special character like , ; : in the item’s name.

To return a new session with selected items, use the syntax <session_var>[list, of, item, names]:

[6]:
s_pop_new = s_pop['pop', 'births', 'deaths']

s_pop_new.names
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-6-68de9e1a10e0> in <module>
----> 1 s_pop_new = s_pop['pop', 'births', 'deaths']
      2
      3 s_pop_new.names

NameError: name 's_pop' is not defined

The filter method allows you to select all items of the same kind (i.e. all axes, or groups or arrays) or all items with names satisfying a given pattern:

[7]:
# select only arrays of a session
s_pop.filter(kind=Array)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-7-7af5f7f7d26e> in <module>
      1 # select only arrays of a session
----> 2 s_pop.filter(kind=Array)

NameError: name 's_pop' is not defined
[8]:
# selection all items with a name starting with a letter between a and k
s_pop.filter(pattern='[a-k]*')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-8-a97970540334> in <module>
      1 # selection all items with a name starting with a letter between a and k
----> 2 s_pop.filter(pattern='[a-k]*')

NameError: name 's_pop' is not defined

Iterating over Items

Like the built-in Python dict objects, Session objects provide methods to iterate over items:

[9]:
# iterate over item names
for key in s_pop.keys():
    print(key)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-9-88e5eb00e4f8> in <module>
      1 # iterate over item names
----> 2 for key in s_pop.keys():
      3     print(key)

NameError: name 's_pop' is not defined
[10]:
# iterate over items
for value in s_pop.values():
    if isinstance(value, Array):
        print(value.info)
    else:
        print(repr(value))
    print()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-10-3bf2c61ed9b3> in <module>
      1 # iterate over items
----> 2 for value in s_pop.values():
      3     if isinstance(value, Array):
      4         print(value.info)
      5     else:

NameError: name 's_pop' is not defined
[11]:
# iterate over names and items
for key, value in s_pop.items():
    if isinstance(value, Array):
        print(key, ':')
        print(value.info)
    else:
        print(key, ':', repr(value))
    print()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-11-cd16090b2eff> in <module>
      1 # iterate over names and items
----> 2 for key, value in s_pop.items():
      3     if isinstance(value, Array):
      4         print(key, ':')
      5         print(value.info)

NameError: name 's_pop' is not defined

Arithmetic Operations On Sessions

Session objects accept binary operations with a scalar:

[12]:
# get population, births and deaths in millions
s_pop_div = s_pop / 1e6

s_pop_div.pop
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-12-70f958e0d076> in <module>
      1 # get population, births and deaths in millions
----> 2 s_pop_div = s_pop / 1e6
      3
      4 s_pop_div.pop

NameError: name 's_pop' is not defined

with an array (please read the documentation of the random.choice function first if you don’t know it):

[13]:
from larray import random
random_increment = random.choice([-1, 0, 1], p=[0.3, 0.4, 0.3], axes=s_pop.pop.axes) * 1000
random_increment
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-13-dd062de4e6dc> in <module>
      1 from larray import random
----> 2 random_increment = random.choice([-1, 0, 1], p=[0.3, 0.4, 0.3], axes=s_pop.pop.axes) * 1000
      3 random_increment

NameError: name 's_pop' is not defined
[14]:
# add some variables of a session by a common array
s_pop_rand = s_pop['pop', 'births', 'deaths'] + random_increment

s_pop_rand.pop
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-14-2efaa60f64b2> in <module>
      1 # add some variables of a session by a common array
----> 2 s_pop_rand = s_pop['pop', 'births', 'deaths'] + random_increment
      3
      4 s_pop_rand.pop

NameError: name 's_pop' is not defined

with another session:

[15]:
# compute the difference between each array of the two sessions
s_diff = s_pop - s_pop_rand

s_diff.births
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-15-db5241167ae2> in <module>
      1 # compute the difference between each array of the two sessions
----> 2 s_diff = s_pop - s_pop_rand
      3
      4 s_diff.births

NameError: name 's_pop' is not defined

Applying Functions On All Arrays

In addition to the classical arithmetic operations, the apply method can be used to apply the same function on all arrays. This function should take a single element argument and return a single value:

[16]:
# add the next year to all arrays
def add_next_year(array):
    if 'time' in array.axes.names:
        last_year = array.time.i[-1]
        return array.append('time', 0, last_year + 1)
    else:
        return array

s_pop_with_next_year = s_pop.apply(add_next_year)

print('pop array before calling apply:')
print(s_pop.pop)
print()
print('pop array after calling apply:')
print(s_pop_with_next_year.pop)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-16-edc573de51c2> in <module>
      7         return array
      8
----> 9 s_pop_with_next_year = s_pop.apply(add_next_year)
     10
     11 print('pop array before calling apply:')

NameError: name 's_pop' is not defined

It is possible to pass a function with additional arguments:

[17]:
# add the next year to all arrays.
# Use the 'copy_values_from_last_year flag' to indicate
# whether or not to copy values from the last year
def add_next_year(array, copy_values_from_last_year):
    if 'time' in array.axes.names:
        last_year = array.time.i[-1]
        value = array[last_year] if copy_values_from_last_year else 0
        return array.append('time', value, last_year + 1)
    else:
        return array

s_pop_with_next_year = s_pop.apply(add_next_year, True)

print('pop array before calling apply:')
print(s_pop.pop)
print()
print('pop array after calling apply:')
print(s_pop_with_next_year.pop)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-17-919adc40667d> in <module>
     10         return array
     11
---> 12 s_pop_with_next_year = s_pop.apply(add_next_year, True)
     13
     14 print('pop array before calling apply:')

NameError: name 's_pop' is not defined

It is also possible to apply a function on non-Array objects of a session. Please refer the documentation of the apply method.

Comparing Sessions

Being able to compare two sessions may be useful when you want to compare two different models expected to give the same results or when you have updated your model and want to see what are the consequences of the recent changes.

Session objects provide the two methods to compare two sessions: equals and element_equals:

  • The equals method will return True if all items from both sessions are identical, False otherwise.

  • The element_equals method will compare items of two sessions one by one and return an array of boolean values.

[18]:
# load a session representing the results of a demographic model
filepath_hdf = get_example_filepath('demography_eurostat.h5')
s_pop = Session(filepath_hdf)

# create a copy of the original session
s_pop_copy = s_pop.copy()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-18-e82a48f3fb0f> in <module>
      1 # load a session representing the results of a demographic model
      2 filepath_hdf = get_example_filepath('demography_eurostat.h5')
----> 3 s_pop = Session(filepath_hdf)
      4
      5 # create a copy of the original session

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/core/session.py in __init__(self, *args, **kwargs)
     94             if isinstance(a0, str):
     95                 # assume a0 is a filename
---> 96                 self.load(a0)
     97             else:
     98                 # iterable of tuple or dict-like

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/core/session.py in load(self, fname, names, engine, display, **kwargs)
    426         else:
    427             handler = handler_cls(fname)
--> 428         metadata, objects = handler.read(names, display=display, **kwargs)
    429         for k, v in objects.items():
    430             self[k] = v

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/common.py in read(self, keys, *args, **kwargs)
    128                 print("loading", type, "object", key, "...", end=' ')
    129             try:
--> 130                 res[key] = self._read_item(key, type, *args, **kwargs)
    131             except Exception:
    132                 if not ignore_exceptions:

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/hdf.py in _read_item(self, key, type, *args, **kwargs)
    137         else:
    138             raise TypeError()
--> 139         return read_hdf(self.handle, hdf_key, *args, **kwargs)
    140
    141     def _dump_item(self, key, value, *args, **kwargs):

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/hdf.py in read_hdf(filepath_or_buffer, key, fill_value, na, sort_rows, sort_columns, name, **kwargs)
     81             cartesian_prod = writer != 'LArray'
     82             res = df_asarray(pd_obj, sort_rows=sort_rows, sort_columns=sort_columns, fill_value=fill_value,
---> 83                              parse_header=False, cartesian_prod=cartesian_prod)
     84             if _meta is not None:
     85                 res.meta = _meta

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in df_asarray(df, sort_rows, sort_columns, raw, parse_header, wide, cartesian_prod, **kwargs)
    338         unfold_last_axis_name = isinstance(axes_names[-1], basestring) and '\\' in axes_names[-1]
    339         res = from_frame(df, sort_rows=sort_rows, sort_columns=sort_columns, parse_header=parse_header,
--> 340                          unfold_last_axis_name=unfold_last_axis_name, cartesian_prod=cartesian_prod, **kwargs)
    341
    342     # ugly hack to avoid anonymous axes converted as axes with name 'Unnamed: x' by pandas

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in from_frame(df, sort_rows, sort_columns, parse_header, unfold_last_axis_name, fill_value, meta, cartesian_prod, **kwargs)
    241             raise ValueError('sort_rows and sort_columns cannot not be used when cartesian_prod is set to False. '
    242                              'Please call the method sort_axes on the returned array to sort rows or columns')
--> 243         axes_labels = index_to_labels(df.index, sort=False)
    244
    245     # Pandas treats column labels as column names (strings) so we need to convert them to values

~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in index_to_labels(idx, sort)
     41     Returns unique labels for each dimension.
     42     """
---> 43     if isinstance(idx, pd.core.index.MultiIndex):
     44         if sort:
     45             return list(idx.levels)

AttributeError: module 'pandas.core' has no attribute 'index'
[19]:
# 'element_equals' compare arrays one by one
s_pop.element_equals(s_pop_copy)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-19-7785a94587dd> in <module>
      1 # 'element_equals' compare arrays one by one
----> 2 s_pop.element_equals(s_pop_copy)

NameError: name 's_pop' is not defined
[20]:
# 'equals' returns True if all items of the two sessions have exactly the same items
s_pop.equals(s_pop_copy)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-20-deeccf2589e2> in <module>
      1 # 'equals' returns True if all items of the two sessions have exactly the same items
----> 2 s_pop.equals(s_pop_copy)

NameError: name 's_pop' is not defined
[21]:
# slightly modify the 'pop' array for some labels combination
s_pop_copy.pop += random_increment
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-21-01bcf10dfc4d> in <module>
      1 # slightly modify the 'pop' array for some labels combination
----> 2 s_pop_copy.pop += random_increment

NameError: name 's_pop_copy' is not defined
[22]:
# the 'pop' array is different between the two sessions
s_pop.element_equals(s_pop_copy)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-22-53b7a4e33712> in <module>
      1 # the 'pop' array is different between the two sessions
----> 2 s_pop.element_equals(s_pop_copy)

NameError: name 's_pop' is not defined
[23]:
# 'equals' returns False if at least one item of the two sessions are different in values or axes
s_pop.equals(s_pop_copy)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-23-1672284a9f98> in <module>
      1 # 'equals' returns False if at least one item of the two sessions are different in values or axes
----> 2 s_pop.equals(s_pop_copy)

NameError: name 's_pop' is not defined
[24]:
# reset the 'copy' session as a copy of the original session
s_pop_copy = s_pop.copy()

# add an array to the 'copy' session
s_pop_copy.gender_ratio = s_pop_copy.pop.ratio('gender')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-24-c21041cc8798> in <module>
      1 # reset the 'copy' session as a copy of the original session
----> 2 s_pop_copy = s_pop.copy()
      3
      4 # add an array to the 'copy' session
      5 s_pop_copy.gender_ratio = s_pop_copy.pop.ratio('gender')

NameError: name 's_pop' is not defined
[25]:
# the 'gender_ratio' array is not present in the original session
s_pop.element_equals(s_pop_copy)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-25-08dab1cde8cb> in <module>
      1 # the 'gender_ratio' array is not present in the original session
----> 2 s_pop.element_equals(s_pop_copy)

NameError: name 's_pop' is not defined
[26]:
# 'equals' returns False if at least one item is not present in the two sessions
s_pop.equals(s_pop_copy)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-26-c3385740b3c6> in <module>
      1 # 'equals' returns False if at least one item is not present in the two sessions
----> 2 s_pop.equals(s_pop_copy)

NameError: name 's_pop' is not defined

The == operator return a new session with boolean arrays with elements compared element-wise:

[27]:
# reset the 'copy' session as a copy of the original session
s_pop_copy = s_pop.copy()

# slightly modify the 'pop' array for some labels combination
s_pop_copy.pop += random_increment
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-27-8bcf7020c24c> in <module>
      1 # reset the 'copy' session as a copy of the original session
----> 2 s_pop_copy = s_pop.copy()
      3
      4 # slightly modify the 'pop' array for some labels combination
      5 s_pop_copy.pop += random_increment

NameError: name 's_pop' is not defined
[28]:
s_check_same_values = s_pop == s_pop_copy

s_check_same_values.pop
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-28-5fd1ba111955> in <module>
----> 1 s_check_same_values = s_pop == s_pop_copy
      2
      3 s_check_same_values.pop

NameError: name 's_pop' is not defined

This also works for axes and groups:

[29]:
s_check_same_values.time
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-29-7558bc7abc67> in <module>
----> 1 s_check_same_values.time

NameError: name 's_check_same_values' is not defined

The != operator does the opposite of == operator:

[30]:
s_check_different_values = s_pop != s_pop_copy

s_check_different_values.pop
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-30-18e0b26d24cb> in <module>
----> 1 s_check_different_values = s_pop != s_pop_copy
      2
      3 s_check_different_values.pop

NameError: name 's_pop' is not defined

A more visual way is to use the compare function which will open the Editor.

compare(s_pop, s_pop_alternative, names=['baseline', 'lower_birth_rate'])

compare two sessions

Session API

Please go to the Session section of the API Reference to get the list of all methods of Session objects.