Interactive online version: Binder badge

Working With Sessions

Import the LArray library:

[2]:
from larray import *

Check the version of LArray:

[3]:
from larray import __version__
__version__
[3]:
'0.30'

Before To Continue

If you not yet comfortable with creating, saving and loading sessions, please read first the Creating Sessions and Loading and Dumping Sessions sections of the tutorial before going further.

Exploring Content

To get the list of items names of a session, use the names shortcut (be careful that the list is sorted alphabetically and does not follow the internal order!):

[4]:
# load a session representing the results of a demographic model
filepath_hdf = get_example_filepath('population_session.h5')
s_pop = Session(filepath_hdf)

# print the content of the session
print(s_pop.names)
['births', 'country', 'deaths', 'even_years', 'gender', 'odd_years', 'pop', 'time']

To get more information of items of a session, the summary will provide not only the names of items but also the list of labels in the case of axes or groups and the list of axes, the shape and the dtype in the case of arrays:

[5]:
# print the content of the session
print(s_pop.summary())
country: country ['Belgium' 'France' 'Germany'] (3)
gender: gender ['Male' 'Female'] (2)
time: time [2013 2014 2015] (3)
even_years: time['2014'] >> even_years (1)
odd_years: time[2013 2015] >> odd_years (2)
births: country, gender, time (3 x 2 x 3) [int32]
deaths: country, gender, time (3 x 2 x 3) [int32]
pop: country, gender, time (3 x 2 x 3) [int32]

Selecting And Filtering Items

To select an item, simply use the syntax <session_var>.<item_name>:

[6]:
s_pop.pop
[6]:
country  gender\time      2013      2014      2015
Belgium         Male   5472856   5493792   5524068
Belgium       Female   5665118   5687048   5713206
 France         Male  31772665  31936596  32175328
 France       Female  33827685  34005671  34280951
Germany         Male  39380976  39556923  39835457
Germany       Female  41142770  41210540  41362080

To return a new session with selected items, use the syntax <session_var>[list, of, item, names]:

[7]:
s_pop_new = s_pop['pop', 'births', 'deaths']

s_pop_new.names
[7]:
['births', 'deaths', 'pop']

The filter method allows you to select all items of the same kind (i.e. all axes, or groups or arrays) or all items with names satisfying a given pattern:

[8]:
# select only arrays of a session
s_pop.filter(kind=LArray)
[8]:
Session(births, deaths, pop)
[9]:
# selection all items with a name starting with a letter between a and k
s_pop.filter(pattern='[a-k]*')
[9]:
Session(country, gender, even_years, births, deaths)

Arithmetic Operations On Sessions

Session objects accept binary operations with a scalar:

[10]:
# get population, births and deaths in millions
s_pop_div = s_pop / 1e6

s_pop_div.pop
[10]:
country  gender\time       2013       2014       2015
Belgium         Male   5.472856   5.493792   5.524068
Belgium       Female   5.665118   5.687048   5.713206
 France         Male  31.772665  31.936596  32.175328
 France       Female  33.827685  34.005671  34.280951
Germany         Male  39.380976  39.556923  39.835457
Germany       Female   41.14277   41.21054   41.36208

with an array (please read the documentation of the random.choice function first if you don’t know it):

[11]:
from larray import random
random_multiplicator = random.choice([0.98, 1.0, 1.02], p=[0.15, 0.7, 0.15], axes=s_pop.pop.axes)
random_multiplicator
[11]:
country  gender\time  2013  2014  2015
Belgium         Male   1.0   1.0  0.98
Belgium       Female   1.0   1.0  0.98
 France         Male   1.0   1.0   1.0
 France       Female   1.0   1.0   1.0
Germany         Male   1.0   1.0   1.0
Germany       Female  0.98   1.0   1.0
[12]:
# multiply all variables of a session by a common array
s_pop_rand = s_pop * random_multiplicator

s_pop_rand.pop
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-12-5f82b3cbbdf9> in <module>
      1 # multiply all variables of a session by a common array
----> 2 s_pop_rand = s_pop * random_multiplicator
      3
      4 s_pop_rand.pop

~/checkouts/readthedocs.org/user_builds/larray/conda/0.30/lib/python3.6/site-packages/larray-0.30-py3.6.egg/larray/core/session.py in opmethod(self, other)
    941                 res = []
    942                 for name in all_keys:
--> 943                     self_item = self.get(name, nan)
    944                     other_operand = other.get(name, nan) if hasattr(other, 'get') else other
    945                     if arrays_only and not isinstance(self_item, LArray):

~/checkouts/readthedocs.org/user_builds/larray/conda/0.30/lib/python3.6/site-packages/larray-0.30-py3.6.egg/larray/core/session.py in get(self, key, default)
    299         """
    300         try:
--> 301             return self[key]
    302         except KeyError:
    303             return default

~/checkouts/readthedocs.org/user_builds/larray/conda/0.30/lib/python3.6/site-packages/larray-0.30-py3.6.egg/larray/core/session.py in __getitem__(self, key)
    255             return Session([(name, self[name]) for name in truenames])
    256         elif isinstance(key, (tuple, list)):
--> 257             assert all(isinstance(k, str) for k in key)
    258             return Session([(k, self[k]) for k in key])
    259         else:

AssertionError:

with another session:

[13]:
# compute the difference between each array of the two sessions
s_diff = s_pop - s_pop_rand

s_diff.births
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-13-db5241167ae2> in <module>
      1 # compute the difference between each array of the two sessions
----> 2 s_diff = s_pop - s_pop_rand
      3
      4 s_diff.births

NameError: name 's_pop_rand' is not defined

Applying Functions On All Arrays

In addition to the classical arithmetic operations, the apply method can be used to apply the same function on all arrays. This function should take a single element argument and return a single value:

[14]:
# force conversion to type int
def as_type_int(array):
    return array.astype(int)

s_pop_rand_int = s_pop_rand.apply(as_type_int)

print('pop array before calling apply:')
print(s_pop_rand.pop)
print()
print('pop array after calling apply:')
print(s_pop_rand_int.pop)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-14-5ba7352689a5> in <module>
      3     return array.astype(int)
      4
----> 5 s_pop_rand_int = s_pop_rand.apply(as_type_int)
      6
      7 print('pop array before calling apply:')

NameError: name 's_pop_rand' is not defined

It is possible to pass a function with additional arguments:

[15]:
# passing the LArray.astype method directly with argument
# dtype defined as int
s_pop_rand_int = s_pop_rand.apply(LArray.astype, dtype=int)

print('pop array before calling apply:')
print(s_pop_rand.pop)
print()
print('pop array after calling apply:')
print(s_pop_rand_int.pop)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-15-526833a6ec98> in <module>
      1 # passing the LArray.astype method directly with argument
      2 # dtype defined as int
----> 3 s_pop_rand_int = s_pop_rand.apply(LArray.astype, dtype=int)
      4
      5 print('pop array before calling apply:')

NameError: name 's_pop_rand' is not defined

It is also possible to apply a function on non-LArray objects of a session. Please refer the documentation of the apply method.

Comparing Sessions

Being able to compare two sessions may be useful when you want to compare two different models expected to give the same results or when you have updated your model and want to see what are the consequences of the recent changes.

Session objects provide the two methods to compare two sessions: equals and element_equals.

The equals method will return True if all items from both sessions are identical, False otherwise:

[16]:
# load a session representing the results of a demographic model
filepath_hdf = get_example_filepath('population_session.h5')
s_pop = Session(filepath_hdf)

# create a copy of the original session
s_pop_copy = Session(filepath_hdf)

# 'equals' returns True if all items of the two sessions have exactly the same items
s_pop.equals(s_pop_copy)
[16]:
True
[17]:
# create a copy of the original session but with the array
# 'births' slightly modified for some labels combination
s_pop_alternative = Session(filepath_hdf)
s_pop_alternative.births *= random_multiplicator

# 'equals' returns False if at least on item of the two sessions are different in values or axes
s_pop.equals(s_pop_alternative)
[17]:
False
[18]:
# add an array to the session
s_pop_new_output = Session(filepath_hdf)
s_pop_new_output.gender_ratio = s_pop_new_output.pop.ratio('gender')

# 'equals' returns False if at least on item is not present in the two sessions
s_pop.equals(s_pop_new_output)
[18]:
False

The element_equals method will compare items of two sessions one by one and return an array of boolean values:

[19]:
# 'element_equals' compare arrays one by one
s_pop.element_equals(s_pop_copy)
[19]:
name  country  gender  time  even_years  odd_years  births  deaths   pop
         True    True  True        True       True    True    True  True
[20]:
# array 'births' is different between the two sessions
s_pop.element_equals(s_pop_alternative)
[20]:
name  country  gender  time  even_years  odd_years  births  deaths   pop
         True    True  True        True       True   False    True  True

The == operator return a new session with boolean arrays with elements compared element-wise:

[21]:
s_same_values = s_pop == s_pop_alternative

s_same_values.births
[21]:
country  gender\time   2013  2014   2015
Belgium         Male   True  True  False
Belgium       Female   True  True  False
 France         Male   True  True   True
 France       Female   True  True   True
Germany         Male   True  True   True
Germany       Female  False  True   True

This also works for axes and groups:

[22]:
s_same_values.country
[22]:
country  Belgium  France  Germany
            True    True     True

The != operator does the opposite of == operator:

[23]:
s_different_values = s_pop != s_pop_alternative

s_different_values.births
[23]:
country  gender\time   2013   2014   2015
Belgium         Male  False  False   True
Belgium       Female  False  False   True
 France         Male  False  False  False
 France       Female  False  False  False
Germany         Male  False  False  False
Germany       Female   True  False  False

A more visual way is to use the compare function which will open the Editor.

compare(s_pop, s_pop_alternative, names=['baseline', 'lower_birth_rate'])

compare two sessions

Session API

Please go to the Session section of the API Reference to get the list of all methods of Session objects.