Indexing, Selecting and Assigning¶
Import the LArray library:
[1]:
from larray import *
Import the test array pop
:
[2]:
# let's start with
pop = load_example_data('demography_eurostat').pop
pop
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-2-9782f2255391> in <module>
1 # let's start with
----> 2 pop = load_example_data('demography_eurostat').pop
3 pop
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/example.py in load_example_data(name)
91 if name not in AVAILABLE_EXAMPLE_DATA.keys():
92 raise ValueError("example_data must be chosen from list {}".format(list(AVAILABLE_EXAMPLE_DATA.keys())))
---> 93 return la.Session(AVAILABLE_EXAMPLE_DATA[name])
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/core/session.py in __init__(self, *args, **kwargs)
94 if isinstance(a0, str):
95 # assume a0 is a filename
---> 96 self.load(a0)
97 else:
98 # iterable of tuple or dict-like
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/core/session.py in load(self, fname, names, engine, display, **kwargs)
426 else:
427 handler = handler_cls(fname)
--> 428 metadata, objects = handler.read(names, display=display, **kwargs)
429 for k, v in objects.items():
430 self[k] = v
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/common.py in read(self, keys, *args, **kwargs)
128 print("loading", type, "object", key, "...", end=' ')
129 try:
--> 130 res[key] = self._read_item(key, type, *args, **kwargs)
131 except Exception:
132 if not ignore_exceptions:
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/hdf.py in _read_item(self, key, type, *args, **kwargs)
137 else:
138 raise TypeError()
--> 139 return read_hdf(self.handle, hdf_key, *args, **kwargs)
140
141 def _dump_item(self, key, value, *args, **kwargs):
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/hdf.py in read_hdf(filepath_or_buffer, key, fill_value, na, sort_rows, sort_columns, name, **kwargs)
81 cartesian_prod = writer != 'LArray'
82 res = df_asarray(pd_obj, sort_rows=sort_rows, sort_columns=sort_columns, fill_value=fill_value,
---> 83 parse_header=False, cartesian_prod=cartesian_prod)
84 if _meta is not None:
85 res.meta = _meta
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in df_asarray(df, sort_rows, sort_columns, raw, parse_header, wide, cartesian_prod, **kwargs)
338 unfold_last_axis_name = isinstance(axes_names[-1], basestring) and '\\' in axes_names[-1]
339 res = from_frame(df, sort_rows=sort_rows, sort_columns=sort_columns, parse_header=parse_header,
--> 340 unfold_last_axis_name=unfold_last_axis_name, cartesian_prod=cartesian_prod, **kwargs)
341
342 # ugly hack to avoid anonymous axes converted as axes with name 'Unnamed: x' by pandas
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in from_frame(df, sort_rows, sort_columns, parse_header, unfold_last_axis_name, fill_value, meta, cartesian_prod, **kwargs)
241 raise ValueError('sort_rows and sort_columns cannot not be used when cartesian_prod is set to False. '
242 'Please call the method sort_axes on the returned array to sort rows or columns')
--> 243 axes_labels = index_to_labels(df.index, sort=False)
244
245 # Pandas treats column labels as column names (strings) so we need to convert them to values
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in index_to_labels(idx, sort)
41 Returns unique labels for each dimension.
42 """
---> 43 if isinstance(idx, pd.core.index.MultiIndex):
44 if sort:
45 return list(idx.levels)
AttributeError: module 'pandas.core' has no attribute 'index'
Selecting (Subsets)¶
The Array
class allows to select a subset either by labels or indices (positions)
Selecting by Labels¶
To take a subset of an array using labels, use brackets [ ].
Let’s start by selecting a single element:
[3]:
pop['Belgium', 'Female', 2017]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-3-416b922b672e> in <module>
----> 1 pop['Belgium', 'Female', 2017]
NameError: name 'pop' is not defined
As long as there is no ambiguity (i.e. axes sharing one or several same label(s)), the order of indexing does not matter. So you usually do not care/have to remember about axes positions during computation. It only matters for output.
[4]:
# order of index doesn't matter
pop['Female', 2017, 'Belgium']
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-4-4385ae601c5e> in <module>
1 # order of index doesn't matter
----> 2 pop['Female', 2017, 'Belgium']
NameError: name 'pop' is not defined
Selecting a subset is done by using slices or lists of labels:
[5]:
pop[['Belgium', 'Germany'], 2014:2016]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-5-dffb7fcae00c> in <module>
----> 1 pop[['Belgium', 'Germany'], 2014:2016]
NameError: name 'pop' is not defined
Slices bounds are optional: if not given, start is assumed to be the first label and stop is the last one.
[6]:
# select all years starting from 2015
pop[2015:]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-6-cd526a81a33e> in <module>
1 # select all years starting from 2015
----> 2 pop[2015:]
NameError: name 'pop' is not defined
[7]:
# select all first years until 2015
pop[:2015]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-7-d85985428ba9> in <module>
1 # select all first years until 2015
----> 2 pop[:2015]
NameError: name 'pop' is not defined
Slices can also have a step (defaults to 1), to take every Nth labels:
[8]:
# select all even years starting from 2014
pop[2014::2]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-8-2f5495b33d72> in <module>
1 # select all even years starting from 2014
----> 2 pop[2014::2]
NameError: name 'pop' is not defined
Warning: Selecting by labels as in above examples works well as long as there is no ambiguity. When two or more axes have common labels, it leads to a crash. The solution is then to precise to which axis belong the labels.
[9]:
immigration = load_example_data('demography_eurostat').immigration
# the 'immigration' array has two axes (country and citizenship) which share the same labels
immigration
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-9-289b1fb5c9ce> in <module>
----> 1 immigration = load_example_data('demography_eurostat').immigration
2
3 # the 'immigration' array has two axes (country and citizenship) which share the same labels
4 immigration
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/example.py in load_example_data(name)
91 if name not in AVAILABLE_EXAMPLE_DATA.keys():
92 raise ValueError("example_data must be chosen from list {}".format(list(AVAILABLE_EXAMPLE_DATA.keys())))
---> 93 return la.Session(AVAILABLE_EXAMPLE_DATA[name])
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/core/session.py in __init__(self, *args, **kwargs)
94 if isinstance(a0, str):
95 # assume a0 is a filename
---> 96 self.load(a0)
97 else:
98 # iterable of tuple or dict-like
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/core/session.py in load(self, fname, names, engine, display, **kwargs)
426 else:
427 handler = handler_cls(fname)
--> 428 metadata, objects = handler.read(names, display=display, **kwargs)
429 for k, v in objects.items():
430 self[k] = v
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/common.py in read(self, keys, *args, **kwargs)
128 print("loading", type, "object", key, "...", end=' ')
129 try:
--> 130 res[key] = self._read_item(key, type, *args, **kwargs)
131 except Exception:
132 if not ignore_exceptions:
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/hdf.py in _read_item(self, key, type, *args, **kwargs)
137 else:
138 raise TypeError()
--> 139 return read_hdf(self.handle, hdf_key, *args, **kwargs)
140
141 def _dump_item(self, key, value, *args, **kwargs):
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/hdf.py in read_hdf(filepath_or_buffer, key, fill_value, na, sort_rows, sort_columns, name, **kwargs)
81 cartesian_prod = writer != 'LArray'
82 res = df_asarray(pd_obj, sort_rows=sort_rows, sort_columns=sort_columns, fill_value=fill_value,
---> 83 parse_header=False, cartesian_prod=cartesian_prod)
84 if _meta is not None:
85 res.meta = _meta
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in df_asarray(df, sort_rows, sort_columns, raw, parse_header, wide, cartesian_prod, **kwargs)
338 unfold_last_axis_name = isinstance(axes_names[-1], basestring) and '\\' in axes_names[-1]
339 res = from_frame(df, sort_rows=sort_rows, sort_columns=sort_columns, parse_header=parse_header,
--> 340 unfold_last_axis_name=unfold_last_axis_name, cartesian_prod=cartesian_prod, **kwargs)
341
342 # ugly hack to avoid anonymous axes converted as axes with name 'Unnamed: x' by pandas
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in from_frame(df, sort_rows, sort_columns, parse_header, unfold_last_axis_name, fill_value, meta, cartesian_prod, **kwargs)
241 raise ValueError('sort_rows and sort_columns cannot not be used when cartesian_prod is set to False. '
242 'Please call the method sort_axes on the returned array to sort rows or columns')
--> 243 axes_labels = index_to_labels(df.index, sort=False)
244
245 # Pandas treats column labels as column names (strings) so we need to convert them to values
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in index_to_labels(idx, sort)
41 Returns unique labels for each dimension.
42 """
---> 43 if isinstance(idx, pd.core.index.MultiIndex):
44 if sort:
45 return list(idx.levels)
AttributeError: module 'pandas.core' has no attribute 'index'
[10]:
# LArray doesn't use the position of the labels used inside the brackets
# to determine the corresponding axes. Instead LArray will try to guess the
# corresponding axis for each label whatever is its position.
# Then, if a label is shared by two or more axes, LArray will not be able
# to choose between the possible axes and will raise an error.
try:
immigration['Belgium', 'Netherlands']
except Exception as e:
print(type(e).__name__, ':', e)
NameError : name 'immigration' is not defined
[11]:
# the solution is simple. You need to precise the axes on which you make a selection
immigration[immigration.country['Belgium'], immigration.citizenship['Netherlands']]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-11-e9996eb15483> in <module>
1 # the solution is simple. You need to precise the axes on which you make a selection
----> 2 immigration[immigration.country['Belgium'], immigration.citizenship['Netherlands']]
NameError: name 'immigration' is not defined
Ambiguous Cases - Specifying Axes Using The Special Variable X¶
When selecting, assigning or using aggregate functions, an axis can be referred via the special variable X
:
pop[X.time[2015:]]
pop.sum(X.time)
This gives you access to axes of the array you are manipulating. The main drawback of using X
is that you lose the autocompletion available from many editors. It only works with non-anonymous axes for which names do not contain whitespaces or special characters.
[12]:
# the previous example can also be written as
immigration[X.country['Belgium'], X.citizenship['Netherlands']]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-12-d269086d0278> in <module>
1 # the previous example can also be written as
----> 2 immigration[X.country['Belgium'], X.citizenship['Netherlands']]
NameError: name 'immigration' is not defined
Selecting by Indices¶
Sometimes it is more practical to use indices (positions) along the axis, instead of labels. You need to add the character i
before the brackets: .i[indices]
. As for selection with labels, you can use a single index, a slice or a list of indices. Indices can be also negative (-1 represent the last element of an axis).
Note: Remember that indices (positions) are always 0-based in Python. So the first element is at index 0, the second is at index 1, etc.
[13]:
# select the last year
pop[X.time.i[-1]]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-13-64147c3366d4> in <module>
1 # select the last year
----> 2 pop[X.time.i[-1]]
NameError: name 'pop' is not defined
[14]:
# same but for the last 3 years
pop[X.time.i[-3:]]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-14-1174dfab392c> in <module>
1 # same but for the last 3 years
----> 2 pop[X.time.i[-3:]]
NameError: name 'pop' is not defined
[15]:
# using a list of indices
pop[X.time.i[0, 2, 4]]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-15-db2d9cb9c286> in <module>
1 # using a list of indices
----> 2 pop[X.time.i[0, 2, 4]]
NameError: name 'pop' is not defined
Warning: The end indice (position) is EXCLUSIVE while the end label is INCLUSIVE.
[16]:
year = 2015
# with labels
pop[X.time[:year]]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-16-9a77b8cef09c> in <module>
2
3 # with labels
----> 4 pop[X.time[:year]]
NameError: name 'pop' is not defined
[17]:
# with indices (i.e. using the .i[indices] syntax)
index_year = pop.time.index(year)
pop[X.time.i[:index_year]]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-17-ae9e14a67775> in <module>
1 # with indices (i.e. using the .i[indices] syntax)
----> 2 index_year = pop.time.index(year)
3 pop[X.time.i[:index_year]]
NameError: name 'pop' is not defined
You can use .i[]
selection directly on array instead of axes. In this context, if you want to select a subset of the first and third axes for example, you must use a full slice :
for the second one.
[18]:
# select first country and last three years
pop.i[0, :, -3:]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-18-c2b90aedfcee> in <module>
1 # select first country and last three years
----> 2 pop.i[0, :, -3:]
NameError: name 'pop' is not defined
Using Groups In Selections¶
[19]:
even_years = pop.time[2014::2]
pop[even_years]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-19-d6091d3c0ac8> in <module>
----> 1 even_years = pop.time[2014::2]
2
3 pop[even_years]
NameError: name 'pop' is not defined
Boolean Filtering¶
Boolean filtering can be used to extract subsets. Filtering can be done on axes:
[20]:
# select even years
pop[X.time % 2 == 0]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-20-17c7fddce252> in <module>
1 # select even years
----> 2 pop[X.time % 2 == 0]
NameError: name 'pop' is not defined
or data:
[21]:
# select population for the year 2017
pop_2017 = pop[2017]
# select all data with a value greater than 30 million
pop_2017[pop_2017 > 30e6]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-21-cfbf42b0a771> in <module>
1 # select population for the year 2017
----> 2 pop_2017 = pop[2017]
3
4 # select all data with a value greater than 30 million
5 pop_2017[pop_2017 > 30e6]
NameError: name 'pop' is not defined
Note: Be aware that after boolean filtering, several axes may have merged.
Arrays can also be used to create boolean filters:
[22]:
start_year = Array([2015, 2016, 2017], axes=pop.country)
start_year
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-22-b9058e5232fa> in <module>
----> 1 start_year = Array([2015, 2016, 2017], axes=pop.country)
2 start_year
NameError: name 'pop' is not defined
[23]:
pop[X.time >= start_year]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-23-7e33b1b1e14e> in <module>
----> 1 pop[X.time >= start_year]
NameError: name 'pop' is not defined
Assigning subsets¶
Assigning A Value¶
Assigning a value to a subset is simple:
[24]:
pop[2017] = 0
pop
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-24-1446b8540f40> in <module>
----> 1 pop[2017] = 0
2 pop
NameError: name 'pop' is not defined
Now, let’s store a subset in a new variable and modify it:
[25]:
# store the data associated with the year 2016 in a new variable
pop_2016 = pop[2016]
pop_2016
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-25-91ce7069d764> in <module>
1 # store the data associated with the year 2016 in a new variable
----> 2 pop_2016 = pop[2016]
3 pop_2016
NameError: name 'pop' is not defined
[26]:
# now, we modify the new variable
pop_2016['Belgium'] = 0
# and we can see that the original array has been also modified
pop
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-26-f6bcb1d53a23> in <module>
1 # now, we modify the new variable
----> 2 pop_2016['Belgium'] = 0
3
4 # and we can see that the original array has been also modified
5 pop
NameError: name 'pop_2016' is not defined
One very important gotcha though…
Warning: Storing a subset of an array in a new variable and modifying it after may also impact the original array. The reason is that selecting a contiguous subset of the data does not return a copy of the selected subset, but rather a view on a subset of the array. To avoid such behavior, use the .copy()
method.
Remember:
taking a contiguous subset of an array is extremely fast (no data is copied)
if one modifies that subset, one also modifies the original array
.copy() returns a copy of the subset (takes speed and memory) but allows you to change the subset without modifying the original array in the same time
The same warning apply for entire arrays:
[27]:
# reload the 'pop' array
pop = load_example_data('demography_eurostat').pop
# create a second 'pop2' variable
pop2 = pop
pop2
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-27-1d07543f3912> in <module>
1 # reload the 'pop' array
----> 2 pop = load_example_data('demography_eurostat').pop
3
4 # create a second 'pop2' variable
5 pop2 = pop
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/example.py in load_example_data(name)
91 if name not in AVAILABLE_EXAMPLE_DATA.keys():
92 raise ValueError("example_data must be chosen from list {}".format(list(AVAILABLE_EXAMPLE_DATA.keys())))
---> 93 return la.Session(AVAILABLE_EXAMPLE_DATA[name])
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/core/session.py in __init__(self, *args, **kwargs)
94 if isinstance(a0, str):
95 # assume a0 is a filename
---> 96 self.load(a0)
97 else:
98 # iterable of tuple or dict-like
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/core/session.py in load(self, fname, names, engine, display, **kwargs)
426 else:
427 handler = handler_cls(fname)
--> 428 metadata, objects = handler.read(names, display=display, **kwargs)
429 for k, v in objects.items():
430 self[k] = v
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/common.py in read(self, keys, *args, **kwargs)
128 print("loading", type, "object", key, "...", end=' ')
129 try:
--> 130 res[key] = self._read_item(key, type, *args, **kwargs)
131 except Exception:
132 if not ignore_exceptions:
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/hdf.py in _read_item(self, key, type, *args, **kwargs)
137 else:
138 raise TypeError()
--> 139 return read_hdf(self.handle, hdf_key, *args, **kwargs)
140
141 def _dump_item(self, key, value, *args, **kwargs):
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/hdf.py in read_hdf(filepath_or_buffer, key, fill_value, na, sort_rows, sort_columns, name, **kwargs)
81 cartesian_prod = writer != 'LArray'
82 res = df_asarray(pd_obj, sort_rows=sort_rows, sort_columns=sort_columns, fill_value=fill_value,
---> 83 parse_header=False, cartesian_prod=cartesian_prod)
84 if _meta is not None:
85 res.meta = _meta
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in df_asarray(df, sort_rows, sort_columns, raw, parse_header, wide, cartesian_prod, **kwargs)
338 unfold_last_axis_name = isinstance(axes_names[-1], basestring) and '\\' in axes_names[-1]
339 res = from_frame(df, sort_rows=sort_rows, sort_columns=sort_columns, parse_header=parse_header,
--> 340 unfold_last_axis_name=unfold_last_axis_name, cartesian_prod=cartesian_prod, **kwargs)
341
342 # ugly hack to avoid anonymous axes converted as axes with name 'Unnamed: x' by pandas
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in from_frame(df, sort_rows, sort_columns, parse_header, unfold_last_axis_name, fill_value, meta, cartesian_prod, **kwargs)
241 raise ValueError('sort_rows and sort_columns cannot not be used when cartesian_prod is set to False. '
242 'Please call the method sort_axes on the returned array to sort rows or columns')
--> 243 axes_labels = index_to_labels(df.index, sort=False)
244
245 # Pandas treats column labels as column names (strings) so we need to convert them to values
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in index_to_labels(idx, sort)
41 Returns unique labels for each dimension.
42 """
---> 43 if isinstance(idx, pd.core.index.MultiIndex):
44 if sort:
45 return list(idx.levels)
AttributeError: module 'pandas.core' has no attribute 'index'
[28]:
# set all data corresponding to the year 2017 to 0
pop2[2017] = 0
pop2
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-28-e8b13cd922ea> in <module>
1 # set all data corresponding to the year 2017 to 0
----> 2 pop2[2017] = 0
3 pop2
NameError: name 'pop2' is not defined
[29]:
# and now take a look of what happened to the original array 'pop'
# after modifying the 'pop2' array
pop
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-29-269253ee2d9d> in <module>
1 # and now take a look of what happened to the original array 'pop'
2 # after modifying the 'pop2' array
----> 3 pop
NameError: name 'pop' is not defined
Warning: The syntax new_array = old_array
does not create a new array but rather an ‘alias’ variable. To actually create a new array as a copy of a previous one, the .copy()
method must be called.
[30]:
# reload the 'pop' array
pop = load_example_data('demography_eurostat').pop
# copy the 'pop' array and store the copy in a new variable
pop2 = pop.copy()
# modify the copy
pop2[2017] = 0
pop2
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-30-c76cccf49dbd> in <module>
1 # reload the 'pop' array
----> 2 pop = load_example_data('demography_eurostat').pop
3
4 # copy the 'pop' array and store the copy in a new variable
5 pop2 = pop.copy()
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/example.py in load_example_data(name)
91 if name not in AVAILABLE_EXAMPLE_DATA.keys():
92 raise ValueError("example_data must be chosen from list {}".format(list(AVAILABLE_EXAMPLE_DATA.keys())))
---> 93 return la.Session(AVAILABLE_EXAMPLE_DATA[name])
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/core/session.py in __init__(self, *args, **kwargs)
94 if isinstance(a0, str):
95 # assume a0 is a filename
---> 96 self.load(a0)
97 else:
98 # iterable of tuple or dict-like
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/core/session.py in load(self, fname, names, engine, display, **kwargs)
426 else:
427 handler = handler_cls(fname)
--> 428 metadata, objects = handler.read(names, display=display, **kwargs)
429 for k, v in objects.items():
430 self[k] = v
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/common.py in read(self, keys, *args, **kwargs)
128 print("loading", type, "object", key, "...", end=' ')
129 try:
--> 130 res[key] = self._read_item(key, type, *args, **kwargs)
131 except Exception:
132 if not ignore_exceptions:
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/hdf.py in _read_item(self, key, type, *args, **kwargs)
137 else:
138 raise TypeError()
--> 139 return read_hdf(self.handle, hdf_key, *args, **kwargs)
140
141 def _dump_item(self, key, value, *args, **kwargs):
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/hdf.py in read_hdf(filepath_or_buffer, key, fill_value, na, sort_rows, sort_columns, name, **kwargs)
81 cartesian_prod = writer != 'LArray'
82 res = df_asarray(pd_obj, sort_rows=sort_rows, sort_columns=sort_columns, fill_value=fill_value,
---> 83 parse_header=False, cartesian_prod=cartesian_prod)
84 if _meta is not None:
85 res.meta = _meta
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in df_asarray(df, sort_rows, sort_columns, raw, parse_header, wide, cartesian_prod, **kwargs)
338 unfold_last_axis_name = isinstance(axes_names[-1], basestring) and '\\' in axes_names[-1]
339 res = from_frame(df, sort_rows=sort_rows, sort_columns=sort_columns, parse_header=parse_header,
--> 340 unfold_last_axis_name=unfold_last_axis_name, cartesian_prod=cartesian_prod, **kwargs)
341
342 # ugly hack to avoid anonymous axes converted as axes with name 'Unnamed: x' by pandas
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in from_frame(df, sort_rows, sort_columns, parse_header, unfold_last_axis_name, fill_value, meta, cartesian_prod, **kwargs)
241 raise ValueError('sort_rows and sort_columns cannot not be used when cartesian_prod is set to False. '
242 'Please call the method sort_axes on the returned array to sort rows or columns')
--> 243 axes_labels = index_to_labels(df.index, sort=False)
244
245 # Pandas treats column labels as column names (strings) so we need to convert them to values
~/checkouts/readthedocs.org/user_builds/larray/conda/0.32/lib/python3.6/site-packages/larray-0.32-py3.6.egg/larray/inout/pandas.py in index_to_labels(idx, sort)
41 Returns unique labels for each dimension.
42 """
---> 43 if isinstance(idx, pd.core.index.MultiIndex):
44 if sort:
45 return list(idx.levels)
AttributeError: module 'pandas.core' has no attribute 'index'
[31]:
# the data from the original array have not been modified
pop
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-31-0dd2f7db30f1> in <module>
1 # the data from the original array have not been modified
----> 2 pop
NameError: name 'pop' is not defined
Assigning Arrays And Broadcasting¶
Instead of a value, we can also assign an array to a subset. In that case, that array can have less axes than the target but those which are present must be compatible with the subset being targeted.
[32]:
# select population for the year 2015
pop_2015 = pop[2015]
# propagate population for the year 2015 to all next years
pop[2016:] = pop_2015
pop
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-32-a418c1132dc5> in <module>
1 # select population for the year 2015
----> 2 pop_2015 = pop[2015]
3
4 # propagate population for the year 2015 to all next years
5 pop[2016:] = pop_2015
NameError: name 'pop' is not defined
Warning: The array being assigned must have compatible axes (i.e. same axes names and same labels) with the target subset.
[33]:
# replace 'Male' and 'Female' labels by 'M' and 'F'
pop_2015 = pop_2015.set_labels('gender', 'M,F')
pop_2015
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-33-9d98e4e27eed> in <module>
1 # replace 'Male' and 'Female' labels by 'M' and 'F'
----> 2 pop_2015 = pop_2015.set_labels('gender', 'M,F')
3 pop_2015
NameError: name 'pop_2015' is not defined
[34]:
# now let's try to repeat the assignement operation above with the new labels.
# An error is raised because of incompatible axes
try:
pop[2016:] = pop_2015
except Exception as e:
print(type(e).__name__, ':', e)
NameError : name 'pop_2015' is not defined