Interactive online version: Binder badge

Transforming Arrays (Relabeling, Renaming, Reordering, Combining, Extending, Sorting, …)

Import the LArray library:

[2]:
from larray import *

Check the version of LArray:

[3]:
from larray import __version__
__version__
[3]:
'0.31'

Manipulating axes

[4]:
# let's start with
pop = load_example_data('demography').pop[2016, 'BruCap', 90:95]
pop
[4]:
age  sex\nat    BE   FO
 90        M   539   74
 90        F  1477  136
 91        M   499   49
 91        F  1298  105
 92        M   332   35
 92        F  1141   78
 93        M   287   27
 93        F   906   74
 94        M   237   23
 94        F   739   65
 95        M   154   19
 95        F   566   53

Relabeling

Replace all labels of one axis

[5]:
# returns a copy by default
pop_new_labels = pop.set_labels('sex', ['Men', 'Women'])
pop_new_labels
[5]:
age  sex\nat    BE   FO
 90      Men   539   74
 90    Women  1477  136
 91      Men   499   49
 91    Women  1298  105
 92      Men   332   35
 92    Women  1141   78
 93      Men   287   27
 93    Women   906   74
 94      Men   237   23
 94    Women   739   65
 95      Men   154   19
 95    Women   566   53
[6]:
# inplace flag avoids to create a copy
pop.set_labels('sex', ['M', 'F'], inplace=True)
[6]:
age  sex\nat    BE   FO
 90        M   539   74
 90        F  1477  136
 91        M   499   49
 91        F  1298  105
 92        M   332   35
 92        F  1141   78
 93        M   287   27
 93        F   906   74
 94        M   237   23
 94        F   739   65
 95        M   154   19
 95        F   566   53

Renaming axes

Rename one axis

[7]:
pop.info
[7]:
6 x 2 x 2
 age [6]: 90 91 92 93 94 95
 sex [2]: 'M' 'F'
 nat [2]: 'BE' 'FO'
dtype: int64
memory used: 192 bytes
[8]:
# 'rename' returns a copy of the array
pop2 = pop.rename('sex', 'gender')
pop2
[8]:
age  gender\nat    BE   FO
 90           M   539   74
 90           F  1477  136
 91           M   499   49
 91           F  1298  105
 92           M   332   35
 92           F  1141   78
 93           M   287   27
 93           F   906   74
 94           M   237   23
 94           F   739   65
 95           M   154   19
 95           F   566   53

Rename several axes at once

[9]:
# No x. here because sex and nat are keywords and not actual axes
pop2 = pop.rename(sex='gender', nat='nationality')
pop2
[9]:
age  gender\nationality    BE   FO
 90                   M   539   74
 90                   F  1477  136
 91                   M   499   49
 91                   F  1298  105
 92                   M   332   35
 92                   F  1141   78
 93                   M   287   27
 93                   F   906   74
 94                   M   237   23
 94                   F   739   65
 95                   M   154   19
 95                   F   566   53

Reordering axes

Axes can be reordered using transpose method. By default, transpose reverse axes, otherwise it permutes the axes according to the list given as argument. Axes not mentioned come after those which are mentioned(and keep their relative order). Finally, transpose returns a copy of the array.

[10]:
# starting order : age, sex, nat
pop
[10]:
age  sex\nat    BE   FO
 90        M   539   74
 90        F  1477  136
 91        M   499   49
 91        F  1298  105
 92        M   332   35
 92        F  1141   78
 93        M   287   27
 93        F   906   74
 94        M   237   23
 94        F   739   65
 95        M   154   19
 95        F   566   53
[11]:
# no argument --> reverse axes
pop.transpose()

# .T is a shortcut for .transpose()
pop.T
[11]:
nat  sex\age    90    91    92   93   94   95
 BE        M   539   499   332  287  237  154
 BE        F  1477  1298  1141  906  739  566
 FO        M    74    49    35   27   23   19
 FO        F   136   105    78   74   65   53
[12]:
# reorder according to list
pop.transpose('age', 'nat', 'sex')
[12]:
age  nat\sex    M     F
 90       BE  539  1477
 90       FO   74   136
 91       BE  499  1298
 91       FO   49   105
 92       BE  332  1141
 92       FO   35    78
 93       BE  287   906
 93       FO   27    74
 94       BE  237   739
 94       FO   23    65
 95       BE  154   566
 95       FO   19    53
[13]:
# axes not mentioned come after those which are mentioned (and keep their relative order)
pop.transpose('sex')
[13]:
sex  age\nat    BE   FO
  M       90   539   74
  M       91   499   49
  M       92   332   35
  M       93   287   27
  M       94   237   23
  M       95   154   19
  F       90  1477  136
  F       91  1298  105
  F       92  1141   78
  F       93   906   74
  F       94   739   65
  F       95   566   53

Combining arrays

Append/Prepend

Append/prepend one element to an axis of an array

[14]:
pop = load_example_data('demography').pop[2016, 'BruCap', 90:95]

# imagine that you have now acces to the number of non-EU foreigners
data = [[25, 54], [15, 33], [12, 28], [11, 37], [5, 21], [7, 19]]
pop_non_eu = LArray(data, pop['FO'].axes)

# you can do something like this
pop = pop.append('nat', pop_non_eu, 'NEU')
pop
[14]:
age  sex\nat    BE   FO  NEU
 90        M   539   74   25
 90        F  1477  136   54
 91        M   499   49   15
 91        F  1298  105   33
 92        M   332   35   12
 92        F  1141   78   28
 93        M   287   27   11
 93        F   906   74   37
 94        M   237   23    5
 94        F   739   65   21
 95        M   154   19    7
 95        F   566   53   19
[15]:
# you can also add something at the start of an axis
pop = pop.prepend('sex', pop.sum('sex'), 'B')
pop
[15]:
age  sex\nat    BE   FO  NEU
 90        B  2016  210   79
 90        M   539   74   25
 90        F  1477  136   54
 91        B  1797  154   48
 91        M   499   49   15
 91        F  1298  105   33
 92        B  1473  113   40
 92        M   332   35   12
 92        F  1141   78   28
 93        B  1193  101   48
 93        M   287   27   11
 93        F   906   74   37
 94        B   976   88   26
 94        M   237   23    5
 94        F   739   65   21
 95        B   720   72   26
 95        M   154   19    7
 95        F   566   53   19

The value being appended/prepended can have missing (or even extra) axes as long as common axes are compatible

[16]:
aliens = zeros(pop.axes['sex'])
aliens
[16]:
sex    B    M    F
     0.0  0.0  0.0
[17]:
pop = pop.append('nat', aliens, 'AL')
pop
[17]:
age  sex\nat      BE     FO   NEU   AL
 90        B  2016.0  210.0  79.0  0.0
 90        M   539.0   74.0  25.0  0.0
 90        F  1477.0  136.0  54.0  0.0
 91        B  1797.0  154.0  48.0  0.0
 91        M   499.0   49.0  15.0  0.0
 91        F  1298.0  105.0  33.0  0.0
 92        B  1473.0  113.0  40.0  0.0
 92        M   332.0   35.0  12.0  0.0
 92        F  1141.0   78.0  28.0  0.0
 93        B  1193.0  101.0  48.0  0.0
 93        M   287.0   27.0  11.0  0.0
 93        F   906.0   74.0  37.0  0.0
 94        B   976.0   88.0  26.0  0.0
 94        M   237.0   23.0   5.0  0.0
 94        F   739.0   65.0  21.0  0.0
 95        B   720.0   72.0  26.0  0.0
 95        M   154.0   19.0   7.0  0.0
 95        F   566.0   53.0  19.0  0.0

Extend

Extend an array along an axis with another array with that axis (but other labels)

[18]:
_pop = load_example_data('demography').pop
pop = _pop[2016, 'BruCap', 90:95]
pop_next = _pop[2016, 'BruCap', 96:100]

# concatenate along age axis
pop.extend('age', pop_next)
[18]:
age  sex\nat    BE   FO
 90        M   539   74
 90        F  1477  136
 91        M   499   49
 91        F  1298  105
 92        M   332   35
 92        F  1141   78
 93        M   287   27
 93        F   906   74
 94        M   237   23
 94        F   739   65
 95        M   154   19
 95        F   566   53
 96        M    80    9
 96        F   327   25
 97        M    43    9
 97        F   171   21
 98        M    23    4
 98        F   135    9
 99        M    20    2
 99        F    92    8
100        M    12    0
100        F    60    3

Stack

Stack several arrays together to create an entirely new dimension

[19]:
# imagine you have loaded data for each nationality in different arrays (e.g. loaded from different Excel sheets)
pop_be, pop_fo = pop['BE'], pop['FO']

# first way to stack them
nat = Axis('nat=BE,FO,NEU')
pop = stack([pop_be, pop_fo, pop_non_eu], nat)

# second way
pop = stack([('BE', pop_be), ('FO', pop_fo), ('NEU', pop_non_eu)], 'nat')

pop
[19]:
age  sex\nat    BE   FO  NEU
 90        M   539   74   25
 90        F  1477  136   54
 91        M   499   49   15
 91        F  1298  105   33
 92        M   332   35   12
 92        F  1141   78   28
 93        M   287   27   11
 93        F   906   74   37
 94        M   237   23    5
 94        F   739   65   21
 95        M   154   19    7
 95        F   566   53   19

Sorting

Sort an axis (alphabetically if labels are strings)

[20]:
pop_sorted = pop.sort_axes('nat')
pop_sorted
[20]:
age  sex\nat    BE   FO  NEU
 90        M   539   74   25
 90        F  1477  136   54
 91        M   499   49   15
 91        F  1298  105   33
 92        M   332   35   12
 92        F  1141   78   28
 93        M   287   27   11
 93        F   906   74   37
 94        M   237   23    5
 94        F   739   65   21
 95        M   154   19    7
 95        F   566   53   19

Give labels which would sort the axis

[21]:
pop_sorted.labelsofsorted('sex')
[21]:
age  sex\nat  BE  FO  NEU
 90        0   M   M    M
 90        1   F   F    F
 91        0   M   M    M
 91        1   F   F    F
 92        0   M   M    M
 92        1   F   F    F
 93        0   M   M    M
 93        1   F   F    F
 94        0   M   M    M
 94        1   F   F    F
 95        0   M   M    M
 95        1   F   F    F

Sort according to values

[22]:
pop_sorted.sort_values((90, 'F'))
[22]:
age  sex\nat  NEU   FO    BE
 90        M   25   74   539
 90        F   54  136  1477
 91        M   15   49   499
 91        F   33  105  1298
 92        M   12   35   332
 92        F   28   78  1141
 93        M   11   27   287
 93        F   37   74   906
 94        M    5   23   237
 94        F   21   65   739
 95        M    7   19   154
 95        F   19   53   566