Transforming Arrays (Relabeling, Renaming, Reordering, Combining, Extending, Sorting, …)¶
Import the LArray library:
[2]:
from larray import *
Check the version of LArray:
[3]:
from larray import __version__
__version__
[3]:
'0.31'
Manipulating axes¶
[4]:
# let's start with
pop = load_example_data('demography').pop[2016, 'BruCap', 90:95]
pop
[4]:
age sex\nat BE FO
90 M 539 74
90 F 1477 136
91 M 499 49
91 F 1298 105
92 M 332 35
92 F 1141 78
93 M 287 27
93 F 906 74
94 M 237 23
94 F 739 65
95 M 154 19
95 F 566 53
Relabeling¶
Replace all labels of one axis
[5]:
# returns a copy by default
pop_new_labels = pop.set_labels('sex', ['Men', 'Women'])
pop_new_labels
[5]:
age sex\nat BE FO
90 Men 539 74
90 Women 1477 136
91 Men 499 49
91 Women 1298 105
92 Men 332 35
92 Women 1141 78
93 Men 287 27
93 Women 906 74
94 Men 237 23
94 Women 739 65
95 Men 154 19
95 Women 566 53
[6]:
# inplace flag avoids to create a copy
pop.set_labels('sex', ['M', 'F'], inplace=True)
[6]:
age sex\nat BE FO
90 M 539 74
90 F 1477 136
91 M 499 49
91 F 1298 105
92 M 332 35
92 F 1141 78
93 M 287 27
93 F 906 74
94 M 237 23
94 F 739 65
95 M 154 19
95 F 566 53
Renaming axes¶
Rename one axis
[7]:
pop.info
[7]:
6 x 2 x 2
age [6]: 90 91 92 93 94 95
sex [2]: 'M' 'F'
nat [2]: 'BE' 'FO'
dtype: int64
memory used: 192 bytes
[8]:
# 'rename' returns a copy of the array
pop2 = pop.rename('sex', 'gender')
pop2
[8]:
age gender\nat BE FO
90 M 539 74
90 F 1477 136
91 M 499 49
91 F 1298 105
92 M 332 35
92 F 1141 78
93 M 287 27
93 F 906 74
94 M 237 23
94 F 739 65
95 M 154 19
95 F 566 53
Rename several axes at once
[9]:
# No x. here because sex and nat are keywords and not actual axes
pop2 = pop.rename(sex='gender', nat='nationality')
pop2
[9]:
age gender\nationality BE FO
90 M 539 74
90 F 1477 136
91 M 499 49
91 F 1298 105
92 M 332 35
92 F 1141 78
93 M 287 27
93 F 906 74
94 M 237 23
94 F 739 65
95 M 154 19
95 F 566 53
Reordering axes¶
Axes can be reordered using transpose
method. By default, transpose reverse axes, otherwise it permutes the axes according to the list given as argument. Axes not mentioned come after those which are mentioned(and keep their relative order). Finally, transpose returns a copy of the array.
[10]:
# starting order : age, sex, nat
pop
[10]:
age sex\nat BE FO
90 M 539 74
90 F 1477 136
91 M 499 49
91 F 1298 105
92 M 332 35
92 F 1141 78
93 M 287 27
93 F 906 74
94 M 237 23
94 F 739 65
95 M 154 19
95 F 566 53
[11]:
# no argument --> reverse axes
pop.transpose()
# .T is a shortcut for .transpose()
pop.T
[11]:
nat sex\age 90 91 92 93 94 95
BE M 539 499 332 287 237 154
BE F 1477 1298 1141 906 739 566
FO M 74 49 35 27 23 19
FO F 136 105 78 74 65 53
[12]:
# reorder according to list
pop.transpose('age', 'nat', 'sex')
[12]:
age nat\sex M F
90 BE 539 1477
90 FO 74 136
91 BE 499 1298
91 FO 49 105
92 BE 332 1141
92 FO 35 78
93 BE 287 906
93 FO 27 74
94 BE 237 739
94 FO 23 65
95 BE 154 566
95 FO 19 53
[13]:
# axes not mentioned come after those which are mentioned (and keep their relative order)
pop.transpose('sex')
[13]:
sex age\nat BE FO
M 90 539 74
M 91 499 49
M 92 332 35
M 93 287 27
M 94 237 23
M 95 154 19
F 90 1477 136
F 91 1298 105
F 92 1141 78
F 93 906 74
F 94 739 65
F 95 566 53
Combining arrays¶
Append/Prepend¶
Append/prepend one element to an axis of an array
[14]:
pop = load_example_data('demography').pop[2016, 'BruCap', 90:95]
# imagine that you have now acces to the number of non-EU foreigners
data = [[25, 54], [15, 33], [12, 28], [11, 37], [5, 21], [7, 19]]
pop_non_eu = LArray(data, pop['FO'].axes)
# you can do something like this
pop = pop.append('nat', pop_non_eu, 'NEU')
pop
[14]:
age sex\nat BE FO NEU
90 M 539 74 25
90 F 1477 136 54
91 M 499 49 15
91 F 1298 105 33
92 M 332 35 12
92 F 1141 78 28
93 M 287 27 11
93 F 906 74 37
94 M 237 23 5
94 F 739 65 21
95 M 154 19 7
95 F 566 53 19
[15]:
# you can also add something at the start of an axis
pop = pop.prepend('sex', pop.sum('sex'), 'B')
pop
[15]:
age sex\nat BE FO NEU
90 B 2016 210 79
90 M 539 74 25
90 F 1477 136 54
91 B 1797 154 48
91 M 499 49 15
91 F 1298 105 33
92 B 1473 113 40
92 M 332 35 12
92 F 1141 78 28
93 B 1193 101 48
93 M 287 27 11
93 F 906 74 37
94 B 976 88 26
94 M 237 23 5
94 F 739 65 21
95 B 720 72 26
95 M 154 19 7
95 F 566 53 19
The value being appended/prepended can have missing (or even extra) axes as long as common axes are compatible
[16]:
aliens = zeros(pop.axes['sex'])
aliens
[16]:
sex B M F
0.0 0.0 0.0
[17]:
pop = pop.append('nat', aliens, 'AL')
pop
[17]:
age sex\nat BE FO NEU AL
90 B 2016.0 210.0 79.0 0.0
90 M 539.0 74.0 25.0 0.0
90 F 1477.0 136.0 54.0 0.0
91 B 1797.0 154.0 48.0 0.0
91 M 499.0 49.0 15.0 0.0
91 F 1298.0 105.0 33.0 0.0
92 B 1473.0 113.0 40.0 0.0
92 M 332.0 35.0 12.0 0.0
92 F 1141.0 78.0 28.0 0.0
93 B 1193.0 101.0 48.0 0.0
93 M 287.0 27.0 11.0 0.0
93 F 906.0 74.0 37.0 0.0
94 B 976.0 88.0 26.0 0.0
94 M 237.0 23.0 5.0 0.0
94 F 739.0 65.0 21.0 0.0
95 B 720.0 72.0 26.0 0.0
95 M 154.0 19.0 7.0 0.0
95 F 566.0 53.0 19.0 0.0
Extend¶
Extend an array along an axis with another array with that axis (but other labels)
[18]:
_pop = load_example_data('demography').pop
pop = _pop[2016, 'BruCap', 90:95]
pop_next = _pop[2016, 'BruCap', 96:100]
# concatenate along age axis
pop.extend('age', pop_next)
[18]:
age sex\nat BE FO
90 M 539 74
90 F 1477 136
91 M 499 49
91 F 1298 105
92 M 332 35
92 F 1141 78
93 M 287 27
93 F 906 74
94 M 237 23
94 F 739 65
95 M 154 19
95 F 566 53
96 M 80 9
96 F 327 25
97 M 43 9
97 F 171 21
98 M 23 4
98 F 135 9
99 M 20 2
99 F 92 8
100 M 12 0
100 F 60 3
Stack¶
Stack several arrays together to create an entirely new dimension
[19]:
# imagine you have loaded data for each nationality in different arrays (e.g. loaded from different Excel sheets)
pop_be, pop_fo = pop['BE'], pop['FO']
# first way to stack them
nat = Axis('nat=BE,FO,NEU')
pop = stack([pop_be, pop_fo, pop_non_eu], nat)
# second way
pop = stack([('BE', pop_be), ('FO', pop_fo), ('NEU', pop_non_eu)], 'nat')
pop
[19]:
age sex\nat BE FO NEU
90 M 539 74 25
90 F 1477 136 54
91 M 499 49 15
91 F 1298 105 33
92 M 332 35 12
92 F 1141 78 28
93 M 287 27 11
93 F 906 74 37
94 M 237 23 5
94 F 739 65 21
95 M 154 19 7
95 F 566 53 19
Sorting¶
Sort an axis (alphabetically if labels are strings)
[20]:
pop_sorted = pop.sort_axes('nat')
pop_sorted
[20]:
age sex\nat BE FO NEU
90 M 539 74 25
90 F 1477 136 54
91 M 499 49 15
91 F 1298 105 33
92 M 332 35 12
92 F 1141 78 28
93 M 287 27 11
93 F 906 74 37
94 M 237 23 5
94 F 739 65 21
95 M 154 19 7
95 F 566 53 19
Give labels which would sort the axis
[21]:
pop_sorted.labelsofsorted('sex')
[21]:
age sex\nat BE FO NEU
90 0 M M M
90 1 F F F
91 0 M M M
91 1 F F F
92 0 M M M
92 1 F F F
93 0 M M M
93 1 F F F
94 0 M M M
94 1 F F F
95 0 M M M
95 1 F F F
Sort according to values
[22]:
pop_sorted.sort_values((90, 'F'))
[22]:
age sex\nat NEU FO BE
90 M 25 74 539
90 F 54 136 1477
91 M 15 49 499
91 F 33 105 1298
92 M 12 35 332
92 F 28 78 1141
93 M 11 27 287
93 F 37 74 906
94 M 5 23 237
94 F 21 65 739
95 M 7 19 154
95 F 19 53 566