{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Transforming Arrays (Relabeling, Renaming, Reordering, Sorting, ...)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import the LArray library:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from larray import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import the `population` array from the `demography_eurostat` dataset:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "demography_eurostat = load_example_data('demography_eurostat')\n", "population = demography_eurostat.population\n", "\n", "# display the 'population' array\n", "population" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Manipulating axes\n", "\n", "\n", "The ``Array`` class offers several methods to manipulate the axes and labels of an array:\n", "\n", "- [set_labels](../_generated/larray.Array.set_labels.rst#larray.Array.set_labels): to replace all or some labels of one or several axes.\n", "- [rename](../_generated/larray.Array.rename.rst#larray.Array.rename): to replace one or several axis names.\n", "- [set_axes](../_generated/larray.Array.set_axes.rst#larray.Array.set_axes): to replace one or several axes.\n", "- [transpose](../_generated/larray.Array.transpose.rst#larray.Array.transpose): to modify the order of axes.\n", "- [drop](../_generated/larray.Array.drop.rst#larray.Array.drop): to remove one or several labels.\n", "- [combine_axes](../_generated/larray.Array.combine_axes.rst#larray.Array.combine_axes): to combine axes.\n", "- [split_axes](../_generated/larray.Array.split_axes.rst#larray.Array.split_axes): to split one or several axes by splitting their labels and names.\n", "- [reindex](../_generated/larray.Array.reindex.rst#larray.Array.reindex): to reorder, add and remove labels of one or several axes.\n", "- [insert](../_generated/larray.Array.insert.rst#larray.Array.insert): to insert a label at a given position.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Relabeling\n", "\n", "Replace some labels of an axis:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# replace only one label of the 'gender' axis by passing a dict\n", "population_new_labels = population.set_labels('gender', {'Male': 'Men'})\n", "population_new_labels" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# set all labels of the 'country' axis to uppercase by passing the function str.upper()\n", "population_new_labels = population.set_labels('country', str.upper)\n", "population_new_labels" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [set_labels](../_generated/larray.Array.set_labels.rst#larray.Array.set_labels) for more details and examples." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Renaming axes\n", "\n", "Rename one axis:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# 'rename' returns a copy of the array\n", "population_new_names = population.rename('time', 'year')\n", "population_new_names" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Rename several axes at once:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population_new_names = population.rename({'gender': 'sex', 'time': 'year'})\n", "population_new_names" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [rename](../_generated/larray.Array.rename.rst#larray.Array.rename) for more details and examples." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Replacing Axes\n", "\n", "Replace one axis:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "new_gender = Axis('sex=Men,Women')\n", "population_new_axis = population.set_axes('gender', new_gender)\n", "population_new_axis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Replace several axes at once:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "new_country = Axis('country_codes=BE,FR,DE') \n", "population_new_axes = population.set_axes({'country': new_country, 'gender': new_gender})\n", "population_new_axes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reordering axes\n", "\n", "Axes can be reordered using ``transpose`` method.\n", "By default, *transpose* reverse axes, otherwise it permutes the axes according to the list given as argument.\n", "Axes not mentioned come after those which are mentioned(and keep their relative order).\n", "Finally, *transpose* returns a copy of the array.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# starting order : country, gender, time\n", "population" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# no argument --> reverse all axes\n", "population_transposed = population.transpose()\n", "\n", "# .T is a shortcut for .transpose()\n", "population_transposed = population.T\n", "\n", "population_transposed" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# reorder according to list\n", "population_transposed = population.transpose('gender', 'country', 'time')\n", "population_transposed" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# move 'time' axis at first place\n", "# not mentioned axes come after those which are mentioned (and keep their relative order)\n", "population_transposed = population.transpose('time')\n", "population_transposed" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# move 'gender' axis at last place\n", "# not mentioned axes come before those which are mentioned (and keep their relative order)\n", "population_transposed = population.transpose(..., 'gender')\n", "population_transposed" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [transpose](../_generated/larray.Array.transpose.rst#larray.Array.transpose) for more details and examples." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Dropping Labels" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population_labels_dropped = population.drop([2014, 2016])\n", "population_labels_dropped" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [drop](../_generated/larray.Array.drop.rst#larray.Array.drop) for more details and examples." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Combine And Split Axes\n", "\n", "Combine two axes:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population_combined_axes = population.combine_axes(('country', 'gender'))\n", "population_combined_axes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Split an axis:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population_split_axes = population_combined_axes.split_axes('country_gender')\n", "population_split_axes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [combine_axes](../_generated/larray.Array.combine_axes.rst#larray.Array.combine_axes) and [split_axes](../_generated/larray.Array.split_axes.rst#larray.Array.split_axes) for more details and examples." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reordering, adding and removing labels\n", "\n", "The ``reindex`` method allows to reorder, add and remove labels along one axis:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# reverse years + remove 2013 + add 2018 + copy data for 2017 to 2018\n", "population_new_time = population.reindex('time', '2018..2014', fill_value=population[2017])\n", "population_new_time" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or several axes:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population_new = population.reindex({'country': 'country=Luxembourg,Belgium,France,Germany', \n", " 'time': 'time=2018..2014'}, fill_value=0)\n", "population_new" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [reindex](../_generated/larray.Array.reindex.rst#larray.Array.reindex) for more details and examples." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another way to insert new labels is to use the ``insert`` method:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# insert a new country before 'France' with all values set to 0\n", "population_new_country = population.insert(0, before='France', label='Luxembourg')\n", "# or equivalently\n", "population_new_country = population.insert(0, after='Belgium', label='Luxembourg')\n", "\n", "population_new_country" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [insert](../_generated/larray.Array.insert.rst#larray.Array.insert) for more details and examples." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sorting\n", "\n", "\n", "- [sort_labels](../_generated/larray.Array.sort_labels.rst#larray.Array.sort_labels): sort the labels of an axis.\n", "- [labelsofsorted](../_generated/larray.Array.labelsofsorted.rst#larray.Array.labelsofsorted): give labels which would sort an axis. \n", "- [sort_values](../_generated/larray.Array.sort_values.rst#larray.Array.sort_values): sort axes according to values" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# get a copy of the 'population_benelux' array\n", "population_benelux = demography_eurostat.population_benelux.copy()\n", "population_benelux" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sort an axis (alphabetically if labels are strings)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population_sorted = population_benelux.sort_labels('gender')\n", "population_sorted" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Give labels which would sort the axis\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population_benelux.labelsofsorted('country')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sort according to values\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population_sorted = population_benelux.sort_values(('Male', 2017))\n", "population_sorted" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Aligning Arrays\n", "\n", "The ``align`` method align two arrays on their axes with a specified join method.\n", "In other words, it ensure all common axes are compatible." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# get a copy of the 'births' array\n", "births = demography_eurostat.births.copy()\n", "\n", "# align the two arrays with the 'inner' join method\n", "population_aligned, births_aligned = population_benelux.align(births, join='inner')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('population_benelux before align:')\n", "print(population_benelux)\n", "print()\n", "print('population_benelux after align:')\n", "print(population_aligned)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('births before align:')\n", "print(births)\n", "print()\n", "print('births after align:')\n", "print(births_aligned)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Aligned arrays can then be used in arithmetic operations:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population_aligned - births_aligned" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [align](../_generated/larray.Array.align.rst#larray.Array.align) for more details and examples." ] } ], "metadata": { "celltoolbar": "Edit Metadata", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" }, "livereveal": { "autolaunch": false, "scroll": true } }, "nbformat": 4, "nbformat_minor": 2 }