{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Import the LArray library:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from larray import *" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# load 'demography_eurostat' dataset\n", "demography_eurostat = load_example_data('demography_eurostat')\n", "\n", "# extract the 'population' array from the dataset \n", "population = demography_eurostat.population\n", "population" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Inspecting Array objects\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get array summary : metadata + dimensions + description of axes + dtype + size in memory" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Array summary: metadata + dimensions + description of axes\n", "population.info" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get axes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population.axes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get axis names" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population.axes.names" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get number of dimensions" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population.ndim" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get length of each dimension" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get total number of elements of the array" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population.size" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get type of internal data (int, float, ...)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population.dtype" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get size in memory" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population.memory_used" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Some Useful Functions\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### with total\n", "\n", "Add totals to one or several axes:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population.with_total('gender', label='Total')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [with_total](../_generated/larray.Array.with_total.rst#larray.Array.with_total) for more details and examples.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### where\n", "\n", "The ``where`` function can be used to apply some computation depending on a condition:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# where(condition, value if true, value if false)\n", "where(population < population.mean('time'), -population, population)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [where](../_generated/larray.where.rst#larray.where) for more details and examples.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### clip\n", "\n", "Set all data between a certain range:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# values below 10 millions are set to 10 millions\n", "population.clip(minval=10**7)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# values above 40 millions are set to 40 millions\n", "population.clip(maxval=4*10**7)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# values below 10 millions are set to 10 millions and \n", "# values above 40 millions are set to 40 millions\n", "population.clip(10**7, 4*10**7)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Using vectors to define the lower and upper bounds\n", "lower_bound = sequence(population.time, initial=5_500_000, inc=50_000)\n", "upper_bound = sequence(population.time, 41_000_000, inc=100_000)\n", "\n", "print(lower_bound, '\\n')\n", "print(upper_bound, '\\n')\n", "\n", "population.clip(lower_bound, upper_bound)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [clip](../_generated/larray.Array.clip.rst#larray.Array.clip) for more details and examples.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### divnot0\n", "\n", "Replace division by 0 by 0:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "divisor = ones(population.axes, dtype=int)\n", "divisor['Male'] = 0\n", "divisor" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population / divisor" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# we use astype(int) since the divnot0 method \n", "# returns a float array in this case while \n", "# we want an integer array\n", "population.divnot0(divisor).astype(int)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [divnot0](../_generated/larray.Array.divnot0.rst#larray.Array.divnot0) for more details and examples.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ratio\n", "\n", "The ``ratio`` (``rationot0``) method returns an array with all values divided by the sum of values along given axes:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population.ratio('gender')\n", "\n", "# which is equivalent to\n", "population / population.sum('gender')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [ratio](../_generated/larray.Array.ratio.rst#larray.Array.ratio) and [rationot0](../_generated/larray.Array.rationot0.rst#larray.Array.rationot0) for more details and examples.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### percents\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# or, if you want the previous ratios in percents\n", "population.percent('gender')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [percent](../_generated/larray.Array.percent.rst#larray.Array.percent) for more details and examples.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### diff\n", "\n", "The ``diff`` method calculates the n-th order discrete difference along a given axis.\n", "\n", "The first order difference is given by ``out[n+1] = in[n+1] - in[n]`` along the given axis.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# calculates 'diff[year+1] = population[year+1] - population[year]'\n", "population.diff('time')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# calculates 'diff[year+2] = population[year+2] - population[year]'\n", "population.diff('time', d=2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# calculates 'diff[year] = population[year+1] - population[year]'\n", "population.diff('time', label='lower')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [diff](../_generated/larray.Array.diff.rst#larray.Array.diff) for more details and examples.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### growth\\_rate\n", "\n", "The ``growth_rate`` method calculates the growth along a given axis.\n", " \n", "It is roughly equivalent to ``a.diff(axis, d, label) / a[axis.i[:-d]]``:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population.growth_rate('time')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [growth_rate](../_generated/larray.Array.growth_rate.rst#larray.Array.growth_rate) for more details and examples.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### shift\n", "\n", "The ``shift`` method drops first label of an axis and shifts all subsequent labels\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "population.shift('time')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# when shift is applied on an (increasing) time axis,\n", "# it effectively brings \"past\" data into the future\n", "population_shifted = population.shift('time')\n", "stack({'population_shifted_2014': population_shifted[2014], 'population_2013': population[2013]}, 'array')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [shift](../_generated/larray.Array.shift.rst#larray.Array.shift) for more details and examples.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Other interesting functions\n", "\n", "There are a lot more interesting functions that you can find in the API reference in sections [Aggregation Functions](../api.rst#aggregation-functions), [Miscellaneous](../api.rst#miscellaneous) and [Utility Functions](../api.rst#utility-functions).\n" ] } ], "metadata": { "celltoolbar": "Edit Metadata", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5" }, "livereveal": { "autolaunch": false, "scroll": true } }, "nbformat": 4, "nbformat_minor": 2 }