{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Working With Sessions\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import the LArray library:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbsphinx": "hidden" }, "outputs": [], "source": [ "%xmode Minimal" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from larray import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Three Kinds Of Sessions " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "They are three ways to group objects in LArray:\n", "\n", " * [Session](../api.rst#session): is an ordered dict-like container with special I/O methods. Although the *autocomplete*\\* feature on the objects stored in the session is available in the larray-editor, it is not available in development tools like PyCharm making it cumbersome to use.\n", " * [CheckedSession](../api.rst#checkedsession): provides the same methods as Session objects but are defined in a completely different way (see example below). The *autocomplete*\\* feature is both available in the larray-editor and in development tools (PyCharm). In addition, the type of each stored object is protected. Optionally, it is possible to constrain the axes and dtype of arrays using ``CheckedArray``.\n", " * [CheckedParameters](../api.rst#checkedparameters): is a special version of CheckedSession in which the value of all stored objects (parameters) is frozen after initialization.\n", " \n", " \\* *Autocomplete* is the feature in which development tools try to predict the variable or function a user intends to enter after only a few characters have been typed (like word completion in cell phones)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating Sessions " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Session\n", "\n", "Create a session:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# define some scalars, axes and arrays\n", "variant = 'baseline'\n", "\n", "country = Axis('country=Belgium,France,Germany')\n", "gender = Axis('gender=Male,Female')\n", "time = Axis('time=2013..2017')\n", "\n", "population = zeros([country, gender, time])\n", "births = zeros([country, gender, time])\n", "deaths = zeros([country, gender, time])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# create an empty session and objects one by one after\n", "s = Session()\n", "s.variant = variant\n", "s.country = country\n", "s.gender = gender\n", "s.time = time\n", "s.population = population\n", "s.births = births\n", "s.deaths = deaths\n", "\n", "print(s.summary())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# or create a session in one step by passing all objects to the constructor\n", "s = Session(variant=variant, country=country, gender=gender, time=time, \n", " population=population, births=births, deaths=deaths)\n", "\n", "print(s.summary())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### CheckedSession\n", "\n", "The syntax to define a checked-session is a bit specific:\n", "\n", "```python\n", "class MySession(CheckedSession):\n", " # Variables can be declared in two ways:\n", " # a) by specifying only the type of the variable (to be initialized later)\n", " var1: Type\n", " # b) by giving an initialization value.\n", " # In that case, the type is deduced from the initialization value\n", " var2 = initialization value\n", " # Additionally, axes and dtype of Array variables can be constrained \n", " # using the special type CheckedArray\n", " arr1: CheckedArray([list, of, axes], dtype) = initialization value\n", "```\n", "\n", "Check the example below:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class Demography(CheckedSession):\n", " # (convention is to declare parameters (read-only objects) in capital letters)\n", " # Declare 'VARIANT' parameter as of type string.\n", " # 'VARIANT' will be initialized when a 'Demography' session will be created\n", " VARIANT: str\n", " # declare variables with an initialization value.\n", " # Their type is deduced from their initialization value. \n", " COUNTRY = Axis('country=Belgium,France,Germany')\n", " GENDER = Axis('gender=Male,Female')\n", " TIME = Axis('time=2013..2017')\n", " population = zeros([COUNTRY, GENDER, TIME], dtype=int)\n", " births = zeros([COUNTRY, GENDER, TIME], dtype=int)\n", " # declare 'deaths' with constrained axes and dtype.\n", " # Its type (Array), axes and dtype are not modifiable.\n", " # It will be initialized with 0\n", " deaths: CheckedArray([COUNTRY, GENDER, TIME], int) = 0\n", "\n", "d = Demography(VARIANT='baseline')\n", "\n", "print(d.summary())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading and Dumping Sessions\n", "\n", "One of the main advantages of grouping arrays, axes and groups in session objects is that you can load and save all of them in one shot. Like arrays, it is possible to associate metadata to a session. These can be saved and loaded in all file formats. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Loading Sessions (CSV, Excel, HDF5)\n", "\n", "To load the items of a session, you have two options:\n", "\n", "1) Instantiate a new session and pass the path to the Excel/HDF5 file or to the directory containing CSV files to the Session constructor:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# create a new Session object and load all arrays, axes, groups and metadata \n", "# from all CSV files located in the passed directory\n", "csv_dir = get_example_filepath('demography_eurostat')\n", "s = Session(csv_dir)\n", "\n", "# create a new Session object and load all arrays, axes, groups and metadata\n", "# stored in the passed Excel file\n", "filepath_excel = get_example_filepath('demography_eurostat.xlsx')\n", "s = Session(filepath_excel)\n", "\n", "# create a new Session object and load all arrays, axes, groups and metadata\n", "# stored in the passed HDF5 file\n", "filepath_hdf = get_example_filepath('demography_eurostat.h5')\n", "s = Session(filepath_hdf)\n", "\n", "print(s.summary())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2) Call the ``load`` method on an existing session and pass the path to the Excel/HDF5 file or to the directory containing CSV files as first argument:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# create a session containing 3 axes, 2 groups and one array 'population'\n", "filepath = get_example_filepath('population_only.xlsx')\n", "s = Session(filepath)\n", "\n", "print(s.summary())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# call the load method on the previous session and add the 'births' and 'deaths' arrays to it\n", "filepath = get_example_filepath('births_and_deaths.xlsx')\n", "s.load(filepath)\n", "\n", "print(s.summary())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The ``load`` method offers some options:\n", "\n", "1) Using the ``names`` argument, you can specify which items to load:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "births_and_deaths_session = Session()\n", "\n", "# use the names argument to only load births and deaths arrays\n", "births_and_deaths_session.load(filepath_hdf, names=['births', 'deaths'])\n", "\n", "print(births_and_deaths_session.summary())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2) Setting the ``display`` argument to True, the ``load`` method will print a message each time a new item is loaded: " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s = Session()\n", "\n", "# with display=True, the load method will print a message\n", "# each time a new item is loaded\n", "s.load(filepath_hdf, display=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Dumping Sessions (CSV, Excel, HDF5)\n", "\n", "To save a session, you need to call the ``save`` method. The first argument is the path to a Excel/HDF5 file or to a directory if items are saved to CSV files:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# save items of a session in CSV files.\n", "# Here, the save method will create a 'demography' directory in which CSV files will be written \n", "s.save('demography')\n", "\n", "# save the session to an HDF5 file\n", "s.save('demography.h5')\n", "\n", "# save the session to an Excel file\n", "s.save('demography.xlsx')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "