{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# How data is stored\n\nGeneral information about eddies storage.\n\nAll files have the same structure, with more or less fields and possible different order.\n\nThere are 3 class of files:\n\n- **Eddies collections** : contain a list of eddies without link between them\n- **Track eddies collections** :\n  manage eddies associated in trajectories, the ```track``` field allows to separate each trajectory\n- **Network eddies collections** :\n  manage eddies associated in networks, the ```track``` and ```segment``` fields allow to separate observations\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "from matplotlib import pyplot as plt\nfrom numpy import arange, outer\nimport py_eddy_tracker_sample\n\nfrom py_eddy_tracker.data import get_demo_path\nfrom py_eddy_tracker.observations.network import NetworkObservations\nfrom py_eddy_tracker.observations.observation import EddiesObservations, Table\nfrom py_eddy_tracker.observations.tracking import TrackEddiesObservations"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Eddies can be stored in 2 formats with the same structure:\n\n- zarr (https://zarr.readthedocs.io/en/stable/), which allow efficiency in IO,...\n- NetCDF4 (https://unidata.github.io/netcdf4-python/), well-known format\n\nEach field are stored in column, each row corresponds at 1 observation,\narray field like contour/profile are 2D column.\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Eddies files (zarr or netcdf) can be loaded with ```load_file``` method:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "eddies_collections = EddiesObservations.load_file(get_demo_path(\"Cyclonic_20160515.nc\"))\neddies_collections.field_table()\n# offset and scale_factor are used only when data is stored in zarr or netCDF4"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Field access\nTo access the total field, here ```amplitude```\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "eddies_collections.amplitude\n\n# To access only a specific part of the field\neddies_collections.amplitude[4:15]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Data matrix is a numpy ndarray\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "eddies_collections.obs"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "eddies_collections.obs.dtype"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Contour storage\nAll contours are stored on the same number of points, and are resampled if needed with an algorithm to be stored as objects\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Speed profile storage\nSpeed profile is an interpolation of speed mean along each contour.\nFor each contour included in eddy, we compute mean of speed along the contour,\nand after we interpolate speed mean array on a fixed size array.\n\nSeveral field are available to understand \"uavg_profile\" :\n 0. - num_contours : Number of contour in eddies, must be equal to amplitude divide by isoline step\n 1. - height_inner_contour : height of inner contour used\n 2. - height_max_speed_contour : height of max speed contour used\n 3. - height_external_contour : height of outter contour used\n\nLast value of \"uavg_profile\" is for inner contour and first value for outter contour.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# Observations selection of \"uavg_profile\" with high number of contour(Eddy with high amplitude)\ne = eddies_collections.extract_with_mask(eddies_collections.num_contours > 15)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# Raw display of profiles with more than 15 contours\nax = plt.subplot(111)\n_ = ax.plot(e.uavg_profile.T, lw=0.5)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# Profile from inner to outter\nax = plt.subplot(111)\nax.plot(e.uavg_profile[:, ::-1].T, lw=0.5)\n_ = ax.set_xlabel(\"From inner to outter contour\"), ax.set_ylabel(\"Speed (m/s)\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# If we normalize indice of contour to set speed contour to 1 and inner contour to 0\nax = plt.subplot(111)\nh_in = e.height_inner_contour\nh_s = e.height_max_speed_contour\nh_e = e.height_external_contour\nr = (h_e - h_in) / (h_s - h_in)\nnb_pt = e.uavg_profile.shape[1]\n# Create an x array for each profile\nx = outer(arange(nb_pt) / nb_pt, r)\n\nax.plot(x, e.uavg_profile[:, ::-1].T, lw=0.5)\n_ = ax.set_xlabel(\"From inner to outter contour\"), ax.set_ylabel(\"Speed (m/s)\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Trajectories\nTracks eddies collections add several fields :\n\n- **track** : Trajectory number\n- **observation_flag** : Flag indicating if the value is interpolated between two observations or not\n  (0: observed eddy, 1: interpolated eddy)\"\n- **observation_number** : Eddy temporal index in a trajectory, days starting at the eddy first detection\n- **cost_association** : result of the cost function to associate the eddy with the next observation\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "eddies_tracks = TrackEddiesObservations.load_file(\n    py_eddy_tracker_sample.get_demo_path(\"eddies_med_adt_allsat_dt2018/Cyclonic.zarr\")\n)\n# In this example some fields are removed (effective_contour_longitude,...) in order to save time for doc building\neddies_tracks.field_table()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Networks\nNetwork files use some specific fields :\n\n- track :  ID of network (ID 0 correspond to lonely eddies)\n- segment :  ID of a segment within a network (from 1 to N)\n- previous_obs : Index of the previous observation in the full dataset,\n  if -1 there are no previous observation (the segment starts)\n- next_obs : Index of the next observation in the full dataset, if -1 there are no next observation (the segment ends)\n- previous_cost : Result of the cost function (1 is a good association, 0 is bad) with previous observation\n- next_cost : Result of the cost function (1 is a good association, 0 is bad) with next observation\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "eddies_network = NetworkObservations.load_file(get_demo_path(\"network_med.nc\"))\neddies_network.field_table()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "sl = slice(70, 100)\nTable(\n    eddies_network.network(651).obs[sl][\n        [\n            \"time\",\n            \"track\",\n            \"segment\",\n            \"previous_obs\",\n            \"previous_cost\",\n            \"next_obs\",\n            \"next_cost\",\n        ]\n    ]\n)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Networks are ordered by increasing network number (`track`), then increasing segment number, then increasing time\n\n"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.10.6"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}