{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "d1cd6238",
   "metadata": {},
   "source": [
    "# Custom derived quantities\n",
    "\n",
    "When you fit an SED with `galapy-fit`, the sampler explores the posterior of the\n",
    "*free model parameters* (ages, star-formation parameters, ISM parameters, ...).\n",
    "On their own those numbers are rarely what you want to report. What you usually\n",
    "report are **derived quantities**; i.e. physical numbers computed *from* the model at\n",
    "each posterior sample: a stellar mass, a star-formation rate, a dust temperature,\n",
    "a model SED.\n",
    "\n",
    "`galapy` computes a set of these for you automatically and stores their full\n",
    "posterior (one value per sample) inside the `Results` object. This guide shows\n",
    "you how to\n",
    "\n",
    "1. choose *which* built-in quantities are stored when you run a fit,\n",
    "2. add *your own* quantities to an existing `Results` object, and\n",
    "3. summarise any of them (mean, median, credible intervals).\n",
    "\n",
    "It builds up in four levels — feel free to stop at the one that covers your use\n",
    "case. The cookbook at the end collects ready-to-paste recipes."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "89334d89",
   "metadata": {},
   "source": [
    "## The built-in quantities\n",
    "\n",
    "Every `Results` object stores the posterior of these nine quantities by default:\n",
    "\n",
    "| key     | quantity                       | units                              | shape       |\n",
    "|---------|--------------------------------|------------------------------------|-------------|\n",
    "| `SED`   | spectral energy distribution   | flux, mJy                          | array       |\n",
    "| `Mstar` | stellar mass                   | $M_\\odot$                          | scalar      |\n",
    "| `Mdust` | dust mass                      | $M_\\odot$                          | scalar      |\n",
    "| `Mgas`  | gas mass                       | $M_\\odot$                          | scalar      |\n",
    "| `Zstar` | stellar metallicity            | absolute (metal mass fraction)     | scalar      |\n",
    "| `Zgas`  | gas metallicity                | absolute (metal mass fraction)     | scalar      |\n",
    "| `SFR`   | star-formation rate            | $M_\\odot$ / yr                     | scalar      |\n",
    "| `TMC`   | molecular-cloud temperature    | K                                  | scalar      |\n",
    "| `TDD`   | diffuse-dust temperature       | K                                  | scalar      |\n",
    "\n",
    "Each is exposed as an attribute holding the posterior: a `(Nsamples,)` array for\n",
    "the scalar quantities and a `(Nsamples, Nwavelength)` array for the `SED`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "131ecef7",
   "metadata": {},
   "outputs": [],
   "source": [
    "from galapy.sampling.Results import load_results\n",
    "\n",
    "# load a results file written by galapy-fit (HDF5 or pickle); the format is\n",
    "# inferred from the extension\n",
    "res = load_results( 'data/sampling_4D+noise_dynesty_results_light.galapy.hdf5' )\n",
    "\n",
    "res.Mstar.shape   # -> (Nsamples,)\n",
    "res.SED.shape     # -> (Nsamples, Nwl)\n",
    "res.get_stored_quantities()   # everything currently stored on this object"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2df85f08",
   "metadata": {},
   "source": [
    "## Level 1 — choosing which built-ins to store (no code)\n",
    "\n",
    "Computing and storing every quantity for every posterior sample takes time and\n",
    "disk space (the `SED` in particular is a full array per sample). If you only care\n",
    "about a few numbers, tell `galapy-fit` to store just those, through the\n",
    "`store_quantities` hyperparameter in your parameter file:\n",
    "\n",
    "```python\n",
    "# parameter file (galapy-genparams)\n",
    "store_quantities = ['Mstar', 'SFR', 'Mdust']\n",
    "```\n",
    "\n",
    "- `None` (the default) stores **all** nine quantities.\n",
    "- A shorter list makes the output file smaller and post-processing faster.\n",
    "- `'SED'` is **always** stored, whether or not you list it — listing it just\n",
    "  prints a harmless warning and is ignored. (The SED is needed internally to\n",
    "  validate each sample, so you always get it for free.)\n",
    "\n",
    "That is all most users ever need. The remaining levels are for when you want a\n",
    "quantity that is **not** in the built-in list."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9a101008",
   "metadata": {},
   "source": [
    "## Level 2 — adding one custom quantity\n",
    "\n",
    "Anything you can compute from a fully-configured `galapy` model, you can store as\n",
    "a derived quantity, *after* the run, with `Results.add_property`.\n",
    "\n",
    "You pass a function that takes **one argument, the model**, with the parameters of\n",
    "the current sample already set, and returns the value you want for that sample:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "42fa1835",
   "metadata": {},
   "outputs": [],
   "source": [
    "# bolometric-ish luminosity: integrate the model SED\n",
    "res.add_property( lambda model : model.get_SED().sum(), name = 'Lsed' )\n",
    "\n",
    "res.Lsed.shape   # -> (Nsamples,)  one value per posterior sample"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d3af06bb",
   "metadata": {},
   "source": [
    "`add_property` evaluates your function on every posterior sample exactly the way\n",
    "the built-ins are evaluated, stores the result as a new attribute (`res.Lsed`\n",
    "here), and returns the list of names it added:\n",
    "\n",
    "```python\n",
    "res.add_property( lambda model : model.age, name = 'age' )\n",
    "# -> ['age']\n",
    "```\n",
    "\n",
    "What can the `model` give you? Anything the model exposes, for example:\n",
    "\n",
    "```python\n",
    "model.age                       # galaxy age [yr]\n",
    "model.sfh.Mstar( model.age )    # stellar mass at that age\n",
    "model.sfh( model.age )          # SFR at that age\n",
    "model.ism.mc.T                  # molecular-cloud temperature\n",
    "model.get_SED()                 # the model SED [mJy]\n",
    "model.get_emission()            # rest-frame luminosity [Lsun]\n",
    "```\n",
    "\n",
    "If you omit `name` for a single function, it is auto-named `custom0`, `custom1`,\n",
    "... — handy for quick exploration, but pass an explicit `name` for anything you\n",
    "intend to keep."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "08f6587d",
   "metadata": {},
   "source": [
    "> **⚠️ The operation of adding a new property to the list can take up to several 10s of seconds.**\n",
    "> This is a consequence of the necessity to re-build the model for each sample in the parameter space.\n",
    "> Since this additional cost is payed every time you call `add_property`, in the case you need\n",
    "> to store more than 1 additional quantity, it is desirable to list all of them in a single call of\n",
    "> `add_property`, as explained in the following section."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "84a64f23",
   "metadata": {},
   "source": [
    "## Level 3 — several quantities at once, and summarising them\n",
    "\n",
    "To add more than one quantity in a single pass over the posterior, hand\n",
    "`add_property` a dictionary of `{ name : function }`. The posterior is walked\n",
    "once and all the functions are evaluated per sample. **Way cheaper than calling\n",
    "`add_property` repeatedly:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6b5616b1",
   "metadata": {},
   "outputs": [],
   "source": [
    "res.add_property( {\n",
    "    'Lsed'   : lambda model : model.get_SED().sum(),\n",
    "    'fdust'   : lambda model : model.sfh.Mdust( model.age ) / model.sfh.Mstar( model.age ),\n",
    "} )\n",
    "# -> ['Lsed', 'fdust']"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "213173c0",
   "metadata": {},
   "source": [
    "Once stored, a custom quantity behaves *exactly* like a built-in: every summary\n",
    "helper accepts its name."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "12b445a9",
   "metadata": {},
   "outputs": [],
   "source": [
    "res.get_mean( 'fdust' )                          # posterior weighted mean"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5a8c78d1",
   "metadata": {},
   "outputs": [],
   "source": [
    "res.get_median( 'fdust' )                         # posterior median"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8f020cc6",
   "metadata": {},
   "outputs": [],
   "source": [
    "res.get_quantile( 'fdust', quantile = 0.84 )      # any weighted quantile"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "603c5334",
   "metadata": {},
   "outputs": [],
   "source": [
    "res.get_std( 'fdust' )                            # weighted standard deviation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5f3b5499",
   "metadata": {},
   "outputs": [],
   "source": [
    "res.get_bestfit( 'fdust' )                        # value at the max-likelihood sample"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "57565fba",
   "metadata": {},
   "outputs": [],
   "source": [
    "res.get_credible_interval( 'fdust', percent = 0.68 )   # central 68% interval"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dc190ff2",
   "metadata": {},
   "source": [
    "These helpers are *posterior-weighted* (they use the sampling weights) and they\n",
    "automatically **exclude invalid samples** (see the sentinel note below), so you\n",
    "do not have to filter anything by hand."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "385abf00",
   "metadata": {},
   "source": [
    "## Level 4 — array-valued quantities, and a numerical pitfall\n",
    "\n",
    "A derived quantity can be an array, not just a scalar — the `SED` is the built-in\n",
    "example. The classic user case is the **average attenuation curve** as a function\n",
    "of wavelength.\n",
    "\n",
    "It is tempting to store the attenuation in magnitudes via the model's\n",
    "`get_avgAtt()`. **Do not** — and this is the single most common trap, so it gets\n",
    "its own box:\n",
    "\n",
    "> **⚠️ Average in linear space, not in magnitudes / logs.**\n",
    "> Magnitude attenuation is `-2.5 * log10( attenuated / unattenuated )`. When a\n",
    "> sample attenuates a band to (near) zero flux that is `+inf`; round-off can even\n",
    "> make it `nan`. Those non-finite values are *correct* per sample, but they\n",
    "> poison any posterior mean. **Store the linear quantity and convert at the very\n",
    "> end.** For attenuation, `galapy` already keeps the linear average attenuation\n",
    "> factor (in `[0, 1]`) on the model as `model.Aavg` after every parameter update:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aba5651f",
   "metadata": {},
   "outputs": [],
   "source": [
    "res.add_property( lambda model : model.Aavg, name = 'Aavg' )\n",
    "\n",
    "res.Aavg.shape                  # -> (Nsamples, Nwl), finite by construction\n",
    "mean_Aavg = res.get_mean( 'Aavg' )      # safe: linear, finite\n",
    "\n",
    "# convert the *summary* to magnitudes only at the end, if you want them:\n",
    "import numpy as np\n",
    "mean_att_mag = -2.5 * np.log10( mean_Aavg )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f6ed67d",
   "metadata": {},
   "source": [
    "The same principle applies to any quantity you would normally view in log space\n",
    "(luminosities spanning orders of magnitudes, metallicities, ...): **store it linear, average,\n",
    "then take the log of the summary.**"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0375892a",
   "metadata": {},
   "source": [
    "### Saving the newly computed derived properties\n",
    "\n",
    "Custom quantities are first-class: they are serialised together with the\n",
    "built-ins, so you only pay the per-sample evaluation cost once. After adding\n",
    "properties you can re-save the `Results` object and reload it later.\n",
    "\n",
    "The lowest-friction option is to pickle the whole object:\n",
    "\n",
    "```python\n",
    "import pickle\n",
    "res.add_property( lambda model : model.Aavg, name = 'Aavg' )\n",
    "\n",
    "with open( 'myfit_with_Aavg.pickle', 'wb' ) as f :\n",
    "    pickle.dump( res, f )\n",
    "\n",
    "from galapy.sampling.Results import load_results\n",
    "later = load_results( 'myfit_with_Aavg.pickle' )   # format inferred from extension\n",
    "later.get_mean( 'Aavg' )        # the custom quantity is right there\n",
    "```\n",
    "\n",
    "To re-save as HDF5 instead (portable, the format `galapy-fit` uses), dump the\n",
    "object to a dictionary and write it with the galapy IO helper:\n",
    "\n",
    "```python\n",
    "import galapy\n",
    "from galapy.io.hdf5 import write_to_hdf5\n",
    "\n",
    "write_to_hdf5(\n",
    "    'myfit_with_Aavg.galapy.hdf5',\n",
    "    metadata = dict( storage_method = 'hard',\n",
    "                     galapy_version = galapy.__version__ ),\n",
    "    hard     = True,\n",
    "    results  = res.dump(),\n",
    ")\n",
    "later = load_results( 'myfit_with_Aavg.galapy.hdf5' )\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e30a1373",
   "metadata": {},
   "source": [
    "(Under the hood the in-memory round-trip is `Results.load( res.dump() )`:\n",
    "`dump()` returns a plain dictionary and `Results.load()` rebuilds the object from\n",
    "it — `load_results` is just the file-level wrapper around that.)\n",
    "\n",
    "> **ℹ️ Invalid samples are flagged, not dropped.**\n",
    "> If a sample's parameters are unphysical (e.g. its SED is non-finite), every\n",
    "> derived quantity for that sample is stored as `-inf` rather than silently\n",
    "> skipped, so all the per-sample arrays keep the same length and stay aligned\n",
    "> with the chain. The summary helpers exclude these sentinels automatically;\n",
    "> if you slice the raw arrays yourself, mask them with `numpy.isfinite` first."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d30db420",
   "metadata": {},
   "source": [
    "## Cookbook\n",
    "\n",
    "Short, paste-ready recipes. Each assumes a loaded `Results` object `res`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df26f236",
   "metadata": {},
   "source": [
    "**Specific star-formation rate (sSFR):**\n",
    "```python\n",
    "res.add_property(\n",
    "    lambda model : model.sfh( model.age ) - model.sfh.Mstar( model.age ),\n",
    "    name = 'sSFR' )\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "346711a4",
   "metadata": {},
   "source": [
    "**Dust-to-stellar mass ratio:**\n",
    "```python\n",
    "res.add_property(\n",
    "    lambda model : model.sfh.Mdust( model.age ) - model.sfh.Mstar( model.age ),\n",
    "    name = 'fdust' )\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "71bb3985",
   "metadata": {},
   "source": [
    "**Monochromatic luminosity at a chosen rest-frame wavelength** (e.g. 1500 Å):\n",
    "```python\n",
    "import numpy as np\n",
    "def L1500 ( model ) :\n",
    "    wl = model.wl()                       # rest-frame wavelength grid [Angstrom]\n",
    "    return np.interp( 1500., wl, model.get_emission() )\n",
    "res.add_property( L1500, name = 'L1500' )\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b76cc6c0",
   "metadata": {},
   "source": [
    "**A rest-frame colour from the model SED** (flux ratio in two windows):\n",
    "```python\n",
    "import numpy as np\n",
    "def colour ( model ) :\n",
    "    wl, sed = model.wl(), model.get_SED()\n",
    "    blue = sed[ ( wl > 4000. ) & ( wl < 5000. ) ].mean()\n",
    "    red  = sed[ ( wl > 6000. ) & ( wl < 7000. ) ].mean()\n",
    "    return -2.5 * np.log10( blue / red )      # scalar -> magnitudes is fine here\n",
    "res.add_property( colour, name = 'blue_red' )\n",
    "```\n",
    "(Here the per-sample value is finite, so magnitudes are safe — contrast with the\n",
    "average-attenuation case, where it is the *averaging* across samples that breaks.)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "60c4b1b7",
   "metadata": {},
   "source": [
    "**Linear average attenuation curve** (then convert the summary):\n",
    "```python\n",
    "res.add_property( lambda model : model.Aavg, name = 'Aavg' )\n",
    "att_mag_median = -2.5 * np.log10( res.get_median( 'Aavg' ) )\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "591d780d",
   "metadata": {},
   "source": [
    "**Several quantities in one pass:**\n",
    "```python\n",
    "res.add_property( {\n",
    "    'sSFR'  : lambda model : model.sfh( model.age ) / model.sfh.Mstar( model.age ),\n",
    "    'fdust' : lambda model : model.sfh.Mdust( model.age ) / model.sfh.Mstar( model.age ),\n",
    "    'Lsed'  : lambda model : model.get_SED().sum(),\n",
    "} )\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "de1343bb",
   "metadata": {},
   "source": [
    "## Plotting\n",
    "\n",
    "The posteriors of derived properties added via `add_property` can be plotted through the `corner_derived` helper function along with the default derived properties:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "288d7c8e",
   "metadata": {},
   "outputs": [],
   "source": [
    "from galapy.analysis.plot import corner_derived\n",
    "\n",
    "fig, axes = corner_derived(\n",
    "    res, which_keys = [\n",
    "        'SFR', 'TMC', 'TDD', # <-- default\n",
    "        'fdust', 'Lsed'      # <-- custom\n",
    "    ],\n",
    "    getdist_settings={\n",
    "        'contours' : [0.68, 0.95],\n",
    "        'smooth_scale_2D' : 0.5,\n",
    "        'fine_bins': 1024,\n",
    "        'fine_bins_2D': 1024,\n",
    "    },\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6a7eba83",
   "metadata": {},
   "source": [
    "Since a user-defined property does not have built-in metadata, `corner_derived` falls back to showing the quantity's *name* as the axis label (rendered as, e.g., `Lsed` or `fdust`, above).\n",
    "Built-in quantities, by contrast, get a nice LaTeX symbol ($\\psi$, $T_\\mathrm{MC}$, $T_\\mathrm{DD}$ above).\n",
    "\n",
    "Label and log-axis choices for the derived quantities live entirely on the **analysis-side**: they are *not* stored. \n",
    "You can supply them at plotting time through keyword arguments:\n",
    "\n",
    "- `labels`: a `{key: latex}` dictionary giving a LaTeX symbol for a quantity (raw LaTeX, no surrounding `$` are required);\n",
    "- `log_scale`: a list of keys to draw on a `log10` axis.\n",
    "\n",
    "For example, having added `Lsed` and `fdust`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a02f1306",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, axes = corner_derived(\n",
    "    res, which_keys = [\n",
    "        'SFR', 'TMC', 'TDD', # <-- default\n",
    "        'fdust', 'Lsed'      # <-- custom\n",
    "    ],\n",
    "    labels = {\n",
    "        'Lsed' : 'L_\\mathrm{SED}',  # proper symbol instead of Lsed\n",
    "        'fdust' : 'f_\\mathrm{dust}' # proper symbol instead of fdust\n",
    "    },\n",
    "    log_scale = ['SFR', 'fdust'],    # fdust on log-axis too\n",
    "    getdist_settings={\n",
    "        'contours' : [0.68, 0.95],\n",
    "        'smooth_scale_2D' : 0.5,\n",
    "        'fine_bins': 1024,\n",
    "        'fine_bins_2D': 1024,\n",
    "    },\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16e6a008",
   "metadata": {},
   "source": [
    "The `labels` mapping also overrides the default label of a built-in quantity if you list it, and the `\\log_{10}(...)` wrapping is applied automatically on top of a label whenever the corresponding key also appears in the `log_scale` list.\n",
    "\n",
    "Since these are purely cosmetics the same `Results` object can be re-plotted with different labels without changing anything stored in it."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14bdf058",
   "metadata": {},
   "source": [
    "## Recap\n",
    "\n",
    "- **Storing built-ins:** set `store_quantities` in the parameter file; `SED` is\n",
    "  always kept.\n",
    "- **Adding your own:** `res.add_property( f, name = ... )`, where `f(model)`\n",
    "  returns the per-sample value; pass a dict to add several at once.\n",
    "- **Summarising:** `get_mean`, `get_median`, `get_quantile`, `get_std`,\n",
    "  `get_bestfit`, `get_credible_interval` — all accept custom names and are\n",
    "  posterior-weighted.\n",
    "- **Two rules that save you:** return the *same shape* for every sample, and\n",
    "  *average linear quantities in linear space* (convert to log/mag only at the\n",
    "  end)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}