Build and modify the parameter file
The parameter file can be generated in whatever position in the filesystem by calling from a terminal
$ galapy-genparams [--name/-n NAME|--SFH_model/-sfh SFH_MODEL]
Where
the argument
NAME
can include any path in the filesystem (default isgalapy_hyper_parameters
). Since the generated parameter file has python syntax, we automatically append a.py
extension to the custom file name. If no argument is passed the parameter file will be generated in the current directory.the argument
SFH_MODEL
allows the user to choose a SFH model, if no choice is made, the value should be set by modifying the content of the generated parameter file.the
galapy-genparams
command can be called with the argument--help/-h
to show an help message.
The generated file should be self-explanatory and has to be modified according to the fit the user has to perform. In what follows we provide a thorough explanation of all the entries that are present in the generated file.
Import the observational data
This first block of parameters defines the photometric observation that we are going to fit the model to.
First, the observational dataset has to be defined, this is done by setting the following python iterables of values:
bands = None
fluxes = None
errors = None
uplims = None
The None
values in bands
, fluxes
and errors
are placeholders that should be modified by the user.
In particular, the sequences of values can be a numpy array
, a python list
or a tuple
, it is nevertheless necessary that they all have all the same dimensions.
bands
: a sequence of strings listing the bandpass transmission names associated to the observation, e.g.bands = ( 'GOODS.i', 'GOODS.v', 'GOODS.z', 'GOODS.b' )
fluxes
: the observed flux within the bandpass filters listed in the hyperparameterbands
(floating point values)errors
: errors associated to the transmissions listed in the hyperparameterfluxes
(floating point values)uplims
: a list of booleans (True
orFalse
) identifying which of the fluxes listed in hyperparameterfluxes
are non-detections. (note that, in case all the measurements are detections, this value can be left to its default value, i.e.None
)
Then, we define the photometric system through the two hyperparameters
filters = list(bands)
filters_custom = None
The default value of the filters
assumes that all the transmissions listed in hyperparameter bands
are present in the database.
If this is true it can be left as is.
The second parameter is optional and can be left to None
if all the filters are present in the database.
In case this is not true there are 2 possibilities: either none of the filters is present in the database, or just a subset is present.
In the first case set filters = ()
and follow the instructions in point 2 of the following enumerated list.
In the second case you will have to set both hyperparameters filters
and filters_custom
manually:
List the transmission filters already available in the database. In this case the
filters
hyperparameter should be set to a list or a tuple listing the sub-set of filters already present:filters = ( 'GOODS.i', 'GOODS.v', 'GOODS.z', 'GOODS.b' )
List the custom set of filters through the
filters_custom
hyperparameter. It has to receive a nested dictionary properly formatted. This means each element in the dictionary should have a key that defines the chosen name for the filter and an associated value that should be itself a dictionary with two keys:'wavelengths'
and'photons'
; the first key should provide an array-like listing a set of wavelengths while the second the transmissions in units of photons corresponding to each of the wavelengths in the first array:filters = { 'custom1' : { 'wavelengths' : [ 999., 1000., 1001., 1499., 1500., 1501. ], 'photons' : [ 0., 0., 1., 1., 0., 0. ] }, 'custom2' : { 'wavelengths' : [ 1999., 2000., 2001., 2499., 2500., 2501. ], 'photons' : [ 0., 0., 1., 1., 0., 0. ] } }
in the example above the photometric system will contain two filters,
'custom1'
and'custom2'
, the first will be a top-hat function in the interval \(\lambda \in (1000, 1500)\,\mathring{A}\), while the second a top-hat function in the interval \(\lambda \in (2000, 2500)\,\mathring{A}\)
These parameters will be used to build an object of type galapy.PhotometricSystem.PMS
.
Tip
To check what are the filters available in the galapy database:
from galapy.PhotometricSystem import print_filters
print_filters()
Note that the function also accepts arguments, for filtering by experiment
(e.g. print_filters('Herschel')
will only print on screen filters used
in the Herschel experiment)
The last hyperparameter to set in this block defines how to treat eventual upper-limits present in the dataset
method_uplims = 'chi2'
If all the measurements can be treated as detections, this parameter will be ignored. Otherwise the user can choose between 3 possible behaviours, which will translate in an additional term to the gaussian likelihood used to sample the parameter space. The three possible methods are:
'chi2'
(default): non-detections are treated exactly as detections with a large error;'simple'
: a simple step-wise function setting the log-likelihood to \(-\infty\) (i.e., zero probability) when the model predicts a flux larger than observed and to zero (i.e., probability equal to one) when the predicted flux is lower than the limit:\[\begin{split}f\left[\overline{S}_j, \overline{S}_j(\theta),\sigma_j\right] = \left\{ \begin{aligned} &\ \text{-}\infty & \overline{S}_j(\theta) > \overline{S}_j\\ &\\ &\ 0& \text{otherwise} \end{aligned} \right.\end{split}\]'S12'
: Sawicki (2012) proposes a modification of the \(\chi^2\) that consists of the integral of the probability of some observation up to the given proposed model. If the errors on data are Gaussian, this integral provides the following analytical expression for the corresponding log-likelihood:\[f\left[\overline{S}_j, \overline{S}_j(\theta),\sigma_j\right] = -2 \ln \left\{\sqrt{\frac{\pi}{2}} \sigma_j \left[1 + \text{erf}\left(\frac{\overline{S}_j - \overline{S}_j(\theta)}{\sqrt{2}\sigma_j}\right)\right]\right\}~.\]Even though it can be argued that using the expression above is the most formally correct way of accounting for upper limits when errors are Gaussian, the combination of logarithm and error function is particularly risky in computational terms. Specifically, it tends to hit the numerical limit of floating point numbers representation accuracy really fast, leading to undefined behaviour.
Define the physics of the galaxy model
In this block of hyperparameters the user can define the modelling strategies to adopt, both in terms of galaxy model and noise model. All the parameters in this section can be mantained to their default value, this will not affect the possibility to start sampling. Nonetheless, the user should set these values according to the needs of the examined source, as the scientific result of the sampling depends on this set of parameters.
cosmo
(default= 'Planck18'
): this parameter defines what cosmological model to adopt in computing distances and ages. It is used to convert the emitted energy into flux and to check the age of the source against the age of the Universe at the given redshift. There are several pre-computed models, including'Planck15'
,'Planck18'
,'WMAP7'
,'WMAP9'
. Nonetheless it is possible to use user-defined models by passing a dictionary of pre-computed values of luminosity distance and age as a function of redshift for the chosen cosmology:cosmo = { 'redshift' = [ ... ], # the redshift grid of chosen size N 'luminosity_distance' = [ ... ], # luminosity distance values defined on the N-size redshift grid 'age' = [ ... ] # age of the Universe values defined on the N-size redshift grid }
The values provided will be linearly interpolated within the provided redshift grid and linearly extrapolated from the extremes outside of the redshift grid.
sfh_model
(default= 'insitu'
) the star-formation history model of choice. The available parameterised models are listed below:'constant'
:\[\psi(t) = \psi_0\]where \(\psi_0\) is a constant floating-point value expressed in units of \(M_\odot/\text{yr}\) (parameter
sfh.psi
in the Free Parameters of the Galaxy model table).'delayedexp'
, a generalised version of the delayed exponential SFR:\[\psi(t)\propto \tau^{\kappa}\, \exp{(-\tau/\tau_\star)}\]where \(\tau_\star\) is the characteristic star-formation timescale (parameter
sfh.tau_star
) and \(\kappa\) (parametersfh.k_shape
) is a shape parameter for the early evolution; \(\kappa=0\) corresponds to a pure exponential, while \(\kappa=1\) to the standard delayed exponential, the model above is normalized by a factor defined in the free-parametersfh.psi_norm
.'lognormal'
:\[\psi(t)\propto \dfrac{1}{\tau}\, \dfrac{1}{\sqrt{2\pi\sigma_\star^2}}\, \exp\left[-\dfrac{\ln^2(\tau/\tau_\star)}{2\,\sigma_\star^2}\right]\]where \(\tau_\star\) (parameter
sfh.tau_star
) and \(\sigma_\star\) (parametersfh.sigma_star
) control the peak age and width. Also in this case, the above model is multiplied by a free-parametersfh.psi_norm
.'insitu'
, is our default, physically motivated model. It provides a SFR with shape:\[\psi(t)\propto e^{-x}-e^{-s\gamma\, x}\]
where \(x\equiv\tau/s\,\tau_\star\) with \(s \approx 3\) a parameter related to gas condensation, while \(\gamma\) is a parameter including gas dilution, recycling and the strength of stellar feedback. Its value is given by \(\gamma \equiv 1 + \mathcal{R} + \epsilon_\text{out}\), where \(\mathcal{R}\) is the recycled gas fraction and \(\epsilon_\text{out}\approx 3[\psi_\text{max}/M_\odot\text{yr}^{-1}]^{-0.3}\) is the mass loading factor of the outflows from stellar feedback. Therefore, the parameter \(\gamma\) is completely determined in terms of the free parameter \(\psi_\text{max}\) and, eventually, by the age of the galaxy \(\tau\). The free parameters in this model are
sfh.psi_max
, entering the relation for \(\epsilon_\text{out}\) and normalising the model, andsfh.tau_star
.'interpolated'
: the user can provide a grid of values for the age and SFR at each time, the SFH will then be interpolated over said grid (not for sampling, see the API documentation).
ssp_lib
(default= 'parsec22.NT'
) defines which Simple Stellar Population table to use. There are several available a complete list can be printed on screen by callingfrom galapy.CompositeStellarPopulation import print_ssp_libs print_ssp_libs()
A thorough description of the differences between the several SSP libraries can be found in the presentation paper or in Choose the Simple Stellar Population library.
do_Xray
(default= False
) by setting this boolean toTrue
the SED will be extended down to wavelength \(\lambda = 1\,\mathring{A}\). The choice affects both the stellar continuum (by adding the X-Ray emission due to stars in binary systems) and the eventual AGN emission (ifdo_AGN = True
).do_Radio
(default= False
) this parameter controls the eventual addition of Radio emission due to the stellar continuum. In particularSuper-Nova Synchrotron
Nebular Free-Free emission
The system authomatically accounts for the SSP library chosen, as libraries of the
parsec22.NT
family already include synchrotron and libraries of theparsec22.NTL
family include both synchrotron and nebular emission (free-free and lines, see Choose the Simple Stellar Population library for further details).Note
If a library from the
parsec22.NTL
family is chosen, settingdo_Radio = True
does not have any effect as all the sources of stellar Radio continuum are already included on top of the SSPs. This also means that the radio contribute will be accounted for in the energy balance computation (see presentation paper for details).do_AGN
(default= False
), if this parameter is set toTrue
, templated emission from the AGN will be added to the final SED model. The templates we use are those produced by Fritz et al., 2006 . Further description on the templated models and of the possible combinations of parameters can be found in the documentation page of thegalapy.ActiveGalacticNucleus
module. Note that, while the choice of template is defined on a discrete distribution of parameters, its contribute to the overall emission is given in terms of some reference emission, through the free parameterfAGN
. In particular, by default we use the total InfraRed luminosity due to the non-AGN component of the emission as a reference.lstep
(default= None
) this parameter regulates the eventual sub-sampling to be applied to the wavelength grid, it acceptsan integer number: consider a wavelength grid point every
lstep
points;a sequence of integers, listing the indices of the elements of the wavelength grid to take into account;
a boolean mask with the same size of the wavelength grid, where
False
values are masked elements.
Warning
Since the time required for computing models is tightly related to the resolution of the wavelength grid, this hyper-parameter allows (by sub-sampling the default grid) to speed up the computation. The wavelength grid is defined by the SSP library chosen, sub-sampling it means to have a lower resolution on all the quantities derived from the fit. In particular, it will directly result in an under-estimation of the energy balanced between attenuated and re-radiated flux. The safest choice is to avoid any sub-sampling.
noise_model
(default= None
), choose a model for treatment of noise. Currently available models are:'calibration_error'
: adds a systematic unknown source of error to the measurement depending on a single additional nuissance parameterf_sys
, modifying the measured uncertainties as \(\tilde{\sigma}_i^2(\theta, f_\text{sys}) = \sigma_i^2 + f_\text{sys}^2 \bar{S}_i^2(\theta)\), consistently, the log-likelihood takes on the form\[\ln\mathcal{L}(\bar{S}| \theta, f_\text{sys} ) \equiv - \dfrac{1}{2}\sum_i\biggl\{\dfrac{[\bar{S}_i - \bar{S}_i(\theta)]^2}{\tilde{\sigma}_i^2(\theta, f_\text{sys})} + \ln\bigl[2\pi \tilde{\sigma}_i^2(\theta, f_\text{sys})\bigr]\biggr\}\]where \(\bar{S}_i\) is the i-th measured flux and \(\bar{S}_i(\theta)\) the model prediction for the i-th measured flux (see presentation paper for further details, some information on this can also be found in the emcee online documentation).
None
: no-particular action is performed to model eventual systematics in the error measurements, it is assumed that all the sources of uncertainty have been already accounted for when estimating errors on the observed fluxes. (note that this might not be the safest choice, especially when etherogeneous datasets are being considered).
noise_kwargs
(default= {}
) eventual additional arguments to be passed to the noise constructor class (see documentation of thegalapy.Noise
module.
Choose the fixed and free parameters
Define the chosen parameterisation of the galaxy and noise models in this block.
The user has to define the content of two python dictionaries: galaxy_parameters
and noise_parameters
.
A complete list of the available tunable parameters is provided in Free Parameters of the Galaxy model.
The logic used is the following:
if a parameter does not appear in the dictionary as a
key : value
pair, it will be set to its default value (see last column of the table in Free Parameters of the Galaxy model).if a parameter appears in the dictionary there are two possibilities:
fix the parameter to a value
float_value
by passing akey : float_value
pair, where thefloat_value
is some floating point number. This choice will not add a dimension to the sampled parameter space but will change the default value of the target parameter.Set a free parameter to vary within some uninformative uniform prior whose limits are user defined. This is achieve by passing a tuple as value of the
key : value
pair. In particular the tuple will be something like( [lower_limit, upper_limit], boolean )
, wherelower_limit
andupper_limit
are the lower and upper limits of the uniform uninformative prior, respectively, while theboolean
states whether the given free-parameter has to be considered logarithmic or linear.a logarithmic free-parameter, e.g. the age of the galaxy, will look something like:
'age' : ( [6., 11.], True )
, meaning that samples will be drawn from the interval \(10^6\ \text{years} < \text{age} < 10^{11}\ \text{years}\).a linear free-parameter, e.g. the redshift of the galaxy, will look something like:
'redshift' : ( [0., 10.], False )
, meaning that samples will be drawn from the interval \(0 < z < 10\).
Each free-parameter adds a dimension to the parameter space.
Note
The default parameter file is built with all the available tunable parameters already inserted in the dictionary, all of them set as free parameters with a large prior.
We also made a priori some of these parameters logarithmic, as sampling in the logarithmic space for some of these of parameters is most of the times (but not always!)
recommended, when the prior volume span different orders of magnitude. Nevertheless we strongly recommend the user to tune the size of the prior and the number of free
and fixed parameters to taylor the specific use-case.
Furthermore, when reading the parameter file, the system will authomatically ignore the free parameters that have been included in the two dictionaries but are not used
in the chose model (i.e. if do_AGN = False
all the AGN-parameters will be ignored, even though included in the galaxy_parameters
dictionary).
Nevertheless, leaving them in the dictionary will trigger a warning that will be print on screen when the fitting command is called (i.e. galapy-fit parameter_file.py
).
In order not to pollute the STDERR, we recommend to erase from the dictionary all the parameters that are not used in the chosen galaxy model.
The reason why we have chosen this strategy over a silently ignoring the redundant parameters, is to raise attention in the user in eventual human-driven-mistakes in the
choice of parameters.
Another relevant case in which more than the necessary parameters are included by default when the parameter file is generated concerns the SFH model.
Different SFH models provide different parameterisations and when the command galapy-genparams
is called without the --SFH_model/-sfh SFH_MODEL
option, all
the possible parameterisations from all the available models are included in the galaxy_parameters
dictionary. If the unnecessary parameters are not removed from the dictionary,
all the parameters that are not used in the SFH model chosen will raise a Warning message.
By calling, e.g.,
$ galapy-genparams -sfh delayedexp
the parameter file will be generated with sfh_model = 'delayedexp'
(see point 2 of Define the physics of the galaxy model) and only the parameters relevant to the Delayed Exponential SFH model
will be present in the galaxy_parameters
dictionary.
Sampling and output format choices
This last section allows to choose a sampler, regulates its behaviour and the choices for the output format. The parameters to set are the following:
sampler
(default= 'emcee'
) choose betweenEmcee sampler provides an implementation of the Markov-Chain Monte Carlo (MCMC) technique. Specifically, it implements an ensemble sampler with affine invariance Goodman & Weare, 2010 that, by instantiating many test particles (walkers) in the parameter space, builds first order Markov sequences of proposals that are tested against the likelihood. The dynamics of this system of particles is regulated by the requirement that, at each new step, a better estimate of the parameters is drawn.
Dynesty sampler implements Dynamic Nested Sampling (Higson et al., 2019), a generalised version of nested sampling (Skilling, 2004) where the number of test particles (here live points) is dynamically increased in regions of the posterior where a higher accuracy is required. The parameter space is modelled as a nested set of iso-likelihood regions that are sampled until the overall evidence reaches a stopping criterion set by the user. In our default hyper-parameters set-up we provide an 80%/20% posterior/evidence split and we model the posterior space with multiple ellipsoids, as we expect to have multiple peaks and correlations when sampling high dimensional parameter spaces. We use the default stopping function
\[\mathcal{S}(f_p, s_p, s_{\mathcal{Z}}, n) \equiv f_p \times \frac{\mathcal{S}_p(n)}{s_p} + (1 - f_p) \times \frac{\mathcal{S}_\mathcal{Z}(n)}{s_{\mathcal{Z}}} < 1\]where \(f_p\) is the fractional importance we place on posterior estimation (20%, as mentioned above), \(\mathcal{S}_p\) is the posterior stopping function, \(\mathcal{S}_\mathcal{Z}\) is the evidence stopping function, \(s_p\) is the posterior “error threshold”, \(s_\mathcal{Z}\) is the evidence error threshold, and \(n\) is the total number of Monte Carlo realisations, used to generate the posterior/evidence stopping values.
When sampling high-dimensional large volumes the degeneracy between parameters can easily generate a complex posterior topology, such as multiple peaks on some parameters or non-linear correlations. Our suggestion for an optimal usage is to rely on dynamic nested sampling in this case. As an empirical rule of thumb, we can recommend to rely on nested sampling when the number of free parameters is larger than \(~5\) and when it is not necessary to include extremely complex priors (as this, even though feasible, is not trivial in nested sampling).
On the other hand, MCMC provides a more straightforward interface to the inclusion of sophisticated priors and proves to be efficient and to possibly converge faster when working on smaller and well-behaved volumes, i.e. when multiple peaks and complex correlations among parameters are not to be expected.
nwalkers
(default= 64
) andnsamples
(default= 4096
). These two hyperparameters are used if and only if the emcee sampler is chosen (see previous point). They regulate the number of walkers that will be used and the total number of samples that will be extracted per each walker.Note
The total number of samples extracted with the default values will be
nwalkers
\(\times\)nsamples
\(= 2^{18}\), corresponding to an expected total execution time of about \(\approx 15\div20\) minutes. Even though, in MCMC sampling with emcee using our default hyperparameters, the convergence is not guaranteed, this number of samples should be enough for most of well-behaved posteriors. More sophisticated convergence criteria to stop the sampler can be included by modifying thesampler_kw
andsampling_kw
hyperparameters.Tip
The number of walkers (
nwalkers
) should always be \(\ge 4\times N_\text{dim}\), where \(N_\text{dim}\) is the number of free-parameters.sampler_kw
(default= {}
) additional keyword arguments to be passed to the constructor of the chosen sampler.for emcee: see documentation of the Ensamble Sampler. The default values we have set internally (see
galapy.sampling.Sampler
) areemcee_default_sampler_kw = { 'moves' : None, 'args' : None, 'kwargs' : None }
for dynesty: see documentation of the Dynamic Nested Sampler. The default values we have set internally (see
galapy.sampling.Sampler
) aredynesty_default_sampler_kw = { 'bound' : 'multi', 'sample' : 'rwalk', 'update_interval' : 0.6, 'walks' : 25, 'bootstrap' : 0 }
sampling_kw
(default= {}
) additional keyword arguments to be passed to the function running the sampling with the chosen sampler.for emcee: see documentation of the
run_mcmc
method of the Ensamble Sampler. The default values we have set internally (seegalapy.sampling.Sampler
) areemcee_default_sampling_kw = { 'log_prob0' : None, 'rstate0' : None, 'blobs0' : None, 'tune' : False, 'skip_initial_state_check' : False, 'thin_by' : 1, 'thin' : None, 'store' : True, 'progress' : True, 'progress_kwargs' : None }
for dynesty: see documentation of the
run_nested
method of the Dynamic Nested Sampler. The default values we have set internally (seegalapy.sampling.Sampler
) aredynesty_default_sampling_kw = { 'nlive_init' : 1024, 'maxiter_init' : None, 'maxcall_init' : None, 'dlogz_init' : 0.02, 'nlive_batch' : 1024, 'maxiter_batch' : None, 'maxcall_batch' : None, 'maxiter' : sys.maxsize, 'maxcall' : sys.maxsize, 'maxbatch' : sys.maxsize, 'n_effective' : None, 'print_progress' : True, 'stop_kwargs' : { 'target_n_effective' : 10000 } }
output_directory
(default= ''
) choose the position in the filesystem where to store the results. The default empty string (''
) means “store in current location”, results will be saved in the directory from which thegalapy-fit
command has been called.run_id
(default= ''
) choose a common name to associate to all the output files. The empty string will trigger the usage of a date-time string with format'YYYYMMDDhhmm'
whereYYYY
= four digits for the year (we are scheptical that our tool will still be used after year 9999 A.D.),MM
two digits for the month,DD
two digits for the day,hh
two digits for the hour (in 24h-format),mm
two digits for the minutes.store_method
(default= 'hdf5'
) two possible output formats are currently available in GalaPy:'pickle'
the standard Python serialisation protocol. Results objects (see documentation of thegalapy.sampling.Results
module) are computed at the end of a sampling run then serialised and stored in non-volatile memory. The typical size of the output file can reach up to \(\sim 1\) GB.'hdf5'
the Hierarchical Data Format (see, e.g., Folk et al., 2010), a widespread method for storing heterogeneous data.
Tip
For almost all use-cases, the HDF5 format is a better choice than pickle as it is safer to distribute and backward/forward compatibility is guaranteed. Pickle should be used only for internal usage.
store_lightweight
(default= False
) boolean available only if the HDF5 output format is chosen.True
: store only samples coordinates, likelihood values and weights along with minimal additional information to re-build the models used in the sampling (typical size \(\sim 10\) MB);False
: along with the information available also stored with this option set asTrue
option, all the additional derived quantities computed when building theResults
object are stored (typical size up to \(\sim 1\) GB, see documentation of thegalapy.sampling.Results
module).
Note
The amount of information stored by choosing
store_method = 'pickle'
is equivalent to the combinationstore_method = 'hdf5'
andstore_lightweight = False
)pickle_raw
(default= False
) whether to pickle the sampler raw results, no analysis on the outputs is done, these data are not sufficient to reproduce the models used. (might be useful for analyzing the run statistics).pickle_sampler
(default= False
) whether to pickle the sampler at the end-of-run state. It is necessary to set this hyperparameter toTrue
if the user wants to restart the run.