.. _myxclassfit-short:
myXCLASSFit
===========
**Fitting single spectra**
The **myXCLASSFit** function offers the possibility to fit multiple frequency
ranges in multiple files from multiple telescopes simultaneously. It can
be used with different optimization algorithms to optimize the input
parameters defined in an molfit file to achieve a good description of
the observational data. Details of the **myXCLASSFit** are described in
Sect. ":ref:`api-myxclassfit`".
The **myXCLASSFit** function requires at least the following three input files
- molfit file, see Sect. ":ref:`myxclass-molfit`", to define which molecules
are taken into account and the corresponding components,
- observational xml, see Sect. ":ref:`myxclassfit-obs-xml-file`" , to control
the import of the observational data,
- algorithm xml file, see Sect. ":ref:`myxclassfit-alg-xml-file`" to specify
the optimization algorithm.
In addition to these three input files, the *iso ratio file* for describing
the relationships between molecules and their isotopologues, see
Sect. ":ref:`myxclass-iso`", may also be required.
Example call of the **myXCLASSFit** function:
::
>>> from xclass import task_myXCLASSFit
>>> import os
# get path of current directory
>>> LocalPath = os.getcwd() + "/"
# set path and name of molfit file
>>> MolfitsFileName = LocalPath + "files/my_molecules.molfit"
# set path and name of observational (obs.) xml file
>>> ObsXMLFileName = LocalPath + "files/my_observation.xml"
# set path and name of algorithm (alg.) xml file
>>> AlgorithmXMLFileName = LocalPath + "files/my_algorithm__trf.xml"
# set optimization package
>>> Optimizer = "scipy"
# define that results of myXCLASSFit function are return as dictionary
>>> DictOutFlag = True
# call myXCLASSFit function
>>> OutDict = task_myXCLASSFit.myXCLASSFitCore( \
MolfitsFileName = MolfitsFileName, \
ObsXMLFileName = ObsXMLFileName, \
AlgorithmXMLFileName = AlgorithmXMLFileName, \
Optimizer = Optimizer, \
DictOutFlag = DictOutFlag)
.. _myxclassfit-obs-xml-file:
Observational xml file
----------------------
The *observational xml file* (obs. xml file) is used to describe the import
of the observational data within an xml file. Here, each parameter is
defined by so-called tags, i.e. ``TagValue``.
- The obs. xml file has to start with
::
- All tags in the obs. xml file has to be enclosed by the ```` tag,
i.e.
::
..
- The tag ```` defines the number of observational data files
and has to be :math:`>=1`. All parameters for each observational data file
has to be enclosed by the ```` tag, e.g.
::
2
..
..
- The tag ```` defines the path and name of the corresponding
observational data file. |br| |br|
- For each observational data file the tag ```` describes the
type of the observational data file. For ASCII files the tag has to be set to
``xclassASCII``, for FITS files to ``xclassFITS``, respectively. |br| |br|
- The number of ranges ```` must be defined for each file and
must be :math:`>=1`. The tag must also be set if the whole file is used.
|br| |br|
- For each frequency range the tags ```` and ````
describe the lowest and highest frequency, respectively. |br| |br|
- The step frequency of the simulated spectrum has to be given by
the tag ````. |br| |br|
- For each frequency range, the user can define a simple phenomenological
description of the background continuum
.. math::
:label: obsxml:UserTbg
I_{\rm bg} (\nu) = T_{\rm bg} \cdot \left(\frac{\nu}{\nu_{\rm min}} \right)^{T_{\rm slope}}
using the tags ```` (for the background temperature
in~K) and ```` (for the temperature slope, dimensionless).
The tag ```` indicates if the user defined background continuum
(described by ```` and ````)
describe the continuum contribution completely (``True``) or if
continuum contributions defined in the molfit file are taken into account
as well (``False``). |br| |br|
- (optional) The user can specify path and name of an ASCII file describing
the background intensity between the cosmic microwave background and the
components at the largest distance as function of frequency (in~MHz)
by using tag ````. |br| |br|
- (optional) In order to specify global values for the parameters describing
the dust contribution for each frequency range, see :eq:`myXCLASS:tau`, the
tags ```` (for the hydrogen column density in
cm :math:`^{-2}`), ```` (for the dust spectral index,
dimensionless), and ```` (in cm :math:`^{2}` g :math:`^{-1}`) can be
used. Additionally, the user can specify path and name of an ASCII file
describing the dust optical depth as function of frequency (in MHz) by using
tag ````. |br| |br|
- (optional) In order to define global values for the advanced phenomenological
description of the continuum for each frequency range, the user has to define
the tags ```` (for selecting the continuum function,
dimensionless), ```` (for the first continuum parameter,
dimensionless), ```` (for the second continuum parameter,
dimensionless), ```` (for the third continuum parameter,
dimensionless), ```` (for the fourth continuum
parameter, dimensionless), and ```` (for the fifth
continuum parameter, dimensionless). |br| |br|
- (optional, for *myXCLASSMapFit* and *myXCLASSMapRedoFit* function only) The
tag ```` defines a threshold (in~K) for each frequency range for
a pixel. If the spectrum of a pixel has an max. intensity lower than the
value defined by this parameter the pixel is not fitted (ignored). |br| |br|
- (optional, for *LineID* function only) The tag ```` describes the
noise level (in~K) for each frequency range. All parts of the spectrum with
intensities lower than the noise level are ignored. |br| |br|
- For each observation file the user has to specify the size of the telescope
( :math:`>0`) by defining the tag ```` (for circle-shaped
beams) or by tags ````, ```` and ```` for elliptical rotated
beams. Additionally, the tag ```` indicates if single dish or
interferometric observations are described. If the sub-beam description is
deactivated and an elliptical beam is still used, XCLASS assumes a circular
beam and calculates an average beam size from BMAJ and BMIN. |br| |br|
- Using the tag ```` the user can defined different local standard
of rest velocities ( :math: `v_{\rm LSR}`) for each obs. data file. Thereby, the
value defined by the input parameter ``vLSR`` is ignored. |br| |br|
- (optional) The tag ```` indicates the red-shift for the corresponding
obs. data file. |br| |br|
- (optional) The tags ````, ````, and
```` control the import of the ASCII file containing the
observational data. Please note, that these tags are read only if the
corresponding obs. data file is an ASCII file, i.e. that the tag
```` is set to ``xclassASCII``. If tag ```` is set to
``True``, the ASCII file has to contain an additional 3rd column, describing
the errors (or weigths) for each frequency channel. These weights are used
in the computation of the :math:`\chi^2` function
.. math::
:label: obsxml:chi2
\chi^2 = \sum_{i=1}^N \left[ \left(y_i^\mathsf{obs} -
y_i^\mathsf{fit} \right)^2
\cdot \frac{1}{\left(\sigma_i^\mathsf{error}\right)^2}
\right],
where :math:`\sigma_i^{\rm error}` represents the error of the *i* th data point.
|br| |br|
- In order to use isotopologues, the user has to set the tag
```` to ``True`` and to specify path and name of an iso ratio
file using tag ````. If no isotopologues are used, tag
```` has to be set to ``False``. |br| |br|
- (optional) Local-overlap is taken into account by setting tag
```` to ``True`` otherwise to ``False``, see Sect.
":ref:`myxclass-localoverlap`". |br| |br|
- (optional) The number of model pixels along x- and y-direction required for
the sub-beam description are defined by tags ```` and
````, respectively. |br| |br|
- (optional) Deactivate sub-beam description, see Sect.
":ref:`myxclass-molfit-subbeam`",
by setting the ```` to ``True`` otherwise to ``False``.
|br| |br|
- (optional) In order to use pre-computed emission and absorption functions
used in :eq:`myxclass-sourceFunction` for different layer distances
the user can define the path of a directory, using tag ````,
containing ASCII files describing both functions for different frequencies
and distances. |br| |br|
- (optional) By default *XCLASS* uses the SQLite3 database file ``cdms_sqlite.db``
located in the directory defined in the ``xclass/init.dat`` file. In order to
use a different database file, the user has to define the path and name of
another database file using the tag ````.
Example **observational xml file** used by many *XCLASS* functions:
::
1
demo/myXCLASSFit/band1b.dat
xclassASCII
1
580102.0
580546.5
0.5
False
0.88
3.0
background.dat
3.0e+24
2.0
0.42
jena_thin_no__MHz.dat
0.0
1.1
2.098
2.432
20.0
False
no
1
True
demo/myXCLASS/iso_names.txt
True
True
500
500
Database/cdms_sqlite__2016-06-15.db
.. _myxclassfit-alg-xml-file:
Algorithm xml file
------------------
The *algorithm xml file* (alg. xml file) is used to describe the
optimization algorithms, which are used to fit the model parameters
defined in the molfit and other input files (iso-ratio) to the
observational data. Here, each parameter is defined by so-called tags, i.e.
``TagValue``.
- The alg. xml file has to start with
::
- All tags in the alg. xml file has to be enclosed by the ```` tag,
i.e.
::
..
- The tag ```` defines the number of optimization
algorithms and has to be :math:`>=1`. All parameters for each algorithm
has to be enclosed by the ```` tag, e.g.
::
2
..
..
- The tag ```` defines the name of the optimization algorithm.
The following algorithms are available:
- **"trf"**: (Trust Region Reflective algorithm, only used with
optimizer ``SCIPY`` or MapFit function, *parallelized*)
a fast local optimization
algorithm that takes into account the parameter limits, i.e. the
algorithm guarantees that the fitting parameters remain within the
user-defined parameter limits.
Additional tags required by this algorithm:
- tag ````
sets the value of variation (in percent) for the calculation
of the gradient of the :math:`\chi^2` function.
|br| |br|
- **"dogbox"**: (dogleg algorithm with rectangular trust region, only used with
optimizer ``SCIPY`` or MapFit function, *parallelized*), very similar
to the aforementioned "trf" algorithm.
Additional tags required by this algorithm:
- tag ````
sets the value of variation (in percent) for the calculation
of the gradient of the :math:`\chi^2` function.
|br| |br|
- **"lm"**: the Levenberg-Marquardt algorithm as implemented in MINPACK but
without taking parameter limits into account! Please use "trf" algorithm
instead.
Additional tags required by this algorithm:
- tag ````
sets the value of variation (in percent) for the calculation
of the gradient of the :math:`\chi^2` function.
|br| |br|
- **"nelder-mead"**: (Nelder-Mead algorithm, only used with
optimizer ``SCIPY`` or MapFit function, *not parallelized*)
a local optimization algorithm
|br| |br|
- **"powell"**: (modified Powell algorithm, only used with
optimizer ``SCIPY`` or MapFit function, *not parallelized*)
a local optimization algorithm
|br| |br|
- **"cobyla"**: (Constrained Optimization BY Linear Approximation (COBYLA)
algorithm, only used with optimizer ``SCIPY`` or MapFit
function, *not parallelized*) a local optimization algorithm. |br| |br|
- **"cg"**: (conjugate gradient algorithm, only used with
optimizer ``SCIPY`` or MapFit function, *parallelized*)
a local optimization algorithm
Additional tags required by this algorithm:
- tag ````
sets the value of variation (in percent) for the calculation
of the gradient of the :math:`\chi^2` function.
|br| |br|
- **"bfgs"**: (BFGS algorithm, only used with
optimizer ``SCIPY`` or MapFit function, *parallelized*)
a local optimization algorithm
Additional tags required by this algorithm:
- tag ````
sets the value of variation (in percent) for the calculation
of the gradient of the :math:`\chi^2` function.
|br| |br|
- **"l-bfgs-b"**: (L-BFGS-B algorithm, only used with
optimizer ``SCIPY`` or MapFit function, *parallelized*)
a local optimization algorithm
Additional tags required by this algorithm:
- tag ````
sets the value of variation (in percent) for the calculation
of the gradient of the :math:`\chi^2` function.
|br| |br|
- **"slsqp"**: (Sequential Least Squares Programming (SLSQP) Algorithm, only used with
optimizer ``SCIPY`` or MapFit function, *parallelized*)
a local optimization algorithm
Additional tags required by this algorithm:
- tag ````
sets the value of variation (in percent) for the calculation
of the gradient of the :math:`\chi^2` function.
|br| |br|
- **"basinhopping"**: (Basin-Hopping Algorithm, only used with
optimizer ``SCIPY`` or MapFit function, *parallelized*)
a global optimization algorithm
|br| |br|
- **"brute", "bruteforce", "brute-force"**: (brute force, only used with
optimizer ``SCIPY`` or MapFit function, *parallelized*)
a global optimization algorithm.
Additional tags required by this algorithm:
- tag ```` can be used to define the number of grid points
along each parameter axis as python list. (Each element indicates the
number of grid points for the corresponding fit parameter).
|br| |br|
- **"differential_evolution"**: (Differential Evolution, only used with
optimizer ``SCIPY`` or MapFit function, *parallelized*)
a global optimization algorithm. |br| |br|
- **"dual_annealing"**: (Dual Annealing, only used with optimizer ``SCIPY``
or MapFit function, *parallelized*) a global optimization algorithm.
|br| |br|
- **"error-estimation", "errorestim_ins"**: estimate error of model parameters
using "mcmc" or "ultranest", *parallelized*.
Additional tags required by this algorithm:
- tag ```` sets the number of sampler,
- tag ```` sets multiplicity of standard deviation
(1-sigma, 2-sigma, 3-sigma, etc.),
- tag ```` sets the type of error range: ["Gaussian",
"Percentile", "HPD", "UltraNest"].
|br| |br|
- **"mcmc", "emcee"**: (*parallelized*) The *emcee* [1]_ package
[ForemanMackey_2013]_, implements the affine-invariant ensemble sampler
of [Goodman_2010]_, to perform a full-parallelized MCMC algorithm,
Additional tags required by this algorithm:
- tag ```` sets the used algorithm to compute errors, i.e.
``mcmc``, ``ultranest``, or ``ins``.
- tag ```` sets the number of sampler
- tag ```` describes the path and name of the backend file,
which is used to store and resume an interrupted MCMC run. If the given file
does not exists before the MCMC run starts, *XCLASS* stores all MCMC parameters
into a HDF5 file. In order to resume an interrupted MCMC run, the path and
name of the corresponding HDF5 file has to be defined by this tag.
- tag ```` sets the initial sampling method. If the tag is set
to ``local``, the initial starting values of the MCMC samplers are drawn
in a small sphere around the initial parameter values. If the tag is set
to ``global``, the initial starting values are randomly distributed within
the given parameter limits.
- tag ```` sets the number of iterations for burn-in phase
|br| |br|
- **"ultranest"**: (*parallelized*) UltraNest [2]_ is a general-purpose Bayesian
inference package [Buchner_2016]_, [Buchner_2019]_ for parameter estimation
and model comparison. It is especially useful for multi-modal or
non-Gaussian parameter spaces, computational expensive models, in robust
pipelines. Furthermore, UltraNest is intended for fitting complex physical
models with slow likelihood evaluations, with one to hundreds of
parameters. In addition, it intends to replace heuristic methods like
multi-ellipsoid nested sampling and dynamic nested sampling with more
rigorous methods.
Additional tags required by this algorithm:
- tag ````
sets the number of sampler
- tag ```` describes the path and name of the backend file,
which is used to store and resume an interrupted UltraNest run.
- tag ```` describes additional parameters for the UltraNest
package as python dictionary
- tag ```` sets the number of iterations for burn-in phase
|br| |br|
- **"genetic"**: (Genetic algorithm, only used with optimizer ``MAGIX``,
*parallelized*) a global optimization algorithm
- tag ```` sets the number of best sites.
|br| |br|
- **"pso"**: (Particle swarm optimization, only used with optimizer ``MAGIX``,
*parallelized*) a global optimization algorithm
- tag ```` sets the number of best sites.
|br| |br|
- **"ins"**: (Interval nested sampling algorithm, only used with
optimizer ``MAGIX``, *parallelized*) a global optimization algorithm
- tag ```` get critical bound for volume,
- tag ```` defines the difference between
maximal and minimal value of inclusion function,
|br| |br|
- **"ns"**: (Nested sampling algorithm, only used with optimizer ``MAGIX``,
*parallelized*) a global optimization algorithm.
|br| |br|
- **"bees"**: (Bees algorithm, only used with optimizer ``MAGIX``,
*parallelized*) a global optimization algorithm
- tag ```` sets the number of best sites,
- tag ```` sets the number of bees.
|br| |br|
- **"blank"**: (only used with optimizer ``SCIPY`` or MapFit or
CubeFit function)
using this method the fit parameters are not optimized,
but the synthetic spectra based on the given parameter values are
computed.
|br| |br|
- The tag ```` (or ````) defines the
max. number of iterations for the corresponding optimization algorithm and
has to be :math:`>=1`.
|br| |br|
- The tag ```` defines the max. number of cores used for the
corresponding optimization algorithm and has to be :math:`>=1`.
Note that a value :math:`>1` is considered only for parallelized algorithms.
|br| |br|
- The tag ```` (or ````) defines the stopping criterion
for the :math:`\chi^2` function. If the :math:`\chi^2` function drops below
this threshold, the algorithm is stopped.
|br| |br|
- The tag ```` defines if a renormalized :math:`\chi^2`
function is used.
.. math::
\left(\chi^2_{\rm limit} \right)_\mathsf{renom} =
\left(\sum_{i=1}^{N_\mathsf{exp}} \cdot N_\mathsf{points}(i) - N_\mathsf{par}\right) \cdot \left(\chi^2_{\rm limit} \right)_\mathsf{orig},
where :math:`N_{\rm exp}` is the number of observation files
:math:`N_{\rm points}(i)` indicates the number of observation data points in the
observation file :math:`i`;
:math:`N_{\rm par}` is the total number of all fit parameters;
:math:`\left(\chi^2_{\rm limit} \right)_\mathsf{\rm orig}` is the original
unmodified value of :math:`\chi^2`.
|br| |br|
- The tag ```` indicates the storage of the :math:`\chi^2` function.
|br| |br|
- With the optimizer ``MAGIX``, see Sect. ":ref:`api-myxclassfit`" the following
optimization algorithms are available as MPI-parallelized implementations
**"Levenberg-Marquardt"**, **"genetic"**, **"pso"**, **"ins"**, **"ns"**, and **"bees"**.
- With optimizer ``MAGIX`` the tag ```` defines path and name
of a host file used for MPI parallelized algorithms.
|br| |br|
- With optimizer ``SCIPY`` the tag ```` indicates the usage of a
log file containing all :math:`\chi^2` values with the corresponding parameter
vectors computed during the application of each optimization algorithm.
|br| |br|
- With optimizer ``SCIPY`` the tag ```` indicates the creation of
plots describing the obs. data together with the synthetic spectra and
:math:`\chi^2` function for each obs. data file.
|br| |br|
Example of an **algorithm xml file**:
::
2
bees
30
8
hostfile.txt
0.001
yes
default
yes
Frequency [Hz]
Intensity
yes
Levenberg-Marquardt
20
8
hostfile.txt
0.0008
yes
default
yes
Frequency [Hz]
Intensity
yes
.. ----------------------------------------------------------------------------------------
Footnotes
---------
.. Footnotes
.. [1] https://emcee.readthedocs.io/en/stable/
.. [2] https://johannesbuchner.github.io/UltraNest/index.html
.. ----------------------------------------------------------------------------------------
References
----------
.. citation reference
.. [ForemanMackey_2013] Foreman-Mackey, Daniel, David W. Hogg, Dustin Lang, and Jonathan
Goodman. 2013. “Emcee: The MCMC Hammer.” *Publications of the
Astronomical Society of the Pacific* 125 (925): 306.
https://doi.org/10.1086/670067.
.. [Goodman_2010] Goodman, Jonathan, and Jonathan Weare. 2010. “Ensemble Samplers
with Affine Invariance.” *Communications in Applied Mathematics
and Computational Science* 5 (1): 65–80.
.. [Buchner_2016] Buchner, Johannes. 2016. “A statistical test for Nested Sampling
algorithms.” *Statistics and Computing* 26 (1-2): 383–92.
https://doi.org/10.1007/s11222-014-9512-y.
.. [Buchner_2019] ———. 2019. “Collaborative Nested Sampling: Big Data versus Complex
Physical Models” 131 (1004): 108005.
https://doi.org/10.1088/1538-3873/aae7fc.
.. ----------------------------------------------------------------------------------------
.. hack to get extra blank line
.. |br| raw:: html