refnx.analysis
- class refnx.analysis.BaseObjective(p, logl, logp=None, fcn_args=(), fcn_kwds=None, name=None, weighted=True)[source]
Bases:
object
Don’t necessarily have to use Parameters, could use np.array
- covar(target='nll')[source]
Estimates a covariance matrix based on numerical differentiation of either the negative log-likelihood or negative log-posterior probabilities.
- Parameters:
target (str, {"nll", "nlpost"})
- Returns:
covar – The covariance matrix for the fitting system
- Return type:
np.ndarray
Notes
Estimation of a covariance matrix can be susceptible to numeric instabilities. Critically evaluate the matrix before further use.
- logl(pvals=None)[source]
Log-likelihood probability function
- Parameters:
pvals (np.ndarray) – Array containing the values to be tested.
- Returns:
logl – log-likelihood probability.
- Return type:
- logp(pvals=None)[source]
Log-prior probability function
- Parameters:
pvals (np.ndarray) – Array containing the values to be tested.
- Returns:
logp – log-prior probability
- Return type:
- logpost(pvals=None)[source]
Log-posterior probability function
- Parameters:
pvals (np.ndarray) – Array containing the values to be tested.
- Returns:
logpost – log-probability.
- Return type:
Notes
The log probability is the sum is the sum of the log-prior and log-likelihood probabilities. Does not set the parameter attribute.
- nll(pvals=None)[source]
Negative log-likelihood function
- Parameters:
pvals (np.ndarray) – Array containing the values to be tested.
- Returns:
nll – negative log-likelihood
- Return type:
- nlpost(pvals=None)[source]
Negative log-posterior function
- Parameters:
pvals (np.ndarray) – Array containing the values to be tested.
- Returns:
nlpost – negative log-posterior
- Return type:
- class refnx.analysis.Bounds[source]
Bases:
object
A base class that describes the probability distribution for a parameter
- invcdf(q)[source]
Calculate the inverse of the cumulative distribution function for the prior.
This is also known as the percent point function, or ppf.
- Parameters:
q (array-like) – lower tail probability
- Returns:
x – quantile corresponding to the lower tail probability q.
- Return type:
array-like
- logp(value)[source]
Calculate the log-prior probability of a value with the probability distribution.
- rvs(size=1, random_state=None)[source]
Generate random variates from the probability distribution.
- Parameters:
size (int or tuple) – Specifies the number, or array shape, of random variates to return.
random_state ({None, int, float or
numpy.random.RandomState
}) – For reproducible sampling
- Returns:
arr – Random variates from within the probability distribution.
- Return type:
array-like
- class refnx.analysis.CurveFitter(objective, nwalkers=200, ntemps=-1, **mcmc_kws)[source]
Bases:
object
Analyse a curvefitting system (with MCMC sampling)
- Parameters:
objective (refnx.analysis.Objective) – The
refnx.analysis.Objective
to be analysed.nwalkers (int, optional) – How many walkers you would like the sampler to have. Must be an even number. The more walkers the better.
ntemps (int or None, optional) – If ntemps == -1, then an
emcee.EnsembleSampler
is used during the sample method. Otherwise, or if ntemps is None then parallel tempering is used with aptemcee.sampler.Sampler
object during the sample method, with ntemps specifing the number of temperatures. Can be None, in which case the Tmax keyword argument sets the maximum temperature. Parallel Tempering is useful if you expect your posterior distribution to be multi-modal.mcmc_kws (dict) – Keywords used to create the
emcee.EnsembleSampler
orptemcee.sampler.Sampler
objects.
Notes
See the documentation at http://dan.iel.fm/emcee/current/api/ for further details on what keywords are permitted, and for further information on Parallel Tempering. The pool and threads keywords are ignored here. Specification of parallel threading is done with the pool argument in the sample method.
- acf(nburn=0, nthin=1)[source]
Calculate the autocorrelation function
- Returns:
acfs – The autocorrelation function, acfs.shape=(lags, nvary)
- Return type:
np.ndarray
- property chain
MCMC chain belonging to CurveFitter.sampler
- Returns:
chain – The MCMC chain with shape (steps, nwalkers, ndim) or (steps, ntemps, nwalkers, ndim).
- Return type:
array
- fit(method='L-BFGS-B', target='nll', verbose=True, **kws)[source]
Obtain the maximum log-likelihood, or log-posterior, estimate (mode) of the objective. Maximising the log-likelihood is equivalent to minimising chi2 in a least squares fit.
- Parameters:
method (str) –
which method to use for the optimisation. One of:
’least_squares’:
scipy.optimize.least_squares
.’L-BFGS-B’: L-BFGS-B.
’differential_evolution’:
scipy.optimize.differential_evolution
’dual_annealing’:
scipy.optimize.dual_annealing
(SciPy >= 1.2.0)’shgo’:
scipy.optimize.shgo
(SciPy >= 1.2.0)
You can also choose many of the minimizers from
scipy.optimize.minimize
.target ({'nll', 'nlpost'}, optional) – Minimize the negative log-likelihood (‘nll’) or the negative log-posterior (‘nlpost’). This is equivalent to maximising the likelihood or posterior probabilities respectively. Maximising the likelihood is equivalent to minimising chi^2 in a least-squares fit. This option only applies to the differential_evolution, shgo, dual_annealing or L-BFGS-B methods. These optimisers require lower and upper (box) bounds for each parameter. If the Bounds on a parameter are not an Interval, but a PDF specifying a statistical distribution, then the lower and upper bounds are approximated as
PDF.rv.ppf([0.005, 0.995])
, covering 99 % of the statistical distribution.verbose (bool, optional) – Gives fitting progress. To see a progress bar tqdm has to be installed.
kws (dict) – Additional arguments are passed to the underlying minimization method.
- Returns:
result, covar – result.x contains the best fit parameters result.covar is the covariance matrix for the fit. result.stderr is the uncertainties on each of the fit parameters.
- Return type:
scipy.optimize.OptimizeResult
, np.ndarray
Notes
If the objective supplies a residuals method then least_squares can be used. Otherwise the nll method of the objective is minimised. Use this method just before a sampling run. If self.objective.parameters is a Parameters instance, then each of the varying parameters has its value updated by the fit, and each Parameter has a stderr attribute which represents the uncertainty on the fit parameter.
The use of dual annealing and shgo requires that scipy >= 1.2.0 be installed.
- property index_max_prob
The index of the highest log-probability for the samples
- initialise(pos='covar', random_state=None)[source]
Initialise the emcee walkers.
- Parameters:
pos (str or np.ndarray) –
Method for initialising the emcee walkers. One of:
’covar’, use the estimated covariance of the system.
’jitter’, add a small amount of gaussian noise to each parameter
- ’prior’, sample random locations from the prior using Latin
Hyper Cube.
- pos, an array that specifies a snapshot of the walkers. Has shape
- (nwalkers, ndim), or (ntemps, nwalkers, ndim) if parallel
tempering is employed. You can also provide a previously created chain.
random_state ({None, int, np.random.RandomState, np.random.Generator}) –
If performing MCMC with ntemps == -1:
If random_state is not specified the ~np.random.RandomState singleton is used.
If random_state is an int, a new
RandomState
instance is used, seeded with random_state.If random_state is already a
RandomState
instance, then that object is used.
If using parallel tempering then random number generation is controlled by
np.random.default_rng(random_state)
Specify random_state for repeatable minimizations.
- initialise_with_chain(chain)[source]
Initialise sampler with a pre-existing chain
- Parameters:
chain (array) – Array of size (steps, ntemps, nwalkers, ndim) or (steps, nwalkers, ndim), containing a chain from a previous sampling run.
- property logpost
Log-probability for each of the entries in self.chain
- make_sampler()[source]
Make the samplers for the Objective.
Use this method if the number of varying parameters changes.
- property nvary
- sample(steps, nthin=1, random_state=None, f=None, callback=None, verbose=True, pool=-1)[source]
Performs sampling from the objective.
- Parameters:
steps (int) – Collect steps samples into the chain. The sampler will run a total of steps * nthin moves.
nthin (int, optional) – Each chain sample is separated by nthin iterations.
random_state ({None, int, np.random.RandomState, np.random.Generator}) –
If performing MCMC with ntemps == -1:
If random_state is not specified the ~np.random.RandomState singleton is used.
If random_state is an int, a new
RandomState
instance is used, seeded with random_state.If random_state is already a
RandomState
instance, then that object is used.
If using parallel tempering then random number generation is controlled by
np.random.default_rng(random_state)
Specify random_state for repeatable minimizations.
f (file-like or str) – File to incrementally save chain progress to. Each row in the file is a flattened array of size (nwalkers, ndim) or (ntemps, nwalkers, ndim). There are steps rows in the file.
callback (callable) – callback function to be called at each iteration step. Has the signature callback(coords, logprob).
verbose (bool, optional) – Gives updates on the sampling progress
pool (int or map-like object, optional) – If pool is an int then it specifies the number of threads to use for parallelization. If pool == -1, then all CPU’s are used. If pool is a map-like callable that follows the same calling sequence as the built-in map function, then this pool is used for parallelisation.
Notes
Please see
emcee.EnsembleSampler
for its detailed behaviour.>>> # we'll burn the first 500 steps >>> fitter.sample(500) >>> # after you've run those, then discard them by resetting the >>> # sampler. >>> fitter.sampler.reset() >>> # Now collect 40 steps, each step separated by 50 sampler >>> # generations. >>> fitter.sample(40, nthin=50)
One can also burn and thin in Curvefitter.process_chain.
- class refnx.analysis.GlobalObjective(objectives, lambdas=None, alpha=1.0)[source]
Bases:
Objective
Global Objective function for simultaneous fitting with refnx.analysis.CurveFitter
- Parameters:
objectives (list) – list of
refnx.analysis.Objective
objectslambdas (array-like) – Lagrangian multipliers for each of the objective terms that contribute to the log-likelihood. Broadcast against the list of objectives. This array-like may become a list of Parameters in the future.
alpha ({float, Parameter, None}, optional) – Multiplies the overall log-prior term for the parameters. Used to balance log-prior and log-likelihood terms in the log-posterior.
- generative(pvals=None)[source]
Concatenated generative curves for the
refnx.analysis.Objective.generative
.- Parameters:
pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
- Returns:
generative – Concatenated
refnx.analysis.Objective.generative
- Return type:
np.ndarray
- logl(pvals=None)[source]
Calculate the combined log-likelhood of the system.
- Parameters:
pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
- Returns:
logl – log-likelihood probability
- Return type:
Notes
The log-likelihood of each of the objectives is multiplied by the respective Lagrangian multiplier in
GlobalObjective.lambdas
. The purpose of this multiplier is to allow the user to weight certain datasets more heavily (e.g. to make each of the contributions equal).
- logp(pvals=None)[source]
Calculate the log-prior of the system
- Parameters:
pvals (array-like or refnx.analysis.Parameters, optional) – values for the varying or entire set of parameters
- Returns:
logp – log-prior probability
- Return type:
- property npoints
int number of data points in all the objectives.
- property parameters
refnx.analysis.Parameters
associated with all the objectives.
- plot(pvals=None, samples=0, parameter=None, fig=None)[source]
Plot the data/model for all the objectives in the GlobalObjective.
Matplotlib must be installed to use this method.
- Parameters:
pvals (np.ndarray, optional) – Numeric values for the Parameter’s that are varying
samples (number, optional) – If the objective has been sampled, how many samples you wish to plot on the graph.
parameter (refnx.analysis.Parameter, optional) – Creates an interactive plot for the Parameter in Jupyter. Requires ipywidgets be installed. Use with %matplotlib notebook/qt.
fig (Figure instance, optional) – If fig is not supplied then a new figure is created. Otherwise the graph is created on the current axes on the supplied figure.
- Returns:
fig, ax – matplotlib figure and axes objects.
- Return type:
matplotlib.Figure
,matplotlib.Axes
- residuals(pvals=None)[source]
Concatenated residuals for each of the
refnx.analysis.Objective.residuals
.- Parameters:
pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
- Returns:
residuals – Concatenated
refnx.analysis.Objective.residuals
- Return type:
np.ndarray
Notes
The Lagrangian multipliers contained in the
lambdas
attribute are also included in the calculation of these residual arrays, to permit least squares analyses. If you would like to view un-modified residuals you should calculate them from the individual objectives.
- property weighted
bool do all the datasets have y_err, and are all the objectives wanting to use weights?
- class refnx.analysis.Interval(lb=-inf, ub=inf)[source]
Bases:
Bounds
Describes a uniform probability distribution. May be open, semi-open, or closed.
Examples
>>> from refnx.analysis import Parameter, Interval >>> p = Parameter(1) >>> # closed interval >>> p.bounds = Interval(0, 10) >>> p.logp([5, -1]) array([-2.30258509, -inf])
A semi-closed interval will still prevent the fitter from accessing impossible locations.
>>> p.bounds = Interval(lb=-10) >>> p.logp([5, -1]) array([0., 0.])
- invcdf(q)[source]
Calculate the inverse of the cumulative distribution function for the uniform prior.
This is also known as the percent point function, or ppf.
- Parameters:
q (array-like) – lower tail probability
- Returns:
x – quantile corresponding to the lower tail probability q.
- Return type:
array-like
- property lb
Lower bound of uniform distribution
- property rv
- rvs(size=1, random_state=None)[source]
Generate random variates from the probability distribution.
- Parameters:
size (int or tuple) – Specifies the number, or array shape, of random variates to return.
random_state ({None, int, float or
numpy.random.RandomState
}) – For reproducible sampling
- Returns:
arr – Random variates from within the probability distribution.
- Return type:
array-like
- property ub
Upper bound of uniform distribution
- valid(val)[source]
Checks whether a parameter value is within the support of the Interval. If it isn’t then it returns a value that is within the support. If the Interval is closed (i.e. lower and upper bounds are both specified) then invalid values will be corrected by random samples taken between the lower and upper bounds. If the interval is semi-open, only one of the bounds being specified, then invalid values will be corrected by the value being reflected by the same distance from the relevant limit.
- Parameters:
val (array-like) – values to examine
- Returns:
valid – values within the support
- Return type:
array-like
Examples
>>> b = Interval(0, 10) >>> b.valid(11.5) 8.5
- class refnx.analysis.MCMCResult(name, param, stderr, chain, median)
Bases:
tuple
- chain
Alias for field number 3
- median
Alias for field number 4
- name
Alias for field number 0
- param
Alias for field number 1
- stderr
Alias for field number 2
- class refnx.analysis.Model(parameters, fitfunc=None, fcn_args=(), fcn_kwds=None)[source]
Bases:
object
Calculates a generative model (dependent variable), given parameters and independent variables.
- Parameters:
parameters (array or refnx.analysis.Parameters) – Parameters to calculate the model with
fitfunc (callable, optional) – A function that calculates the generative model. Should have the signature
fitfunc(x, parameters, *fcn_args, **fcn__kwds)
where x is an array-like specifying the independent variables, and parameters are the parameters required to calculate the model. fcn_args and fcn_kwds can be used to supply additional arguments to to fitfunc.fcn_args (sequence, optional) – Supplies extra arguments to fitfunc
fcn_kwds (dict, optional) – Supplies keyword arguments to fitfunc
Notes
It is not necessary to supply fitfunc to create a Model iff you are inheriting Model and are also overriding Model.model.
- property fitfunc
The fit-function associated with the model
- logp()[source]
The model can add additional terms to it’s log-probability. However, it should _not_ include logp from any of the parameters. That is calculated by Objective.logp.
- model(x, p=None, x_err=None)[source]
Calculates a generative model(dependent variable), given parameters and independent variables.
- Parameters:
x (array-like) – Independent variable.
p (array-like or refnx.analysis.Parameters) – Parameters to supply to the generative function.
x_err (optional) – Uncertainty in x.
- Returns:
generative
- Return type:
array-like or float
Notes
The interpretation of x, p, and x_err is up to the fitfunc supplied during construction of this object (or the overridden model method of this object).
- property parameters
The refnx.analysis.Parameters set associated with the model.
- class refnx.analysis.Objective(model, data, lnsigma=None, use_weights=True, transform=None, logp_extra=None, auxiliary_params=(), name=None, alpha=None)[source]
Bases:
BaseObjective
Objective function for using with curvefitters such as refnx.analysis.curvefitter.CurveFitter.
- Parameters:
model (refnx.analysis.Model) – the generative model function. One can also provide an object that inherits refnx.analysis.Model.
data (refnx.dataset.Data1D) – data to be analysed.
lnsigma (float or refnx.analysis.Parameter, optional) –
Used if the experimental uncertainty (data.y_err) underestimated by a constant fractional amount. The experimental uncertainty is modified as:
s_n**2 = y_err**2 + exp(lnsigma * 2) * model**2
See Objective.logl for more details.
use_weights (bool) – use experimental uncertainty in calculation of residuals and logl, if available. If this is set to False, then you should also set self.lnsigma.vary = False, it will have no effect on the fit.
transform (callable, optional) – the model, data and data uncertainty are transformed by this function before calculating the likelihood/residuals. Has the signature transform(data.x, y, y_err=None), returning the tuple (transformed_y, transformed_y_err).
logp_extra (callable, optional) – user specifiable log-probability term. This contribution is in addition to the log-prior term of the model parameters, and model.logp, as well as the log-likelihood of the data. Has signature: logp_extra(model, data). The model will already possess updated parameters. Beware of including the same log-probability terms more than once.
auxiliary_params ({sequence, Parameters}, optional) – Extra Parameter objects that are involved with curvefitting, but aren’t directly included as part of the model. See notes for more details.
name (str) – Name for the objective.
alpha ({float, Parameter, None}, optional) – Multiplies the overall log-prior term for the parameters. Used to balance log-prior and log-likelihood terms in the log-posterior.
Notes
For parallelisation logp_extra needs to be picklable.
auxiliary_params are included in calculating the Objective.logp term, are present in Objective.varying_parameters(), and are modified by Curvefitter during an analysis. Their main purpose is to aid in making constraints in models.
- chisqr(pvals=None)[source]
Calculates the chi-squared value for a given fitting system.
- Parameters:
pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
- Returns:
chisqr – Chi-squared value, np.sum(residuals**2).
- Return type:
np.ndarray
- corner(**kwds)[source]
Corner plot of the chains belonging to the Parameters. Requires the corner and matplotlib packages.
- Parameters:
kwds (dict) – passed directly to the corner.corner function
- Returns:
fig
- Return type:
matplotlib.Figure
object.
- covar(target='residuals')[source]
Estimates the covariance matrix of the Objective by numerical differentiation.
- Parameters:
target ({"residuals", "nll", "nlpost"}) – Specifies what approach should be used to estimate covariance.
- Returns:
covar – Covariance matrix
- Return type:
np.ndarray
Notes
This method numerically differentiates either the ‘residuals’, the negative log-likelihood, or the negative log-posterior, to estimate the Hessian matrix. The Hessian matrix is then inverted to obtain the covariance matrix. If ‘residuals’ is selected, then the Hessian matrix is estimated as as J.T x J. If an Objective cannot calculate residuals then the covariance matrix can be estimated by inverting a Hessian matrix created from either the ‘nll’ or ‘nlpost’ methods. The default ‘residuals’ approach falls back to ‘nll’ if a problem is experienced. The default ‘residuals’ setting is preferred as the other settings can sometimes experience instabilities during Hessian estimation with numerical differentiation.
- generative(pvals=None)[source]
Calculate the generative (dependent variable) function associated with the model.
- Parameters:
pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
- Returns:
model
- Return type:
np.ndarray
- logl(pvals=None)[source]
Calculate the log-likelhood of the system
The major component of the log-likelhood probability is from the data. Extra potential terms are added on from the Model, self.model.logp, and the user specifiable logp_extra function.
- Parameters:
pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
- Returns:
logl – log-likelihood probability
- Return type:
Notes
The log-likelihood is calculated as:
logl = -0.5 * np.sum(((y - model) / s_n)**2 + np.log(2 * pi * s_n**2)) logp += self.model.logp() logp += self.logp_extra(self.model, self.data)
where
s_n**2 = y_err**2 + exp(2 * lnsigma) * model**2
- logp(pvals=None)[source]
Calculate the log-prior of the system
- Parameters:
pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
- Returns:
logp – log-prior probability
- Return type:
Notes
The log-prior is calculated as:
logp = np.sum(param.logp() for param in self.varying_parameters())
- logpost(pvals=None)[source]
Calculate the log-probability of the curvefitting system
- Parameters:
pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
- Returns:
logpost – log-probability
- Return type:
Notes
The overall log-probability is the sum of the log-prior and log-likelihood. The log-likelihood is not calculated if the log-prior is impossible (logp == -np.inf).
- nll(pvals=None)[source]
Negative log-likelihood function
- Parameters:
pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
- Returns:
nll – negative log-likelihood
- Return type:
- property npoints
int the number of points in the dataset.
- property parameters
refnx.analysis.Parameters
, all the Parameters contained in the fitting system.
- pgen(ngen=1000, nburn=0, nthin=1, random_state=None)[source]
Yield random parameter vectors from the MCMC samples. The objective state is not altered.
- Parameters:
ngen (int, optional) – the number of samples to yield. The actual number of samples yielded is min(ngen, chain.size)
nburn (int, optional) – discard this many steps from the start of the chain
nthin (int, optional) – only accept every nthin samples from the chain
random_state ({int, np.random.Generator, None}) – random number generator that picks the samples
- Yields:
pvec (np.ndarray) – A randomly chosen parameter vector
- plot(pvals=None, samples=0, parameter=None, fig=None)[source]
Plot the data/model.
Requires matplotlib be installed.
- Parameters:
pvals (np.ndarray, optional) – Numeric values for the Parameter’s that are varying
samples (number) – If the objective has been sampled, how many samples you wish to plot on the graph.
parameter (refnx.analysis.Parameter) – Creates an interactive plot for the Parameter in Jupyter. Requires ipywidgets be installed. Use with %matplotlib notebook/qt.
fig (Figure instance, optional) – If fig is not supplied then a new figure is created. Otherwise the graph is created on the current axes on the supplied figure.
- Returns:
fig, ax – matplotlib figure and axes objects.
- Return type:
matplotlib.Figure
,matplotlib.Axes
- prior_transform(u)[source]
Calculate the prior transform of the system.
Transforms uniform random variates in the unit hypercube, u ~ uniform[0.0, 1.0), to the parameter space of interest, according to the priors on the varying parameters.
- Parameters:
u (array-like) – Size of the varying parameters
- Returns:
pvals – Scaled parameter values
- Return type:
array-like
Notes
If a parameter has bounds, x ~ Unif[-10, 10) then the scaling from u to x is done as follows:
x = 2. * u - 1. # scale and shift to [-1., 1.) x *= 10. # scale to [-10., 10.)
- residuals(pvals=None)[source]
Calculates the residuals for a given fitting system.
- Parameters:
pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
- Returns:
residuals – Residuals, (data.y - model) / y_err.
- Return type:
np.ndarray
- setp(pvals)[source]
Set the parameters from pvals.
- Parameters:
pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
- varying_parameters()[source]
- Returns:
varying_parameters – The varying Parameter objects allowed to vary during the fit.
- Return type:
- property weighted
bool Does the data have weights (data.y_err), and is the objective using them?
- class refnx.analysis.PDF(rv)[source]
Bases:
Bounds
A class that describes the probability distribution for a parameter.
- Parameters:
rv (
scipy.stats.rv_continuous
or Object) – A continuous probability distribution. If rv is not an rv_continuous, then it must implement the logpdf and rvs methods.
Examples
>>> import scipy.stats as stats >>> from refnx.analysis import Parameter, PDF >>> p = Parameter(0.5) >>> # use a normal distribution for prior, mean=5 and sd=1. >>> p.bounds = PDF(stats.norm(5, 1)) >>> p.logp(), stats.norm.logpdf(0.5, 5, 1) (-11.043938533204672, -11.043938533204672)
- invcdf(q)[source]
Calculate the inverse of the cumulative distribution function for the uniform prior.
This is also known as the percent point function, or ppf.
- Parameters:
q (array-like) – lower tail probability
- Returns:
x – quantile corresponding to the lower tail probability q.
- Return type:
array-like
- logp(val)[source]
Calculate the log-prior probability of a value with the probability distribution.
- class refnx.analysis.Parameter(value=0.0, name=None, bounds=None, vary=False, constraint=None, units=None)[source]
Bases:
BaseParameter
Class for specifying a variable.
- Parameters:
value (float, optional) – Numerical Parameter value.
name (str, optional) – Name of the parameter.
bounds (refnx.analysis.Bounds, tuple, optional) – Sets the bounds for the parameter. Either supply a refnx.analysis.Bounds object (or one of its subclasses), or a (lower_bound, upper_bound) tuple.
vary (bool, optional) – Whether the Parameter is fixed during a fit.
constraint (expression, optional) – Python expression used to constrain the value during the fit.
- property constraint
- corner()[source]
Plots a histogram of the Parameter posterior.
Requires matplotlib be installed.
- Returns:
fig, ax – matplotlib figure and axes objects.
- Return type:
matplotlib.Figure
,matplotlib.Axes
- logp(pval=None)[source]
Calculate the log probability of the parameter
- Returns:
prob – log probability of the parameter
- Return type:
- set_constraint(constraint, args=())[source]
Constrains the Parameter.
- Parameters:
constraint ({None, expression, callable}) –
One of:
None, remove all constraints on this Parameter.
- expression, an algebraic Python expression used to constrain the
Parameter value.
- callable, a Python function,
constraint(*args)
that returns a float value for the Parameter value.
- callable, a Python function,
The callable should not use this Parameter in any of its calculations; nor should the callable use any Parameter in its calculations that possesses a constraint that would eventually lead back to this Parameter. If these conditions aren’t met then circular dependencies with undefined side effects will be created. A Parameter cannot ultimately depend on itself.
args (tuple) – a sequence of arguments given to constraint if it is a callable. This sequence can contain other Parameters, numbers, arrays, objects, etc. It is strongly recommended that this sequence is not nested. This is because
args
is searched for other Parameter objects, which are stored internally within this object as dependencies. Any constraints that these dependencies may have are evaluated before they are provided to the callable. If the callable uses Parameters that are not immediately retrievable fromargs
(e.g. stored as attributes in an object), and those Parameters have constraints themselves, then those Parameters will likely have stale values, resulting in undefined behaviour.
Examples
>>> from refnx.analysis import Parameter >>> a = Parameter(1) >>> b = Parameter(2) >>> a.set_constraint(np.sin(2*b)) >>> print(a.value) -0.7568024953079282
>>> def c(*args): ... return np.sin(args[0] * args[1])
>>> a.set_constraint(c, args=(2, b)) >>> print(a.value) -0.7568024953079282
- setp(value=None, vary=None, bounds=None, constraint=None)[source]
Set several attributes of the parameter at once.
- Parameters:
value (float, optional) – Numerical Parameter value.
vary (bool, optional) – Whether the Parameter is fixed during a fit.
bounds (refnx.analysis.Bounds, tuple, optional) – Sets the bounds for the parameter. Either supply a refnx.analysis.Bounds object (or one of its subclasses), or a (lower_bound, upper_bound) tuple.
constraint (expression, optional) – Python expression used to constrain the value during the fit.
- class refnx.analysis.Parameters(data=(), name=None)[source]
Bases:
UserList
A sequence of Parameters
- Parameters:
data (sequence) – A sequence of
Parameter
orParameters
name (str) – Name of this
Parameters
instance
- flattened(unique=False)[source]
A list of all the
Parameter
contained in this object, including those contained withinParameters
at any depth.
- logp()[source]
Calculates logp for all the parameters
- Returns:
logp – Log probability for all the parameters
- Return type:
- property nparam
- property parameters
- pgen(ngen=1000, nburn=0, nthin=1, random_state=None)[source]
Yield random parameter vectors from MCMC samples.
- Parameters:
ngen (int, optional) – the number of samples to yield. The actual number of samples yielded is min(ngen, chain.size)
nburn (int, optional) – discard this many steps from the start of the chain
nthin (int, optional) – only accept every nthin samples from the chain
random_state ({int, np.random.Generator, None}) – random number generator that picks the samples
- Yields:
pvec (np.ndarray) – A randomly chosen parameter vector
Notes
The entire parameter vector is yielded, not only the varying parameters. The reason for this is that some parameters may possess a chain if they are not varying, because they are controlled by a constraint.
- class refnx.analysis.Transform(form)[source]
Bases:
object
Mathematical transforms of numeric data.
- Parameters:
form (None or str) –
One of:
- ’lin’
No transform is made
- ’logY’
log10 transform
- ’YX4’
YX**4 transform
- ’YX2’
YX**2 transform
- None
No transform is made
Notes
You ask for a transform to be carried out by calling the Transform object directly.
>>> x = np.linspace(0.01, 0.1, 11) >>> y = np.linspace(100, 1000, 11) >>> y_err = np.sqrt(y) >>> t = Transform('logY') >>> ty, te = t(x, y, y_err) >>> ty array([2. , 2.2787536 , 2.44715803, 2.56820172, 2.66275783, 2.74036269, 2.80617997, 2.86332286, 2.91381385, 2.95904139, 3. ])
- refnx.analysis.autocorrelation_chain(chain, nburn=0, nthin=1)[source]
Calculate the autocorrelation function
- Parameters:
chain (np.ndarray) – The MCMC chain - (nsteps, nwalkers, ndim) or (nsteps, ntemps, nwalkers, ndim)
- Returns:
acfs – The autocorrelation function, acfs.shape=(lags, nvary)
- Return type:
np.ndarray
- refnx.analysis.integrated_time(x, c=5, tol=50, quiet=False, has_walkers=True)[source]
Estimate the integrated autocorrelation time of a time series.
This estimate uses the iterative procedure described on page 16 of Sokal’s notes to determine a reasonable window size.
- Parameters:
x (numpy.ndarray) – The time series. If 2-dimensional, the array dimesions are interpreted as
(n_step, n_walker)
unlesshas_walkers==False
, in which case they are interpreted as(n_step, n_param)
. If 3-dimensional, the dimensions are interperted as(n_step, n_walker, n_param)
.c (Optional[float]) – The step size for the window search. (default:
5
)tol (Optional[float]) – The minimum number of autocorrelation times needed to trust the estimate. (default:
50
)quiet (Optional[bool]) – This argument controls the behavior when the chain is too short. If
True
, give a warning instead of raising anAutocorrError
. (default:False
)has_walkers (Optional[bool]) – Whether the last axis should be interpreted as walkers or parameters if
x
has 2 dimensions. (default:True
)
- Returns:
- An estimate of the integrated autocorrelation time of
the time series
x
.
- Return type:
float or array
- Raises
- AutocorrError: If the autocorrelation time can’t be reliably estimated
from the chain and
quiet
isFalse
. This normally means that the chain is too short.
- refnx.analysis.load_chain(f)[source]
Loads a chain from disk. Does not change the state of a CurveFitter object.
- Parameters:
f (str or file-like) – File containing the chain.
- Returns:
chain – The loaded chain - (nsteps, nwalkers, ndim) or (nsteps, ntemps, nwalkers, ndim)
- Return type:
array
- refnx.analysis.possibly_create_parameter(value, name='', bounds=None, vary=False, constraint=None, units=None)[source]
If supplied with a Parameter return it. If supplied with float, wrap it in a Parameter instance.
- Parameters:
value (float, optional) – Numerical Parameter value.
name (str, optional) – Name of the parameter.
bounds (refnx.analysis.Bounds, tuple, optional) – Sets the bounds for the parameter. Either supply a refnx.analysis.Bounds object (or one of its subclasses), or a (lower_bound, upper_bound) tuple.
vary (bool, optional) – Whether the Parameter is fixed during a fit.
constraint (expression, optional) – Python expression used to constrain the value during the fit.
units (str) – Units for the Parameter.
- Returns:
parameter
- Return type:
- refnx.analysis.process_chain(objective, chain, nburn=0, nthin=1, flatchain=False)[source]
Process the chain produced by a sampler for a given Objective
- Parameters:
objective (refnx.analysis.Objective) – The Objective function that the Posterior was sampled for
chain (array) – The MCMC chain
nburn (int, optional) – discard this many steps from the start of the chain
nthin (int, optional) – only accept every nthin samples from the chain
flatchain (bool, optional) – collapse the walkers down into a single dimension.
- Returns:
[(param, stderr, chain)] – List of (param, stderr, chain) tuples. If isinstance(objective.parameters, Parameters) then param is a Parameter instance. param.value, param.stderr and param.chain will contain the median, stderr and chain samples, respectively. Otherwise param will be a float representing the median of the chain samples. stderr is the half width of the [15.87, 84.13] spread (similar to standard deviation) and chain is an array containing the MCMC samples for that parameter.
- Return type:
Notes
The chain should have the shape (iterations, nwalkers, nvary) or (iterations, ntemps, nwalkers, nvary) if parallel tempering was employed. The burned and thinned chain is created via: chain[nburn::nthin]. Note, if parallel tempering is employed, then only the lowest temperature of the parallel tempering chain is processed and returned as it corresponds to the (lowest energy) target distribution. If flatten is True then the burned/thinned chain is reshaped and arr.reshape(-1, nvary) is returned. This function has the effect of setting the parameter stderr’s.
- refnx.analysis.pymc_model(objective)[source]
Creates a pymc model from an Objective.
Requires aesara and pymc be installed. This is an experimental feature.
- Parameters:
objective (refnx.analysis.Objective)
- Returns:
model
- Return type:
pymc.Model
Notes
The varying parameters are renamed ‘p0’, ‘p1’, etc, as it’s vital in pymc that all parameters have their own unique name.