refnx.analysis

class refnx.analysis.BaseObjective(p, logl, logp=None, fcn_args=(), fcn_kwds=None, name=None, weighted=True)[source]

Bases: object

Don’t necessarily have to use Parameters, could use np.array

covar(target='nll')[source]

Estimates a covariance matrix based on numerical differentiation of either the negative log-likelihood or negative log-posterior probabilities.

Parameters: target (str, {"nll", "nlpost"}) –
Returns: covar – The covariance matrix for the fitting system
Return type: np.ndarray

Notes

Estimation of a covariance matrix can be susceptible to numeric instabilities. Critically evaluate the matrix before further use.

logl(pvals=None)[source]

Log-likelihood probability function

Parameters: pvals (np.ndarray) – Array containing the values to be tested.
Returns: logl – log-likelihood probability.
Return type: float

logp(pvals=None)[source]

Log-prior probability function

Parameters: pvals (np.ndarray) – Array containing the values to be tested.
Returns: logp – log-prior probability
Return type: float

logpost(pvals=None)[source]

Log-posterior probability function

Parameters: pvals (np.ndarray) – Array containing the values to be tested.
Returns: logpost – log-probability.
Return type: float

Notes

The log probability is the sum is the sum of the log-prior and log-likelihood probabilities. Does not set the parameter attribute.

nll(pvals=None)[source]

Negative log-likelihood function

Parameters: pvals (np.ndarray) – Array containing the values to be tested.
Returns: nll – negative log-likelihood
Return type: float

nlpost(pvals=None)[source]

Negative log-posterior function

Parameters: pvals (np.ndarray) – Array containing the values to be tested.
Returns: nlpost – negative log-posterior
Return type: float

setp(pvals)[source]

Set the parameters from pvals

Parameters: pvals (np.ndarray) – Array containing the values to be tested.

varying_parameters()[source]

Returns: varying_parameters – The parameters varying in this objective function.
Return type: np.ndarray

class refnx.analysis.Bounds[source]

Bases: object

A base class that describes the probability distribution for a parameter

invcdf(q)[source]

Calculate the inverse of the cumulative distribution function for the prior.

This is also known as the percent point function, or ppf.

Parameters: q (array-like) – lower tail probability
Returns: x – quantile corresponding to the lower tail probability q.
Return type: array-like

logp(value)[source]

Calculate the log-prior probability of a value with the probability distribution.

Parameters: val (float or array-like) – variates to calculate the log-probability for
Returns: arr – The log-probabilty corresponding to each of the variates
Return type: float or np.ndarray

rvs(size=1, random_state=None)[source]

Generate random variates from the probability distribution.

Parameters

size (int or tuple) – Specifies the number, or array shape, of random variates to return.
random_state ({None, int, float or numpy.random.RandomState}) – For reproducible sampling

Returns

arr – Random variates from within the probability distribution.

Return type

array-like

valid(val)[source]

Checks whether a parameter value is within the support of the distribution. If it isn’t then it returns a value that is within the support.

Parameters: val (array-like) – values to examine
Returns: valid – valid values within the support
Return type: array-like

class refnx.analysis.CurveFitter(objective, nwalkers=200, ntemps=-1, **mcmc_kws)[source]

Bases: object

Analyse a curvefitting system (with MCMC sampling)

Parameters

objective (refnx.analysis.Objective) – The refnx.analysis.Objective to be analysed.
nwalkers (int, optional) – How many walkers you would like the sampler to have. Must be an even number. The more walkers the better.
ntemps (int or None, optional) – If ntemps == -1, then an emcee.EnsembleSampler is used during the sample method. Otherwise, or if ntemps is None then parallel tempering is used with a ptemcee.sampler.Sampler object during the sample method, with ntemps specifing the number of temperatures. Can be None, in which case the Tmax keyword argument sets the maximum temperature. Parallel Tempering is useful if you expect your posterior distribution to be multi-modal.
mcmc_kws (dict) – Keywords used to create the emcee.EnsembleSampler or ptemcee.sampler.Sampler objects.

Notes

See the documentation at http://dan.iel.fm/emcee/current/api/ for further details on what keywords are permitted, and for further information on Parallel Tempering. The pool and threads keywords are ignored here. Specification of parallel threading is done with the pool argument in the sample method.

acf(nburn=0, nthin=1)[source]

Calculate the autocorrelation function

Returns: acfs – The autocorrelation function, acfs.shape=(lags, nvary)
Return type: np.ndarray

property chain

MCMC chain belonging to CurveFitter.sampler

Returns: chain – The MCMC chain with shape (steps, nwalkers, ndim) or (steps, ntemps, nwalkers, ndim).
Return type: array

fit(method='L-BFGS-B', target='nll', verbose=True, **kws)[source]

Obtain the maximum log-likelihood, or log-posterior, estimate (mode) of the objective. Maximising the log-likelihood is equivalent to minimising chi2 in a least squares fit.

Parameters

method (str) –
which method to use for the optimisation. One of:
- ’least_squares’: scipy.optimize.least_squares.
- ’L-BFGS-B’: L-BFGS-B.
- ’differential_evolution’: scipy.optimize.differential_evolution
- ’dual_annealing’: scipy.optimize.dual_annealing (SciPy >= 1.2.0)
- ’shgo’: scipy.optimize.shgo (SciPy >= 1.2.0)
You can also choose many of the minimizers from scipy.optimize.minimize.
target ({'nll', 'nlpost'}, optional) – Minimize the negative log-likelihood (‘nll’) or the negative log-posterior (‘nlpost’). This is equivalent to maximising the likelihood or posterior probabilities respectively. Maximising the likelihood is equivalent to minimising chi^2 in a least-squares fit. This option only applies to the differential_evolution, shgo, dual_annealing or L-BFGS-B methods. These optimisers require lower and upper (box) bounds for each parameter. If the Bounds on a parameter are not an Interval, but a PDF specifying a statistical distribution, then the lower and upper bounds are approximated as PDF.rv.ppf([0.005, 0.995]), covering 99 % of the statistical distribution.
verbose (bool, optional) – Gives fitting progress. To see a progress bar tqdm has to be installed.
kws (dict) – Additional arguments are passed to the underlying minimization method.

Returns

result, covar – result.x contains the best fit parameters result.covar is the covariance matrix for the fit. result.stderr is the uncertainties on each of the fit parameters.

Return type

scipy.optimize.OptimizeResult, np.ndarray

Notes

If the objective supplies a residuals method then least_squares can be used. Otherwise the nll method of the objective is minimised. Use this method just before a sampling run. If self.objective.parameters is a Parameters instance, then each of the varying parameters has its value updated by the fit, and each Parameter has a stderr attribute which represents the uncertainty on the fit parameter.

The use of dual annealing and shgo requires that scipy >= 1.2.0 be installed.

property index_max_prob: The index of the highest log-probability for the samples

initialise(pos='covar', random_state=None)[source]

Initialise the emcee walkers.

Parameters

pos (str or np.ndarray) –
Method for initialising the emcee walkers. One of:
- ’covar’, use the estimated covariance of the system.
- ’jitter’, add a small amount of gaussian noise to each parameter
- ’prior’, sample random locations from the prior using Latin
  Hyper Cube.
- pos, an array that specifies a snapshot of the walkers. Has shape
  
  (nwalkers, ndim), or (ntemps, nwalkers, ndim) if parallel
  tempering is employed. You can also provide a previously created chain.
random_state ({int, np.random.RandomState, np.random.Generator}) – If random_state is not specified the ~np.random.RandomState singleton is used. If random_state is an int, a new RandomState instance is used, seeded with random_state. If random_state is already a RandomState or a Generator instance, then that object is used. Specify random_state for repeatable initialisations.

initialise_with_chain(chain)[source]

Initialise sampler with a pre-existing chain

Parameters: chain (array) – Array of size (steps, ntemps, nwalkers, ndim) or (steps, nwalkers, ndim), containing a chain from a previous sampling run.

property logpost: Log-probability for each of the entries in self.chain

make_sampler()[source]

Make the samplers for the Objective.

Use this method if the number of varying parameters changes.

property nvary

reset()[source]

Reset the sampled chain.

Typically used on a sampler after a burn-in period.

sample(steps, nthin=1, random_state=None, f=None, callback=None, verbose=True, pool=-1)[source]

Performs sampling from the objective.

Parameters

steps (int) – Collect steps samples into the chain. The sampler will run a total of steps * nthin moves.
nthin (int, optional) – Each chain sample is separated by nthin iterations.
random_state ({int, np.random.RandomState, np.random.Generator}) – If random_state is not specified the ~np.random.RandomState singleton is used. If random_state is an int, a new RandomState instance is used, seeded with random_state. If random_state is already a RandomState or a Generator instance, then that object is used. Specify random_state for repeatable minimizations.
f (file-like or str) – File to incrementally save chain progress to. Each row in the file is a flattened array of size (nwalkers, ndim) or (ntemps, nwalkers, ndim). There are steps rows in the file.
callback (callable) – callback function to be called at each iteration step. Has the signature callback(coords, logprob).
verbose (bool, optional) – Gives updates on the sampling progress
pool (int or map-like object, optional) – If pool is an int then it specifies the number of threads to use for parallelization. If pool == -1, then all CPU’s are used. If pool is a map-like callable that follows the same calling sequence as the built-in map function, then this pool is used for parallelisation.

Notes

Please see emcee.EnsembleSampler for its detailed behaviour.

>>> # we'll burn the first 500 steps
>>> fitter.sample(500)
>>> # after you've run those, then discard them by resetting the
>>> # sampler.
>>> fitter.sampler.reset()
>>> # Now collect 40 steps, each step separated by 50 sampler
>>> # generations.
>>> fitter.sample(40, nthin=50)

One can also burn and thin in Curvefitter.process_chain.

class refnx.analysis.GlobalObjective(objectives, lambdas=None)[source]

Bases: Objective

Global Objective function for simultaneous fitting with refnx.analysis.CurveFitter

Parameters

objectives (list) – list of refnx.analysis.Objective objects
lambdas (array-like) – Lagrangian multipliers for each of the objective terms that contribute to the log-likelihood. Broadcast against the list of objectives. This array-like may become a list of Parameters in the future.

generative(pvals=None)[source]

Concatenated generative curves for the refnx.analysis.Objective.generative.

Parameters: pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
Returns: generative – Concatenated refnx.analysis.Objective.generative
Return type: np.ndarray

logl(pvals=None)[source]

Calculate the combined log-likelhood of the system.

Parameters: pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
Returns: logl – log-likelihood probability
Return type: float

Notes

The log-likelihood of each of the objectives is multiplied by the respective Lagrangian multiplier in GlobalObjective.lambdas. The purpose of this multiplier is to allow the user to weight certain datasets more heavily (e.g. to make each of the contributions equal).

logp(pvals=None)[source]

Calculate the log-prior of the system

Parameters: pvals (array-like or refnx.analysis.Parameters, optional) – values for the varying or entire set of parameters
Returns: logp – log-prior probability
Return type: float

property npoints: int number of data points in all the objectives.

property parameters: refnx.analysis.Parameters associated with all the objectives.

plot(pvals=None, samples=0, parameter=None, fig=None)[source]

Plot the data/model for all the objectives in the GlobalObjective.

Matplotlib must be installed to use this method.

Parameters

pvals (np.ndarray, optional) – Numeric values for the Parameter’s that are varying
samples (number, optional) – If the objective has been sampled, how many samples you wish to plot on the graph.
parameter (refnx.analysis.Parameter, optional) – Creates an interactive plot for the Parameter in Jupyter. Requires ipywidgets be installed. Use with %matplotlib notebook/qt.
fig (Figure instance, optional) – If fig is not supplied then a new figure is created. Otherwise the graph is created on the current axes on the supplied figure.

Returns

fig, ax – matplotlib figure and axes objects.

Return type

matplotlib.Figure, matplotlib.Axes

residuals(pvals=None)[source]

Concatenated residuals for each of the refnx.analysis.Objective.residuals.

Parameters: pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
Returns: residuals – Concatenated refnx.analysis.Objective.residuals
Return type: np.ndarray

Notes

The Lagrangian multipliers contained in the lambdas attribute are also included in the calculation of these residual arrays, to permit least squares analyses. If you would like to view un-modified residuals you should calculate them from the individual objectives.

property weighted: bool do all the datasets have y_err, and are all the objectives wanting to use weights?

class refnx.analysis.Interval(lb=-inf, ub=inf)[source]

Bases: Bounds

Describes a uniform probability distribution. May be open, semi-open, or closed.

Parameters

lb (float) – The lower bound
ub (float) – The upper bound

Examples

>>> from refnx.analysis import Parameter, Interval
>>> p = Parameter(1)
>>> # closed interval
>>> p.bounds = Interval(0, 10)
>>> p.logp([5, -1])
array([-2.30258509,        -inf])

A semi-closed interval will still prevent the fitter from accessing impossible locations.

>>> p.bounds = Interval(lb=-10)
>>> p.logp([5, -1])
array([0., 0.])

invcdf(q)[source]

Calculate the inverse of the cumulative distribution function for the uniform prior.

This is also known as the percent point function, or ppf.

Parameters: q (array-like) – lower tail probability
Returns: x – quantile corresponding to the lower tail probability q.
Return type: array-like

property lb: Lower bound of uniform distribution

logp(val)[source]

Calculate the log-prior probability of a value with the Interval.

Parameters: val (float or array-like) – variates to calculate the log-probability for
Returns: arr – The log-probabilty corresponding to each of the variates
Return type: float or np.ndarray

property rv

rvs(size=1, random_state=None)[source]

Generate random variates from the probability distribution.

Parameters

size (int or tuple) – Specifies the number, or array shape, of random variates to return.
random_state ({None, int, float or numpy.random.RandomState}) – For reproducible sampling

Returns

arr – Random variates from within the probability distribution.

Return type

array-like

property ub: Upper bound of uniform distribution

valid(val)[source]

Checks whether a parameter value is within the support of the Interval. If it isn’t then it returns a value that is within the support. If the Interval is closed (i.e. lower and upper bounds are both specified) then invalid values will be corrected by random samples taken between the lower and upper bounds. If the interval is semi-open, only one of the bounds being specified, then invalid values will be corrected by the value being reflected by the same distance from the relevant limit.

Parameters: val (array-like) – values to examine
Returns: valid – values within the support
Return type: array-like

Examples

>>> b = Interval(0, 10)
>>> b.valid(11.5)
8.5

class refnx.analysis.MCMCResult(name, param, stderr, chain, median)

Bases: tuple

chain: Alias for field number 3

median: Alias for field number 4

name: Alias for field number 0

param: Alias for field number 1

stderr: Alias for field number 2

class refnx.analysis.Model(parameters, fitfunc=None, fcn_args=(), fcn_kwds=None)[source]

Bases: object

Calculates a generative model (dependent variable), given parameters and independent variables.

Parameters

parameters (array or refnx.analysis.Parameters) – Parameters to calculate the model with
fitfunc (callable, optional) – A function that calculates the generative model. Should have the signature fitfunc(x, parameters, *fcn_args, **fcn__kwds) where x is an array-like specifying the independent variables, and parameters are the parameters required to calculate the model. fcn_args and fcn_kwds can be used to supply additional arguments to to fitfunc.
fcn_args (sequence, optional) – Supplies extra arguments to fitfunc
fcn_kwds (dict, optional) – Supplies keyword arguments to fitfunc

Notes

It is not necessary to supply fitfunc to create a Model iff you are inheriting Model and are also overriding Model.model.

property fitfunc: The fit-function associated with the model

logp()[source]: The model can add additional terms to it’s log-probability. However, it should _not_ include logp from any of the parameters. That is calculated by Objective.logp.

model(x, p=None, x_err=None)[source]

Calculates a generative model(dependent variable), given parameters and independent variables.

Parameters

x (array-like) – Independent variable.
p (array-like or refnx.analysis.Parameters) – Parameters to supply to the generative function.
x_err (optional) – Uncertainty in x.

Returns

generative

Return type

array-like or float

Notes

The interpretation of x, p, and x_err is up to the fitfunc supplied during construction of this object (or the overridden model method of this object).

property parameters: The refnx.analysis.Parameters set associated with the model.

class refnx.analysis.Objective(model, data, lnsigma=None, use_weights=True, transform=None, logp_extra=None, auxiliary_params=(), name=None)[source]

Bases: BaseObjective

Objective function for using with curvefitters such as refnx.analysis.curvefitter.CurveFitter.

Parameters

model (refnx.analysis.Model) – the generative model function. One can also provide an object that inherits refnx.analysis.Model.
data (refnx.dataset.Data1D) – data to be analysed.
lnsigma (float or refnx.analysis.Parameter, optional) –
Used if the experimental uncertainty (data.y_err) underestimated by a constant fractional amount. The experimental uncertainty is modified as:

s_n**2 = y_err**2 + exp(lnsigma * 2) * model**2

See Objective.logl for more details.
use_weights (bool) – use experimental uncertainty in calculation of residuals and logl, if available. If this is set to False, then you should also set self.lnsigma.vary = False, it will have no effect on the fit.
transform (callable, optional) – the model, data and data uncertainty are transformed by this function before calculating the likelihood/residuals. Has the signature transform(data.x, y, y_err=None), returning the tuple (transformed_y, transformed_y_err).
logp_extra (callable, optional) – user specifiable log-probability term. This contribution is in addition to the log-prior term of the model parameters, and model.logp, as well as the log-likelihood of the data. Has signature: logp_extra(model, data). The model will already possess updated parameters. Beware of including the same log-probability terms more than once.
auxiliary_params ({sequence, Parameters}, optional) – Extra Parameter objects that are involved with curvefitting, but aren’t directly included as part of the model. See notes for more details.
name (str) – Name for the objective.

Notes

For parallelisation logp_extra needs to be picklable.

auxiliary_params are included in calculating the Objective.logp term, are present in Objective.varying_parameters(), and are modified by Curvefitter during an analysis. Their main purpose is to aid in making constraints in models.

chisqr(pvals=None)[source]

Calculates the chi-squared value for a given fitting system.

Parameters: pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
Returns: chisqr – Chi-squared value, np.sum(residuals**2).
Return type: np.ndarray

corner(**kwds)[source]

Corner plot of the chains belonging to the Parameters. Requires the corner and matplotlib packages.

Parameters: kwds (dict) – passed directly to the corner.corner function
Returns: fig
Return type: matplotlib.Figure object.

covar(target='residuals')[source]

Estimates the covariance matrix of the Objective.

Parameters: target ({"residuals", "nll", "nlpost"}) – Specifies what approach should be used to estimate covariance.
Returns: covar – Covariance matrix
Return type: np.ndarray

Notes

For most purposes the Jacobian of the ‘residuals’ should be used to calculate the covariance matrix, estimated as J.T x J. If an Objective cannot calculate residuals then the covariance matrix can be estimated by inverting a Hessian matrix created from either the ‘nll’ or ‘nlpost’ methods. The default ‘residuals’ approach falls back to ‘nll’ if a problem is experienced. The default ‘residuals’ setting is preferred as the other settings can sometimes experience instabilities during Hessian estimation with numerical differentiation.

generative(pvals=None)[source]

Calculate the generative (dependent variable) function associated with the model.

Parameters: pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
Returns: model
Return type: np.ndarray

logl(pvals=None)[source]

Calculate the log-likelhood of the system

The major component of the log-likelhood probability is from the data. Extra potential terms are added on from the Model, self.model.logp, and the user specifiable logp_extra function.

Parameters: pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
Returns: logl – log-likelihood probability
Return type: float

Notes

The log-likelihood is calculated as:

logl = -0.5 * np.sum(((y - model) / s_n)**2
                     + np.log(2 * pi * s_n**2))
logp += self.model.logp()
logp += self.logp_extra(self.model, self.data)

where

s_n**2 = y_err**2 + exp(2 * lnsigma) * model**2

logp(pvals=None)[source]

Calculate the log-prior of the system

Parameters: pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
Returns: logp – log-prior probability
Return type: float

Notes

The log-prior is calculated as:

logp = np.sum(param.logp() for param in
                 self.varying_parameters())

logpost(pvals=None)[source]

Calculate the log-probability of the curvefitting system

Parameters: pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
Returns: logpost – log-probability
Return type: float

Notes

The overall log-probability is the sum of the log-prior and log-likelihood. The log-likelihood is not calculated if the log-prior is impossible (logp == -np.inf).

nll(pvals=None)[source]

Negative log-likelihood function

Parameters: pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
Returns: nll – negative log-likelihood
Return type: float

property npoints: int the number of points in the dataset.

property parameters: refnx.analysis.Parameters, all the Parameters contained in the fitting system.

pgen(ngen=1000, nburn=0, nthin=1, random_state=None)[source]

Yield random parameter vectors from the MCMC samples. The objective state is not altered.

Parameters

ngen (int, optional) – the number of samples to yield. The actual number of samples yielded is min(ngen, chain.size)
nburn (int, optional) – discard this many steps from the start of the chain
nthin (int, optional) – only accept every nthin samples from the chain
random_state ({int, np.random.Generator, None}) – random number generator that picks the samples

Yields

pvec (np.ndarray) – A randomly chosen parameter vector

plot(pvals=None, samples=0, parameter=None, fig=None)[source]

Plot the data/model.

Requires matplotlib be installed.

Parameters

pvals (np.ndarray, optional) – Numeric values for the Parameter’s that are varying
samples (number) – If the objective has been sampled, how many samples you wish to plot on the graph.
parameter (refnx.analysis.Parameter) – Creates an interactive plot for the Parameter in Jupyter. Requires ipywidgets be installed. Use with %matplotlib notebook/qt.
fig (Figure instance, optional) – If fig is not supplied then a new figure is created. Otherwise the graph is created on the current axes on the supplied figure.

Returns

fig, ax – matplotlib figure and axes objects.

Return type

matplotlib.Figure, matplotlib.Axes

prior_transform(u)[source]

Calculate the prior transform of the system.

Transforms uniform random variates in the unit hypercube, u ~ uniform[0.0, 1.0), to the parameter space of interest, according to the priors on the varying parameters.

Parameters: u (array-like) – Size of the varying parameters
Returns: pvals – Scaled parameter values
Return type: array-like

Notes

If a parameter has bounds, x ~ Unif[-10, 10) then the scaling from u to x is done as follows:

x = 2. * u - 1.  # scale and shift to [-1., 1.)
x *= 10.  # scale to [-10., 10.)

residuals(pvals=None)[source]

Calculates the residuals for a given fitting system.

Parameters: pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters
Returns: residuals – Residuals, (data.y - model) / y_err.
Return type: np.ndarray

setp(pvals)[source]

Set the parameters from pvals.

Parameters: pvals (array-like or refnx.analysis.Parameters) – values for the varying or entire set of parameters

varying_parameters()[source]

Returns: varying_parameters – The varying Parameter objects allowed to vary during the fit.
Return type: refnx.analysis.Parameters

property weighted: bool Does the data have weights (data.y_err), and is the objective using them?

class refnx.analysis.PDF(rv)[source]

Bases: Bounds

A class that describes the probability distribution for a parameter.

Parameters: rv (scipy.stats.rv_continuous or Object) – A continuous probability distribution. If rv is not an rv_continuous, then it must implement the logpdf and rvs methods.

Examples

>>> import scipy.stats as stats
>>> from refnx.analysis import Parameter, PDF
>>> p = Parameter(0.5)
>>> # use a normal distribution for prior, mean=5 and sd=1.
>>> p.bounds = PDF(stats.norm(5, 1))
>>> p.logp(), stats.norm.logpdf(0.5, 5, 1)
(-11.043938533204672, -11.043938533204672)

invcdf(q)[source]

Calculate the inverse of the cumulative distribution function for the uniform prior.

This is also known as the percent point function, or ppf.

Parameters: q (array-like) – lower tail probability
Returns: x – quantile corresponding to the lower tail probability q.
Return type: array-like

logp(val)[source]

Calculate the log-prior probability of a value with the probability distribution.

Parameters: val (float or array-like) – variate to calculate the log-probability for
Returns: arr – The log-probabilty corresponding to each of the variates
Return type: float or np.ndarray

rvs(size=1, random_state=None)[source]

Generate random variates from the probability distribution.

Parameters

size (int or tuple) – Specifies the number, or array shape, of random variates to return.
random_state (None, int, float or np.random.RandomState) – For reproducible sampling

Returns

arr – Random variates from within the probability distribution.

Return type

array-like

valid(val)[source]

Checks whether a parameter value is within the support of the distribution. If it isn’t then it returns a random value that is within the support.

Parameters: val (array-like) – values to examine
Returns: valid – valid values within the support
Return type: array-like

class refnx.analysis.Parameter(value=0.0, name=None, bounds=None, vary=False, constraint=None, units=None)[source]

Bases: BaseParameter

Class for specifying a variable.

Parameters

value (float, optional) – Numerical Parameter value.
name (str, optional) – Name of the parameter.
bounds (refnx.analysis.Bounds, tuple, optional) – Sets the bounds for the parameter. Either supply a refnx.analysis.Bounds object (or one of its subclasses), or a (lower_bound, upper_bound) tuple.
vary (bool, optional) – Whether the Parameter is fixed during a fit.
constraint (expression, optional) – Python expression used to constrain the value during the fit.

property bounds: The bounds placed on this Parameter.

property constraint

corner()[source]

Plots a histogram of the Parameter posterior.

Requires matplotlib be installed.

Returns: fig, ax – matplotlib figure and axes objects.
Return type: matplotlib.Figure, matplotlib.Axes

logp(pval=None)[source]

Calculate the log probability of the parameter

Returns: prob – log probability of the parameter
Return type: float

range(lower, upper)[source]

Sets the lower and upper limits on the Parameter

Parameters

lower (float) – lower bound
upper (float) – upper bound

Return type

None

set_constraint(constraint, args=())[source]

Constrains the Parameter.

Parameters

constraint ({None, expression, callable}) –
One of:
- None, remove all constraints on this Parameter.
- expression, an algebraic Python expression used to constrain the
  Parameter value.
- callable, a Python function, constraint(*args) that returns
  a float value for the Parameter value.
The callable should not use this Parameter in any of its calculations; nor should the callable use any Parameter in its calculations that possesses a constraint that would eventually lead back to this Parameter. If these conditions aren’t met then circular dependencies with undefined side effects will be created. A Parameter cannot ultimately depend on itself.
args (tuple) – a sequence of arguments given to constraint if it is a callable. This sequence can contain other Parameters, numbers, arrays, objects, etc. It is strongly recommended that this sequence is not nested. This is because args is searched for other Parameter objects, which are stored internally within this object as dependencies. Any constraints that these dependencies may have are evaluated before they are provided to the callable. If the callable uses Parameters that are not immediately retrievable from args (e.g. stored as attributes in an object), and those Parameters have constraints themselves, then those Parameters will likely have stale values, resulting in undefined behaviour.

Examples

>>> from refnx.analysis import Parameter
>>> a = Parameter(1)
>>> b = Parameter(2)
>>> a.set_constraint(np.sin(2*b))
>>> print(a.value)
-0.7568024953079282

>>> def c(*args):
...     return np.sin(args[0] * args[1])

>>> a.set_constraint(c, args=(2, b))
>>> print(a.value)
-0.7568024953079282

setp(value=None, vary=None, bounds=None, constraint=None)[source]

Set several attributes of the parameter at once.

Parameters

value (float, optional) – Numerical Parameter value.
vary (bool, optional) – Whether the Parameter is fixed during a fit.
bounds (refnx.analysis.Bounds, tuple, optional) – Sets the bounds for the parameter. Either supply a refnx.analysis.Bounds object (or one of its subclasses), or a (lower_bound, upper_bound) tuple.
constraint (expression, optional) – Python expression used to constrain the value during the fit.

valid(val)[source]

The truth of whether a value would satisfy the bounds for this parameter.

Parameters: val (float) – A proposed value
Returns: valid – np.isfinite(Parameter.logp(val))
Return type: bool

property value: The numeric value of the Parameter

property vary: Whether this Parameter is allowed to vary

class refnx.analysis.Parameters(data=(), name=None)[source]

Bases: UserList

A sequence of Parameters

Parameters

data (sequence) – A sequence of Parameter or Parameters
name (str) – Name of this Parameters instance

constrained_parameters()[source]

Returns: constrained_parameters – A list of unique Parameter contained in this object that have constraints.
Return type: list

flattened(unique=False)[source]

A list of all the Parameter contained in this object, including those contained within Parameters at any depth.

Parameters: unique (bool) – The list will only contain unique objects.
Returns: params – A list of Parameter contained in this object.
Return type: list

logp()[source]

Calculates logp for all the parameters

Returns: logp – Log probability for all the parameters
Return type: float

names()[source]

Returns: names – A list of all the names of all the Parameter contained in this object.
Return type: list

property nparam

nvary()[source]

Returns: nvary – The number of Parameter contained in this object that are allowed to vary.
Return type: int

property parameters

pgen(ngen=1000, nburn=0, nthin=1, random_state=None)[source]

Yield random parameter vectors from MCMC samples.

Parameters

ngen (int, optional) – the number of samples to yield. The actual number of samples yielded is min(ngen, chain.size)
nburn (int, optional) – discard this many steps from the start of the chain
nthin (int, optional) – only accept every nthin samples from the chain
random_state ({int, np.random.Generator, None}) – random number generator that picks the samples

Yields

pvec (np.ndarray) – A randomly chosen parameter vector

Notes

The entire parameter vector is yielded, not only the varying parameters. The reason for this is that some parameters may possess a chain if they are not varying, because they are controlled by a constraint.

property pvals: An array containing the values of all the Parameter in this object.

varying_parameters()[source]

Unique list of varying parameters

Returns: p – Unique list of varying parameters
Return type: list

class refnx.analysis.Transform(form)[source]

Bases: object

Mathematical transforms of numeric data.

Parameters

form (None or str) –

One of:

’lin’
No transform is made

’logY’
log10 transform

’YX4’
YX**4 transform

’YX2’
YX**2 transform

None
No transform is made

Notes

You ask for a transform to be carried out by calling the Transform object directly.

>>> x = np.linspace(0.01, 0.1, 11)
>>> y = np.linspace(100, 1000, 11)
>>> y_err = np.sqrt(y)
>>> t = Transform('logY')
>>> ty, te = t(x, y, y_err)
>>> ty
array([2.        , 2.2787536 , 2.44715803, 2.56820172, 2.66275783,
       2.74036269, 2.80617997, 2.86332286, 2.91381385, 2.95904139,
       3.        ])

refnx.analysis.autocorrelation_chain(chain, nburn=0, nthin=1)[source]

Calculate the autocorrelation function

Parameters: chain (np.ndarray) – The MCMC chain - (nsteps, nwalkers, ndim) or (nsteps, ntemps, nwalkers, ndim)
Returns: acfs – The autocorrelation function, acfs.shape=(lags, nvary)
Return type: np.ndarray

refnx.analysis.fitfunc(f)[source]: A decorator that can be used to say if something is a fitfunc.

refnx.analysis.integrated_time(x, c=5, tol=50, quiet=False, has_walkers=True)[source]

Estimate the integrated autocorrelation time of a time series.

This estimate uses the iterative procedure described on page 16 of Sokal’s notes to determine a reasonable window size.

Parameters

x (numpy.ndarray) – The time series. If 2-dimensional, the array dimesions are interpreted as (n_step, n_walker) unless has_walkers==False, in which case they are interpreted as (n_step, n_param). If 3-dimensional, the dimensions are interperted as (n_step, n_walker, n_param).
c (Optional[float]) – The step size for the window search. (default: 5)
tol (Optional[float]) – The minimum number of autocorrelation times needed to trust the estimate. (default: 50)
quiet (Optional[bool]) – This argument controls the behavior when the chain is too short. If True, give a warning instead of raising an AutocorrError. (default: False)
has_walkers (Optional[bool]) – Whether the last axis should be interpreted as walkers or parameters if x has 2 dimensions. (default: True)

Returns

An estimate of the integrated autocorrelation time of: the time series x.

Return type

float or array

Raises

AutocorrError: If the autocorrelation time can’t be reliably estimated: from the chain and quiet is False. This normally means that the chain is too short.

refnx.analysis.is_parameter(x)[source]: Test for Parameter-ness.

refnx.analysis.is_parameters(x)[source]: Test for Parameter-ness.

refnx.analysis.load_chain(f)[source]

Loads a chain from disk. Does not change the state of a CurveFitter object.

Parameters: f (str or file-like) – File containing the chain.
Returns: chain – The loaded chain - (nsteps, nwalkers, ndim) or (nsteps, ntemps, nwalkers, ndim)
Return type: array

refnx.analysis.possibly_create_parameter(value, name='', bounds=None, vary=False, constraint=None, units=None)[source]

If supplied with a Parameter return it. If supplied with float, wrap it in a Parameter instance.

Parameters

value (float, optional) – Numerical Parameter value.
name (str, optional) – Name of the parameter.
bounds (refnx.analysis.Bounds, tuple, optional) – Sets the bounds for the parameter. Either supply a refnx.analysis.Bounds object (or one of its subclasses), or a (lower_bound, upper_bound) tuple.
vary (bool, optional) – Whether the Parameter is fixed during a fit.
constraint (expression, optional) – Python expression used to constrain the value during the fit.
units (str) – Units for the Parameter.

Returns

parameter

Return type

refnx.analysis.Parameter

refnx.analysis.process_chain(objective, chain, nburn=0, nthin=1, flatchain=False)[source]

Process the chain produced by a sampler for a given Objective

Parameters

objective (refnx.analysis.Objective) – The Objective function that the Posterior was sampled for
chain (array) – The MCMC chain
nburn (int, optional) – discard this many steps from the start of the chain
nthin (int, optional) – only accept every nthin samples from the chain
flatchain (bool, optional) – collapse the walkers down into a single dimension.

Returns

[(param, stderr, chain)] – List of (param, stderr, chain) tuples. If isinstance(objective.parameters, Parameters) then param is a Parameter instance. param.value, param.stderr and param.chain will contain the median, stderr and chain samples, respectively. Otherwise param will be a float representing the median of the chain samples. stderr is the half width of the [15.87, 84.13] spread (similar to standard deviation) and chain is an array containing the MCMC samples for that parameter.

Return type

list

Notes

The chain should have the shape (iterations, nwalkers, nvary) or (iterations, ntemps, nwalkers, nvary) if parallel tempering was employed. The burned and thinned chain is created via: chain[nburn::nthin]. Note, if parallel tempering is employed, then only the lowest temperature of the parallel tempering chain is processed and returned as it corresponds to the (lowest energy) target distribution. If flatten is True then the burned/thinned chain is reshaped and arr.reshape(-1, nvary) is returned. This function has the effect of setting the parameter stderr’s.

refnx.analysis.pymc_model(objective)[source]

Creates a pymc model from an Objective.

Requires aesara and pymc be installed. This is an experimental feature.

Parameters: objective (refnx.analysis.Objective) –
Returns: model
Return type: pymc.Model

Notes

The varying parameters are renamed ‘p0’, ‘p1’, etc, as it’s vital in pymc that all parameters have their own unique name.

refnx.analysis.sequence_to_parameters(seq)[source]

Flattens and converts sequence of float/Parameter to a Parameters instance.

Parameters: seq (sequence) – Sequence (possibly nested) of float/Parameter
Returns: params
Return type: refnx.analysis.Parameters