Analysing lipid membrane data

This Jupyter notebook demonstrates the utility of the refnx for:

  • the co-refinement of three contrast variation datasets of a DMPC (1,2-dimyristoyl-sn-glycero-3-phosphocholine) bilayer measured at the solid-liquid interface with a common model

  • the use of the LipidLeaflet component to parameterise the model in terms of physically relevant parameters

  • the use of Bayesian Markov Chain Monte Carlo (MCMC) to investigate the Posterior distribution of the curvefitting system.

  • the intrinsic usefulness of Jupyter notebooks to facilitate reproducible research in scientific data analysis

The first step in most Python scripts is to import modules and functions that are going to be used

# use matplotlib for plotting
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import os.path

import refnx, scipy

# the analysis module contains the curvefitting engine
from refnx.analysis import CurveFitter, Objective, Parameter, GlobalObjective, process_chain

# the reflect module contains functionality relevant to reflectometry
from refnx.reflect import SLD, ReflectModel, Structure, LipidLeaflet

# the ReflectDataset object will contain the data
from refnx.dataset import ReflectDataset

In order for the analysis to be exactly reproducible the same package versions must be used. The conda packaging manager, and pip, can be used to ensure this is the case.

# version numbers used in this analysis
refnx.version.version, scipy.version.version
('0.1.21.dev0+7e2809e', '1.6.3')

The ReflectDataset class is used to represent a dataset. They can be constructed by supplying a filename

pth = os.path.join(os.path.dirname(refnx.__file__), 'analysis', 'test')

data_d2o = ReflectDataset(os.path.join(pth, 'c_PLP0016596.dat')) = "d2o"

data_hdmix = ReflectDataset(os.path.join(pth, 'c_PLP0016601.dat')) = "hdmix"

data_h2o = ReflectDataset(os.path.join(pth, 'c_PLP0016607.dat')) = "h2o"

A SLD object is used to represent the Scattering Length Density of a material. It has real and imag attributes because the SLD is a complex number, with the imaginary part accounting for absorption. The units of SLD are \(10^{-6} \mathring{A}^{-2}\)

The real and imag attributes are Parameter objects. These Parameter objects contain the: parameter value, whether it allowed to vary, any interparameter constraints, and bounds applied to the parameter. The bounds applied to a parameter are probability distributions which encode the log-prior probability of the parameter having a certain value.

si = SLD(2.07 + 0j)
sio2 = SLD(3.47 + 0j)

# the following represent the solvent contrasts used in the experiment
d2o = SLD(6.36 + 0j)
h2o = SLD(-0.56 + 0j)
hdmix = SLD(2.07 + 0j)

# We want the `real` attribute parameter to vary in the analysis, and we want to apply
# uniform bounds. The `setp` method of a Parameter is a way of changing many aspects of
# Parameter behaviour at once.
d2o.real.setp(vary=True, bounds=(6.1, 6.36))'d2o SLD'

The LipidLeaflet class is used to describe a single lipid leaflet in our interfacial model. A leaflet consists of a head and tail group region. Since we are studying a bilayer then inner and outer LipidLeaflet’s are required.

# Parameter for the area per molecule each DMPC molecule occupies at the surface. We
# use the same area per molecule for the inner and outer leaflets.
apm = Parameter(56, 'area per molecule', vary=True, bounds=(52, 65))

# the sum of scattering lengths for the lipid head and tail in Angstrom.
b_heads = Parameter(6.01e-4, 'b_heads')
b_tails = Parameter(-2.92e-4, 'b_tails')

# the volume occupied by the head and tail groups in cubic Angstrom.
v_heads = Parameter(319, 'v_heads')
v_tails = Parameter(782, 'v_tails')

# the head and tail group thicknesses.
inner_head_thickness = Parameter(9, 'inner_head_thickness', vary=True, bounds=(4, 11))
outer_head_thickness = Parameter(9, 'outer_head_thickness', vary=True, bounds=(4, 11))
tail_thickness = Parameter(14, 'tail_thickness', vary=True, bounds=(10, 17))

# finally construct a `LipidLeaflet` object for the inner and outer leaflets.
# Note that here the inner and outer leaflets use the same area per molecule,
# same tail thickness, etc, but this is not necessary if the inner and outer
# leaflets are different.
inner_leaflet = LipidLeaflet(apm,
                             b_heads, v_heads, inner_head_thickness,
                             b_tails, v_tails, tail_thickness,
                             3, 3)

# we reverse the monolayer for the outer leaflet because the tail groups face upwards
outer_leaflet = LipidLeaflet(apm,
                             b_heads, v_heads, outer_head_thickness,
                             b_tails, v_tails, tail_thickness,
                             3, 0, reverse_monolayer=True)

The Slab Component represents a layer of uniform scattering length density of a given thickness in our interfacial model. Here we make Slabs from SLD objects, but other approaches are possible.

# Slab constructed from SLD object.
sio2_slab = sio2(15, 3)
sio2_slab.thick.setp(vary=True, bounds=(2, 30)) = 'sio2 thickness'
sio2_slab.rough.setp(vary=True, bounds=(0, 7)) = name='sio2 roughness'
sio2_slab.vfsolv.setp(0.1, vary=True, bounds=(0., 0.5)) = 'sio2 solvation'

solv_roughness = Parameter(3, 'bilayer/solvent roughness')
solv_roughness.setp(vary=True, bounds=(0, 5))

Once all the Components have been constructed we can chain them together to compose a Structure object. The Structure object represents the interfacial structure of our system. We create different Structures for each contrast. It is important to note that each of the Structures share many components, such as the LipidLeaflet objects. This means that parameters used to construct those components are shared between all the Structures, which enables co-refinement of multiple datasets. An alternate way to carry this out would be to apply constraints to underlying parameters, but this way is clearer. Note that the final component for each structure is a Slab created from the solvent SLDs, we give those slabs a zero thickness.

s_d2o = si | sio2_slab | inner_leaflet | outer_leaflet | d2o(0, solv_roughness)
s_hdmix = si | sio2_slab | inner_leaflet | outer_leaflet | hdmix(0, solv_roughness)
s_h2o = si | sio2_slab | inner_leaflet | outer_leaflet | h2o(0, solv_roughness)

The Structures created in the previous step describe the interfacial structure, these structures are used to create ReflectModel objects that know how to apply resolution smearing, scaling factors and background.

model_d2o = ReflectModel(s_d2o)
model_hdmix = ReflectModel(s_hdmix)
model_h2o = ReflectModel(s_h2o)

model_d2o.scale.setp(vary=True, bounds=(0.9, 1.1))

model_d2o.bkg.setp(vary=True, bounds=(-1e-6, 1e-6))
model_hdmix.bkg.setp(vary=True, bounds=(-1e-6, 1e-6))
model_h2o.bkg.setp(vary=True, bounds=(-1e-6, 1e-6))

An Objective is constructed from a ReflectDataset and ReflectModel. Amongst other things Objectives can calculate chi-squared, log-likelihood probability, log-prior probability, etc. We then combine all the individual Objectives into a GlobalObjective.

objective_d2o = Objective(model_d2o, data_d2o)
objective_hdmix = Objective(model_hdmix, data_hdmix)
objective_h2o = Objective(model_h2o, data_h2o)

global_objective = GlobalObjective([objective_d2o, objective_hdmix, objective_h2o])

A CurveFitter object can perform least squares fitting, or MCMC sampling on the Objective used to construct it.

fitter = CurveFitter(global_objective)

We’ll just do a normal least squares fit here. MCMC sampling is left as an exercise for the reader.

55it [00:55,  1.01it/s]/home/docs/checkouts/ RuntimeWarning: invalid value encountered in subtract
  df = fun(x) - f0
55it [00:55,  1.02s/it]
   covar: array([[ 1.40881107e-05,  3.44092042e-11,  3.35574641e-06,
        -7.49697214e-05, -3.95379150e-07,  1.39433128e-04,
        -3.32571010e-07,  1.43904190e-05, -1.58181480e-05,
        -1.02770602e-05,  6.03562595e-04,  1.47415440e-11,
       [ 3.44092042e-11,  1.08613865e-14,  4.19838273e-09,
        -1.04932138e-08,  3.25856401e-10, -1.68158147e-09,
        -1.20914863e-09, -3.10729239e-09,  3.61228515e-08,
        -1.05656948e-11, -7.59284248e-09, -2.36714341e-15,
       [ 3.35574641e-06,  4.19838273e-09,  5.08187196e-02,
        -2.15192293e-02,  1.00317243e-03,  4.85095710e-03,
        -2.78702700e-02,  4.22680803e-03, -1.30682478e-01,
         5.89740810e-05,  4.27172791e-01, -3.18466223e-09,
       [-7.49697214e-05, -1.04932138e-08, -2.15192293e-02,
         4.88326835e-01,  1.18794801e-03,  1.31594659e-01,
         1.72838687e-02, -8.63460360e-02,  1.05566159e+00,
         1.17654503e-04, -3.51546654e+00,  8.70261141e-09,
       [-3.95379150e-07,  3.25856401e-10,  1.00317243e-03,
         1.18794801e-03,  7.16804007e-05,  6.50292024e-04,
        -7.76943888e-04, -4.39580494e-04,  3.72717666e-03,
         2.01757459e-06, -8.38426958e-03, -1.16912162e-10,
       [ 1.39433128e-04, -1.68158147e-09,  4.85095710e-03,
         1.31594659e-01,  6.50292024e-04,  5.55909501e-02,
        -2.52781316e-03, -2.94917981e-02,  3.47173397e-01,
        -6.11665174e-06, -1.13829996e+00,  8.12076104e-12,
       [-3.32571010e-07, -1.20914863e-09, -2.78702700e-02,
         1.72838687e-02, -7.76943888e-04, -2.52781316e-03,
         2.47592593e-02, -3.43305019e-03,  9.47032944e-02,
         1.24547720e-05, -2.90336949e-01,  2.58312883e-09,
       [ 1.43904190e-05, -3.10729239e-09,  4.22680803e-03,
        -8.63460360e-02, -4.39580494e-04, -2.94917981e-02,
        -3.43305019e-03,  2.33789439e-02, -2.94016708e-01,
        -1.92507838e-05,  8.87999390e-01,  1.70961714e-09,
       [-1.58181480e-05,  3.61228515e-08, -1.30682478e-01,
         1.05566159e+00,  3.72717666e-03,  3.47173397e-01,
         9.47032944e-02, -2.94016708e-01,  3.97039254e+00,
         1.05257513e-04, -1.20298283e+01, -1.81242236e-08,
       [-1.02770602e-05, -1.05656948e-11,  5.89740810e-05,
         1.17654503e-04,  2.01757459e-06, -6.11665174e-06,
         1.24547720e-05, -1.92507838e-05,  1.05257513e-04,
         2.02096553e-05, -5.64235158e-04,  4.72879209e-13,
       [ 6.03562595e-04, -7.59284248e-09,  4.27172791e-01,
        -3.51546654e+00, -8.38426958e-03, -1.13829996e+00,
        -2.90336949e-01,  8.87999390e-01, -1.20298283e+01,
        -5.64235158e-04,  3.78878494e+01,  2.46082955e-08,
       [ 1.47415440e-11, -2.36714341e-15, -3.18466223e-09,
         8.70261141e-09, -1.16912162e-10,  8.12076104e-12,
         2.58312883e-09,  1.70961714e-09, -1.81242236e-08,
         4.72879209e-13,  2.46082955e-08,  9.86665062e-15,
       [ 1.45740278e-11,  2.44632369e-15,  4.06224036e-09,
        -2.11970994e-08,  2.50024064e-10, -8.04834801e-09,
        -2.88320051e-09,  5.24637767e-09, -7.93026181e-08,
         1.67781808e-12,  2.82992186e-07,  2.14604432e-16,
     fun: -2735.6638936471954
 message: 'Optimization terminated successfully.'
    nfev: 10962
     nit: 55
  stderr: array([3.75341321e-03, 1.04217976e-07, 2.25430077e-01, 6.98803860e-01,
       8.46642786e-03, 2.35777332e-01, 1.57350752e-01, 1.52901746e-01,
       1.99258439e+00, 4.49551502e-03, 6.15531067e+00, 9.93310154e-08,
 success: True
       x: array([ 1.02310556e+00, -2.45241725e-07,  1.15976019e+01,  4.48604483e+00,
        1.21558140e-01,  5.66791999e+01,  1.03151633e+01,  1.37995997e+01,
        6.10597296e+00,  6.18304232e+00,  9.59289748e-01,  5.16705262e-07,
<matplotlib.legend.Legend at 0x7febd0a1d340>