This module implements Markov Chain Monte Carlo (MCMC) algorithms which are used to sample from a target density. All samplers operates on log-densities instead of densities both for efficiency and numerical stability reasons.

Implements the No-U-Turn Sampler (NUTS) of Hoffman&Gelman[13]. This is an adaptive MCMC algorithm which belongs to the class of Hamiltonian Monte Carlo (HMC) samplers. As such the gradient of the log-density is required to carry out the simulation. The sci.diff module allows for the gradient to be automatically computed without approximations with minimal user effort.

The algorithm is implemented according to the paper mentioned above but additionally a diagonal (or optionally dense) mass matrix is estimated during the adaptation phase to improve the performance of the sampler. Similar improvements have been implemented in the Stan Project implementation of NUTS.

The argument `rng`

is a prng object, the function `f(x, grad)`

sets `grad`

to the gradient of the
log-density at `x`

and returns the value of the log-density at `x`

. Both `x`

and `grad`

must have the same length of
`theta0`

which is the starting value of the Markov Chain (adaptation phase). The log-density must evaluate to a finite value at `theta0`

. The
argument `options`

is a table and the algorithm makes use of the following string keys (defaults in square brackets):

Key | Description |
---|---|

stopadapt | [1024] number of adaptation samples, 0 or power of 2 >= 64 |

stop | number of samples or function(theta) |

olstat | on-line statistical estimators |

mass | ["diagonal"] can be "diagonal" or "dense" |

eps | [rough guess] initial value for eps |

cov | [nil] initial covariance: densemass^-1 (matrix) |

var | [nil] initial variance: diagonalmass^-1 (vector) |

delta | [0.8] for adaptation algorithm, see NUTS paper |

gamma | [0.05] for adaptation algorithm, see NUTS paper |

t0 | [10] for adaptation algorithm, see NUTS paper |

k | [0.75] for adaptation algorithm, see NUTS paper |

dlmax | [1000] to cap tree growth, see deltamax in NUTS paper |

The Markov Chain is started at `theta0`

and the adaptation takes place for the first `stopadapt`

samples (`stopadadapt`

may be
`0`

). Then the parameters determining the evolution of the Markov Chain are fixed and the sampling phase follows. If `stop`

is a number then
`stop`

samples are drawn from the chain. If `stop`

is a function, each drawn sample is passed to `stop`

and the algorithm terminates
when `stop`

evaluates to `true`

. In any case `olstat:push(theta)`

is called for every drawn sample `theta`

(samples from the
adaptation phase are excluded). Using the on-line statistical accumulators of `sci.stat`

allows to estimate statistics of the target distribution. When the
algorithm terminates it returns: the last drawn sample, the adapted eps and the adapted inverse mass (variance vector if the mass is diagonal or covariance matrix if
the mass is dense). Notice that this information allows to re-start the chain if so desired by passing these returned values in the `options`

argument and
setting `stopadapt`

to `0`

.

The `mass`

option determines whether the algorithm uses a diagonal or dense mass matrix for the Hamiltonian dynamics. The `eps`

option allows to
provide an initial guess for the discretization step (otherwise a reasonable value is automatically determined). The options `cov`

and
`var`

allows to specify an initial estimates for the mass matrix which is otherwise set equal to the identity matrix. We refer to the NUTS paper for the
remaining options, the default values are perfectly adequate in most cases.

This example illustrates an use of `mcmc.nuts()`

to sample from a simple bivariate density:

```
jit.opt.start("loopunroll=60") -- Set optimization flags:
-- Load required modules:
local math = require "sci.math".generic
local diff = require "sci.diff"
local alg = require "sci.alg"
local prng = require "sci.prng"
local stat = require "sci.stat"
local mcmc = require "sci.mcmc"
local rng = prng.std()
local N = 50000 -- 50000 samples (after adaptation).
local theta0 = alg.tovec{ 0.5,0.5 } -- Starting value of chain.
local mystat = stat.olcov(2) -- Compute mean and covariance.
-- Logarithm of target density:
local function logpdf(x)
return 2*math.log(x[1]) - x[1]*x[2]^2 - x[2]^2 + 2*x[2] - 4*x[1]
end
-- Automatic gradient computation (vector of 2 elements):
local dlogpdf = diff.gradientf(logpdf, 2)
local start = os.clock() -- To time the simulation.
mcmc.nuts(rng, dlogpdf, theta0, {
stop = N,
olstat = mystat,
})
local mu, sig = alg.vec(2), alg.mat(2, 2)
mystat:mean(mu) -- Compute mean of target density.
mystat:cov(sig) -- Compute covariance of target density.
local time = os.clock() - start
print("mean:")
print(mu:width())
print("covariance:")
print(sig:width())
--> mean:
--> +0.650281,+0.637480
--> covariance:
--> +0.159994,-0.052265
--> -0.052265,+0.338427
print("CPU secs:", time) --> 1/4 of a second here with LuaJIT 2.1.
```