This module contains statistical functions and on-line statistical accumulators. All the algorithms are implemented with a focus on the numerical stability of the computations. It is thus preferable to use these estimation functions instead of re-creating the functionality via sci.alg operations.
Returns the sample mean of the vector x
which is based on the mathematical estimator: $$ \mu_x = \frac{1}{N}\sum_{i = 1}^{N} x_i $$
Returns the 'unbiased' sample variance of the vector x
which is based on the mathematical estimator: $$ \sigma^2_x = \frac{1}{N-1}\sum_{i = 1}^{N}
\left(x_i - \mu_x\right)^2 $$ The vector x must have minimum length of 2.
Returns the sample covariance between the vector x
and the vector y
which is based on the mathematical estimator: $$ \text{cov}(x,y) =
\frac{1}{N-1}\sum_{i = 1}^{N} \left(x_i - \mu_x\right)\left(y_i - \mu_y\right) $$ Both vectors must share the same length bigger or equal to 2.
Returns the sample correlation between the vector x
and the vector y
which is based on the mathematical estimator: $$ \rho(x,y) =
\frac{\text{cov}(x,y)}{\sigma_x\sigma_y} $$ Both vectors must share the same length bigger or equal to 2.
Follows a list of on-line accumulators that offer the same statistical estimators just introduced. We speak of on-line accumulators as it's possible to update the
running estimates to take into account new observations using the ol:push()
function with constant complexity. Each on-line accumulator has an associated
dimension which is set at creation time and cannot be changed afterwards.
In the following we refer to ol
for a generic on-line accumulator. All ol
objects support the following 4 methods:
Take x
into account. If ol
has been initialized with a dimension of 0 then x
must be a Lua number. Otherwise x
must
be a vector of length equal to the dimension of ol.
Reset ol
to its initial state (no observations taken into account yet).
Returns the dimension of ol
.
Returns the number of observations taken into account in ol
.
Returns an on-line accumulator of dimension dim
which supports the following method:
If ol
has been initialized with a dimension of 0 then the first variant is used and the running sample mean is returned. Otherwise y
must be a
vector of length equal to the dimension of ol
to which the running sample mean is set.
Returns an on-line accumulator of dimension dim
which supports ol:mean()
and the following method:
If ol
has been initialized with a dimension of 0 then the first variant is used and the running sample variance is returned. Otherwise y
must
be a vector of length equal to the dimension of ol
to which the running sample variance is set.
Returns an on-line accumulator of dimension dim
which supports ol:mean()
, ol:var()
and the following two methods:
C
must be a square matrix with dimensions equal to the dimension of ol
to which the running sample covariance is set.
R
must be a square matrix with dimensions equal to the dimension of ol
to which the running sample correlation is set.