Pixel intensity features

Denote

\(f(x,y)\) – real value of a continuous intensity function \(f\) at real-valued (continuous) Cartesian image location \((x,y)\);
\(P\) – 2-dimensional point set of greyscale image intensity values at discrete locations;
\(p(x,y)\) – greyscale value of \(f(x,y)\) at discrete 2-dimensional image pixel location \((x,y)\), or, simply, pixel \((x,y)\) intensity;
\(G = {p(x,y) | p(x,y)>0}\) – 1-dimensional set of image pixels of non-zero intensity, or so called region of interest (ROI);
\(g_i\) – element of \(G\);
\(n = card(G)\) – number of ROI elements;
\(\mathbb{E}\) – the expectation operator;
\(\min \: X\) and \(\max \: X\) – the minimum and maximum of random variable \(X\);
\(\mu_k\) – central moment of order \(k\) of a real-valued random variable \(X\), \(\mu_k = \mathbb{E}[(X − \mathbb{E}[X])^n]\).

Nyxus pixel intensity features are calculated as:

INTEGRATED_INTENSITY = \(\sum _i^n g_i\)

MEAN \(\gets \mu = \frac{1}{n} \sum_i^n g_i\)

MIN = \(\min \: G\) ,

MAX = \(\max \: G\),

RANGE = \(\max \: G - \min \: G\),

COVERED_IMAGE_INTENSITY_RANGE = :math: `frac {max : G - min : G} {max : P - min : P} `

STANDARD_DEVIATION \(\gets \sigma = (\mathbb{E}[(G-\mu)^2]) ^{\frac {1}{2}} = \left[ \frac{1}{n-1} \sum_i^n (g_i-\mu)^2 \right ] ^{\frac {1}{2}}\)

STANDARD_DEVIATION_BIASED \(\gets \sigma_b = (\mathbb{E}[(G-\mu)^2]) ^{\frac {1}{2}} = \left[ \frac{1}{n} \sum_i^n (g_i-\mu)^2 \right ] ^{\frac {1}{2}}\)

COV = \(\frac{\sigma}{\mu}\),

STANDARD_ERROR = \(\frac{\sigma}{\sqrt{n}}\),

SKEWNESS = \(\frac {\sqrt n \mu_3}{\mu_2^{1.5}}\) if \(n>3\) and \(\sigma_2 \neq 0\), otherwise \(=0\).

KURTOSIS = \(\frac{n \mu_4} {\sigma^4}\) if \(n>4\) and \(\mu_2 \neq 0\), otherwise \(=0\).

EXCESS_KURTOSIS = \(\frac{n \mu_4} {\sigma^4} - 3\) if \(n>4\) and \(\mu_2 \neq 0\), otherwise \(=0\).

HYPERSKEWNESS = \(\frac{n \mu_4} {\mu_2^{5/2}}\) if \(n>5\) and \(\mu_2 \neq 0\), otherwise \(=0\)

HYPERFLATNESS = \(\frac {n \mu_5} {\mu_2^3}\) if \(n>6\) and \(\mu_2 \neq 0\), otherwise \(=0\)

MEAN_ABSOLUTE_DEVIATION = \(\sigma = \frac{1}{n} \sum_i^n \left| g_i-\mu \right|\)

ENERGY \(\gets E = \sum _i^n g_i^2\)

ROOT_MEAN_SQUARED \(= \sqrt {\frac {1} {n} \sum_i^n g_i^2 }\)

ENTROPY \(= \sum_i^k (- b_{i} \: \log \: b_{i})\) where \(b_i\) is a non-zero value of the image histogram of size \(k = 1 + \log_2 \: n\),

MODE \(= x_{uk} + w \frac{f_k - f_{k-1}}{2 f_k - f_{k-1} - f_{k+1}}\) where \(k\) - the index of the histogram bin containing the greatest count, \(x_{uk}\) - lower bound of the histogram bin containing the greatest count, \(f_k\) - the greatest bin count, \(f_{k-1}\) and \(f_{k+1}\) - counts of the bins neighboring to the greatest count bin; (informally, the histogram bin value having the highest count)

VARIANCE \(\gets \sigma = (\mathbb{E}[(G-\mu)^2]) ^{\frac {1}{2}} = \left[ \frac{1}{n-1} \sum_i^n (g_i-\mu)^2 \right ]\)

VARIANCE_BIASED \(\gets \sigma_b = (\mathbb{E}[(G-\mu)^2]) ^{\frac {1}{2}} = \left[ \frac{1}{n} \sum_i^n (g_i-\mu)^2 \right ]\)

UNIFORMITY = \(\sum_i^k b_{i}^2\) where \(b_i\) is a value of the image histogram of size \(k = 256\)

UNIFORMITY_PIU = \((1 - \frac{\max \: G - \min \: G}{\max \: G + \min \: G}) \times 100\)

The quantile \(q_p\) of a random variable (or equivalently of its distribution) is defined as the smallest number \(q\) such that the cumulative distribution function is greater than or equal to some \(p\), where \(0<p<1\). This can be calculated for a continuous random variable with density function \(f(x)\) by solving

\[p = \int_{-\inf}^{q_p} f(x)dx\]

for \(q_p\), or by using the inverse of the cumulative distribution function,

\[q_p = F^{-1}(p).\]

The \(p\)-th quantile of a random variable \(X\) is the value \(q_p\) such that

\[F(q_p) = P(X \leqslant q_p) = p\]

P01, P10, P25, P75, P90, P99 - the 1%, 10%, 25%, 75%, 90%, and 99% percentiles. A percentile \(q_p\) is a solution of equation \(p = \int _{-\infty} ^{q_p} f(x)dx\) where \(p=0.01, 0.1, 0.25, etc\), for example \(0.25 = \int _{-\infty} ^{0.25} f(x)dx\).

QCOD = \(\frac {P75 - P25} {P75 + P25}\)

MEDIAN – the 50% percentile defined as \(0.5 = \int _{-\infty} ^{0.5} f(x)dx\), the value such that an equal number of samples are less than and greater than the value (for an odd sample size), or the average of the two central values (for an even sample size).

MEDIAN_ABSOLUTE_DEVIATION = \(\sigma = \frac{1}{n} \sum_i^n \left| g_i - MEDIAN \right|\)

INTERQUARTILE_RANGE = \(q_{0.75} - q_{0.25}\) - the difference of the 1st and 3rd sample quartiles,

ROBUST_MEAN_ABSOLUTE_DEVIATION (RMAD)

\[RMAD = \frac{1}{k} \underset{q_{0.1} \leqslant g_i \leqslant q_{0.9}} {\sum_i^n} |g_i - \mu_R|\]

where

\[\mu_R = \underset{q_{0.1} \leqslant g_i \leqslant q_{0.9}} { \frac{1}{n} \sum_i^n g_i }\]

or, otherwise, MAD calculated over the subset of \(G=\{g_i\}^n\) whose elements are in the \([q_{0.1},q_{0.9}]\) value interval.

References

Zwillinger, D. (Ed.). CRC Standard Mathematical Tables and Formulae. Boca Raton, FL: CRC Press, p. 602, 1995.