MANOVA

class hyppo.ksample.MANOVA

Multivariate analysis of variance (MANOVA) test statistic and p-value.

MANOVA is the current standard for multivariate k-sample testing. The test statistic is formulated as below [1]:

In MANOVA, we are testing if the mean vectors of each of the k-samples are the same. Define {x1iiidFX1, i=1,...,n1}, {x2jiidFX2, j=1,...,n2}, ... as k groups of samples deriving from different a multivariate Gaussian distribution with the same dimensionality and same covariance matrix. That is, the null and alternate hypotheses are,

H0:μ1=μ2==μk,HA: jj s.t. μjμj

Let ˉxi refer to the columnwise means of xi; that is, ˉxi=(1/ni)nij=1xij. The pooled sample covariance of each group, W, is

W=ki=1nij=1(xijˉxi(xijˉxi)T

Next, define B as the sample covariance matrix of the means. If n=ki=1ni and the grand mean is ˉx=(1/n)ki=1nj=1xij,

B=ki=1ni(ˉxiˉx)(ˉxiˉx)T

Some of the most common statistics used when performing MANOVA include the Wilks' Lambda, the Lawley-Hotelling trace, Roy's greatest root, and Pillai-Bartlett trace (PBT) [3] [4] (PBT was chosen to be the best of these as it is the most conservative [5] [6]) and [7] has shown that there are minimal differences in statistical power among these statistics. Let λ1,λ2,,λs refer to the eigenvalues of W1B. Here s=min(νB,p) is the minimum between the degrees of freedom of B, νB and p. So, the PBT MANOVA test statistic can be written as [8],

MANOVAn1,,nk(x,y)=si=1λi1+λi=tr(B(B+W)1)

The p-value analytically by using the F statitic. In the case of PBT, given m=(|pνB|1)/2 and r=(νWp1)/2, this is [2]:

Fs(2m+s+1),s(2r+s+1)=(2r+s+1)MANOVAn1,n2(x,y)(2m+s+1)(sMANOVAn1,n2(x,y))

Methods Summary

MANOVA.statistic(*args)

Calulates the MANOVA test statistic.

MANOVA.test(*args)

Calculates the MANOVA test statistic and p-value.


MANOVA.statistic(*args)

Calulates the MANOVA test statistic.

Parameters

*args (ndarray) -- Variable length input data matrices. All inputs must have the same number of dimensions. That is, the shapes must be (n, p) and (m, p), ... where n, m, ... are the number of samples and p is the number of dimensions.

Returns

stat (float) -- The computed MANOVA statistic.

MANOVA.test(*args)

Calculates the MANOVA test statistic and p-value.

Parameters

*args (ndarray) -- Variable length input data matrices. All inputs must have the same number of dimensions. That is, the shapes must be (n, p) and (m, p), ... where n, m, ... are the number of samples and p is the number of dimensions.

Returns

  • stat (float) -- The computed MANOVA statistic.

  • pvalue (float) -- The computed MANOVA p-value.

Examples

>>>
>>> import numpy as np
>>> from hyppo.ksample import MANOVA
>>> x = np.arange(7)
>>> y = x
>>> stat, pvalue = MANOVA().test(x, y)
>>> '%.3f, %.1f' % (stat, pvalue)
'0.000, 1.0'

Examples using hyppo.ksample.MANOVA