Latent Class Analysis (LCA):
Latent class analysis is concerned with deriving information about categorical latent variable s from observed values of categorical manifest variable s. In other words, LCA deals with fitting latent class models – a subclass of the latent variable models – to the observed data.
LCA is used for analysis of categorical data in biomedical, social science and market research.
Basic latent class models postulate the following relationship between distribution of the manifest variables and values of a categorical latent variable:
|
where y=(y1,…,yL) is the response – the vector of values of L manifest categorical variables;
x is a value of the latent categorical variable;
PY(y) is the observed distribution of y;
PX(x) is the distribution of x;
PYX(y|x) is the distribution of y for given value of x.
The latent class models usually postulate local independence of the manifest variables (y1,…,yN) .
Outside the social research, the latent class models are often called “finite mixture models” – because the above described model represents distribution of all responses as a mixture of t conditional distributions of y : PYX(y|x), x=1,…t .
If X is a single categorical latent variable taking on t values, then ascribing particular values of X to observed responses y is equivalent to partitioning all responses into t classes. Therefore the corresponding branch of LCA is named “latent class cluster analysis”. In contrast, in the “latent class factor analysis,” x is considered as a vector of several categorical (usually – dichotomous) variables x=(x1,…,xN) , or “factors.