Suppose X is a random vector with probability distribution (or density) P(X | V), where V is a vector of parameters, and Xo is a realization of X. A statistic T(X) is called a sufficient statistic if the conditional probability (density) does not depend upon V for any possible Xo.
In other words, the observed value T(Xo) of a sufficient statistic T bears all the information (about the vector V of parameters) the data Xo contain.
A trivial example of a sufficient statistic is T(X) = X, but such a statistic is useless. Practically interesting cases of sufficient statistics are those when T has (much) smaller dimension than the vector X itself. Such statistics allow you to reduce all the observed data values to a smaller set of values preserving all the information about the parameter V.
Planning on taking an introductory statistics course, but not sure if you need to start at the beginning? Review the course description for each of our introductory statistics courses and estimate which best matches your level, then take the self test for that course. If you get all or almost all the questions correct, move on and take the next test.
Data Analytics
Considering becoming adata scientist, customer analyst or our data science certificate program?