The art of statistics and data science lies, in part, in taking a real-world problem and converting it into a well-defined quantitative problem amenable to useful solution. At the technical end of things lies regularization. In data science this involves various methods of simplifying models, to minimize overfitting and better reveal underlying phenomena. Some examples include:
- reducing dimensionality
- smoothing
- penalty terms to discourage complexity in models
- constraining predictor coefficients