Cross Validating in R

Excerpt from the article Cross Validation techniques in R: A brief overview of some methods, packages, and functions for assessing prediction models by Dr. Jon Starkweather.

Cross validation is useful for overcoming the problem of over-fitting. Over-fitting is one aspect of the larger issue of what statisticians refer to as shrinkage (Harrell, Lee, & Mark, 1996). Over-fitting is a term which refers to when the model requires more information than the data can provide. For example, over-fitting can occur when a model which was initially fit with the same data as was used to assess fit. Much like exploratory and confirmatory analysis should not be done on the same sample of data, fitting a model and then assessing how well that model performs on the same data should be avoided

For more info, refer to the actual article.

Comments

Popular posts from this blog

A Basic Recipe for Machine Learning

HIVE: Both Left and Right Aliases Encountered in Join

MVNO: Through thick and thin