AI Glossary/Train-Test Split
Data Science

Train-Test Split

Dividing data into separate sets for training and evaluating model performance.

In-depth explanation

The training set is used to train the model; the test set evaluates final performance on unseen data. A validation set (from training data) is used for hyperparameter tuning. Typical splits are 80/20 or 70/15/15. Proper splitting prevents data leakage and gives honest performance estimates. Time-series data requires temporal splits.

Examples

80% train, 20% test
70% train, 15% validation, 15% test

Related terms

Master Train-Test Split.

Learn how to apply this concept with hands-on projects in our comprehensive AI programs.