Comprehensive overview of 29 cross-validation methods organized into 4 categories: I.I.D., Grouped, Temporal, and Spatial. Use this guide to select the most appropriate validation strategy for your medical machine learning project.
Answer a few questions to find the most suitable CV method for your data
| Method | Category | Best For | Medical Applications | Data Requirements | Key Benefits | Limitations |
|---|---|---|---|---|---|---|
| 🔄 I.I.D. Cross-Validation (9 Methods) | ||||||
| HoldOut | I.I.D. | Quick prototyping | Initial model screening, large datasets | Large dataset (>5000) | Fast, simple baseline | High variance, wastes data |
| KFold | I.I.D. | Independent samples | Basic diagnostic models, single-visit data | Independent observations | Simple, well-understood | Assumes independence |
| StratifiedKFold | I.I.D. | Imbalanced outcomes | Rare diseases, cancer detection | Class labels available | Preserves class distribution | Still assumes independence |
| RepeatedKFold | I.I.D. | Stable estimates | Clinical trials, biomarker validation | Sufficient samples | Reduces variance | Computationally expensive |
| LOOCV | I.I.D. | Very small datasets | Rare diseases, pilot studies | <100 samples | Maximum data usage | High variance, computationally expensive |
| LPOCV | I.I.D. | Exhaustive validation | Small critical datasets | <50 samples | Tests all p-combinations | Combinatorial explosion C(n,p) |
| BootstrapValidation | I.I.D. | Small medical datasets | Rare diseases, clinical trials | <500 samples typically | Stable estimates (.632/.632+ correction) | May be optimistic without correction |
| MonteCarloCV | I.I.D. | Flexible repeated sampling | Complex medical datasets | Custom sampling needs | Highly customizable | Requires careful design |
| NestedCV | I.I.D. | Model selection + evaluation | Biomarker discovery, precision medicine | Hyperparameter tuning needs | Unbiased performance estimates | Computationally expensive |
| ⏰ Temporal Cross-Validation (8 Methods) | ||||||
| Time Series Split | Temporal | Time-ordered data | ICU monitoring, disease progression | Temporal ordering | Respects temporal order | Limited by time dependencies |
| Rolling Window CV | Temporal | Real-time predictions | Clinical decision support, alerts | Fixed window size | Mimics deployment | Requires sufficient history |
| Expanding Window CV | Temporal | Growing datasets | Population health monitoring | Accumulating data | Uses all available history | May include outdated patterns |
| Purged K-Fold | Temporal | Financial/trading-like medical data | Drug response, biomarker evolution | Overlapping observations | Prevents temporal leakage | Reduces available data |
| Blocked Time Series | Temporal | Seasonal/periodic patterns | Flu seasons, allergy patterns | Temporal blocks | Preserves temporal patterns | Requires known periodicity |
| Combinatorial Purged CV | Temporal | Complex temporal patterns | Financial health data, trading strategies | Multiple test periods | Comprehensive temporal testing | Very computationally expensive |
| Purged Group Time Series | Temporal | Grouped + temporal data | Multi-patient longitudinal studies | Groups + time constraints | Handles both dependencies | Complex implementation |
| Nested Temporal CV | Temporal | Temporal model selection | Forecasting with hyperparameter tuning | Temporal + nested loops | Unbiased temporal tuning | Extremely expensive |
| 👥 Grouped Cross-Validation (8 Methods) | ||||||
| Group K-Fold | Grouped | Multi-visit patients | Longitudinal studies, chronic disease | Group identifiers | Prevents patient leakage | Requires sufficient groups |
| Leave-One-Group-Out | Grouped | Multi-site studies | Clinical trials, hospital networks | Site/group IDs | Tests cross-site generalization | High variance with few sites |
| Stratified Group K-Fold | Grouped | Grouped + imbalanced | Multi-site rare disease studies | Groups + class labels | Preserves distribution per group | Complex to implement |
| Leave-p-Groups-Out | Grouped | Multiple groups testing | Multi-center validation | Groups + p selection | Tests on p groups at once | Many iterations C(n,p) |
| Repeated Group K-Fold | Grouped | Robust grouped validation | Stable estimates for grouped data | Groups + multiple runs | Reduces variance in estimates | Computationally expensive |
| Hierarchical Group K-Fold | Grouped | Nested group structures | Multi-site with sub-groups | Hierarchical groups | Preserves nested structure | Requires clear hierarchy |
| Multi-level CV | Grouped | Hierarchical medical data | Hospital→Department→Patient | Multi-level hierarchy | Respects all hierarchy levels | Complex implementation |
| Nested Grouped CV | Grouped | Patient-grouped model selection | Personalized medicine, longitudinal ML | Groups + hyperparameter tuning | Prevents leakage in model selection | Very computationally intensive |
| 🌍 Spatial Cross-Validation (4 Methods) | ||||||
| Spatial Block CV | Spatial | Geographic health data | Epidemiology, environmental health | Geographic coordinates | Handles spatial autocorrelation | Requires spatial structure |
| Buffered Spatial CV | Spatial | Disease mapping with buffer zones | Outbreak prediction, environmental exposure | Spatial coordinates + buffer | Prevents spatial leakage | Reduces available data |
| Spatiotemporal Block CV | Spatial | Space + time dependencies | Pandemic spread, seasonal epidemics | Spatial + temporal data | Handles both dimensions | Complex implementation |
| Environmental Health CV | Spatial | Environmental health studies | Pollution exposure, climate health | Geographic + environmental | Considers environmental factors | Requires environmental data |
Quick code snippets for the most common medical CV methods
from trustcv import TrustCVValidator
from trustcv.splitters import GroupKFold, StratifiedGroupKFold
# Basic patient-grouped cross-validation
validator = TrustCVValidator(method='group_kfold', n_splits=5, check_leakage=True)
results = validator.validate(model=model, X=X, y=y, groups=patient_ids)
# Stratified patient grouping (preserves class balance)
validator = TrustCVValidator(method='stratified_group_kfold', n_splits=5)
results = validator.validate(model=model, X=X, y=y, groups=patient_ids)
from trustcv.splitters import TimeSeriesSplit, RollingWindowCV
from trustcv.splitters import PurgedKFoldCV
# Basic temporal split
cv = TimeSeriesSplit(n_splits=5)
# Rolling window for real-time prediction
cv = RollingWindowCV(window_size=30, gap=2) # 30 days, 2-day gap
# Purged K-Fold to prevent temporal leakage
cv = PurgedKFoldCV(n_splits=5, purge_gap=5)
from trustcv.splitters import LeaveOneGroupOut, LeavePGroupsOut
from trustcv.splitters import HierarchicalGroupKFold
# Leave-one-site-out validation
cv = LeaveOneGroupOut()
# Leave-p-sites-out for multi-center studies
cv = LeavePGroupsOut(n_groups=2)
# Hierarchical: Hospital → Department → Patient
cv = HierarchicalGroupKFold(n_splits=5)
Comparing built-in cross-validation support across popular ML/DL toolboxes
| Toolbox | Built-in CV Splitters | Medical CV Support | Notes |
|---|---|---|---|
| TrustCV (ours) | 29 | ⭐⭐⭐ Excellent | Native support for Patient-Grouping, Temporal Purging, Spatial Blocking, and Leakage Detection. |
| Scikit-learn | 15 | ⭐⭐ Good | Solid for basic IID and simple Group/Time splits, but lacks medical-specific temporal/spatial logic. |
| PyCaret | 5 | ⭐⭐⭐ Strong | Good high-level strategies for simple group/time splits, but limited configuration. |
| CatBoost / XGBoost / LightGBM | 2-3 | ⭐⭐ Moderate | Native support for basic folds and simple grouping/stratification. |
| MONAI | 1 | ⭐⭐ Moderate | Specialized for medical imaging datasets but lacks broader clinical CV methods. |
| TensorFlow / PyTorch / JAX | 0 | ⭐ Basic | No dedicated CV module; requires manual loops or external splitters (like sklearn or TrustCV). |
TrustCV provides 14+ specialized methods not found in scikit-learn, including PurgedKFoldCV, SpatialBlockCV, HierarchicalCV, and EnvironmentalHealthCV.
Unlike standard toolkits, TrustCV includes 6 types of automated data leakage checkers (Patient, Temporal, Spatial, Preprocessing, Duplicate, Feature-Target).
Our splitters are designed to map directly to FDA 510(k) and CE MDR documentation requirements for clinical validation.