Cross-validation for Longitudinal Datasets with Unstable Correlations
This demo is based on the KDD 2025 paper "Cross-Validation for Longitudinal Datasets with Unstable Correlations".
It simulates an outcome that is a linear combination of two features: It then compares the expected MSE estimated by different CV strategies—random CV, block CV, and our proposed approach (|block CV output - random CV output|) of two linear models: Random and block CV often estimate the unstable model as having a lower MSE than the stable model, resulting in models that will fail over time. On the other hand our method avoids this pitfall and provides more reliable model selection. To play with this demo, click the sliding bars for
a:
b:
Au:
Vu:
Strength of correlation of stable vs. unstable feature and outcome over time
Time period (t)
Stable (a)
Unstable (bpu(t))
Note: Stable - Random CV and Stable - Block CV overlap entirely because they always produce the same output.