Momentum Crashes Are Optional (If You Let a Machine Drive)
Momentum strategies print money until they don't. A new paper shows that a simple LightGBM model can keep the upside and cut max drawdown from 30% to 13%. The trick is knowing when to stop listening to price and start listening to fundamentals.
Momentum is the oldest trick in the quant playbook. Buy what went up, sell what went down, collect your premium, repeat. It works across markets, across asset classes, across decades. Until it doesn't. And when it doesn't, it tends to fail in spectacular fashion.
The 2020 COVID crash is a textbook case. A standard 12-1 cross-sectional momentum strategy on S&P 500 stocks lost 28% in two months. Years of accumulated gains, gone in weeks. This is not a bug in momentum. It is the feature you pay for with that fat long-run premium.
A recent paper by Wang (2025), presented at the ICEMGD symposium, asks a straightforward question: can you keep the upside of momentum while cutting the crash risk, without resorting to market timing calls or discretionary overrides?
Three strategies, one answer
The study compares three approaches on S&P 500 stocks from 2010 to 2024:
Pure momentum. Classic 12-1 long-short, top and bottom decile, rebalanced monthly. Delivered 20.3% annualized with a 1.94 Sharpe. Solid numbers. But the max drawdown hit -30.4%, almost entirely during the COVID crash.
Momentum with a static ROE filter. Same setup, but the universe is restricted to stocks with above-median Return on Equity. The idea: screen out financially weak names that tend to blow up in reversals. It worked for drawdown reduction (max drawdown fell to -16.3%) but the cure was almost worse than the disease. Annual returns dropped to 7.7%, Sharpe fell to 0.89. The filter was too blunt. It kept you out of trouble, but also out of the best trades.
Dynamic ML momentum with LightGBM. A gradient-boosted tree model trained on momentum, ROE, and price-to-book, re-estimated each month with an expanding window. The model decides how much weight to put on each signal depending on current conditions. Result: 22% annualized return, 2.53 Sharpe, max drawdown of just -13%.
Read those numbers again. The ML strategy earned slightly more than pure momentum, with less than half the drawdown. Its Sharpe was 30% higher than the baseline and nearly three times the static filter approach.
What the model actually does
The feature importance breakdown is telling. In normal times, price momentum dominates. The model essentially runs a momentum strategy with minor adjustments. But during the turbulent February-March 2020 window, it shifted weight toward ROE and valuation, effectively rotating from "buy winners" to "buy quality." It sidestepped the worst of the crash not by predicting the crash, but by recognizing that the character of winning stocks had changed.
During expansions, the baseline and ML strategy produced nearly identical returns (23.2% annualized, Sharpe of 2.85). The difference showed up entirely in the recession months: baseline momentum lost 28%, the ML strategy lost 10%.
This is the core insight. You do not need the model to be a better stock picker in normal markets. You need it to know when normal is over.
The fine print
A few caveats worth noting. The study uses no transaction costs. For a monthly long-short strategy in the top and bottom deciles of the S&P 500, costs are nontrivial and would reduce all three strategies' returns. The ROE data is measured as of end-2024 and applied retroactively as a static characteristic, which introduces look-ahead bias in the quality filter (less of a concern for the ML model, which re-trains expanding-window). The recession sample is exactly two months, which makes regime-specific statistics noisy. And the "dynamic" strategy in practice uses a simplified rule: it mirrors baseline momentum in non-recession months and switches to quality-filtered momentum during the labeled recession. This is closer to a regime-switching backtest than a true adaptive model.
The paper also acknowledges the overfitting risk inherent in any ML approach and the limited interpretability of gradient-boosted models.
Why it matters anyway
Despite these caveats, the paper illustrates a principle that holds up well beyond this specific backtest: momentum and quality are not competing strategies. They are complementary signals with different regime sensitivities. The value of a model is not that it finds a new alpha source, but that it learns the conditional relationship between existing signals and adjusts the mix in real time.
For anyone running momentum exposure in a systematic portfolio, the practical takeaway is clear. A static quality overlay is too expensive in opportunity cost. A dynamic overlay that increases quality weight only when conditions deteriorate preserves most of the momentum premium while meaningfully reducing tail risk. Whether that overlay comes from LightGBM, a simpler regime indicator, or volatility scaling (as Barroso and Santa-Clara showed in 2015), the logic is the same: let momentum run when it wants to, and pull the handbrake when fundamentals start flashing.
The hard part was never knowing this. The hard part is having a systematic process that does it without human intervention. That is where the model earns its keep.
Reference: Wang, S. (2025). Market Investment Strategy: Cross-Sectional Momentum with Dynamic Filtering. Proceedings of ICEMGD 2025 Symposium, 39-46. DOI: 10.54254/2754-1169/2025.LH25120