In 2025, I ran a prediction game for finance professionals. This year, to keep pace with current trends, I decided instead to pose ten questions about markets in 2026 to several AI large language models (LLMs). Will they do any better than humans, and will they agree with each other?
I used three AI LLMs for this task: ChatGPT (5.2), Gemini (3), and Claude (Sonnet 4.5). Here are the questions and responses:
1) Where will the S&P 500 finish 2026?
ChatGPT: 7,700
Gemini: 7,555
Claude: 6,800
2) Where will the FTSE 100 finish 2026?
ChatGPT: 10,600
Gemini: 10,750
Claude: 9,200
3) What will the ten-year Treasury yield be at the end of 2026?
ChatGPT: 4.5%
Gemini: 3.75%
Claude: 4.50%
4) Where will the GBP/USD spot rate be at the end of 2026?
ChatGPT: 1.28
Gemini: 1.38
Claude: 1.28
5) What will the dollar price of Brent crude oil be at the end of 2026?
ChatGPT: $57
Gemini: $55
Claude: $66
6) Will there be a 20% decline in the S&P 500 in 2026?
ChatGPT: No
Gemini: No
Claude: No
7) What will the dollar price of Bitcoin be at the end of 2026?
ChatGPT: $130,000
Gemini: $135,000
Claude: $78,000
8) Will the Russell 2000 outperform the S&P 500 in 2026?
ChatGPT: No
Gemini: No
Claude: No
9) What will the Federal Funds Target Rate (upper bound) be at the end of 2026?
ChatGPT: 3.25%
Gemini: 3.25%
Claude: 3.50%
10) What will the dollar price of gold be at the end of 2026?
ChatGPT: $4,500
Gemini: $4,900
Claude: $3,100
It was reassuring that when I initially asked the AI models to make these predictions, they were reluctant and would only offer heavily caveated ranges – suggesting they may be better calibrated than humans. I had to persuade them to produce point forecasts!
Of course, the models were right to be reticent. No LLM will be capable of making accurate and specific predictions about a system as complex as financial markets. Still, there were some other aspects I wanted to explore.
Were the predictions consistent across models?
Absolutely not. While there was some consistency between ChatGPT and Gemini, Claude made several very bold calls (gold at $3,100, for example). Interestingly, Claude’s rationale was often the most “convincing.” It behaved like a particular type of market strategist – frequently wrong, but articulate and provocative – and therefore attracting the most attention.
Were the predictions internally consistent?
This is an important and difficult question to answer. The models could certainly sound internally consistent when asked to justify their views, but some combinations looked odd. For example, Claude was the most bullish on oil while also being the most bearish on UK equities – a scenario that could happen, but feels like an unlikely pairing (given the resource heavy nature of the UK market).
Are they better than human predictions?
I was asking the models to predict outcomes that, in my view, are impossible for either humans or LLMs to forecast with any real accuracy. Can they do this as well as a human? Almost certainly. Does that make it useful? Outside of marketing copy, probably not.
There are definitely more important things to spend tokens on!
—
* Thank you to my colleague Duncan for suggesting this post.
—
My first book has been published. The Intelligent Fund Investor explores the beliefs and behaviours that lead investors astray, and shows how we can make better decisions. You can get a copy here (UK) or here (US).
All opinions are my own, not that of my employer or anybody else. I am often wrong, and my future self will disagree with my present self at some point. Not investment advice.