Generating model documentation, sourcing, and comments

Investment Banking: Ranking the Best AI Tools for Financial Modeling (2026) 🤯

Investment Banking: Ranking the Best AI Tools for Financial Modeling (2026) 🤯

We extensively tested OpenAI's ChatGPT, Anthropic's Claude, Microsoft Copilot Agent Mode, and Shortcut on a real three-statement Excel model using investment banking standards.

Here’s what actually works, and what doesn’t.

Quick Answer: The Rankings 👇

Best Overall: Shortcut

Close Second: Claude (Opus 4.6)

Third Place: Microsoft Copilot (GPT-5)

Distant Fourth: ChatGPT (GPT-5.2)

The Testing:

Criteria 1: How It Feels Working With The Tool

Understanding the Assignment:

Winners: Claude and Shortcut

Claude and Shortcut asked thoughtful clarifying questions after receiving the prompt, about:

• Forecast preferences
• Revenue segmentation
• Share repurchases
• Layout decisions
• Schedule structure

That behavior closely resembles what you’d want from a good junior analyst.

Copilot and ChatGPT asked none.

Speed:

Winner: Shortcut

Shortcut and Claude completed the setup in ~15 minutes vs ~25 minutes for Copilot.

ChatGPT took close to an hour.

A decent analyst would have taken 1-2 hours to complete the assignment.

So all were faster than a human analyst at initial setup.

Criteria 2: Data Extraction, Formatting and Best Practices

Formatting:

Winner: Shortcut

Shortcut and Claude produced the most “investment-bank-like” outputs.

Shortcut was more consistent with input coloring and structure.

Claude missed several formatting conventions.

Copilot ignored IB formatting entirely.

Accuracy:

Winner: Copilot

Copilot won, but this was disappointing across the board.

Shortcut and Claude hallucinated significant portions of historical data.

In both cases, the errors were subtle enough to be dangerous, with slightly incorrect line items all adding up to correct subtotals.

Shortcut’s second attempt returned almost no mistakes.

Claude continued to generate bad data.

Fixing this would require careful cell-by-cell auditing that takes longer than just inputting the numbers yourself.

As a rule, analysts should not rely on these agents to find data and should instead upload PDFs and spreadsheets for the agents to work with.

Copilot and ChatGPT were more accurate.

ChatGPT’s presentation was the least polished, but its historical balance sheet was easiest to audit.

Had Shortcut and Claude used correct data, they would have won as they were attempting a more analytically rigorous presentation.

Shortcut was also going into the footnotes to break out certain items when appropriate.

Sourcing and Commenting:

Winner: Claude

Claude provided the best explanations of where data came from and why certain modeling decisions were made.

Copilot added no comments.

ChatGPT added too many.

Shortcut did some, but less consistently.

Claude was also the only tool to backsolve EBITDA correctly.

The Bottom Line:

Shortcut and Claude significantly outperform Copilot and ChatGPT.

But right now, even the best tool still underperforms a Junior Analyst.

Matan FeldmanCEO & Founder, Wall Street Prep

Feb 23, 2026

Generating model documentation, sourcing, and comments

Why the human is still essential here

How people use this

Cell-level comments for key drivers

Model methodology and assumptions memo draft

Tie-out checklist and QA variance notes

Need Help Implementing AI in Your Organization?

LLM Orchestration

AI Strategy

Compliance & Safety

Related Prompts (2)

💰Finance Tracker Agent

Finance Tracker Agent Personality

📊Analytics Reporter Agent

Analytics Reporter Agent Personality

Community stories (1)

Investment Banking: Ranking the Best AI Tools for Financial Modeling (2026) 🤯