Recent advancements in accounting-focused AI models are delivering measurable improvements in accuracy, but new benchmark data suggests the path to fully autonomous finance remains far from complete.

A newly released benchmark from DualEntry evaluating leading AI models places Claude Opus 4.7 at the top, achieving 79.2% overall accuracy, ahead of models such as GPT-5.4 and GPT-5.4-Nano. The evaluation covered ten models across major vendors including Anthropic, OpenAI, Google, MiniMax, and Zhipu AI.

While the results highlight clear progress, they also reveal a more nuanced reality: AI is performing well in structured accounting tasks, but continues to struggle with complex, multi-step financial processes.

Strong Performance in Structured Tasks

One of the most notable findings is the high level of accuracy achieved in well-defined, repeatable accounting functions.

AI models reached over 90% accuracy in transaction classification and journal entries, indicating that rule-based and pattern-driven tasks are increasingly within reach of automation. These areas, traditionally time-consuming for finance teams, are becoming prime candidates for AI augmentation.

This aligns with a broader shift across enterprise software, where AI is first proving its value in structured workflows before expanding into more complex operational domains.

The Month-End Close Challenge

In contrast, performance drops significantly when models are applied to more complex workflows.

Accuracy falls to around 50% in month-end close processes, a critical function that involves multiple steps, contextual judgment, and cross-functional data dependencies. Unlike transaction-level tasks, these workflows require not just pattern recognition, but an understanding of sequencing, exceptions, and business context.

This gap underscores a key limitation: while AI can assist with individual tasks, it still struggles to manage interconnected financial processes end-to-end.

Open Models Narrow the Gap

Another notable outcome is the performance of emerging open-weight models.

Models such as GLM-5 and MiniMax M2.7 outperformed several established alternatives, including higher-profile proprietary models. This suggests that innovation in accounting AI is not limited to a small group of vendors, and that competitive dynamics are broadening.

For enterprises, this could translate into greater choice—but also increased complexity in evaluating and integrating the right solutions.

A Persistent Accuracy Gap

Despite improvements across the board, no model exceeded 80% overall accuracy.

This is a critical threshold. In finance, where decisions carry regulatory, financial, and reputational consequences, even small error rates can have significant implications. As a result, full automation remains out of reach for most organizations.

Instead, AI is increasingly being positioned as a co-pilot—supporting analysis, accelerating workflows, and surfacing insights—rather than replacing human oversight.

What This Means for Finance Leaders

The findings point to a clear conclusion: AI in accounting is advancing, but not yet ready for full autonomy.

For CFOs and finance leaders, this creates a dual reality. On one hand, there are immediate opportunities to automate structured tasks and improve efficiency. On the other, core processes such as financial close, compliance, and reporting still require human judgment and control.

This reinforces a broader industry pattern. AI is moving from assistance toward execution—but finance may be one of the last domains where fully autonomous operation is accepted.

The Road Ahead

As AI models continue to improve, the focus is likely to shift from raw accuracy toward reliability, explainability, and workflow integration.

Closing the gap between task-level performance and end-to-end process execution will be the next major milestone. Until then, organizations will need to balance innovation with caution, adopting AI where it delivers clear value while maintaining oversight where it matters most.

For those interested in exploring the underlying data, the full benchmark and methodology are available via DualEntry’s accounting AI benchmark platform.

ERP News Editorial Team

+ posts

The ERPNews Editorial Team covers global developments in ERP (Enterprise Resource Planning), enterprise software, cloud platforms, AI, automation, and digital transformation, providing independent news and editorial analysis for senior business and technology leaders. Our reporting focuses on market signals, strategic shifts, and enterprise impact across the ERP and enterprise technology ecosystem.

For editorial inquiries, please contact:
📩 [email protected]

AI Still Falls Short of End-to-End Accounting, Despite Benchmark Gains

Strong Performance in Structured Tasks

The Month-End Close Challenge

Open Models Narrow the Gap

A Persistent Accuracy Gap

What This Means for Finance Leaders

The Road Ahead

ERP News Editorial Team

Latest ERP News

Beyond Go-Live: Why ERP Success Depends on Training, Adoption, and Continuous Learning

Cyber Security Moves Up the SMB Agenda as AI Adoption Exposes Operational Gaps

Forterro Bets on Vertical Manufacturing Software as Industrial ERP Consolidation Accelerates

Celonis Launches Context Model and Moves to Acquire Ikigai Labs to Address Enterprise AI’s Operational Blind Spots

Sage Future 2026 Highlights the Rise of Transparent AI for Finance, HR and Operations

ERP News Magazine May 2026 – Issue #60

ERP.net Launches Operator.net: An Integrated AI Agentic Platform for Enterprise Resource Planning

Evergreen’s Pine Services Group Enters Australia with Stratus Consulting Group Acquisition

What Features Does HR Software Need to Have for Growing UK Businesses?

Dyna Software Introduces Agentic AI Platform to Automate ServiceNow Configuration