Revolutionary multi agent system achieves 72.7% win rate against baseline models in generating publication ready academic illustrations
February 12, 2026 Researchers from Google Cloud AI Research and Peking University have introduced PaperBanana, a framework using five specialized agents to transform raw technical text into publication quality methodology diagrams and statistical plots, addressing what has long been considered one of the most time consuming bottlenecks in academic publishing.
The system, detailed in a paper published on arXiv on January 30, 2026, represents a significant advancement in automating the scientific workflow. While AI powered tools have made strides in literature review and code generation, creating professional academic illustrations has remained predominantly manual work requiring specialized design skills many researchers lack.
Five Agents Working in Concert
PaperBanana divides tasks among specialized AI agents, with the first searching a reference database for similar diagrams to use as templates, the second translating paper descriptions into detailed image descriptions, and the third refining these using aesthetics guidelines extracted from NeurIPS publications.
The seven-member research team, Dawei Zhu, Rui Meng, Yale Song, Xiyu Wei, Sujian Li, Tomas Pfister, and Jinsung Yoon, designed the framework to operate through a two phase process:
Linear Planning
🔹Retriever Agent identifies the 10 most relevant reference examples from a curated database
🔹Planner Agent converts technical methodology text into structured visual descriptions
🔹Stylist Agent applies academic aesthetic standards, including domain specific color palettes and layouts
Iterative Refinement
🔹Visualizer Agent renders images using advanced models or generates executable Python code
🔹Critic Agent inspects outputs against source material through three rounds of refinement
Addressing the "Numerical Hallucination" Problem
One of PaperBanana's most innovative features is its dual mode approach to statistical plots. For statistical data, the system switches from direct image generation to executable Python Matplotlib code, ensuring numerical precision and eliminating hallucinations common in standard AI image generators.
This design decision proves critical for academic publishing, where even minor numerical inaccuracies can undermine a paper's credibility. Traditional image generation models often produce visually appealing charts but suffer from distorted scales, repeated elements, or incorrect data representation.
Benchmark Results Demonstrate Superior Performance
To evaluate the framework, the team introduced PaperBananaBench, comprising 292 test cases curated from actual NeurIPS 2025 publications across diverse research domains and illustration styles.
Using a Vision Language Model (VLM) as evaluator, PaperBanana demonstrated significant improvements over baseline approaches:
🔹Overall Score: +17.0%
🔹Conciseness: +37.2%
🔹Readability: +12.9%
🔹Aesthetics: +6.6%
The system achieved a 72.7% win rate against baseline AI models in blind human evaluation, with particularly strong performance in "Agent & Reasoning" diagram categories, where it scored 69.9%.
From Research Problem to Practical Tool
The framework addresses a well documented pain point in scientific publishing. Previous approaches using code-based methods like TikZ or Python PPTX fall short with complex visual elements like specialized icons or custom shapes, now standard in modern AI publications.
Beyond generating diagrams from scratch, PaperBanana can polish existing hand drawn sketches or rough drafts into professional grade illustrations. This capability allows researchers to focus on conceptualizing their ideas while the system handles the technical execution of visual design.
Community Response and Open Source Implementation
While the original PaperBanana system relies on proprietary Google models (Gemini 3 Pro and Nano Banana Pro), the research has already sparked community interest. An unofficial open source implementation has emerged on GitHub, complete with Model Context Protocol (MCP) server support for integration with development tools.
The timing of PaperBanana's release comes as AI providers increasingly invest in scientific workflow automation. The framework represents a strategic move toward making high quality scientific communication more accessible and less dependent on manual design expertise.
👉🏻 Found this article interesting? Follow us on Facebook, Twitter and whatsapp to read more exclusive content we post.