🍌

Bibby implementation · Based on Zhu et al., arXiv:2601.23265

Paper Banana on Bibby

Our tool inspired by the PaperBanana research paper

Bibby's Paper Banana feature applies the multi-agent approach from Zhu et al. (2026) to turn your scientific content into publication-ready methodology diagrams and statistical plots — inside your Bibby workspace.

The overview below summarizes the original paper. To try Bibby's implementation, use Create Illustration.

Original paper authors

Dawei Zhu · PKU*Rui Meng · GoogleYale Song · GoogleXiyu Wei · PKUSujian Li · PKUTomas Pfister · GoogleJinsung Yoon · Google

🍌 Try Bibby's Paper Banana Read original paper Authors' code @BibbyResearch

292

Test Cases

from NeurIPS 2025

Agents

specialized AI agents

4/4

Benchmarks Won

faithfulness, conciseness, readability, aesthetics

SCROLL

From the original paper

Abstract reproduced from Zhu et al. (arXiv:2601.23265). Bibby's tool is an independent implementation inspired by this work.

Despite rapid advances in autonomous AI scientists powered by language models, generating publication-ready illustrations remains a labor-intensive bottleneck in the research workflow. To lift this burden, we introduce PaperBanana, an agentic framework for automated generation of publication-ready academic illustrations.

Powered by state-of-the-art VLMs and image generation models, PaperBanana orchestrates specialized agents to retrieve references, plan content and style, render images, and iteratively refine via self-critique. To rigorously evaluate our framework, we introduce PaperBananaBench, comprising 292 test cases for methodology diagrams curated from NeurIPS 2025 publications.

VLMs

State-of-the-art vision-language models

Multi-Agent

5 specialized collaborative agents

Self-Critique

Iterative quality refinement loop

NeurIPS 2025

Benchmark from top venue

Multi-Agent Architecture

Five Agents, One Mission

Architecture from the PaperBanana paper — as implemented in Bibby's Paper Banana tool.

🔍

Retriever

Identifies relevant reference examples from academic databases to guide style and content alignment.

→

📐

Planner

Translates scientific content into a detailed visual plan, decomposing structure and layout.

→

🎨

Stylist

Enforces academic aesthetic standards — color palettes, typography, line weights, and visual hierarchy.

→

🖼️

Visualizer

Renders the initial image or generates Python/Matplotlib plotting code from the visual plan.

→

🔬

Critic

Performs iterative self-critique, inspecting generated results against source content and triggering refinement.

Generated Illustrations

✓Illustrative preview of the paper's multi-agent pipeline

Capabilities

What the paper describes

Capabilities reported in Zhu et al.; Bibby's implementation may differ in scope and availability.

🗂️NeurIPS-ready

Methodology Diagrams

Neural networks, flowcharts, multi-agent pipelines, and complex system architectures — all rendered to publication standards.

📊Data-exact

Statistical Plots

Accurate data visualization via Matplotlib code generation. Bar charts, ablation studies, and accuracy comparisons grounded in your data.

✏️Sketch input

Sketch-to-Pro

Transform rough hand-drawn sketches into clean, harmonious academic figures with consistent fonts and styling.

✨Polish mode

Aesthetic Refinement

Upload existing diagrams to upgrade fonts, colors, and spacing without altering underlying content or structure.

📚Context-aware

Reference-Driven

Retrieves relevant papers to align style with academic conventions — your figures will match the venue aesthetic.

🔄Auto-refine

Iterative Refinement

Self-critique loop ensures publication-quality output. The Critic agent reviews and forces regeneration until quality passes.

PaperBananaBench

Consistently Outperforms All Baselines

Benchmark results reported in the original paper (292 test cases from NeurIPS 2025). Figures are from Zhu et al., not Bibby-run evaluations.

Faithfulness

Baseline: 61%PaperBanana: 87%

Conciseness

Baseline: 58%PaperBanana: 82%

Readability

Baseline: 70%PaperBanana: 91%

Aesthetics

Baseline: 63%PaperBanana: 85%

PaperBananaBench

292

Test Cases

NeurIPS

Source Venue

2025

Publications

4 Dims

Evaluation Axes

The benchmark covers diverse research domains and illustration styles, representing the breadth of modern AI research publications.

How It Works

From Text to Publication Figure in Seconds

Describe your methodology or paste your data. Bibby's Paper Banana applies the paper's agent workflow inside your project.

STEP 01

✍️

Describe

Input your methodology, data, or sketch. PaperBanana accepts text, PDFs, or images.

STEP 02

🔍

Retrieve

The Retriever agent scans reference databases to find style-aligned academic examples.

STEP 03

📐

Plan & Style

Planner and Stylist agents create a detailed visual plan with academic-grade aesthetics.

STEP 04

✨

Render & Refine

Visualizer renders the figure; Critic reviews and iterates until quality is publication-ready.

Cite the original paper

If you use the PaperBanana method

Cite Zhu et al. when referring to the research framework. Bibby's web tool is a separate product implementation.

@article{zhu2026paperbanana,
  title={PaperBanana: Automating Academic
         Illustration for AI Scientists},
  author={Zhu, Dawei and Meng, Rui and
          Song, Yale and Wei, Xiyu and
          Li, Sujian and Pfister, Tomas and
          Yoon, Jinsung},
  journal={arXiv preprint arXiv:2601.23265},
  year={2026}
}

Original research affiliations (not Bibby)

🏛️Peking University

☁️Google Cloud AI Research

Original paper — corresponding authors (not Bibby support): [email protected] · [email protected] · [email protected]