We-Math 2.0

A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning

Runqi Qiao*¹^,², Qiuna Tan*¹^,², Peiqing Yang¹, Yanzi Wang³,
Xiaowan Wang¹, Enhui Wan¹, Sitong Zhou¹, Guanting Dong¹, Yuchen Zeng¹, Yida Xu¹, Jie Wang¹,
Chong Sun², Chen Li†², Honggang Zhang†¹,

¹Beijing University of Posts and Telecommunications, ²WeChat Vision, Tencent Inc., ³Tsinghua University

🤗

Mathbook-Standard arXiv Code

🤗

Mathbook-Pro

MathBook-Standard Knowledge System MathBook-Pro

We-Math Series

We-Math 2.0

Knowledge System

Mathbook-Standard

Methodology

Citation

About We-Math Series

We-Math series is dedicated to advancing multimodal mathematical reasoning through fundamental, long-term efforts.
Its vision is to enable Large Multimodal Models (LMMs) to master essential knowledge and develop genuine reasoning abilities, empowering LMMs to tackle complex and open-ended mathematical challenges with depth, adaptability, and creativity.

📊

We-Math (Focus on Evaluation)

Process-oriented decomposition: Breaks down visual mathematical problems into step-wise sub-problems to explore process-level mechanisms.
Hierarchical knowledge framework: 5 layers of knowledge granularity and 67 knowledge concepts, where each decomposition step is grounded in explicit mathematical knowledge.
Four-dimensional diagnostics: Four-dimensional metrics to hierarchically assess inherent reasoning issues.

🧠

We-Math 2.0 (Focus on Training)

MathBook knowledge system: Five-level hierarchy with 491 knowledge points and 1,819 fundamental principles.
Training data (Part 1): MathBook-Standard ensures broad conceptual coverage and flexibility via dual expansion (multi-images per question, multi-questions per image).
Training data (Part 2): MathBook-Pro defines a 3D difficulty space and provides 7 graded variants per problem for progressive learning.
RL training: Two-stage pipeline with Cold-Start fine-tuning for knowledge-oriented chain-of-thought alignment, followed by Progressive Alignment RL using average-reward learning and dynamic scheduling.

Note. Note. All We-Math 2.0 datasets are manually constructed to ensure complete coverage of all knowledge points and principles, with carefully crafted images guaranteeing high quality and diverse complexity; all related GGB files and training data will be fully open-sourced to the AI and education communities.

We-Math 2.0

We-Math 2.0 is a unified system designed to comprehensively enhance the mathematical reasoning capabilities of Multimodal Large Language Models (MLLMs). It integrates a structured mathematical knowledge system, model-centric data space modeling, and a reinforcement learning (RL)-based training paradigm to achieve both broad conceptual coverage and robust reasoning performance across varying difficulty levels.

The key contributions of We-Math 2.0 are fourfold:

MathBook Knowledge System: a five-level hierarchical structure encompassing 491 knowledge points and 1,819 fundamental principles.

MathBook-Standard & MathBook-Pro: MathBook-Standard ensures wide conceptual coverage and flexibility via dual expansion, while MathBook-Pro defines a three-dimensional difficulty space and generates 7 progressive variants per problem for robust training.

MathBook-RL: a two-stage RL framework comprising Cold-Start Fine-tuning for knowledge-oriented chain-of-thought alignment, and Progressive Alignment RL with average-reward learning and dynamic data scheduling for gradual alignment across difficulty levels.

MathBookEval: a comprehensive benchmark covering all 491 knowledge points with diverse reasoning step distributions.

Extensive experiments show that MathBook-RL consistently outperforms existing baselines on four widely-used benchmarks and achieves strong results on MathBookEval, demonstrating superior generalization in mathematical reasoning.

Overview diagram of WeMath 2.0, including examples of knowledge points, fundamental principles, and sample problems.

MathBook Knowledge System

MathBook Knowledge System is organized as a five-level hierarchy covering 491 knowledge points and 1,819 fundamental principles. It is systematically derived from trusted sources such as Wikipedia and open-source textbooks, refined through hierarchical clustering, and further revised by human experts to ensure accuracy and completeness. An overview of this structure is provided in the module display below.

—

KnowledgePoint · Principles

Select a node on the left to view its knowledge points and principles.

Mathbook-Standard

Building on the MathBook Knowledge System, MathBook-Standard is a dataset featuring comprehensive principle-level knowledge annotations and carefully curated problems to ensure broad, balanced coverage across mathematical domains, with particular focus on underrepresented areas. To foster deeper conceptual understanding, MathBook-Standard employs a dual-expansion strategy: multi-images per question and multi-questions per image, enabling the creation of diverse problem sets that promote conceptual flexibility and adaptability.

Below, we present the multi-images-per-question component of the dataset, which can be retrieved via its underlying knowledge principles.

(All images are manually created using GeoGebra.)

Problem Retrieval

Tip: Select from the left to narrow to specific knowledge points & principles.

Mathbook-Pro

Building on the MathBook Knowledge System, MathBook-Pro introduces a pivotal three-dimensional difficulty modeling framework that systematically characterizes mathematical problem complexity from a model-centric perspective. Each seed problem is positioned within a structured difficulty space defined by three orthogonal axes:

Step Complexity – Reasoning depth is quantified by the number of knowledge points involved. More complex variants incorporate additional intermediate conclusions, with the most advanced cases involving at least six knowledge points drawn from the MathBook Knowledge System.

Visual Complexity – Additional elements such as auxiliary lines or altered geometric configurations are introduced via GeoGebra, while preserving the original core structure.

Contextual Complexity – Concise mathematical statements are rephrased into richer real-world contexts or linguistically abstract scenarios, increasing the semantic and interpretive demands of the problem statement.

By varying a single dimension at a time and progressively composing transformations across multiple dimensions, each seed problem is expanded into seven progressive difficulty levels. This enables structured, gradual learning for MLLMs and creates a robust foundation for enhancing reasoning performance across varying levels of complexity. Below, we showcase the multi-level difficulty component of MathBook-Pro, illustrating its progressive design across the three complexity dimensions.

The module below illustrates the use of MathBook-Pro in the Dynamic Scheduling RL strategy: the left panel presents a Case Demonstration, while the right panel visualizes the difficulty space, where colored points indicate the data currently scheduled for training. (see the training process section for further clarity)

Methodology

MathBook-RL

Cold-Start Fine-tuning. Supervised fine-tuning on MathBook-Standard (covering all 491 knowledge points), instilling awareness of the knowledge system and guiding knowledge-driven chain-of-thought reasoning.

Progressive Alignment RL. A curriculum-based RL procedure with two phases:

Pre-aligned RL. Train on \( \mathcal{D}_{\text{ImgVar}} \), where each group contains multiple variants of the same knowledge principle \( \{(q_i, a_i^{(t)}, I_i^{(t)})\}_{t=1}^m \). A mean-based reward is computed over variants sharing the same knowledge principle, encouraging reasoning consistency and robustness based on knowledge mastery rather than individual instances.
Dynamic Scheduling RL. Using MathBook-Pro, each base problem \( x_0=(q_0, a_0, I_0) \) follows a progressive trajectory that increases difficulty along knowledge, visual, and contextual dimensions:
\[ x_0 \;\rightarrow\; \phi_s(x_0) \;\rightarrow\; \phi_s\!\circ\!\phi_v(x_0) \;\rightarrow\; \phi_s\!\circ\!\phi_c(x_0) \;\rightarrow\; \phi_s\!\circ\!\phi_v\!\circ\!\phi_c(x_0) \]
where \( \phi_s \) adds knowledge points, \( \phi_v \) increases visual complexity, and \( \phi_c \) increases contextual abstraction. If the model fails at \( \phi(x) \) after succeeding at \( x \), we trigger incremental learning via an auxiliary set \( \Delta(x,\phi) \) that isolates the newly introduced factor:
- Knowledge Increment Scheduling: when failure is due to added knowledge in \( \phi_s \), sample auxiliary problems \( x'_0 \) from MathBook-Standard targeting the new knowledge point(s).
- Modality Increment Scheduling: when failure stems from added modality complexity (\( \phi_v \) or \( \phi_c \)), guide the model through single-modality incremental problems that isolate the visual or contextual component.

This curriculum with explicit incremental adaptation enables smooth progression across difficulty levels and effective transfer across knowledge, visual, and contextual variations, improving stability and generalization.

Overview of MathBook dataset and the corresponding training phase.

Experiment Results

Performance comparison across four mathematical reasoning benchmarks. Each benchmark follows its standard evaluation metric: MathVista and MathVision use accuracy, We-Math reports the strict score, and MathVerse evaluates on the vision-only subset with accuracy.

Citation

BibTeX

@article{qiao2025we,
  title={We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning},
  author={Qiao, Runqi and Tan, Qiuna and Yang, Peiqing and Wang, Yanzi and Wang, Xiaowan and Wan, Enhui and Zhou, Sitong and Dong, Guanting and Zeng, Yuchen and Xu, Yida and others},
  journal={arXiv preprint arXiv:2508.10433},
  year={2025}
}

@article{qiao2024we,
        title={We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?},
        author={Qiao, Runqi and Tan, Qiuna and Dong, Guanting and Wu, Minhui and Sun, Chong and Song, Xiaoshuai and GongQue, Zhuoma and Lei, Shanglin and Wei, Zhe and Zhang, Miaoxuan and others},
        journal={arXiv preprint arXiv:2407.01284},
        year={2024}
      }

Contact

For questions, suggestions, or collaborations, please contact qrq@bupt.edu.cn or qiunatan@bupt.edu.cn.