markdown 格式问题 #2565

Richard-Wth · 2025-06-03T08:04:54Z

🔎 Search before asking | 提交之前请先搜索

I have searched the MinerU Readme and found no similar bug report.
I have searched the MinerU Issues and found no similar bug report.
I have searched the MinerU Discussions and found no similar bug report.

Description of the bug | 错误描述

在使用 magic-pdf 解析得到的 markdown 文件中，所有标题都是一级标题，例如图片中的章节 3 和小章节 3.1 都是一级标题。

How to reproduce the bug | 如何复现

3 Arithmetic Reasoning

We begin by considering math word problems of the form in Figure 1, which measure the arithmetic reasoning ability of language models. Though simple for humans, arithmetic reasoning is a task where language models often struggle (Hendrycks et al., 2021; Patel et al., 2021, inter alia). Strikingly, chainof-thought prompting when used with the 540B parameter language model performs comparably with task-specific finetuned models on several tasks, even achieving new state of the art on the challenging GSM8K benchmark (Cobbe et al., 2021).

3.1 Experimental Setup

We explore chain-of-thought prompting for various language models on multiple benchmarks.

Benchmarks. We consider the following five math word problem benchmarks: (1) the GSM8K benchmark of math word problems (Cobbe et al., 2021), (2) the SVAMP dataset of math word problems with varying structures (Patel et al., 2021), (3) the ASDiv dataset of diverse math word problems (Miao et al., 2020), (4) the AQuA dataset of algebraic word problems, and (5) the MAWPS benchmark (Koncel-Kedziorski et al., 2016). Example problems are given in Appendix Table 12.

Operating System Mode | 操作系统类型

Linux

Operating System Version| 操作系统版本

Ubuntu22.04

Python version | Python 版本

3.12

Software version | 软件版本 (magic-pdf --version)

1.3.x

Device mode | 设备模式

cuda

myym0 · 2025-06-05T07:45:06Z

我也遇到了同样的问题，请问这种情况下如何去对markdown划分章节，有没有高效一点的方法呢？

Richard-Wth added the bug Something isn't working label Jun 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

markdown 格式问题 #2565

markdown 格式问题 #2565

Richard-Wth commented Jun 3, 2025

myym0 commented Jun 5, 2025

Uh oh!

markdown 格式问题 #2565

markdown 格式问题 #2565

Comments

Richard-Wth commented Jun 3, 2025

🔎 Search before asking | 提交之前请先搜索

Description of the bug | 错误描述

How to reproduce the bug | 如何复现

3 Arithmetic Reasoning

3.1 Experimental Setup

Operating System Mode | 操作系统类型

Operating System Version| 操作系统版本

Python version | Python 版本

Software version | 软件版本 (magic-pdf --version)

Device mode | 设备模式

myym0 commented Jun 5, 2025

Uh oh!