Skip to content

Releases: breezedeus/Pix2Text

feat: add new models MFD-1.5 & MFR-1.5

24 Jul 16:02
b70f717

Choose a tag to compare

Update 2025.07.25: V1.1.4 Released

Major Changes:

  • Upgraded the Mathematical Formula Detection (MFD) and Mathematical Formula Recognition (MFR) models to version 1.5. All default configurations, documentation, and examples now use mfd-1.5 and mfr-1.5 as the standard models.

主要变更:

  • 数学公式检测(MFD)和数学公式识别(MFR)模型升级到 1.5 版本,所有默认配置、文档和示例均以 mfd-1.5mfr-1.5 为标准模型。

feat: enhance image processing functions and add comprehensive tests for transparency handling

07 May 15:22
1c1aeca

Choose a tag to compare

Update 2025.05.06: V1.1.3.2 Released

Major Changes:

  • Fixed a potential error when processing transparent images, see #171 for details.

主要变更:

  • 修复了处理透明图片时可能出现的错误,具体见 #171

fix: update version to 1.1.3.1 for VLM model import fix

27 Apr 11:43
c878756

Choose a tag to compare

Update 2025.04.27: V1.1.3.1 Released

Major Changes:

  • Bugfix: Fixed the issue of model import related to VLM.

主要变更:

  • 修复了 VLM 相关的模型导入问题。

feat: add VLM support for text and table recognition

15 Apr 15:35
e40c153

Choose a tag to compare

Update 2025.04.15: V1.1.3 Released

Major Changes:

主要变更:

Bugfix: Fixed issues related to downloading models on Windows

17 Dec 14:32
c87047b

Choose a tag to compare

Update 2024.12.17: V1.1.2.3 Released

Major Changes:

  • Bugfix: Fixed issues related to downloading models on Windows.

主要变更:

  • 修复了在 Windows 环境下下载模型的问题。

bugfix

11 Dec 13:40
416ff1e

Choose a tag to compare

Update 2024.12.11: V1.1.2.2 Released

Major Changes:

  • Bugfix: Resolved issues related to serialization errors when handling ONNX Runtime session options by ensuring that non-serializable configurations are managed appropriately.

主要变更:

  • 修复了与 ONNX Runtime session options 相关的序列化错误,通过确保不可序列化的配置信息在适当的管理下进行处理。

bugfix

02 Dec 15:32
acbac46

Choose a tag to compare

Update 2024.12.02: V1.1.2.1 Released

Major Changes:

  • Fixed an error in fetch_column_info()@DocYoloLayoutParser, thanks to Bin.

主要变更:

  • 修复了 fetch_column_info()@DocYoloLayoutParser 中的错误,感谢网友 Bin 。

Integrated a better layout analysis model DocLayout-YOLO

16 Nov 15:05
d393a21

Choose a tag to compare

Update 2024.11.17: V1.1.2 Released

Major Changes:

  • A new layout analysis model DocLayout-YOLO has been integrated, improving the accuracy of layout analysis.
  • Bug fixes:
    • When the text language is set to English only, a dedicated English OCR model is used to avoid including Chinese in the output.
    • The processing logic for PNG images has been optimized, enhancing recognition performance.

主要变更:

  • 版面分析模型加入 DocLayout-YOLO,提升版面分析的准确性。
  • 修复 bugs:
    • 在设置文本语言只有英语时,使用专门的英文 OCR 模型,避免输出中包含中文。
    • 对 PNG 图片的处理逻辑进行了优化,提升了识别效果。

Bugfixes

18 Jul 05:36
4438d9b

Choose a tag to compare

Update 2024.07.18: V1.1.1.2 Released

Major Changes:

主要变更:

Fix: some formats of models require fixed-size input images

24 Jun 15:09
c4271c7

Choose a tag to compare

Update 2024.06.24: V1.1.1.1 Released

Major Changes:

  • Added a new parameter static_resized_shape when initializing MathFormulaDetector, which is used to resize the input image to a fixed size. Some formats of models require fixed-size input images during inference, such as CoreML.

主要变更:

  • MathFormulaDetector 初始化时加入了参数 static_resized_shape, 用于把输入图片 resize 为固定大小。某些格式的模型在推理时需要固定大小的输入图片,如 CoreML