Releases: PaddlePaddle/PaddleX
Releases · PaddlePaddle/PaddleX
v3.3.5
2025.10.23 v3.3.5 released
- Fixed the issue with weight data type mapping, supporting GPUs with compute capability between 7 and 8.
- Resolved the problem of model configuration parsing failure when the model configuration includes
quantization_config. - Fixed the issue where inference errors occurred when using paths containing Chinese characters in directories on Windows.
- Resolved the problem of being unable to use PaddleOCR-VL models hosted on the AI Studio platform.
- Added support for passing the
max_new_tokensparameter during PaddleOCR-VL model inference.
2025.10.23 v3.3.5 发布
- 修复权重数据类型映射问题,支持 compute capability 在7-8之间的GPU。
- 修复模型配置中包含
quantization_config时模型解析配置失败的问题。 - 修复 Windows 环境下,使用带有中文目录的路径推理报错的问题
- 修复无法使用 AI Studio 平台托管的 PaddleOCR-VL 模型的问题
- 支持 PaddleOCR-VL 模型推理时
max_new_tokens参数的传入
Full Changelog: v3.3.4...v3.3.5
v3.3.1
2025.10.16 v3.3.1 released
- Fix issues such as the missing
concatenate_markdown_pagesmethod in the PaddleOCR-VL production pipeline.
2025.10.16 v3.3.1 发布
- 修复 PaddleOCR-VL 产线
concatenate_markdown_pages方法缺失等问题。
Full Changelog: v3.3.0...v3.3.1
v3.3.0
2025.10.16 v3.3.0 released
- Added support for inference and deployment of PaddleOCR-VL and PP-OCRv5 multilingual models.
2025.10.16 v3.3.0 发布
- 支持PaddleOCR-VL、PP-OCRv5多语种模型的推理部署能力。
Full Changelog: v3.2.1...v3.3.0
v3.2.1
2025.8.29 v3.2.1 released
- Bug Fixes:
- Fixed a potential array overflow issue in the formula recognition module of PP-StructureV3 during pipeline inference.
- Optimized the model download logic: when the model weight file already exists locally, the system will no longer re-download it, ensuring usability in offline environments.
- Updated the high-stability serving image dependencies to be compatible with PaddleX 3.2.0; fixed path errors in the upload and cleanup scripts for the high-stability serving image, as well as typos in the documentation.
- Fixed issues of invalid escape sequences in the formula recognition model post-processing stage in Python 3.12, as well as the potential Regular Expression Denial of Service (ReDoS) vulnerability.
2025.8.29 v3.2.1 发布
- Bug修复:
- 修复了PP-StructureV3在产线推理过程中,公式识别模块可能出现的数组溢出的隐患。
- 优化了模型下载逻辑:当本地已存在模型权重文件时,系统将不再重新下载,确保在离线环境下的可用性。
- 更新高稳定性服务化部署镜像依赖,适配PaddleX 3.2.0;修复高稳定服务化部署镜像上传与清理脚本中的路径错误以及文档中的typo。
- 修复了在 Python 3.12 中公式识别模型后处理阶段的无效转义符问题,以及潜在的正则表达式拒绝服务(ReDoS)问题。
Full Changelog: v3.2.0...v3.2.1
v3.2.0
2025.8.20 v3.2.0 released
-
Deployment Capability Upgrades:
- Fully supports PaddlePaddle framework versions 3.1.0 and 3.1.1.
- High-performance inference supports CUDA 12, with backend options including Paddle Inference and ONNX Runtime.
- High-stability serving solution is fully open-sourced, enabling users to customize Docker images and SDKs as needed.
- High-stability serving solution supports invocation via manually constructed HTTP requests, allowing client applications to be developed in any programming language.
-
Key Model Additions:
- Added training, inference, and deployment support for PP-OCRv5 English, Thai, and Greek recognition models. The PP-OCRv5 English model delivers an 11% improvement over the main PP-OCRv5 model in English scenarios, with the Thai model achieving an accuracy of 82.68% and the Greek model 89.28%.
-
Benchmark Enhancements:
- All pipelines support fine-grained benchmarking, enabling the measurement of end-to-end inference time as well as per-layer and per-module latency data to assist with performance analysis.
- Added key metrics such as inference latency and memory usage for commonly used configurations on mainstream hardware to the documentation, providing deployment reference for users.
-
Bug Fixes:
- Fixed an issue where invalid input image file formats could cause recursive calls.
- Resolved ineffective parameter settings for chart recognition, seal recognition, and document pre-processing in the configuration files for the PP-DocTranslation and PP-StructureV3 pipelines.
- Fixed an issue where PDF files were not properly closed after inference.
-
Other Updates:
- Added support for Windows users with NVIDIA 50-series graphics cards; users can install the corresponding PaddlePaddle framework version as per the installation guide.
- The PP-OCR model series now supports returning coordinates for individual characters.
- The
model_nameparameter inPaddlePredictorOptionhas been moved toPaddleInfer, improving usability. - Refactored the official model download logic, with new support for multiple model hosting platforms such as AIStudio and ModelScope.
2025.8.20 v3.2.0 发布
-
部署能力升级:
- 全面支持飞桨框架 3.1.0 和 3.1.1 版本。
- 高性能推理支持 CUDA 12,可使用 Paddle Inference、ONNX Runtime 后端推理。
- 高稳定性服务化部署方案全面开源,支持用户根据需求对 Docker 镜像和 SDK 进行定制化修改。
- 高稳定性服务化部署方案支持通过手动构造HTTP请求的方式调用,该方式允许客户端代码使用任意编程语言编写。
-
重要模型新增:
- 新增 PP-OCRv5 英文、泰文、希腊文识别模型的训练、推理、部署。其中 PP-OCRv5 英文模型较 PP-OCRv5 主模型在英文场景提升 11%,泰文识别模型精度 82.68%,希腊文识别模型精度 89.28%。
-
Benchmark升级:
- 全部产线支持产线细粒度 benchmark,能够测量产线端到端推理时间以及逐层、逐模块的耗时数据,可用于辅助产线性能分析。
- 在文档中补充各产线常用配置在主流硬件上的关键指标,包括推理耗时和内存占用等,为用户部署提供参考。
-
Bug修复:
- 修复了当输入图片文件格式不合法时,导致递归调用的问题。
- 修复了 PP-DocTranslation 和 PP-StructureV3 产线配置文件中图表识别、印章识别、文档预处理参数设置不生效的问题。
- 修复 PDF 文件在推理结束后未正确关闭的问题。
-
其他升级:
- 支持 Windows 用户使用英伟达 50 系显卡,可根据安装文档安装对应版本的 paddle 框架。
- PP-OCR 系列模型支持返回单文字坐标。
- 将
PaddlePredictorOption中的model_name参数移至PaddleInfer中,改善了用户易用性。 - 重构了官方模型下载逻辑,新增了 AIStudio、ModelScope 等多模型托管平台。
Full Changelog: v3.1.4...v3.2.0
v3.1.4
v3.1.4版本,修复和优化部分问题:
- 修复了分布式训练问题,添加分布式训练文档。
- 修复了表格识别v2产线在不使用版面区域检测模型时,组batch预测报错的问题。
- 优化了一些文档。
Full Changelog: v3.1.3...v3.1.4
v3.1.3
v3.1.3版本,修复和优化部分问题:
-
Bug修复:
- 修复近期引入的 PP-OCRv4 及以下版本的英文模型推理可视化显示问题。
- 修复 eslav_PP-OCRv5_mobile_rec 模型的字典路径错误问题。
-
文档优化:
- 安装文档补充了 PaddleX 3.1.2 版本的官方镜像。
Full Changelog: v3.1.2...v3.1.3
v3.1.2
v3.1.2版本,修复和优化部分问题:
-
Bug修复:
- 将默认CPU推理线程数调整为10,与PaddleOCR对齐。
- 修复了推理时,当传入不合法后缀名图像文件路径时造成的递归调用报错问题。
-
功能优化:
- PP-DocTranslation 产线支持用户传入词表对照表,保证专业名词翻译更准确。
- 3.1版本新增的多语种文本识别模型的默认下载源改为Hugging Face。
-
文档优化:
- 修复PP-DocTranslation 服务化部署的参数名称错误。
- 补充对高稳定性服务化部署手动构造HTTP请求方式的说明。
v3.1.1
v3.1.1版本,修复使用本地字体文件在特殊场景下可能触发的问题
v3.1.0
v3.1.0版本,新增PP-OCRv5种多语种文字识别模型和文档翻译产线,优化PP-StructureV3中的PP-Chart2Table模型:
- 重要模型:
- 新增PP-OCRv5多语种文本识别模型,支持法语、西班牙语、葡萄牙语、俄语、韩语等37种语言的文字识别模型的训推流程。平均精度涨幅超30%。
- 升级PP-StructureV3中的PP-Chart2Table模型,图表转表能力进一步升级,在内部自建测评集合上指标(RMS-F1)提升9.36个百分点(71.24% -> 80.60%)
- 重要产线:
- 新增基于PP-StructureV3和ERNIE 4.5 Turbo的文档翻译产线PP-DocTranslation,支持翻译Markdown文档、各种复杂版式的PDF文档和文档图像,结果保存为Markdown格式文档。