-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Replace ujson by orjson #8655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Replace ujson by orjson #8655
Conversation
f84fabe
to
5f2ca85
Compare
dspy/predict/predict.py
Outdated
@@ -42,7 +42,10 @@ def dump_state(self): | |||
# FIXME: Saving BaseModels as strings in examples doesn't matter because you never re-access as an object | |||
demo[field] = serialize_object(demo[field]) | |||
|
|||
state["demos"].append(demo) | |||
if isinstance(demo, dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is necessary because orjson
doesn't handle dict-like instance's serialization automatically.
c7ad9f4
to
6bb1683
Compare
17ee10a
to
dc3b237
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR replaces the ujson
library with orjson
throughout the DSPy codebase. The change is motivated by performance improvements and better compatibility, particularly for JSON serialization and deserialization operations in model saving/loading workflows.
Key changes made:
- Updated dependency from
ujson>=5.8.0
toorjson>=3.9.0
in pyproject.toml - Replaced all
ujson
import statements withorjson
across multiple modules - Adapted JSON serialization calls to use
orjson.dumps()
with appropriate encoding/decoding - Enhanced the
Example.toDict()
method to handle nested serializable objects recursively
Reviewed Changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
pyproject.toml | Updated dependency specification from ujson to orjson |
dspy/utils/saving.py | Replaced ujson with orjson for metadata loading |
dspy/primitives/base_module.py | Updated JSON operations and added json_mode parameter to dump_state |
dspy/primitives/example.py | Enhanced toDict method with recursive serialization support |
dspy/predict/predict.py | Added json_mode parameter and updated demo serialization logic |
dspy/streaming/streamify.py | Updated streaming response JSON serialization |
dspy/clients/cache.py | Updated cache key generation to use orjson |
dspy/clients/utils_finetune.py | Changed file operations to binary mode for orjson compatibility |
dspy/clients/databricks.py | Updated data serialization for Databricks integration |
dspy/teleprompt/simba_utils.py | Updated JSON operations and error handling |
dspy/predict/refine.py | Updated JSON serialization in advice generation |
tests/primitives/test_base_module.py | Added nested example test case |
tests/predict/test_predict.py | Updated test assertions to use orjson |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
with open(path, encoding="utf-8") as f: | ||
state = ujson.loads(f.read()) | ||
state = orjson.loads(f.read().encode("utf-8")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading the entire file and then encoding it is inefficient. Since orjson.loads can work with bytes directly, consider opening the file in binary mode ('rb') and calling orjson.loads(f.read())
directly.
Copilot uses AI. Check for mistakes.
This is continued work of #8584 due to the inactivity of the original contributor.
Verified that json saved with the old
ujson
path is still loadable by theorjson
code. Specifically I optimized a dspy.ReAct and saved the state as:Then run the code to reload it:
The code above works well, then I ran

react.save()
again with the orjson path, and verified that it's almost the same as the old path, with one minor diff: