-
Notifications
You must be signed in to change notification settings - Fork 81
Supporting RISC-V for PyTorch on CPU #77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening this!
This is indeed very interesting and something we would like to support (under the limit of our time and funding).
The relatively general stance I have here is: we are ready to spend time to unblock you in getting this running, making sure we don't prevent any backend from growing. But, given limited bandwidth and a multitude of backends, we focus our own enablement and maintenance work on the most popular ones for our end uers.
I would have a couple clarifying questions for you here:
- I see there are quite a few pre-requisite in the pypa ecosystem and dependencies. Any estimate on the timelines there?
- For the RISE runners, any pointers on the process to access them? Any agreement we have to accept to use them?
- What is the RISE stance for general maintenance of these projects (keep compilation working, keep workers working, keep performance good, etc)?
By the way, there are already an attempt to make sure that PyTorch is at least compilable for Risc-V, but it got stuck on technicalities of setting up qemu: pytorch/pytorch#143979 |
@albanD I and other members of RISE do want to step up and provide the time and funding to support this effort. We fully realize that it's not something that it "just" going to happen and it's going to take effort, especially on the machine infrastructure side of things.
How can I and other step up to help on that? We did explore for RISE to join the PyTorch Foundation, but several PyTorch members indicated that it was not a requirement. We are of course still open to do it if it's any blocker. On the technical aspect, who would be the right person to engage with to figure out how to provide the machines necessary and the required code changes in the relevant places? I am currently assuming that RFCs are the process to document that? I'm also happy to join any meeting or calls where I could answer some live questions of course.
pypi just recently added official support for riscv64 with pypi/warehouse#18390. manylinux was made availabe for riscv64 at https://quay.io/repository/pypa/manylinux_2_39_riscv64. The only last missing piece is cibuildwheel, but it should be a matter of days/weeks before it's made available. The next challenge will be the dependencies of PyTorch (numpy, MarkupSafe, pillow). We already have packages available at RISE (numpy, MarkupSafe and pillow), but of course we want the upstream projects to provide support for riscv64 themselves. We are working or planning to work with them for adding this support. I expect it to be a matter of weeks or months.
I'm the point of contact for that and I'm very happy to provide whichever number of machines you would need. There is an exercise of capacity planning we'll need to get through to understand better how many machines we will need. But just to get started, I can easily provision a dozen machines or so.
We are actively investing into projects like PyTorch. We are here to stick around and contribute to the success of these projects. Our focus is of course on RISC-V, but we want RISC-V to be a first-class citizen and will do all the necessary work and maintenance to make it a reality. We would then help in keeping the compilation going, making it easier for more people to build and test on RISC-V in general, help keep the workers going, and get the performance to always go up.
@malfet that is definitely the necessary first step to getting PyTorch available on RISC-V. This PR is being pushed by one of the members of RISE and it's a recurrent talking point among RISE members to know what's the status and blockers of it. The push for this current RFC is really to "officialize" RISC-V status, and explicitly document all the other steps we'll want to take to make RISC-V an officially supported platform. Of course, I would love to have your input on all of that as I've reached off-line as well. |
And cibuildhweel has actually merged support for riscv64 a few hours ago: pypa/cibuildwheel#2506 |
@malfet pytorch/pytorch#143979 Hi, the cross-compilation verification for RISC-V has passed normally. Do you have time to take a look? |
Summary
We propose adding RISC-V as a supported PyTorch CPU platform.
The Problem
RISC-V is a fast-growing architecture, and many industry players are using it to target AI/ML workload. However, PyTorch is currenly not available on RISC-V. Many hardware vendors are then building and distributing their own versions of PyTorch, leading to fragmentation in this ecosystem.
The Solution
We propose adding RISC-V as an officially supported platform to PyTorch. The main question is around hardware availability, which the RISE project wants to support PyTorch on.
Why This Matters
Adding RISC-V support to PyTorch allows users of this architecture to run PyTorch directly, without relying on unofficial builds. This reduces fragmentation and makes it easier for developers and researchers to use PyTorch on RISC-V systems. Official support also helps ensure better reliability and consistency across platforms.