feat: Add pixi environment #534

matthewfeickert · 2025-02-17T00:55:07Z

Use a pixi environment to provide a unified install setup for users as well as a testing environment for CI.
- c.f. The 2025-02-16 Bluesky discussion on this for motivation

Note

I'm opening this in draft mode now just to get things going, but I'll need to iterate with @rabst on design (pixi.toml vs. pyproject.toml) and get all the other parts of the CI moved over later.

* Add pixi manifest pixi.toml for Linux x86, macOS arm64, Windows 64.

* Enable workflow dispatch. * Add concurrency limits. * Use pixi for CI workflow. * Unify to a single workflow for all OS tested

rasbt · 2025-02-17T01:53:38Z

Oh wow, thanks!

One question, do you actually need the .gitattributes and pixi.lock files here or are they optional? I am asking because I would try as hard as possible to reduce the number of files in the main folder (because I do sometimes get emails from people asking what these files are for.)

matthewfeickert · 2025-02-17T02:03:28Z

One question, do you actually need the .gitattributes and pixi.lock files here or are they optional? I am asking because I would try as hard as possible to reduce the number of files in the main folder (because I do sometimes get emails from people asking what these files are for.)

@rasbt This is personal choice. If you view this repo as yours for actually developing the llms-from-scratch library then I think I would remove the pixi.lock and just have people solve on install with pixi (still quite fast). If you view this repo as an educational resource (e.g. an application) for people to get up and running then I would for sure keep the lock file.

In that case I'd also keep the autogenerated .gitattributes for the reason it mentions:

$ cat .gitattributes 
# SCM syntax highlighting & preventing 3-way merges
pixi.lock merge=binary linguist-language=YAML linguist-generated=true

Though seems that I'll have to debug (later, sorry) why Windows doesn't pick up the existence of tensorflow-cpu

$ pixi run pytest --ruff setup\02_installing-python-libraries\tests.py
...
============================= test session starts =============================
platform win32 -- Python 3.10.16, pytest-8.3.4, pluggy-1.5.0
rootdir: D:\a\LLMs-from-scratch\LLMs-from-scratch
configfile: pyproject.toml
plugins: anyio-4.8.0, nbval-0.11.0, ruff-0.4.1
collected 2 items
setup\02_installing-python-libraries\tests.py .F                         [100%]
================================== FAILURES ===================================
__________________________________ test_main __________________________________
capsys = <_pytest.capture.CaptureFixture object at 0x000001A4037B5420>
    def test_main(capsys):
        main()
        captured = capsys.readouterr()
>       assert "FAIL" not in captured.out
E       AssertionError: assert 'FAIL' not in '[OK] torch ...util 6.1.1\n'
E         
E         'FAIL' is contained here:
E           b 3.10.0
E           [FAIL] tensorflow-cpu 0.0, please install a version matching >=2.18.0
E         ?  ++++
E           [OK] tqdm 4.67.1
E           [OK] numpy 2.0.2
E           [OK] pandas 2.2.3
E           [OK] psutil 6.1.1
setup\02_installing-python-libraries\tests.py:14: AssertionError
=========================== short test summary info ===========================
FAILED setup/02_installing-python-libraries/tests.py::test_main - AssertionError: assert 'FAIL' not in '[OK] torch ...util 6.1.1\n'
  
  'FAIL' is contained here:
    b 3.10.0
    [FAIL] tensorflow-cpu 0.0, please install a version matching >=2.18.0
  ?  ++++
    [OK] tqdm 4.67.1
    [OK] numpy 2.0.2
    [OK] pandas 2.2.3
    [OK] psutil 6.1.1
========================= 1 failed, 1 passed in 5.74s =========================

as it is installed

$ pixi list --environment tests
Environment: tests
Package                        Version                Build                       Size       Kind   Source
...
tensorflow_cpu                 2.18.0                                             17.4 KiB   pypi   tensorflow_cpu-2.18.0-cp310-cp310-win_amd64.whl
...

rasbt · 2025-02-17T02:11:51Z

Thanks!

And regarding

Though seems that I'll have to debug (later, sorry) why Windows doesn't pick up the existence of tensorflow-cpu

I think it was just a spelling issue that the package is called tensorflow-cpu but is imported as tensorflow_cpu.

matthewfeickert · 2025-02-17T02:14:11Z

I think it was just a spelling issue that the package is called tensorflow-cpu but is imported as tensorflow_cpu.

Yeah, that's what it looks like from my side too. I can put a sed equivalent form in Powershell in there to fix it.

edit: Ah, your fix is much nicer. :)

matthewfeickert · 2025-02-17T02:18:09Z

pixi.toml

+
+[dependencies]
+python = "3.10.*"
+pytorch-cpu = ">=2.6.0,<3"


Note: I just put in pytorch-cpu for quick debugging here, but obviously pytorch-gpu is the desired target. pytorch-cpu can be used in a tests-ci environment that would be designed for GitHub Actions where you aren't going to have GPUs.

I see. Can we just put torch so it handles it based on the OS similar to what pip does with the wheels?

@rasbt No, as conda/mamba/pixi don't define things by OS, they define things be environment. If pytorch is asked for and there's no CUDA support already defined in the existing environment or environment definition file the pytorch will resolve to pytorch-cpu. If there's something like cuda-version defined as additional dependencies or a cuda version in a pixi system-requirements table then the solver will resolve pytorch to CUDA enabled pytorch. And if you explicitly ask for pytorch-gpu it will get the additional CUDA dependencies it needs to ensure that the version of PyTorch installed will be GPU compatible.

Small example:

# mkdir pytorch-check # cd pytorch-check/ # pixi init ✔ Created /tmp/pytorch-check/pixi.toml # pixi add pytorch ✔ Added pytorch >=2.6.0,<3 # pixi list pytorch Package Version Build Size Kind Source pytorch 2.6.0 cpu_mkl_py313_he6a733d_100 26.9 MiB conda pytorch # rm pixi.toml pixi.lock # pixi init # pixi add pytorch-gpu # pixi list pytorch Package Version Build Size Kind Source pytorch 1.11.0 cuda111py310h385535d_1 628.1 MiB conda pytorch pytorch-gpu 1.11.0 cuda111py310h3d52e46_1 9.5 KiB conda pytorch-gpu # pixi list cuda Package Version Build Size Kind Source cuda-version 11.1 hdbd7af8_3 20.5 KiB conda cuda-version cudatoolkit 11.1.1 hb139c0e_13 929.6 MiB conda cudatoolkit # rm pixi.toml pixi.lock # pixi init # echo -e '\n[system-requirements]\ncuda = "12"' >> pixi.toml # pixi add pytorch # cat pixi.toml [project] authors = ["Matthew Feickert <[email protected]>"] channels = ["conda-forge"] name = "pytorch-check" platforms = ["linux-64"] version = "0.1.0" [tasks] [dependencies] pytorch = ">=2.6.0,<3" [system-requirements] cuda = "12" # pixi list pytorch Package Version Build Size Kind Source pytorch 2.6.0 cuda126_mkl_py313_h7373160_300 27.4 MiB conda pytorch # pixi list cuda Package Version Build Size Kind Source cuda-crt-tools 12.8.61 ha770c72_1 26.6 KiB conda cuda-crt-tools cuda-cudart 12.8.57 h5888daf_1 22.2 KiB conda cuda-cudart cuda-cudart_linux-64 12.8.57 h3f2d84a_1 188.4 KiB conda cuda-cudart_linux-64 cuda-cuobjdump 12.8.55 hbd13f7d_0 227.1 KiB conda cuda-cuobjdump cuda-cupti 12.8.57 hbd13f7d_0 1.8 MiB conda cuda-cupti cuda-nvcc-tools 12.8.61 he02047a_1 24.5 MiB conda cuda-nvcc-tools cuda-nvdisasm 12.8.55 hbd13f7d_0 4.9 MiB conda cuda-nvdisasm cuda-nvrtc 12.8.61 hbd13f7d_0 63.1 MiB conda cuda-nvrtc cuda-nvtx 12.8.55 hbd13f7d_0 31 KiB conda cuda-nvtx cuda-nvvm-tools 12.8.61 he02047a_1 23.5 MiB conda cuda-nvvm-tools cuda-version 12.8 h5d125a7_3 20.6 KiB conda cuda-version

With conda/mamba/pixi it is better to be more explicit about the environment you want, as they have the ability to actually resolve it correctly.

.github/workflows/basic-tests-uv.yml

matthewfeickert · 2025-02-17T16:36:51Z

.github/workflows/basic-tests-uv.yml

+          uv python install 3.10
+          uv add . --dev
+          uv pip install -r ch05/07_gpt_to_llama/tests/test-requirements-extra.txt
+          uv add pytest-ruff nbval


Is it important to use uv add here, or can you just do

uv pip install --system . uv pip install --system -r ch05/07_gpt_to_llama/tests/test-requirements-extra.txt uv pip install --system pytest-ruff nbval

and then skip manually activating the virtual environment for each step?

Given that you do this for pip as well I assume there's a reason(?).

Ah, this is for testing purposes, too.

In addition, uv add . --dev installs from the pyproject.toml and should be faster. I think uv pip install --system . may be slower as it uses the pip algo.

The reason why I am doing

uv pip install -r ch05/07_gpt_to_llama/tests/test-requirements-extra.txt

though is that this is bonus material 95% of the users don't care about, and so there is no pyproject.toml for this one.

Hm, I don't think it is any faster. If anything it should be slower as it needs to open and edit a file (in the case of uv add pytest-ruff nbval) and then run uv pip install <whatever your uv add target was>. AFAIK, uv doesn't have multiple installer algorithms, it just has uv pip install and then higher level APIs on top of that. Would be interested if I'm wrong though.

that's why I thought but someone told me yesterday that uv add is supposed to be faster as it doesn't try to mimic pip legacy behavior. Should check some official docs...

Hm. The docs (https://docs.astral.sh/uv/reference/cli/#uv-add) don't mention that, but maybe @charliermarsh can comment on if there's interesting under the hood behavior.

@matthewfeickert or maybe I misinterpreted the "Yes" here

rasbt · 2025-02-17T17:33:46Z

I think it looks good now! Thanks again for all your help here setting this up. Learned a lot!

matthewfeickert · 2025-02-17T20:20:16Z

I think it looks good now! Thanks again for all your help here setting this up. Learned a lot!

Sure thing and hope it helps! Thanks also for creating the amazing educational resources that you do, in addition to all the tools you build, as well as the great educational posts and blogs!

I might make some follow up PRs during breaks from my job talk 😛 that can streamline the process a bit more. Also the pyproject.toml is missing a build-system table.

rasbt · 2025-02-17T20:31:34Z

Thanks! Haha, and I definitely wouldn't mind that.

Regarding streamlining... it may seem inefficient that there are pip/uv/pixi tests now, which seems like a lot. But it's so that I know all 3 things work because users may choose whatever they choose :).

Also, there are the uv-tests that are in separate Windows/Linux/macOS files. That's mainly to get the badge buttons to work properly. I tried doing it differently, but if you have them with a build matrix in the same .yml, you can't have separate badge descriptions based on the things I tried.

matthewfeickert · 2025-02-17T22:16:14Z

Regarding streamlining... it may seem inefficient that there are pip/uv/pixi tests now, which seems like a lot. But it's so that I know all 3 things work because users may choose whatever they choose :).

Oh yeah that was all clear. By "streamlining" I mean "simplifying/speeding up individual processes" not "removing existing workflows".

feat: Add pixi environment

dcdcbc4

* Add pixi manifest pixi.toml for Linux x86, macOS arm64, Windows 64.

matthewfeickert force-pushed the feat/add-pixi-environment branch from 5d35081 to bd81543 Compare February 17, 2025 01:34

matthewfeickert added 2 commits February 16, 2025 18:40

ci: Update CI workflow and unify to one

d3bcf51

* Enable workflow dispatch. * Add concurrency limits. * Use pixi for CI workflow. * Unify to a single workflow for all OS tested

feat: Add pixi lock file

b6be9e3

matthewfeickert force-pushed the feat/add-pixi-environment branch from bd81543 to b6be9e3 Compare February 17, 2025 01:41

Ensure tensorflow-cpu installed on Windows

7add431

fix package check

60945d1

fix package check

2ba76a1

matthewfeickert commented Feb 17, 2025

View reviewed changes

rasbt added 7 commits February 16, 2025 20:28

simplification plus uv and pip runners

5f8544b

some fixes to pixi and pip

6c13d4a

create pixi.lock

15e880c

fix pixi.lock issue

4692e16

another attempt trying to fix get_packages

d10f1f3

another attempt trying to fix get_packages

b57b74f

clean up python_environment_check.py

49c5a70

rasbt force-pushed the feat/add-pixi-environment branch from e4da5d2 to 49c5a70 Compare February 17, 2025 15:14

rasbt marked this pull request as ready for review February 17, 2025 15:27

Merge branch 'main' into feat/add-pixi-environment

94024bf

matthewfeickert commented Feb 17, 2025

View reviewed changes

rasbt added 4 commits February 17, 2025 10:38

updated runner and docs

8f0a7f9

use bash

3c18825

proper env activiation

7e15352

proper env activiation

8fa9078

rasbt merged commit a8b8eb4 into rasbt:main Feb 17, 2025
13 checks passed

matthewfeickert deleted the feat/add-pixi-environment branch February 17, 2025 20:15

feat: Add pixi environment #534

feat: Add pixi environment #534

Uh oh!

Conversation

matthewfeickert commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rasbt commented Feb 17, 2025

Uh oh!

matthewfeickert commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rasbt commented Feb 17, 2025

Uh oh!

matthewfeickert commented Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matthewfeickert Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rasbt commented Feb 17, 2025

Uh oh!

Uh oh!

matthewfeickert commented Feb 17, 2025

Uh oh!

rasbt commented Feb 17, 2025

Uh oh!

matthewfeickert commented Feb 17, 2025

Uh oh!

Uh oh!

matthewfeickert commented Feb 17, 2025 •

edited

Loading

matthewfeickert commented Feb 17, 2025 •

edited

Loading

matthewfeickert commented Feb 17, 2025 •

edited

Loading

matthewfeickert Feb 17, 2025 •

edited

Loading