Skip to content

Conversation

TimothyEDawson
Copy link

@TimothyEDawson TimothyEDawson commented Jul 14, 2025

Changes proposed in this pull request

  • Add Python stub (.pyi) files to the Cython interface which cover the public API for the Python code, including runtime-generated attributes.
  • Include the stub files in the installed Python package.
  • Add testing infrastructure to verify consistent typing and adequate coverage.

There was some discussion about the approach here, but in short I consider this to be a stepping stone toward adding static typing to the Cython interface. Some advantage of starting with stub files include:

  • The full public API is made immediately available for developers who utilize Cantera as a dependency in their Python projects (like me!).
  • Work on type hints is fully segregated from the actual Python code as many details are worked out and the testing infrastructure is set up.
  • It is substantially easier to add typing for the Python API without contending with Cython syntax.
  • It makes dynamically-generated parts of the API, such as pass-through attributes from a Solution to a Quantity or SolutionArray, trivial to type-hint, and may (?) be the correct way to do so.

There are some disadvantages, with the largest perhaps being the substantial increase in the amount of code to be maintained going forward. My hope is that the majority of the stub file content is gradually replaced with type hints within the source code, and I'm aware that I already have some redundant type hints in here.

If applicable, fill in the issue number this pull request is fixing

Potentially closes Cantera/enhancements#85.

Checklist

  • The pull request includes a clear description of this code change
  • Commit messages have short titles and reference relevant issues
  • Build passes (scons build & scons test) and unit tests address code coverage
  • Style & formatting of contributed code follows contributing guidelines
  • The pull request is ready for review

@TimothyEDawson
Copy link
Author

Some notes:

I covered a good chunk of the public API already, but I am aware of some holes here and there which I plan to fill in as time permits. I have deliberately omitted the various file conversion scripts (e.g. ck2yaml.py), but am not opposed to including them here.

I still need to figure out a good way to add unit testing for the type hints. I expect that would look like liberal use of typing.assert_type, as used in scipy-stubs, but to get and maintain thorough coverage may require some thought on infrastructure.

Several decisions were a little impulsive and I'm sure there are many improvements and changes needed. Some examples:

  • I have defined numerous TypeAliases for the sake of convenience, many of which should probably start with an underscore so they are not included as part of the public API.
  • Any is used in several places where it may be possible to replace it with an Unpacked TypedDict or other option.
  • I started to add __all__ in some areas when that should probably be a separate pull request to add it to the source files.
  • __init__.pyi is quite cumbersome. I attempted to turn all of the implicit imports into explicit imports, then removed the exported symbols which originate from third-party libraries, but I'm sure there is a more elegant way to do this. I think that thorough use of __all__ might make the explicit imports fully redundant.
  • NumPy typing always feels like there are too many ways to do things, so I went with a very minimal approach of assuming all ndarrays are arbitrarily-sized np.typing.NDArray[np.float64] which I call Array (should probably be _ArrayFloat64 or something), and anything which is specifically going to be coerced into an ndarray is typed as np.typing.ArrayLike. I'm aware some packages like optype and NumType may provide more descriptive and flexible type hints, but I don't want to add any external dependencies unless absolutely necessary.
  • I did not use backwards-compatibility stuff like from typing_extensions import because I was under the impression those are largely intended for source code, but I need to look into it more to see if those imports should be modified.
  • The stubs simply assume you have the necessary optional dependencies to work, such as Pandas. I'm not certain whether that's the correct way to handle them.

And I'm sure there's more, I just wanted to start getting feedback now rather than keep postponing this pull request forever!

@ischoegl
Copy link
Member

ischoegl commented Jul 15, 2025

Thanks for your efforts on this, @TimothyEDawson - this looks really promising! I am, however, somewhat concerned about maintaining parallel signatures in pyx and pyi files: for any change or addition, things need to be edited in two (usually large!) parallel files, which, imho, is not ideal.

Based on what you suggest here, I ran a couple of quick tests: specifically, I was interested in whether type hints are preserved if they are added directly in the pyx file. I ran some rudimentary tests on interfaces/cython/cantera/_utils.pyx (i.e., one of the shorter files), without adding AnyMap.

Based on successful tests, it appears that adding type hints directly to pyx may be more maintainable than keeping separate pyi files. However, this assumes that things can be combined, i.e., where static types are known, add it to pyx, whereas dynamic properties and methods may have to be added via pyi.

You can find my (very rudimentary) test here: https://github.com/ischoegl/cantera/tree/test-type-hints (just one commit) ... I simply copied over what you had in the pyi file.

@TimothyEDawson
Copy link
Author

TimothyEDawson commented Jul 15, 2025

@ischoegl yes, we're on the same page. That was the same concern I highlighted in the pull request text.

There's no inherent issues with .pyx files as they're generally treated the same as .py files as far as type hints go, that was not part of the motivation for this. There's an order of resolution to type hints as defined here: https://typing.python.org/en/latest/spec/distributing.html#partial-stub-packages Because stub files take precedence over inline type hints, the path forward would essentially be to delete type hints from the stub files as they are added to the source code files.

I am open to whichever approach the Cantera developers wish to take. The two options which seem best to me are:

  • Merge in a full set of stubs with this pull request, then gradually work toward moving type hints into the inline code via subsequent pull requests.
  • Work towards inlining the type hints as much as possible within this pull request, stopping when it's either complete (no more .pyi files) or it gets too tricky (arbitrary).

I lean toward the first option because it gives users a working set of type hints while we work out details for the inline version. It also means I don't even need to touch any .pyx files. Maintainability is a concern, but mostly just for API changes, and any new functionality which is properly type-hinted inline won't need to be added to the .pyi files at all.

At the same time, realistically it will be trivial to move the majority of these type hints inline right now. I expect there will be some edge cases which might take a while to figure out what to do, but if that's the route we want to go, I'd be happy to try it.

@TimothyEDawson
Copy link
Author

One issue I encounter when moving things into the .pyx files is that the language server extensions I use in VS Code (e.g. Pylance, Ruff, Pyrefly) don't recognize Cython syntax, and thus can't help me catch errors like they can in type stubs. That's mostly a convenience, so I'm willing to live with it if I can come up with a robust testing procedure which could catch such errors.

I'm going to focus on finishing up the stub files, as I'm already quite close, and designing some testing infrastructure to ensure the type hints match the runtime code. In its current state I'm already using it extensively on my other projects, and finding it to be very helpful (and also finding areas which need reworking).

I think there are also many good discussions to be had in the nuances of typing. For one example, I haven't found a way to represent attributes which are only sometimes available, e.g. how a SolutionArray based on a Solution object doesn't have Q related attributes, and one based on a PureFluid won't have the attributes associated with Kinetics. I did find that I can use generics in such a way that your IDE will flag that an attribute is unreachable, so I'm moving forward with implementing that now, but unreachable attributes will unfortunately still be visible in the intellisense/tab-completion and at runtime when calling dir.

Copy link

codecov bot commented Jul 31, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.92%. Comparing base (98ec8d1) to head (29fc21e).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1926   +/-   ##
=======================================
  Coverage   74.92%   74.92%           
=======================================
  Files         448      448           
  Lines       55763    55763           
  Branches     9205     9205           
=======================================
  Hits        41780    41780           
  Misses      10881    10881           
  Partials     3102     3102           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@bryanwweber bryanwweber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for starting this! Aside from the line comments, two other suggestions:

  1. Is there any way to add tests for these types? Both correctness relative to implementation and completeness of the types. The former to catch changes in signatures, the latter to catch new functions
  2. Can the imports of Cantera functions be made relative instead of absolute? I'm a little worried about having another Cantera installation on the PYTHONPATH and mixing up the hints

@property
def boundary_emissivities(self) -> tuple[float, float]: ...
@boundary_emissivities.setter
def boundary_emissivities(self, epsilon: Sequence[float]) -> None: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to type that this has to be a sequence of two elements that are floats?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering the same thing. In this case this should actually be tuple[float, float], thus specifying the length, as that matches the internal representation which specifically accepts a tuple and checks the length. In many other places in the code, however, the length is not checked/enforced and the function would work fine with other containers such as lists and ndarrays. In those cases Sequence would be the correct type (sometimes Iterable), neither of which seem to have a way to specify the number of elements.

Technically, boundary_emissivities would work fine with any number of elements if it didn't specifically raise a ValueError as it does, since it specifically pulls out the first two elements from the input tuple.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look at the implementation, but does it really check for a tuple? That'd be surprising to me. Too bad there's not a way to specify sequence dimensions more generically.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It specifies a tuple in the Cython style, though I don't know that it would reject a list. I'm guessing it would not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, in fact, only accept a tuple. Which is good! This way we can enforce the size.

def linear_solver(self) -> SystemJacobian: ...
@linear_solver.setter
def linear_solver(self, precon: SystemJacobian) -> None: ...
def set_initial_guess(self, *args: Any, **kwargs: Any) -> None: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function might be implemented this way, but can we provide any concrete details here that maybe aren't captured by the actual signature?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly, and it's something I plan to come back to! Though I feel like maybe the actual signature should be rewritten to include such details, too.

*,
diluent: CompositionLike | None = None,
fraction: str
| dict[Literal["fuel", "oxidizer", "diluent"], float]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be nice to define this as a TypedDict since these are the only 3 allowed keys

@property
def thermo_model(self) -> ThermoType: ...
@property
def phase_of_matter(self) -> str: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This returns one of a few literal string options that we could specify here.

@@ -0,0 +1 @@
def no_op(t: float) -> float: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't need to hint this function, it's not really meant for end use

Comment on lines 18 to 29
from cantera._utils import __git_commit__ as __git_commit__
from cantera._utils import __version__ as __version__
from cantera._utils import hdf_support as hdf_support
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These appear to be unused here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While true, they are exported symbols which exist in onedim's namespace. You can verify that by executing:
import cantera as ct print(ct.onedim.__git_commit__)

That being said, it's probably fine to remove them as its existence here should be visible from the onedim.py file, and the relevant type information should discernable from _onedim.pyi. Some of the odd code like this originated from MyPy's stubgen, which I used as a rough starting point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should focus on documenting the intended public interface (here and elsewhere), as opposed to the details that aren't relevant to end users. These constants are imported here, sure, but that's essentially an implementation detail.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, though it's not always clear what is relevant and what isn't. Even attributes with an underscore are sometimes important to me, e.g. SolutionArray._phase.

Copy link
Member

@bryanwweber bryanwweber Aug 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal preference is to elide as many implementation details as necessary, including defining types that represent a combination of objects if that's necessary. The reason I feel this way is that I think it will be much easier to get something merged than to chase complete correctness. I'm not sure if that changes anything you're already doing though 😁

# the extension, so there are no ``source`` files in the setup.py ``extension`` and
# we have to treat the module as package data.
cantera = *.pxd, *.dll, *@py_module_ext@
cantera = *.pxd, *.dll, *@py_module_ext@, py.typed, *.pyi, **/*.pyi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make a similar change to the pyproject.toml in the sdist source?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing! To be honest, I wasn't even aware of that file. I take it this folder handles packaging the library for conda-forge and PyPI. How should I go about verifying that it's capturing everything correctly?

I also only just noticed the sourcegen folder... I'd like to read up more on the function and reason for separating these folders. I assume there are some relevant discussions in some issues/pull requests, any chance you could point me to some relevant ones?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I should have just looked through the developer documentation before commenting. It looks like this page has everything I need, so I'll go ahead and verify that everything is being properly included.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like there wasn't actually any change needed. As far as I can tell, scikit-build-core copies py.typed and *.pyi files into the distribution by default, unlike setuptools. I did end up adding a "Typing :: Typed" to the Python classifiers there, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man, setuptools is so unintuitive. I wish there were another build backend that met our needs for the regular interface.

units: ApplicationRegistry
Q_: Quantity

def copy_doc(method: F) -> F: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not meant to be a public function, only a convenience wrapper. If your checking works without including this, I think you can remove it

@TimothyEDawson
Copy link
Author

Hey @bryanwweber , thank you for the review! I'll respond to all the comments soon.

Regarding testing, certainly, that's one of the things I'm working on. Any static type checker like Mypy will easily flag static functions which are missing types. Dynamic attributes might be tricky. mypy stubtest cantera was able to catch some that I missed.

Regarding absolute vs. relative imports, certainly we can swap it. I have a strong personal preference for absolute imports, though I don't know if there are very strong objective reasons to prefer one over the other. I would note that the way I generally avoid the issue you raised when working on a Python package is to perform an editable install (e.g. pip install -e .), though I am unsure if that is an option when using Scons and probably won't reflect Cython code changes.

There are a few high level items I'd love to discuss, some of which may go beyond the scope of this pull request. One is the public/private member distinction - I've noticed many symbols which are exported by Cantera which probably should not be, such as external packages (e.g. Numpy) and the copy_doc decorator function you called out. (Although a properly-typed copy_doc might be useful, I actually have an updated version of it to push at some point.)

One of my goals was to type everything which is exported, even if it wasn't really intended to be publicly accessible, or at the very least everything which isn't prefixed by an underscore (and using judgement for things which are). However, it quickly became clear to me that Cantera should be utilizing __all__ to control the exports of each module, and be a lot more restrictive. This would also solve some issues which I see have workarounds, such as the function composite._make_functions which has a comment saying it exists simply to avoid polluting the module namespace.

There's also a lot that could be added to pyproject.toml to control what options what we want enforced in a given type checker (e.g. MyPy). I have a setup on my end which I haven't pushed. That might be a can of worms given how much can be done within pyproject.toml and how many static type-checkers (and linters) are available, I was planning to defer that until after I've found a testing procedure I'm happy with.

And one last general note, any appearance of Any is a placeholder, and there's certainly a lot of work left to do with every *args, **kwargs and TypedDict. For implementations using args and kwargs I considered whether we could just specify the expected signature instead (i.e. only the inputs which will actually be used), but we should probably either a) still retain some kind of indication that the function will accept any arbitrary arguments, or b) rewrite the actual implementation so it does not do that anymore, if that's not a desired feature. Any thoughts?

@ischoegl
Copy link
Member

[…] I've noticed many symbols which are exported by Cantera which probably should not be, such as external packages (e.g. Numpy) and the copy_doc decorator function you called out. […]

One of my goals was to type everything which is exported, even if it wasn't really intended to be publicly accessible, or at the very least everything which isn't prefixed by an underscore (and using judgement for things which are). However, it quickly became clear to me that Cantera should be utilizing __all__ to control the exports of each module, and be a lot more restrictive.

There was some prior discussion on this a long time ago, see #616

@TimothyEDawson
Copy link
Author

TimothyEDawson commented Aug 1, 2025

@ischoegl I appreciate that background information! I strongly believe that it should be revisited. Making an extra step necessary to make a new class or module-level attribute exported, and thus opt-in, would be a good thing in my opinion. It's worth noting that developers are still free to directly import things which are not in the list, they just need to know it's there.

And to the point about dir(ct), yes, that's not common; however, if anyone tries from cantera import *, they will end up with a lot of extra stuff which could cause issues. Of course, star imports are generally discouraged for that very reason, but Cantera itself uses them a lot. All that namespace pollution also shows up within intellisense and autocomplete suggestions in modern IDEs and the Python REPL.

I was planning to make a new issue to this effect, but I could just as well comment on the original thread.

@ischoegl
Copy link
Member

ischoegl commented Aug 1, 2025

@ischoegl I appreciate that background information! I strongly believe that it should be revisited. […]

A lot has happened since, so I tend to agree. Ad star imports, see #1791 … that was for build scripts, but the issues are somewhat related.

@ischoegl
Copy link
Member

ischoegl commented Aug 1, 2025

I was planning to make a new issue to this effect, but I could just as well comment on the original thread.

@TimothyEDawson ... Feel free to create a new issue while referencing #616. Your angle is sufficiently different and likely more convincing; some pros and cons for (internal) star imports are discussed in #1791.

@bryanwweber
Copy link
Member

though I am unsure if that is an option when using Scons and probably won't reflect Cython code changes.

Indeed, this doesn't work as you suspected. It's also somewhat common as a dev to have PYTHONPATH set to control where the interface is imported from. In that case, relative imports ensure that the correct source code is picked up.

@bryanwweber
Copy link
Member

bryanwweber commented Aug 2, 2025

I've noticed many symbols which are exported by Cantera which probably should not be

I'm not sure about "should" here, but I agree we could simplify things. I'd guess the high-level interface is pretty static at this point, so I could see a case where __init__.py only imports the intended public interface rather than * imports from the submodules. I don't think using __all__ makes a ton of sense, though 😕

One of my goals was to type everything which is exported

As I've said elsewhere, I think we should only type the intended exported interface for now. For two reasons, first it makes this PR simpler, and second because it gives us flexibility to change the unintended interface with a little more freedom.

There's also a lot that could be added to pyproject.toml to control what options what we want enforced in a given type checker 

I'd rather add a separate config file. To the extent we can support multiple type checkers, that would be good I think

@TimothyEDawson
Copy link
Author

I'm quickly closing in on being able to run a moderately strict mypy within the interfaces/cython folder and having it return no errors! I only have 15 errors remaining across 2 files - ctml2yaml.pyi and lxcat2yaml.pyi. I'm hoping that won't be too difficult to incorporate into the testing architecture. I'm looking into how typeshed does its testing for inspiration: https://github.com/python/typeshed/tree/main/tests.

There are some tools for automatically merging stub files into the source code files which might be worth trying, but I assume they won't work for the .pyx files. I'm also not positive that Cython syntax supports the full breadth of typing features I'm currently employing.

@TimothyEDawson
Copy link
Author

I'm not sure about "should" here, but I agree we could simplify things. I'd guess the high-level interface is pretty static at this point, so I could see a case where __init__.py only imports the intended public interface rather than * imports from the submodules. I don't think using __all__ makes a ton of sense, though 😕

Whenever I get around to opening that new issue regarding use of star imports and __all__, I'd definitely like to hear why! I have personally found __all__ to be incredibly convenient and useful in my own projects.

As I've said elsewhere, I think we should only type the intended exported interface for now. For two reasons, first it makes this PR simpler, and second because it gives us flexibility to change the unintended interface with a little more freedom.

So if we don't have types for everything exported to begin with, it makes the testing infrastructure much more complex. Right now it's very close to "does everything have a type? Good!" whereas if I start removing things I've already annotated, the tests will need to now know what all of the exceptions are in order to pass (which is doable, e.g. by adding # type: ignore directives). It's quite easy to just type things that are optional, as I already have.

However, a third option of using __all__ to substantially reduce the exported symbols would be great, and I could whip that up in an afternoon. Then anything not contained in those exports can and should be removed from these stubs.

It's also worth noting that once there's a generalized testing infrastructure in place, it would be pretty straightforward to enforce that new functions must be a.) Added to __all__ if they're intended to be part of the public interface, and b.) must be typed to be accepted, but typing remains optional for implementation details.

@TimothyEDawson
Copy link
Author

TimothyEDawson commented Aug 2, 2025

I'd rather add a separate config file. To the extent we can support multiple type checkers, that would be good I think

Why would you do that? Anything that would go into mypy.ini can also go into a [tool.mypy] section within pyproject.toml, likewise for every modern static type checker and linter. What is the benefit to creating a bunch of config files instead of adding the lines to the one which is already present?

@ischoegl
Copy link
Member

ischoegl commented Aug 2, 2025

[...] I'd guess the high-level interface is pretty static at this point, so I could see a case where __init__.py only imports the intended public interface rather than * imports from the submodules. I don't think using __all__ makes a ton of sense, though 😕

I knew that that __all__ would be controversial! 😂

So if we don't have types for everything exported to begin with, it makes the testing infrastructure much more complex. Right now it's very close to "does everything have a type? Good!" whereas if I start removing things I've already annotated, the tests will need to now know what all of the exceptions are in order to pass (which is doable, e.g. by adding # type: ignore directives). It's quite easy to just type things that are optional, as I already have.

I think there's a case to be made to first tackle the public interface, and then finalize the typing? From my perspective, I am with @TimothyEDawson in his desire to create a clean public interface. In my own projects, I have used both __all__ and the __init__.py approach; both accomplish similar things while having their own pros and cons; I have long abandoned star imports as they are frowned upon for a reason. I'd suggest following @bryanwweber's __init__.py compromise, so we can avoid onerous exception handling for testing in this PR? Just my 2 cents, of course.

@ischoegl ischoegl mentioned this pull request Aug 16, 2025
5 tasks
@TimothyEDawson
Copy link
Author

Wanted to highlight one of the new features I added a bit ago. By making SolutionArray a generic type, the phase object used as the input is visible in its type signature:

image

Which enables the language server to spot when you're trying to access a passthrough property which isn't available:

image

And to infer the type of the underlying phase object:

image

@TimothyEDawson
Copy link
Author

TimothyEDawson commented Aug 17, 2025

I switched to relative imports and started merging stub files into the source code, and I'm not super happy with the results.

Relative imports causes stubtest to raise an error for any module which doesn't end up with a corresponding .py, .pyx, or .pyi file within the installed site-packages folder. E.g.

.venv/lib/python3.13/site-packages/cantera/__init__.pyi:26: error: Cannot find implementation or library stub for module named "cantera._cantera"  [import-not-found]
.venv/lib/python3.13/site-packages/cantera/__init__.pyi:26: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
.venv/lib/python3.13/site-packages/cantera/__init__.pyi:61: error: Cannot find implementation or library stub for module named "cantera.constants"  [import-not-found]
.venv/lib/python3.13/site-packages/cantera/__init__.pyi:76: error: Cannot find implementation or library stub for module named "cantera.delegator"  [import-not-found]
.venv/lib/python3.13/site-packages/cantera/__init__.pyi:142: error: Cannot find implementation or library stub for module named "cantera.reactionpath"  [import-not-found]

Putting the type annotations into the source code seems to also switch Mypy into a mode where, for .py files, it checks the internals and not just the function signatures, which obviously greatly expands the scope of this effort. Though there's probably an option to toggle somewhere to only check the signatures for correctness. E.g.

.venv/lib/python3.13/site-packages/cantera/data.py:19: error: Need type annotation for "data_files" (hint: "data_files: set[<type>] = ...")  [var-annotated]

I'll keep these changes for now and keep working on other aspects, just wanted to leave a note about it here.

@TimothyEDawson
Copy link
Author

Woops, guess I messed something up. I can see the following error in Build docs:

Extension error:
Here is a summary of the problems encountered when running the examples:

Unexpected failing examples (1):

    ../samples/python/kinetics/custom_reactions.py failed leaving traceback:

    Traceback (most recent call last):
      File "/home/runner/work/cantera/cantera/build/doc/samples/python/kinetics/custom_reactions.py", line 74, in <module>
        @ct.extension(name="extensible-Arrhenius", data=ExtensibleArrheniusData)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    TypeError: Argument 'data' has incorrect type (expected cantera.reaction.ExtensibleRateData, got type)

So I see that my mistake was putting data: ExtensibleRateData | None instead of data: type[ExtensibleRateData] | None. I'll fix that really quick, and hopefully that's all.

@TimothyEDawson
Copy link
Author

TimothyEDawson commented Aug 17, 2025

Alright, I think the error causing "CI / ubuntu-22.04 with Python 3.10, Numpy latest, Cython ==0.29.31" to fail:

[ RUN      ] Reaction.PythonExtensibleRate
Traceback (most recent call last):
  File "/home/runner/work/cantera/cantera/build/python/cantera/__init__.py", line 4, in <module>
    from ._cantera import *
  File "build/python/cantera/_cantera.pyx", line 26, in init cantera._cantera
TypeError: 'ABCMeta' object is not subscriptable

where line 26 in _cantera.pyx is: _path: Sequence[str] | None,, is specific to Cython 0.29.31. I'm guessing it just doesn't support the generic type syntax. I'm not sure why the various Python 3.14 tests are failing - they all pass for me locally, though I use 3.14.0-rc.1 and I see these are using 3.14.0-rc.2.

@ischoegl
Copy link
Member

ischoegl commented Aug 19, 2025

Hi @TimothyEDawson ... while #1947 is probably moot, I wanted to leave a note here.

Putting the type annotations into the source code seems to also switch Mypy into a mode where, for .py files, it checks the internals and not just the function signatures, which obviously greatly expands the scope of this effort. Though there's probably an option to toggle somewhere to only check the signatures for correctness.

I'd expect a Mypy option to prevent this also. If there isn't, I'd be 👍 with leaving things in separate .pyi files. PS: I actually don't think there is a straightforward option for Mypy, but there may be other tools that are less restrictive.

Regarding the post-merge tests: it appears to be a single issue with HDF. On your machine, you won't see it unless you have HDF support enabled, but there could be other reasons also.

Other than that, could you rebase this PR on the current main to avoid inclusion of unrelated PR changes that were recently merged?

@TimothyEDawson
Copy link
Author

Hey @ischoegl , I will work on rebasing soon, and might squash some of my commits while I'm at it.

Regarding #1947 , in my opinion you closed it prematurely - it was the opening to a conversation, not the end of it. At the same time, I wonder if it may be moot for a different reason - if we do utilize stub files with explicit imports (and optionally __all__ lists), we may end up achieving everything required without actually modifying the source files (even if speth is not happy with the long lists of imports to be maintained). We'll need to revisit this once I've marked this branch ready for review.

I've made a lot of progress on adding types directly into the file conversion modules, so hopefully I can get those pushed in the next week or two. In parallel I have been digging into the established testing infrastructure and Github Actions to figure out where the typing tests should be inserted. They're independent of pytest so I'm guessing it should be its own "Python type checking" step after the "Run Python tests", within the Ubuntu, Clang, MacOS, and Windows jobs. Though since type checking should be largely independent of the underlying code, I'm not sure if it really makes sense to add it to all of them. It also requires additional dependencies so I need to make sure I'm handling those properly.

I had considered putting it in a standalone type-checking job, but since stubtest does require a built Cantera I'm hesitant to add more builds just for that. (Static type checks using Mypy, Pyright, etc. don't require the code to be compiled). It does matter what version of Python you're running the tests with, too, so I at least need the 3.10 - 3.13 test matrix.

@ischoegl
Copy link
Member

Hey @ischoegl , I will work on rebasing soon, and might squash some of my commits while I'm at it.

Regarding #1947 , in my opinion you closed it prematurely - it was the opening to a conversation, not the end of it. [...]

You may be right, but I currently don't have the bandwidth to go to battle over this. Currently trying to put out several fires at my actual job, and there's no end in sight 😂

I had considered putting it in a standalone type-checking job, but since stubtest does require a built Cantera I'm hesitant to add more builds just for that.

We have a couple of jobs that depend on artifacts from a prior job (examples: .NET runners and Python example runners, i.e., the two matrix jobs that are currently skipped). So it may not be as computationally expensive as it appears. The Python version matrix essentially exists ...

@TimothyEDawson
Copy link
Author

We have a couple of jobs that depend on artifacts from a prior job (examples: .NET runners and Python example runners, i.e., the two matrix jobs that are currently skipped). So it may not be as computationally expensive as it appears. The Python version matrix essentially exists ...

Good point! A setup like run-examples would cover the requirements for stubtest very nicely, I think. Static type checks would ideally be performed independently of code compilation, and perhaps would make more sense in linters.yaml?

@speth
Copy link
Member

speth commented Sep 1, 2025

Hi @TimothyEDawson, now that #1948 has been merged, can you rebase this in light of that PR?

Also, can you eliminate all of the changes that are only to the formatting (for example all the automatic conversion of single quotes to double quotes that's presumably being done by Black or some other auto-formatter). This PR is enormous and those changes just make it that much more work to review.

@TimothyEDawson
Copy link
Author

@speth For sure. It's a holiday weekend and the coming week is super busy for me so I can't promise it'll be done soon, but I should at least be able to do the rebase later today.

I would like to note that this is still in draft status to explicitly indicate the PR is not ready for review, as I know there's several items which are still WIP and I don't want to waste anyone's time reviewing things which I already know will change (or waste computational cost from the CI jobs). But I can also appreciate that y'all might want to change the direction of certain changes earlier rather than later, so I welcome comments as they come.

Adding stubs alongside the Python files to provide type hints for
external use. These are intended to cover the public interface,
including all dynamically-generated attributes.
This ensures the .pyi files are copied to the build directory by Scons,
and subsequently installed to the Python site-packages location. There's
probably a cleaner way to do this.
Discovered that `str` is a `Sequence[str]`, so the latter was replaced
with `list[str] | tuple[str, ...]` as a workaround. Also need to include
the default value to ensure empty calls are correctly type-narrowed,
e.g. `Solution.species() -> list[Species]`.
Add type aliases for state setters/getters which deal with array-value
inputs (i.e. SolutionArray) and quality (e.g. PureFluid).
@ischoegl
Copy link
Member

Hi @TimothyEDawson ... I wanted to give you a heads-up that we're getting relatively close to a beta release for 3.2. Speaking for myself, I'd like to see the type stubs included but don't know whether you'll be able to push this over the finish line?

@TimothyEDawson
Copy link
Author

@ischoegl I highly doubt it, as there's really quite a lot more to do. On top of that, I'm going on a week long vacation soon.

I'd estimate this won't be ready for review for about a month, and that's assuming I get dedicated time to work on it which I have not had for the past month (the end of the fiscal year is always a particularly busy time).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add type hinting to the Python module
4 participants