-
-
Notifications
You must be signed in to change notification settings - Fork 388
Add type stubs to the Python module #1926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Some notes: I covered a good chunk of the public API already, but I am aware of some holes here and there which I plan to fill in as time permits. I have deliberately omitted the various file conversion scripts (e.g. I still need to figure out a good way to add unit testing for the type hints. I expect that would look like liberal use of Several decisions were a little impulsive and I'm sure there are many improvements and changes needed. Some examples:
And I'm sure there's more, I just wanted to start getting feedback now rather than keep postponing this pull request forever! |
Thanks for your efforts on this, @TimothyEDawson - this looks really promising! I am, however, somewhat concerned about maintaining parallel signatures in Based on what you suggest here, I ran a couple of quick tests: specifically, I was interested in whether type hints are preserved if they are added directly in the Based on successful tests, it appears that adding type hints directly to You can find my (very rudimentary) test here: https://github.com/ischoegl/cantera/tree/test-type-hints (just one commit) ... I simply copied over what you had in the |
@ischoegl yes, we're on the same page. That was the same concern I highlighted in the pull request text. There's no inherent issues with .pyx files as they're generally treated the same as .py files as far as type hints go, that was not part of the motivation for this. There's an order of resolution to type hints as defined here: https://typing.python.org/en/latest/spec/distributing.html#partial-stub-packages Because stub files take precedence over inline type hints, the path forward would essentially be to delete type hints from the stub files as they are added to the source code files. I am open to whichever approach the Cantera developers wish to take. The two options which seem best to me are:
I lean toward the first option because it gives users a working set of type hints while we work out details for the inline version. It also means I don't even need to touch any .pyx files. Maintainability is a concern, but mostly just for API changes, and any new functionality which is properly type-hinted inline won't need to be added to the .pyi files at all. At the same time, realistically it will be trivial to move the majority of these type hints inline right now. I expect there will be some edge cases which might take a while to figure out what to do, but if that's the route we want to go, I'd be happy to try it. |
One issue I encounter when moving things into the .pyx files is that the language server extensions I use in VS Code (e.g. Pylance, Ruff, Pyrefly) don't recognize Cython syntax, and thus can't help me catch errors like they can in type stubs. That's mostly a convenience, so I'm willing to live with it if I can come up with a robust testing procedure which could catch such errors. I'm going to focus on finishing up the stub files, as I'm already quite close, and designing some testing infrastructure to ensure the type hints match the runtime code. In its current state I'm already using it extensively on my other projects, and finding it to be very helpful (and also finding areas which need reworking). I think there are also many good discussions to be had in the nuances of typing. For one example, I haven't found a way to represent attributes which are only sometimes available, e.g. how a |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1926 +/- ##
=======================================
Coverage 74.92% 74.92%
=======================================
Files 448 448
Lines 55763 55763
Branches 9205 9205
=======================================
Hits 41780 41780
Misses 10881 10881
Partials 3102 3102 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for starting this! Aside from the line comments, two other suggestions:
- Is there any way to add tests for these types? Both correctness relative to implementation and completeness of the types. The former to catch changes in signatures, the latter to catch new functions
- Can the imports of Cantera functions be made relative instead of absolute? I'm a little worried about having another Cantera installation on the PYTHONPATH and mixing up the hints
@property | ||
def boundary_emissivities(self) -> tuple[float, float]: ... | ||
@boundary_emissivities.setter | ||
def boundary_emissivities(self, epsilon: Sequence[float]) -> None: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to type that this has to be a sequence of two elements that are floats?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering the same thing. In this case this should actually be tuple[float, float]
, thus specifying the length, as that matches the internal representation which specifically accepts a tuple and checks the length. In many other places in the code, however, the length is not checked/enforced and the function would work fine with other containers such as lists and ndarrays. In those cases Sequence
would be the correct type (sometimes Iterable
), neither of which seem to have a way to specify the number of elements.
Technically, boundary_emissivities
would work fine with any number of elements if it didn't specifically raise a ValueError
as it does, since it specifically pulls out the first two elements from the input tuple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't look at the implementation, but does it really check for a tuple? That'd be surprising to me. Too bad there's not a way to specify sequence dimensions more generically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It specifies a tuple in the Cython style, though I don't know that it would reject a list. I'm guessing it would not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does, in fact, only accept a tuple. Which is good! This way we can enforce the size.
def linear_solver(self) -> SystemJacobian: ... | ||
@linear_solver.setter | ||
def linear_solver(self, precon: SystemJacobian) -> None: ... | ||
def set_initial_guess(self, *args: Any, **kwargs: Any) -> None: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function might be implemented this way, but can we provide any concrete details here that maybe aren't captured by the actual signature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly, and it's something I plan to come back to! Though I feel like maybe the actual signature should be rewritten to include such details, too.
*, | ||
diluent: CompositionLike | None = None, | ||
fraction: str | ||
| dict[Literal["fuel", "oxidizer", "diluent"], float] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be nice to define this as a TypedDict since these are the only 3 allowed keys
@property | ||
def thermo_model(self) -> ThermoType: ... | ||
@property | ||
def phase_of_matter(self) -> str: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This returns one of a few literal string options that we could specify here.
@@ -0,0 +1 @@ | |||
def no_op(t: float) -> float: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't need to hint this function, it's not really meant for end use
interfaces/cython/cantera/onedim.pyi
Outdated
from cantera._utils import __git_commit__ as __git_commit__ | ||
from cantera._utils import __version__ as __version__ | ||
from cantera._utils import hdf_support as hdf_support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These appear to be unused here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While true, they are exported symbols which exist in onedim's namespace. You can verify that by executing:
import cantera as ct print(ct.onedim.__git_commit__)
That being said, it's probably fine to remove them as its existence here should be visible from the onedim.py
file, and the relevant type information should discernable from _onedim.pyi
. Some of the odd code like this originated from MyPy's stubgen, which I used as a rough starting point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should focus on documenting the intended public interface (here and elsewhere), as opposed to the details that aren't relevant to end users. These constants are imported here, sure, but that's essentially an implementation detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely, though it's not always clear what is relevant and what isn't. Even attributes with an underscore are sometimes important to me, e.g. SolutionArray._phase
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My personal preference is to elide as many implementation details as necessary, including defining types that represent a combination of objects if that's necessary. The reason I feel this way is that I think it will be much easier to get something merged than to chase complete correctness. I'm not sure if that changes anything you're already doing though 😁
# the extension, so there are no ``source`` files in the setup.py ``extension`` and | ||
# we have to treat the module as package data. | ||
cantera = *.pxd, *.dll, *@py_module_ext@ | ||
cantera = *.pxd, *.dll, *@py_module_ext@, py.typed, *.pyi, **/*.pyi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make a similar change to the pyproject.toml
in the sdist source?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing! To be honest, I wasn't even aware of that file. I take it this folder handles packaging the library for conda-forge and PyPI. How should I go about verifying that it's capturing everything correctly?
I also only just noticed the sourcegen folder... I'd like to read up more on the function and reason for separating these folders. I assume there are some relevant discussions in some issues/pull requests, any chance you could point me to some relevant ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, I should have just looked through the developer documentation before commenting. It looks like this page has everything I need, so I'll go ahead and verify that everything is being properly included.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like there wasn't actually any change needed. As far as I can tell, scikit-build-core copies py.typed
and *.pyi
files into the distribution by default, unlike setuptools. I did end up adding a "Typing :: Typed" to the Python classifiers there, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Man, setuptools is so unintuitive. I wish there were another build backend that met our needs for the regular interface.
units: ApplicationRegistry | ||
Q_: Quantity | ||
|
||
def copy_doc(method: F) -> F: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not meant to be a public function, only a convenience wrapper. If your checking works without including this, I think you can remove it
Hey @bryanwweber , thank you for the review! I'll respond to all the comments soon. Regarding testing, certainly, that's one of the things I'm working on. Any static type checker like Mypy will easily flag static functions which are missing types. Dynamic attributes might be tricky. Regarding absolute vs. relative imports, certainly we can swap it. I have a strong personal preference for absolute imports, though I don't know if there are very strong objective reasons to prefer one over the other. I would note that the way I generally avoid the issue you raised when working on a Python package is to perform an editable install (e.g. There are a few high level items I'd love to discuss, some of which may go beyond the scope of this pull request. One is the public/private member distinction - I've noticed many symbols which are exported by Cantera which probably should not be, such as external packages (e.g. Numpy) and the copy_doc decorator function you called out. (Although a properly-typed copy_doc might be useful, I actually have an updated version of it to push at some point.) One of my goals was to type everything which is exported, even if it wasn't really intended to be publicly accessible, or at the very least everything which isn't prefixed by an underscore (and using judgement for things which are). However, it quickly became clear to me that Cantera should be utilizing There's also a lot that could be added to And one last general note, any appearance of |
There was some prior discussion on this a long time ago, see #616 |
@ischoegl I appreciate that background information! I strongly believe that it should be revisited. Making an extra step necessary to make a new class or module-level attribute exported, and thus opt-in, would be a good thing in my opinion. It's worth noting that developers are still free to directly import things which are not in the list, they just need to know it's there. And to the point about dir(ct), yes, that's not common; however, if anyone tries I was planning to make a new issue to this effect, but I could just as well comment on the original thread. |
@TimothyEDawson ... Feel free to create a new issue while referencing #616. Your angle is sufficiently different and likely more convincing; some pros and cons for (internal) star imports are discussed in #1791. |
a48f805
to
05a707f
Compare
Indeed, this doesn't work as you suspected. It's also somewhat common as a dev to have PYTHONPATH set to control where the interface is imported from. In that case, relative imports ensure that the correct source code is picked up. |
I'm not sure about "should" here, but I agree we could simplify things. I'd guess the high-level interface is pretty static at this point, so I could see a case where
As I've said elsewhere, I think we should only type the intended exported interface for now. For two reasons, first it makes this PR simpler, and second because it gives us flexibility to change the unintended interface with a little more freedom.
I'd rather add a separate config file. To the extent we can support multiple type checkers, that would be good I think |
I'm quickly closing in on being able to run a moderately strict There are some tools for automatically merging stub files into the source code files which might be worth trying, but I assume they won't work for the |
Whenever I get around to opening that new issue regarding use of star imports and
So if we don't have types for everything exported to begin with, it makes the testing infrastructure much more complex. Right now it's very close to "does everything have a type? Good!" whereas if I start removing things I've already annotated, the tests will need to now know what all of the exceptions are in order to pass (which is doable, e.g. by adding However, a third option of using It's also worth noting that once there's a generalized testing infrastructure in place, it would be pretty straightforward to enforce that new functions must be a.) Added to |
Why would you do that? Anything that would go into |
I knew that that
I think there's a case to be made to first tackle the public interface, and then finalize the typing? From my perspective, I am with @TimothyEDawson in his desire to create a clean public interface. In my own projects, I have used both |
I switched to relative imports and started merging stub files into the source code, and I'm not super happy with the results. Relative imports causes
Putting the type annotations into the source code seems to also switch Mypy into a mode where, for
I'll keep these changes for now and keep working on other aspects, just wanted to leave a note about it here. |
Woops, guess I messed something up. I can see the following error in Build docs:
So I see that my mistake was putting |
Alright, I think the error causing "CI / ubuntu-22.04 with Python 3.10, Numpy latest, Cython ==0.29.31" to fail:
where line 26 in _cantera.pyx is: |
Hi @TimothyEDawson ... while #1947 is probably moot, I wanted to leave a note here.
I'd expect a Mypy option to prevent this also. If there isn't, I'd be 👍 with leaving things in separate Regarding the post-merge tests: it appears to be a single issue with HDF. On your machine, you won't see it unless you have HDF support enabled, but there could be other reasons also. Other than that, could you rebase this PR on the current |
Hey @ischoegl , I will work on rebasing soon, and might squash some of my commits while I'm at it. Regarding #1947 , in my opinion you closed it prematurely - it was the opening to a conversation, not the end of it. At the same time, I wonder if it may be moot for a different reason - if we do utilize stub files with explicit imports (and optionally I've made a lot of progress on adding types directly into the file conversion modules, so hopefully I can get those pushed in the next week or two. In parallel I have been digging into the established testing infrastructure and Github Actions to figure out where the typing tests should be inserted. They're independent of pytest so I'm guessing it should be its own "Python type checking" step after the "Run Python tests", within the Ubuntu, Clang, MacOS, and Windows jobs. Though since type checking should be largely independent of the underlying code, I'm not sure if it really makes sense to add it to all of them. It also requires additional dependencies so I need to make sure I'm handling those properly. I had considered putting it in a standalone type-checking job, but since stubtest does require a built Cantera I'm hesitant to add more builds just for that. (Static type checks using Mypy, Pyright, etc. don't require the code to be compiled). It does matter what version of Python you're running the tests with, too, so I at least need the 3.10 - 3.13 test matrix. |
You may be right, but I currently don't have the bandwidth to go to battle over this. Currently trying to put out several fires at my actual job, and there's no end in sight 😂
We have a couple of jobs that depend on artifacts from a prior job (examples: .NET runners and Python example runners, i.e., the two matrix jobs that are currently skipped). So it may not be as computationally expensive as it appears. The Python version matrix essentially exists ... |
Good point! A setup like run-examples would cover the requirements for stubtest very nicely, I think. Static type checks would ideally be performed independently of code compilation, and perhaps would make more sense in |
ea08ae4
to
4e6c81b
Compare
Hi @TimothyEDawson, now that #1948 has been merged, can you rebase this in light of that PR? Also, can you eliminate all of the changes that are only to the formatting (for example all the automatic conversion of single quotes to double quotes that's presumably being done by Black or some other auto-formatter). This PR is enormous and those changes just make it that much more work to review. |
@speth For sure. It's a holiday weekend and the coming week is super busy for me so I can't promise it'll be done soon, but I should at least be able to do the rebase later today. I would like to note that this is still in draft status to explicitly indicate the PR is not ready for review, as I know there's several items which are still WIP and I don't want to waste anyone's time reviewing things which I already know will change (or waste computational cost from the CI jobs). But I can also appreciate that y'all might want to change the direction of certain changes earlier rather than later, so I welcome comments as they come. |
Adding stubs alongside the Python files to provide type hints for external use. These are intended to cover the public interface, including all dynamically-generated attributes.
This ensures the .pyi files are copied to the build directory by Scons, and subsequently installed to the Python site-packages location. There's probably a cleaner way to do this.
Discovered that `str` is a `Sequence[str]`, so the latter was replaced with `list[str] | tuple[str, ...]` as a workaround. Also need to include the default value to ensure empty calls are correctly type-narrowed, e.g. `Solution.species() -> list[Species]`.
Add type aliases for state setters/getters which deal with array-value inputs (i.e. SolutionArray) and quality (e.g. PureFluid).
4e6c81b
to
37d228e
Compare
Hi @TimothyEDawson ... I wanted to give you a heads-up that we're getting relatively close to a beta release for 3.2. Speaking for myself, I'd like to see the type stubs included but don't know whether you'll be able to push this over the finish line? |
@ischoegl I highly doubt it, as there's really quite a lot more to do. On top of that, I'm going on a week long vacation soon. I'd estimate this won't be ready for review for about a month, and that's assuming I get dedicated time to work on it which I have not had for the past month (the end of the fiscal year is always a particularly busy time). |
Changes proposed in this pull request
There was some discussion about the approach here, but in short I consider this to be a stepping stone toward adding static typing to the Cython interface. Some advantage of starting with stub files include:
Solution
to aQuantity
orSolutionArray
, trivial to type-hint, and may (?) be the correct way to do so.There are some disadvantages, with the largest perhaps being the substantial increase in the amount of code to be maintained going forward. My hope is that the majority of the stub file content is gradually replaced with type hints within the source code, and I'm aware that I already have some redundant type hints in here.
If applicable, fill in the issue number this pull request is fixing
Potentially closes Cantera/enhancements#85.
Checklist
scons build
&scons test
) and unit tests address code coverage