Skip to content

Add numpy array shapes for return types #1325

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Aug 17, 2025
Merged

Conversation

hamdanal
Copy link
Contributor

I made these changes:

  • Added array shapes in several places where the shape is known statically
  • Updated existing tests and added missing tests
  • Modified the check function in the tests to validate the shape and dtype of numpy arrays

@hamdanal hamdanal marked this pull request as draft August 15, 2025 18:46
@hamdanal
Copy link
Contributor Author

The mypy test failures in CI (first couple commits) are weird. They appear on Python 3.10 only and only in CI. I couldn't reproduce them locally even with the same versions of python, mypy, pandas and numpy as in CI. I skip these test only for mypy and only on Python 3.10.

@hamdanal hamdanal marked this pull request as ready for review August 15, 2025 20:04
Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR. Very interesting stuff.
I figured out your problem. If you look at my changes in 1db78e8 (and maybe 1db78e8, although I think the first commit I listed has everything), the changes I made will make it work for mypy with python 3.10, and allow you to use ShapeT in Timestamp.__eq__() and Timestamp.__ne__()

I was able to replicate the python 3.10 problem locally, making it easier to debug.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgot one comment

@hamdanal
Copy link
Contributor Author

@Dr-Irv this is ready for another round of review

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the nice work on this and reorganizing the tests in test_scalars.py.

There is one set of tests to add which are testing that when we get to_numpy() on an Index or Series, we know that we have a 1D array, so we should test that comparing that 1D array preserves the 1D array of the boolean result.

@hamdanal
Copy link
Contributor Author

So I think you should add a test for the actual result of to_numpy()

@Dr-Irv this is implemented. To have this working, I had to add the numpy dtype to the to_numpy methods of index and series subclasses. I couldn't make Index and Series themselves generic over the numpy type (even with typevar default to Any) because they cause a lot of tests to fail and make mypy mad on overload resolution in some methods. Instead I created @type_check_only subclasses that are themselves generic over the numpy type and made Index and Series subclasses inherit from them. This worked very well and the @type_check_only decorator insures users of pandas cannot import these fake classes.

In the future I may revisit making Index and Series generic over the numpy type but that will take some serious time and debugging to work properly.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very close. One small thing in the tests that I think should be changed, but I could be convinced otherwise.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @hamdanal . Nice contribution.

@Dr-Irv Dr-Irv merged commit 3aa4044 into pandas-dev:main Aug 17, 2025
13 checks passed
@hamdanal hamdanal deleted the numpy-shapes branch August 17, 2025 21:18
@hamdanal
Copy link
Contributor Author

thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Shape types for numpy arrays in return types
2 participants