fix overload defaults for `as_index` in `groupby` #1321

MarcoGorelli · 2025-08-13T13:14:11Z

These overloads currently show as_index: Literal[False] = ..., but that's not quite correct because the default is True

https://github.com/pandas-dev/pandas/blob/9597c0397962b228f00805e0750b91d0e5272ce9/pandas/core/frame.py#L9375

If as_index: Literal[False] were to be used instead, then mypy would complain that a non-default parameter (as_index) follows a default one (level / axis). Fixing this would require making an extra overload: one for when as_index is passed positionally and one for when it's passed by name

Given that:

groupby already has 16 (!) overloads
as_index is boolean, so arguably it's a best practice to pass it by name anyway

Is it OK to just type if it if it's passed by name? The alternative would be...to bring the number of overloads to 24 🤯 Certainly doable, my preference would be to keep them down and encourage passing as_index by name

Dr-Irv · 2025-08-13T13:58:50Z

I'm not OK with this change for a couple of reasons.

If we require as_index to be a named argument, then it will be incompatible with the implementation, which allows as_index to be positional. So a change should be made in pandas first before we make the change in the stubs.
I think the solution to make True the default is to flip the pairs of overloads. E.g., right now, we have:

    @overload
    def groupby(  # pyright: ignore reportOverlappingOverload
        self,
        by: Scalar,
        axis: AxisIndex | _NoDefaultDoNotUse = ...,
        level: IndexLabel | None = ...,
        as_index: Literal[True] = True,
        sort: _bool = ...,
        group_keys: _bool = ...,
        observed: _bool | _NoDefaultDoNotUse = ...,
        dropna: _bool = ...,
    ) -> DataFrameGroupBy[Scalar, Literal[True]]: ...
    @overload
    def groupby(
        self,
        by: Scalar,
        axis: AxisIndex | _NoDefaultDoNotUse = ...,
        level: IndexLabel | None = ...,
        as_index: Literal[False] = ...,
        sort: _bool = ...,
        group_keys: _bool = ...,
        observed: _bool | _NoDefaultDoNotUse = ...,
        dropna: _bool = ...,
    ) -> DataFrameGroupBy[Scalar, Literal[False]]: ...

I think the right thing to do is to flip the order:

    @overload
    def groupby(
        self,
        by: Scalar,
        axis: AxisIndex | _NoDefaultDoNotUse = ...,
        level: IndexLabel | None = ...,
        as_index: Literal[False] = False,
        sort: _bool = ...,
        group_keys: _bool = ...,
        observed: _bool | _NoDefaultDoNotUse = ...,
        dropna: _bool = ...,
    ) -> DataFrameGroupBy[Scalar, Literal[False]]: ...
    @overload
    def groupby(  # pyright: ignore reportOverlappingOverload
        self,
        by: Scalar,
        axis: AxisIndex | _NoDefaultDoNotUse = ...,
        level: IndexLabel | None = ...,
        as_index: Literal[True]= True,
        sort: _bool = ...,
        group_keys: _bool = ...,
        observed: _bool | _NoDefaultDoNotUse = ...,
        dropna: _bool = ...,
    ) -> DataFrameGroupBy[Scalar, Literal[True]]: ...

So if False is specified, it will match first. Otherwise, True is assumed.

Can you try that instead?

MarcoGorelli · 2025-08-13T14:10:48Z

The issue with that is that if as_index isn't specified, then it'll match the first overload, whereas it should match the second one

Perhaps 24 overloads isn't that bad? Or should I first deprecate non-keyword arguments except level in pandas itself?

Dr-Irv · 2025-08-13T14:52:54Z

The issue with that is that if as_index isn't specified, then it'll match the first overload, whereas it should match the second one

Right. I see your point.

Perhaps 24 overloads isn't that bad? Or should I first deprecate non-keyword arguments except level in pandas itself?

I'm OK with 24 overloads, although I think we should deprecate non-keyword arguments expect level in pandas itself first.

fix overload defaults for as_index in groupby

c362a3e

MarcoGorelli marked this pull request as draft August 13, 2025 18:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix overload defaults for `as_index` in `groupby` #1321

fix overload defaults for `as_index` in `groupby` #1321

Uh oh!

MarcoGorelli commented Aug 13, 2025

Uh oh!

Dr-Irv commented Aug 13, 2025

Uh oh!

MarcoGorelli commented Aug 13, 2025

Uh oh!

Dr-Irv commented Aug 13, 2025

Uh oh!

Uh oh!

Uh oh!

fix overload defaults for as_index in groupby #1321

Are you sure you want to change the base?

fix overload defaults for as_index in groupby #1321

Uh oh!

Conversation

MarcoGorelli commented Aug 13, 2025

Uh oh!

Dr-Irv commented Aug 13, 2025

Uh oh!

MarcoGorelli commented Aug 13, 2025

Uh oh!

Dr-Irv commented Aug 13, 2025

Uh oh!

Uh oh!

fix overload defaults for `as_index` in `groupby` #1321

fix overload defaults for `as_index` in `groupby` #1321