Skip to content

don't allow overriding a namedtuple element #584

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 37 commits into
base: main
Choose a base branch
from

Conversation

vagabond-0
Copy link

@vagabond-0 vagabond-0 commented Jun 27, 2025

resolves #568

This PR adds a check to prevent overriding NamedTuple elements (fields defined on classes directly extending NamedTuple) in subclasses.

Problem
Previously, the type checker did not prevent fields defined in a NamedTuple class from being redefined in subclasses, which is not allowed.

Added a check in the check_class_field_for_override_mismatch function to verify if:

A parent class has NamedTupleMetadata (meaning it extends NamedTuple)
The current field being defined exists in the parent's NamedTuple elements

@facebook-github-bot
Copy link
Contributor

Hi @vagabond-0!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@facebook-github-bot
Copy link
Contributor

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@yangdanny97 yangdanny97 changed the title done with the text and feature don't allow overriding a namedtuple element Jun 27, 2025
None,
format!("Cannot override named tuple element `{}`", name),
);
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should early return here

@@ -1146,6 +1146,19 @@ impl<'a, Ans: LookupAnswer> AnswersSolver<'a, Ans> {

for (parent, parent_metadata) in parents {
parent_has_any = parent_has_any || parent_metadata.has_base_any();
//don't allow overriding a namedtuple element
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: space after //

@yangdanny97
Copy link
Contributor

Thanks @vagabond-0 ! Could you please run test.py and commit any generated changes?

@vagabond-0
Copy link
Author

image
Test generated by running test.py

@yangdanny97
Copy link
Contributor

@vagabond-0 please rebase. That test issue was fixed in #576

yangdanny97 and others added 19 commits June 27, 2025 22:48
Summary: Make it clear that GetAttr and NotFound are handled the same way for attribute deletion/setting.

Reviewed By: ndmitchell

Differential Revision: D77329051

fbshipit-source-id: a1e80c1454d509d3ae8b876dd89c0de0701a569b
Summary: With only one additional word we can avoid most of the Box values in BindingExpect. I hope that should reduce memory allocation.

Reviewed By: SamChou19815

Differential Revision: D77372215

fbshipit-source-id: b3a90a3a58ab49bb13db9ad0643b3e218fdcf737
Summary: Means we allocate less. Also less `box` matching to unpack.

Reviewed By: SamChou19815

Differential Revision: D77372213

fbshipit-source-id: c4e82c7c58fcd9c085440dc3d9d1b2b90cb8332c
Summary:
## Description

This PR allows for field-level overrides to set `kw_only=False` for a dataclass when the class has `kw_only` set to `True`.

## Relevant Issue
Closes facebook#494.

## Testing Instructions

To test this PR manually, create a Python script like
```
from dataclasses import dataclass, field

dataclass(kw_only=True)
class SomeClass:
    x: int = field(kw_only=False)

SomeClass(x=1)
SomeClass(1)

```
and run `pyrefly check`. Verify that no errors are generated.

Note: this PR does not handle cases where there are mixed positional and keyword-only arguments; that will be handled in a follow-up PR (related issue: facebook#493).

Pull Request resolved: facebook#566

Test Plan:
Imported from GitHub, without a `Test Plan:` line.

Rollback Plan:

Reviewed By: ndmitchell

Differential Revision: D77305257

Pulled By: yangdanny97

fbshipit-source-id: 7dcc8d49b9d25a35f4479fc5cca9c89b834f79ce
Summary:
I believe that deterministic cycle breaking will require the ability to use some strategy
other than immediately recording a recursive placeholder result when a cylce is detected.

With this in mind, I want to work on decoupling the detection of a cycle versus the
recording of a recursive placeholder to break it.

This diff does so while keeping the same public interface; the `calculate` and
`calculate_with_recursive` methods are now built out of simpler APIs that I believe
can be extended to support deterministic cycle breaking.

Later in the stack I expect to need a (slightly more powerful variation of) the new
interface in `get_idx`.

----------

Edit: just pushing down notes from the top of the stack: implementing this on Pyrefly
today appears to roughly halve the nondeterminism on an example project. It's known
that we still leak recursive `Var`s in ways prone to data races; my current theory is that
most of the remaining nondeterminism comes from that; any reasonable solution I've been
able to think of so far would build on top of cycle-breaking logic.

Reviewed By: ndmitchell

Differential Revision: D76948138

fbshipit-source-id: 6e74ff378f32aaa22fc0d972b1afcffb55e9691e
Summary:
I need more flexibility eventually, which means this pre-boxed-up API isn't
quite what I want long term.

It's also actually slightly simpler now in my opinion, since there are no more
callbacks to deal with.

The `calculate` helper is still present, because it's useful in much simpler
environments where we don't need complex cycle logic (we currently rely on it
when computing `Definitions`, before bindings.

Reviewed By: ndmitchell

Differential Revision: D76948123

fbshipit-source-id: 851bbc613c089c5301bde60b217e261a9f424822
Summary:
I'm preparing to experiment with deterministic cycle breaking, which I think
we might be able to just implement faster than we can debug nondeterminism in
some of the target projects we want to roll out to.

At this stage I don't care what the ordering actually is, but I need it
to exist - the idea is that to deterministically break cycles, I need a way
to pick an element of the cycle at which to break, and the obvious way to
implement this is to use an ordering and pick the minimal element.

This approach should work at least okay, I think:
- we use the order of variants in `AnyIdx`; there's probably an optimal ordering
  for these in cases where a cycle involves more than one kind of `K::Value`,
  but I'm not sure what it is, and it's easy to experiment later... all we have
  to do is re-order the variants in the definition of `AnyIdx`.
- within any given `K` type, we'll go in the order that the indices were created,
  which I think is almost certainly the best possible order. Some specific cases:
  - It's a great order for loops, because the Phi always preceeds in-loop bindings,
    this is exactly what we want, the Phi is the *only* acceptable place to break
    the cycle because that's the only binding with a `Default` (it's a pretty
    clear shortcoming of the existing setup that we can break somewhere else
    and wind up ignoring the default, I'd like any deterministic algorithm to fix
    that issue, and this order will do so).
  - In general, it's going to usually cause the first AST node visited to be the
    preferred place for cycle breaking, which (a) is reasonably predictable,
    which is good both for our own debugging and for end users, and (b) should
    be *mostly* consistent with what we get today in the absence of data races,
    because we solve the bindings in order.
- I break ties by sorting using lexicographically order of module names. My
  rational for this (which is only relevant for inter-module cycles) is:
  - in most cases, it probably doesn't matter much where we break; we force all
    Vars at module boundaries, so *usually* things will be reasonably sanitized
    anyway
  - it's more efficient to sort based on the idx first, for two reasons:
    - First, a lot of cycles are going to be single-module, and the module
      comparison would be useless in that case
    - Second, even in an inter-module cycle, comparing the AnyIdx is much
      cheaper, and ties should be very rare so we'll almost never compare strings
  - if it turns out that the order of preference for `K::Value` kinds matters
    for cycle semantics, we almost certainly want to sort by that with higher
    preference than module.

Reviewed By: ndmitchell

Differential Revision: D76948122

fbshipit-source-id: 2be03481cb7fa8cee69d98ee21038308de813c7e
Summary:
This diff doesn't actually do anything, but I wire up the plumbing
for `AnswersSolver` to be aware of cycle-breaking.

The idea here is that solving cyclic graphs involves breaking cycles,
and the cycles form a stack because you can encounter a cycle while
solving a different cycle. Breaking cycles deterministically will
require doing more than we do to day, and actually tracking some
data as we resolve the cycle - we'll use the `Cycle` data structure,
which at the moment is just a unit type, for that.

Cycles are thread-local, so the cycles get passed around and created
in exactly the same places as the calculation stack; we might want to
eventually coalesce these into some kind of "thread state" data structure,
but I'd rather delay that kind of bikeshedding until after the core
algorithm is working.

Reviewed By: ndmitchell

Differential Revision: D76948125

fbshipit-source-id: 55bd1b10ef4bc9e6ea7ab78b9f4d7c80e69ef362
Summary:
Deterministic cycle breaking will require this, so let's move it out
of `debugging.rs`

Reviewed By: ndmitchell

Differential Revision: D76948139

fbshipit-source-id: 888faefc096564811606ae543343d40066bf6fd9
Summary:
Now that I'm looking at doing something other than just debug logging
with this, it becomes somewhat important to know at the type level that
the cycle is never empty.

Reviewed By: ndmitchell

Differential Revision: D76948124

fbshipit-source-id: 3a820ccbcc51b3c519feea3c0af05c21f59520a9
Summary:
The `answers.rs` module is getting too busy with cycle breaking work
in progress. I was going to put off factoring things out becasue it's
a bit bikesheddy, but I realized that I'll probably be slower actually
implementing things if I don't isolate the tricky stuff better.

The easiest way to break up `answers` is to move the AnswerSolver out;
the main blocker for that is the use of direct struct construction.

Reviewed By: ndmitchell

Differential Revision: D76948126

fbshipit-source-id: dbfd922cccdac89da3bc7d4c31d3651050d42ccf
Summary:
Having all the Answers and Solutions logic - which mostly
has very little to do with solving answers, with a few
exceptions - living in the same module as the core of the
answers solver was making it harder for me to add the extra
state needed for cycle breaking.

I think this separation of concerns is probably good hygiene
anyway, but my main motivation is that it will make adding more
logic to `Cycles` and `get_idx` a lot more manageable.

Reviewed By: ndmitchell

Differential Revision: D76948129

fbshipit-source-id: aa999afa91323122cb40c98346fc097810d7b47d
Reviewed By: ndmitchell

Differential Revision: D76948130

fbshipit-source-id: 700d154c4a74d41394cb73410244f5014e02f2d2
Summary: Not yet used; we'll call this when we detect a cycle.

Reviewed By: ndmitchell

Differential Revision: D76948131

fbshipit-source-id: d439ebd750899b1fe1bc91a7c223af179234b336
Summary:
We'll need a pre-calculation check for deterministic cycle breaking because we
need to know when we are recursing from a cycle detection point toward the
break-at point: all the intervening calculations have to bump their per-thread
recursion limit.

We'll need a post-calculation check for two reasons:
- In the simplest version, we'll need to know when we finished the cycle, so
  we can pop it from the cycle stack.
- Eventually, we will also need some extra logic for all the intervening
  calculations to avoid writing answers that contain unpinned `Var`s (without
  this, we'll still be exposed to data races).

Even thought the uses of the two checks differ, the same enum works pretty
well for both, because there are always three cases:
- There's no cycle to worry about, just do default things
- We're a participant in the cycle but not the break point, do whatever that entails
- We're at the breakpoint, do whatever that entails

Reviewed By: ndmitchell

Differential Revision: D76948141

fbshipit-source-id: 0592d18848453353cbfe0204fa9db38dc86648b6
Summary:
We used to only use it for the calculation stack so we created
it inside `push`, but we're about to need it for other things.

Reviewed By: ndmitchell

Differential Revision: D76948128

fbshipit-source-id: 5d9b77a6a111cfee1d7749ce7081c3fbd7440160
Summary:
In order to implement deterministic cycle breaks, I'm going to wind up needing
a double match, where the exact logic will depend on some combination of the
cycle state of the current idx and the calculation state.

It's a little hard to see how to do that well without rearranging some of the
fiddly logic more clearly - this diff mostly leaves things as they were but
moves the fiddling into helpers.

Reviewed By: ndmitchell

Differential Revision: D76948134

fbshipit-source-id: f533c3323a2724030385e1fa1084003f85fd07a1
Summary:
A lot of the complexity in the existing logic came from making every branch of
a match return a type complicated enough to control another match because we were
trying to calculate or fetch the result and then record recursive correspondence
independently.

But these two operations aren't actually independent - the only scenario where we would
record recursive is when we've just stored an answer. Inlining that makes things
a bit simpler, the types make more sense and we get rid of one of the blocks in
`get_idx`.

Reviewed By: ndmitchell

Differential Revision: D76948142

fbshipit-source-id: 34fe1a0082805a151ed0adec5035bc79a946a6d0
Summary:
Eliminate another unnecessary result type and match; now the body of `get_idx` is mostly
just one big match. This is a big help since I need to figure out how to add some complexity
while keeping it readable.

Reviewed By: ndmitchell

Differential Revision: D76948135

fbshipit-source-id: ec8dac7c1cef41cd4950e06dfaa669b7fd0dc38d
stroxler and others added 16 commits June 27, 2025 22:48
Summary:
Trying to keep the upcoming diff where I implement cycle detection
as clean as possible; I realized a couple methods are cleaner if we
can pass `CalcId` by reference, and we still need an api to pop the
cycle when we finish it.

Reviewed By: ndmitchell

Differential Revision: D76948133

fbshipit-source-id: f2e42dcf3b52e441ca8ad47192274a47d519a02b
Summary: Otherwise windows users generate them with `\r\n`

Reviewed By: ndmitchell

Differential Revision: D77378036

fbshipit-source-id: f53556dafc61dedb038762ef8fe1f7ab19084551
Summary:
reported [here](https://discord.com/channels/1319086885024567347/1387701966527795200/1387814602158506124)

we might also want to allow auto-imports of builtins < 3 characters

Reviewed By: SamChou19815

Differential Revision: D77379581

fbshipit-source-id: b808302e6c5fb7124b9a75339fcdf7956a217647
Reviewed By: grievejia, yangdanny97

Differential Revision: D77389555

fbshipit-source-id: 0d9b7ee311d0db4c2482a3f38d18769adae66479
Summary:
Pull Request resolved: facebook#576

This code finds module prefixes in the order they appear on disk. Often files written in one order will read in that order, but there is no guarantee, and it's file system dependent. To make sure these tests are deterministic on all file systems, sort the result.

Note that before the result is used in LSP we sort it anyway, so it doesn't need sorting in the real method.

Reviewed By: grievejia

Differential Revision: D77373942

fbshipit-source-id: fc333f02c1c4f48aaef81c3032ae5be051f8936b
Summary: Nicer API.

Reviewed By: SamChou19815

Differential Revision: D77389554

fbshipit-source-id: ca9cfa021d03b6b5fdf3146ac3f7d71601165fcc
Summary: We always make this `Var` or `()`. So might as well simplify and avoid a lot of boilerplate per key type.

Reviewed By: rchen152

Differential Revision: D77391369

fbshipit-source-id: e3f8b8064ceb5fa64e47953f60ce9e24a5882bff
Summary: Just an alias for Keyed.

Reviewed By: rchen152

Differential Revision: D77391368

fbshipit-source-id: 76173bc48de5a53999e4013939ad1026ad8a90d0
Summary:
We are adding a new Partial type that wraps TypedDicts to be able to type "udpate". For context, see D77181950.

After constructing the partial type when synthesizing an "update" method, we want to propagate the type hint to the expr infer stage.

However, to do so, we must add some special handling here https://www.internalfb.com/code/fbsource/[b4a4e4f927f1]/fbcode/pyrefly/pyrefly/lib/alt/call.rs?lines=663

to do this, we want to know that the method we are looking at at this point in the code, is an update method belonging to a TypedDict.

This diff allows the type hint to propagate.

Reviewed By: rchen152

Differential Revision: D77336201

fbshipit-source-id: c641a0c2b09bd3cf05e5a65a810c0ffe1bc2f1b7
…ook#577)

Summary:
Pull Request resolved: facebook#577

fixes facebook#554

This diff changes the magic dunder attr lookup to work the same as other attribute lookups, allowing it to handle bounded type vars

Reviewed By: grievejia

Differential Revision: D77377201

fbshipit-source-id: 0b5ce74dded55bd5775a6b7e64c217779c9e7558
Summary:
This PR completes the architectural refactoring to provide more descriptive errors for assignments to read-only fields, addressing issue facebook#521. The core change replaces the simple readonly: bool flag with a more expressive readonly: Option<ReadOnlyReason> type across the checker.

Pull Request resolved: facebook#570

Test Plan:
test.py

Rollback Plan:

Reviewed By: rchen152

Differential Revision: D77376730

Pulled By: yangdanny97

fbshipit-source-id: 2878b38db1999379210f628bc60bf0ed46b65331
Summary:
Pull Request resolved: facebook#565

This diff adds 2 workflows to our github CI.
1. runs mypy primer
2. runs when 1 completes, downloads the artifacts, and comments with the error difference

This is more or less copied from https://github.com/python/mypy/blob/master/.github/workflows/mypy_primer.yml and https://github.com/python/mypy/blob/master/.github/workflows/mypy_primer_comment.yml

Reviewed By: ndmitchell

Differential Revision: D77228326

fbshipit-source-id: 8fc9cb2eaad652010ebf6fb8e0339da2b35ab7e9
Summary: I think python/typeshed#14347 is the fix to the stdlib, have sent that upstream.

Reviewed By: migeed-z

Differential Revision: D77434542

fbshipit-source-id: a1f4dadebb2fdaa359577ed885d7cdd9bf7e4686
Summary:
# Problem
While Xavi was working on his intern project, he was running into issues where a workspace configuration change wasn't being reflected in the Python configuration being used.

After looking into it, we realized configurations weren't being reloaded on a workspace change. This shouldn't be happening, and in any project with an on-disk configuration, we couldn't replicate the issue. `State::invalidate_config()` should be invalidating the caches, and we saw that the file was being iterated over, so we deduced the config wasn't being reloaded.

It turns out that there are two caches in `standard_config_finder` that aren't cleared on `State::invalidate_config`. We were able to figure out that for any projects without a config file, the caches there wouldn't regenerated the config.

## Rationale

This makes sense for CLI checks, since once we're able to generate a synthetic config, it doesn't ever change unless a config file is added on disk, which we pick up properly.

Pyrefly's LSP has extra hooks in the `configure()` functions that update based on state changes from VSCode, so this approach doesn't work for projects without configs.

# Solution

The solution I've come up with so far is to pass `standard_config_finder`'s caches through to the `ConfigFinder`, and have `ConfigFinder::clear()` handle clearing them for us.

## Alternative Approaches

I really wanted to instead turn the finding functionality of the `ConfigFinder` into a trait, but it turns out we can't do that with the generic functionality `ConfigFinder` is designed around right now. There might be a way to do this that I'm just too Rust-novice to know about right now, so open to alternatives. For now, we do this with an extra `clear_extra_caches` function that's passed a long with references to the caches the `standard_config_finder` creates.

The other approaches I've considered as well are:
1. We could get rid of the generic functionality there, since it's unlikely we'll be loading other config types. Then we could move `before`, `fallback`, and `clear` into the trait and have `StandardConfigFinder` be an implementation of that trait.
2. Instead of passing a clear caches function to `ConfigFinder`, we could have the caches be static mutexes, which we only clear in the IDE, but then the IDE's implementation is tied to `standard_config_finder`, which is probably not a great approach. The good part about this though is that CLI checks won't ever clear those caches, so we won't redo the work there where it won't ever be needed.

Any thoughts on the approach? Any alternatives we should try before committing to this?

Reviewed By: ndmitchell

Differential Revision: D77417749

fbshipit-source-id: 8e7a37449a105b0bb2aa7f5e70dea16760a938f6
@vagabond-0
Copy link
Author

vagabond-0 commented Jun 27, 2025

@yangdanny97 Rebased.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feature] don't allow overriding a namedtuple element
10 participants