Skip to content

Record project deletions in ProjectStateRegistry #130225

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

pxsalehi
Copy link
Member

@pxsalehi pxsalehi commented Jun 27, 2025

This will be later used to update the Stateless lease blob, as a project deletion is a notable event that that our stateless cluster consistency check should consider before acknowledging writes.

Relates ES-11207

@pxsalehi pxsalehi added >non-issue :Distributed Coordination/Distributed A catch all label for anything in the Distributed Coordination area. Please avoid if you can. labels Jun 27, 2025
import java.io.IOException;
import java.util.stream.IntStream;

public class ProjectStateRegistrySerializationTests extends SimpleDiffableWireSerializationTestCase<ClusterState.Custom> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was no serialization test for this, so I added this.


private final Map<ProjectId, Settings> projectsSettings;
// Projects that have been marked for deletion based on their file-based setting
private final Set<ProjectId> projectsMarkedForDeletion;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the file-based deletion trigger is there, it should update this.

// Projects that have been marked for deletion based on their file-based setting
private final Set<ProjectId> projectsMarkedForDeletion;
// A counter that is incremented each time one or more projects are marked for deletion.
private final long projectsMarkedForDeletionGeneration;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very similar to DiscoveryNodes#getNodeLeftGeneration but for project deletions.

@pxsalehi pxsalehi marked this pull request as ready for review June 27, 2025 15:53
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination Meta label for Distributed Coordination team label Jun 27, 2025
Copy link
Member

@ywangd ywangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I left some comments.

Comment on lines +144 to +157
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o instanceof ProjectStateRegistry == false) return false;
ProjectStateRegistry that = (ProjectStateRegistry) o;
return projectsMarkedForDeletionGeneration == that.projectsMarkedForDeletionGeneration
&& Objects.equals(projectsSettings, that.projectsSettings)
&& Objects.equals(projectsMarkedForDeletion, that.projectsMarkedForDeletion);
}

@Override
public int hashCode() {
return Objects.hash(projectsSettings, projectsMarkedForDeletion, projectsMarkedForDeletionGeneration);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of the objects in cluster state compares equality by identity, e.g. DiscoveryNodes, so that it is a bit more performant when there is no change which is often the case. So I'd prefer we use identity equality here if these methods are only used for tests and use a wrapper class in tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasted a bit of time trying to make it happen before realizing how ugly it can get since this is a bit more than ClusterBlocks and would require more hacking. In fact, SnapshotsInProgress which is a more similar case to this (also a custom) and does have serialization tests, also does what I've done here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(same with HealthMetadata, RestoreInProgress, to name a few other customs)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. There is indeed inconsistency in how we handle equals for cluster state objects. Since this object is linear to number of projects, the upper-bound should not be too high for performing a non-identity equals. We can also revisit it if it proves to be a problem.

Copy link
Member

@ywangd ywangd Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we make this class support the more granular diff than the generic CompleteNameDiff. The identity equality will be much less interesting since I think creating diff is one place equals is called the most often. That is the case for SnapshotsInProgress, but not RestoreInProgress, again inconsistency. In any case, it's not for this PR.

return new ProjectStateRegistry(
projectsSettings.build(),
projectsMarkedForDeletion,
newProjectMarkedForDeletion ? projectsMarkedForDeletionGeneration + 1 : projectsMarkedForDeletionGeneration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to reset this generation when cluster term changes similar to nodeLeftGeneration as stated on the original draft PR? But I don't see how we can reset it?

Copy link
Member Author

@pxsalehi pxsalehi Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to. AFAICT, with nodeLeftGeneration this can happen naturally since we build a new DiscoveryNodes when master changes. For project deletions that's not the case. We don't need it either. When comparing leases, term and nodeLeftGeneration are compared first and only if those two are equal we would consider deletions. Point being that the consistency check would have to wait anyway if term is different. Admittedly I think that means whether we rest or not is not important, so resetting is not a bad idea. But since there is no overflow concern here, I'd say we can leave it since as you mentioned doing it would be a bit convoluted probably. We can certainly revisit this, but for now this is how I see it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I also think it's safe to not reset. Overflow does not seem to be a conern either. I do wonder why we reset nodeLeftGeneration. But we can always revisit this if necessary like you suggested.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conceptually it makes sense there and it is possible to do it easier (code-wise) underneath when we build a DiscoveryNodes. If it is not needed, convenient to do, and it makes sense only within a term, why not reset it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(that's how I see it at least after staring at it for a while!)

@pxsalehi pxsalehi requested a review from ywangd July 1, 2025 10:23
}

private ProjectStateRegistry mutate(ProjectStateRegistry instance) {
if (randomBoolean() && instance.size() > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have separate tests for different scenarios instead of a single test that changes behavior based on randomBoolean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically yes, but this is a very common pattern in serialization tests. unless there is a certain combinations that you think is not covered? Is there something in particular that you have in mind?

@pxsalehi pxsalehi requested a review from alexey-ivanov-es July 1, 2025 16:12
Copy link
Member

@ywangd ywangd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +144 to +157
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o instanceof ProjectStateRegistry == false) return false;
ProjectStateRegistry that = (ProjectStateRegistry) o;
return projectsMarkedForDeletionGeneration == that.projectsMarkedForDeletionGeneration
&& Objects.equals(projectsSettings, that.projectsSettings)
&& Objects.equals(projectsMarkedForDeletion, that.projectsMarkedForDeletion);
}

@Override
public int hashCode() {
return Objects.hash(projectsSettings, projectsMarkedForDeletion, projectsMarkedForDeletionGeneration);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. There is indeed inconsistency in how we handle equals for cluster state objects. Since this object is linear to number of projects, the upper-bound should not be too high for performing a non-identity equals. We can also revisit it if it proves to be a problem.

return new ProjectStateRegistry(projectsSettings.build());
final var unknownButUnderDeletion = Sets.difference(projectsMarkedForDeletion, projectsSettings.keys());
if (unknownButUnderDeletion.isEmpty() == false) {
assert unknownButUnderDeletion.stream().noneMatch(projectsSettings::containsKey);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I had in mind is something like assert false : "Cannot mark projects ...". On a second thought, an invalid file-based configuration can actually trigger it so that it is not only a programming error. Therefore, I think we don't really want the assertion that I had mind. We also don't need this assertion here since it just asserts the correctness of the above Sets.difference call. Sorry for the back and forth.

return new ProjectStateRegistry(
projectsSettings.build(),
projectsMarkedForDeletion,
newProjectMarkedForDeletion ? projectsMarkedForDeletionGeneration + 1 : projectsMarkedForDeletionGeneration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I also think it's safe to not reset. Overflow does not seem to be a conern either. I do wonder why we reset nodeLeftGeneration. But we can always revisit this if necessary like you suggested.

@pxsalehi pxsalehi enabled auto-merge (squash) July 2, 2025 08:41
@pxsalehi pxsalehi merged commit 54c9db9 into elastic:main Jul 2, 2025
32 checks passed
mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jul 3, 2025
Project deletions is used to update the Stateless lease blob, as a project deletion 
is a notable event that that our stateless cluster consistency check should consider 
before acknowledging writes. 

Relates ES-11207
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Distributed A catch all label for anything in the Distributed Coordination area. Please avoid if you can. >non-issue Team:Distributed Coordination Meta label for Distributed Coordination team v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants