Skip to content

Interrupt production on irreversible block #1569

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 26, 2025
Merged

Interrupt production on irreversible block #1569

merged 10 commits into from
Jun 26, 2025

Conversation

heifner
Copy link
Member

@heifner heifner commented Jun 2, 2025

Interrupt production if an irreversible block received on a different branch.

See new test in #1570

Resolves #1492

@heifner heifner requested review from greg7mdp and spoonincode June 2, 2025 17:15
@heifner heifner added the OCI Work exclusive to OCI team label Jun 2, 2025
@heifner heifner modified the milestone: 1492 Jun 2, 2025
@greg7mdp
Copy link
Contributor

greg7mdp commented Jun 6, 2025

Why would head not be a descendant of pending_lib?

@heifner
Copy link
Member Author

heifner commented Jun 6, 2025

Why would head not be a descendant of pending_lib?

b3 <- b4 <-b5 <-b6
Producer A starts producing on b3 because it has not received b4, b5, b6.
Producer A: b3 <- b4'.
Meanwhile the rest of the network votes on and moves LIB to b4.
After starting to produce, producer A receives blocks b4, b5, b6 and moves pending lib to b4.
Before this change, producer A would keep producing all of its blocks b5', b6', etc even though it has received b4, b5, b6 and put them in the forkdb and updated pending_lib.

@greg7mdp
Copy link
Contributor

greg7mdp commented Jun 6, 2025

After starting to produce, producer A receives blocks b4, b5, b6 and moves pending lib to b4.
Before this change, producer A would keep producing all of its blocks b5', b6', etc even though it has received b4, b5, b6 and put them in the forkdb and updated pending_lib.

When pending lib is moved to b4, shouldn't we do a fork switch and update head to b6?
Which would prevent the producer pluging from comtinuing to produce on an obsolete branch

@heifner
Copy link
Member Author

heifner commented Jun 6, 2025

When pending lib is moved to b4, shouldn't we do a fork switch and update head to b6? Which would prevent the producer pluging from comtinuing to produce on an obsolete branch

Yes, which is the point of this PR.

@greg7mdp
Copy link
Contributor

greg7mdp commented Jun 6, 2025

When pending lib is moved to b4, shouldn't we do a fork switch and update head to b6? Which would prevent the producer pluging from comtinuing to produce on an obsolete branch

Yes, which is the point of this PR.

So if we do a fork switch and update head to b6 when pending lib is updated to b4, why check !chain.head_child_of_pending_lib() in producer plugin? Why would it ever be true?

@heifner
Copy link
Member Author

heifner commented Jun 6, 2025

So if we do a fork switch and update head to b6 when pending lib is updated to b4, why check !chain.head_child_of_pending_lib() in producer plugin? Why would it ever be true?

Sorry, I'm not being clear. Before this PR, once block production was started by the producer plugin it would produce its complete round of 12 blocks before evaluating anything in the forkdb. Blocks might be coming into the node and put into the forkdb, but a needed fork switch was not performed until all 12 blocks of the producer was complete. This PR makes the producer plugin checks to see if there is a pending lib greater than head before it produces each block which allows it to switch to that pending lib if available.

Before Savanna, we actually dropped those blocks on the floor so they were not even in the forkdb. Pre-Savanna the longest chain would win which would mean that normally the 12 blocks produced by the producer would win and any dropped blocks would not become final. With Savanna, voting can happen on those late blocks so they can become the canonical chain. In practice, if the producer was running slow or had some hiccup so that everyone else had those blocks then those blocks are going to be the canonical chain.

This PR takes the very conservative approach of only interrupting production on proof of LIB that makes the produced chain impossible to become LIB. We talked about and might in the future switch to a diff fork if we find it is likely to become the canonical chain. For example, if we know that there are not enough outstanding votes to allow our branch to win.

@greg7mdp
Copy link
Contributor

greg7mdp commented Jun 6, 2025

So if we do a fork switch and update head to b6 when pending lib is updated to b4, why check !chain.head_child_of_pending_lib() in producer plugin? Why would it ever be true?

Sorry, I'm not being clear. Before this PR, once block production was started by the producer plugin it would produce its complete round of 12 blocks before evaluating anything in the forkdb. Blocks might be coming into the node and put into the forkdb, but a needed fork switch was not performed until all 12 blocks of the producer was complete. This PR makes the producer plugin checks to see if there is a pending lib greater than head before it produces each block which allows it to switch to that pending lib if available.

Before Savanna, we actually dropped those blocks on the floor so they were not even in the forkdb. Pre-Savanna the longest chain would win which would mean that normally the 12 blocks produced by the producer would win and any dropped blocks would not become final. With Savanna, voting can happen on those late blocks so they can become the canonical chain. In practice, if the producer was running slow or had some hiccup so that everyone else had those blocks then those blocks are going to be the canonical chain.

This PR takes the very conservative approach of only interrupting production on proof of LIB that makes the produced chain impossible to become LIB. We talked about and might in the future switch to a diff fork if we find it is likely to become the canonical chain. For example, if we know that there are not enough outstanding votes to allow our branch to win.

Thanks for the explanation! I finally understand the logic of what we are trying to do, but I find the code hard to follow.
It could probably use a refactor eventually, but it may be risky to change it.

@heifner heifner requested a review from linh2931 June 12, 2025 13:43
Copy link
Member

@linh2931 linh2931 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a test for the new code logic in producer_plugin?

@heifner heifner merged commit e8952eb into main Jun 26, 2025
36 checks passed
@heifner heifner deleted the GH-1492-produce-lib branch June 26, 2025 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCI Work exclusive to OCI team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test Failure: restart-scenarios-if-test-none
3 participants