-
Notifications
You must be signed in to change notification settings - Fork 304
firedancer-dev: repair profiler #5687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@@ -318,7 +318,7 @@ publish_stake_weights( fd_replay_tile_ctx_t * ctx, | |||
if( epoch_stakes_root!=NULL ) { | |||
ulong * stake_weights_msg = fd_chunk_to_laddr( ctx->stake_out->mem, ctx->stake_out->chunk ); | |||
ulong epoch = fd_slot_to_leader_schedule_epoch( epoch_schedule, fd_bank_slot_get( slot_ctx->bank ) ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think based on fd_bank.h and comments in fd_runtime.c
Assuming we are executing in epoch E
next_epoch_stakes (E - 1)
epoch_stakes (E - 2)
stakes->vote_accounts (E)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't look right to me.
To compute leader schedule in epoch E, we need the list of (identity, stake) at E-1 -- and soon (vote, stake, identity). And similarly for epoch E-1 we need the data at E-2.
We do store next_epoch_stakes(E-1) and epoch_stakes(E-2).
But using vote_accounts(E) is incorrect. We need to use the correct vote accounts at epoch boundaries, that are part of the manifest/bank.epoch_stakes (or bank.versioned_epoch_stakes -- I've been told these are the same but I haven't checked yet. Agave seems to be switching to versioned in... 2.4? not sure about this either).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use epoch_stakes
(E-2), which is what I think stake_weights
is supposed to represent in this message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can just rename this stuff to be the same and add copious documentation in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some documentation at the top of fd_runtime_process_new_epoch
:
/* Starting a new epoch.
New epoch: T
Just ended epoch: T-1
Epoch before: T-2
In this function:
- stakes in T-2 (epoch_stakes) should be replaced by T-1 (next_epoch_stakes)
- stakes at T-1 (next_epoch_stakes) should be replaced by updated stakes at T (stakes->vote_accounts)
- leader schedule should be calculated using new T-2 stakes (epoch_stakes)
Invariant during an epoch T:
next_epoch_stakes holds the stakes at T-1
epoch_stakes holds the stakes at T-2
*/
Agree we should make this clearer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stake weight changes look good to me
4d17463
to
93a3c72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! this will be super useful for us
struct fd_stake_out_link { | ||
ulong idx; | ||
fd_frag_meta_t * mcache; | ||
ulong * sync; | ||
ulong depth; | ||
ulong seq; | ||
fd_wksp_t * mem; | ||
ulong chunk0; | ||
ulong wmark; | ||
ulong chunk; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align
FD_FN_CONST static inline ulong | ||
manifest_bank_align( void ) { | ||
return 128UL; | ||
} | ||
|
||
FD_FN_CONST static inline ulong | ||
manifest_bank_footprint( void ) { | ||
return 2UL*1024UL*1024UL*1024UL; | ||
} | ||
|
||
FD_FN_CONST static inline ulong | ||
manifest_spad_align( void ) { | ||
return 128UL; | ||
} | ||
|
||
FD_FN_CONST static inline ulong | ||
manifest_spad_footprint( void ) { | ||
return 1UL*1024UL*1024UL*1024UL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these magic values seem a little strange. are they just copied over from the snapshot tile?
if( FD_LIKELY( ctx->in_kind[ in_idx ]==SNAP_IN ) ) { | ||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesnt seem needed given return 0
below?
93a3c72
to
584d06f
Compare
Adding firedancer-dev command repair. This is a standalone application.
The fd_shredcap_tile has been modified to publish_stake_weights from the snapshot manifest supplied by the snapshot tile(s).
This PR also includes an upgrade to the replay tile's generate_stake_weight_msg.