Skip to content

Store the last TaskId in the RTC registers #2162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from
Draft

Conversation

labbott
Copy link
Collaborator

@labbott labbott commented Jul 18, 2025

We have an awkward problem where if some task is looping forever we are stuck and can't access anything except via SWD. There are some situatiosn (read racks) where we can't get dongle access. We already have a watchdog which is used for reset.

The RTC block has a small set of backup registers that will not be reset so long as we don't lose power to VDD (i.e. don't ignition cycle). This makes our scheme:

  • Setup our RTC block and use registers 0 and 1 for our purposes
  • On context switch store the task ID into Register 0
  • On bootup, copy the contents of Register 0 to Register 1
  • In a separate task, set a timer to turn the watchdog off/on
  • Log the value of Register 1 to see what the last task ID that was running was

We have an awkward problem where if some task is looping forever
we are stuck and can't access anything except via SWD. There
are some situatiosn (read racks) where we can't get dongle access.
We already have a watchdog which is used for reset.

The RTC block has a small set of backup registers that will not
be reset so long as we don't lose power to VDD (i.e. don't ignition
cycle). This makes our scheme:

- Setup our RTC block and use registers 0 and 1 for our purposes
- On context switch store the task ID into Register 0
- On bootup, copy the contents of Register 0 to Register 1
- In a separate task, set a timer to turn the watchdog off/on
- Log the value of Register 1 to see what the last task ID that
  was running was
@labbott labbott marked this pull request as draft July 18, 2025 15:12
@@ -170,7 +170,9 @@ pub(crate) fn event_timer_isr_exit() {
}

pub(crate) fn event_context_switch(tcb: usize) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this change, unless it was broken before?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing profiling just stores the base of the task, which is both fast and may have been fine for cliff's debugging (thanks again cliff for adding this). I was struggling to figure out how to go from task base to useful information so I did the calculation here. If we really hate this, we can get humility to do this work for us.

&mut self,
_mgs: &RecvMessage,
) -> Result<(), RequestError<core::convert::Infallible>> {
loop { }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: add cortex_m::asm::nop() here?

@hawkw hawkw self-requested a review July 18, 2025 16:28
@hawkw
Copy link
Member

hawkw commented Jul 18, 2025

@labbott do we intend to eventually merge this change to master or is this just being used for the present debugging?

@labbott
Copy link
Collaborator Author

labbott commented Jul 18, 2025

@labbott do we intend to eventually merge this change to master or is this just being used for the present debugging?

I'm split. I don't think we want to use the SWD watchdog in production because that does automatic bank swap which is not what we want. I also think the hack to get the TaskID is, well, hacky. Longer term, it's probably time to actually add a watchdog task but that may require more discussions about how that should work for hubris. I do think we need some kind of debugging for last state before reset. So maybe once we figure out the current issue we'll have a better idea of what we wish we would have wanted.

@hawkw hawkw removed their request for review July 18, 2025 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants