Skip to content

Collect log using SerialConsole when SshShell fails to connect #3845

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

adityagesh
Copy link
Collaborator

@adityagesh adityagesh commented Jun 5, 2025

ssh may not succeed due to multiple reasons, at present if ssh fails it is either very difficult or impossible to triage the issue without reproducing it.
This change uses NonSshExecutor to run commands for log collection.

@adityagesh
Copy link
Collaborator Author

@squirrelsc @LiliDeng , this PR is for early design review.

This will enhance triaging of failures for both engineers and AI


# Prepare the RunCommandInput for Azure
command = RunCommandInput(
command_id="RunShellScript",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this Linux specific? Consider whether you can add support for Windows. If not possible, think about how to exclude the Feature for non-Posix systems.

@adityagesh adityagesh force-pushed the aditya/serial_log_after_test_failure branch 2 times, most recently from ff40ee5 to bc92811 Compare July 7, 2025 10:15
@adityagesh adityagesh force-pushed the aditya/serial_log_after_test_failure branch from df59804 to a69b551 Compare July 8, 2025 09:17
@adityagesh adityagesh marked this pull request as draft July 8, 2025 09:23
@adityagesh adityagesh force-pushed the aditya/serial_log_after_test_failure branch from a69b551 to 84d4f4a Compare July 8, 2025 09:50
@adityagesh adityagesh force-pushed the aditya/serial_log_after_test_failure branch 2 times, most recently from 741e95e to 4ff4695 Compare July 14, 2025 10:21
@adityagesh adityagesh force-pushed the aditya/serial_log_after_test_failure branch from 4ff4695 to e033ae1 Compare July 14, 2025 15:01
@adityagesh
Copy link
Collaborator Author

adityagesh commented Jul 14, 2025

Testing with non-Linux environments are pending.
I will test with one Windows and Freebsd image.

_ = serial_console.read()
for command in commands:
serial_console.write(self._add_newline(command))
out.append(serial_console.read())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read() may not get all output. How about use previous _ = serial_console.read() to get prompt, and check if the prompt is printed out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, added logic

    serial_console.write("\n")
    response = serial_console.read()
    if not response or "$" not in response and "#" not in response:

Do you recommend the above or response.strip().endswith()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After writing \n, it will return a full prompt. So, you can check the full prompt instead of only special chars.

_ = serial_console.read()
for command in commands:
serial_console.write(self._add_newline(command))
out.append(serial_console.read())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

besides return the output, also printed them out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def _collect_logs_using_non_ssh_executor (the caller) is currently printing it.

Should I add an optional argument to print it in this function itself?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method should print by itself, instead of caller. If it's used by others, they don't need to copy code to print. It's similar to how node.execute print out.

@adityagesh adityagesh force-pushed the aditya/serial_log_after_test_failure branch 3 times, most recently from c7b41ac to 4d6e826 Compare July 22, 2025 09:06
@adityagesh
Copy link
Collaborator Author

@squirrelsc
The log is currently printed to node.log and lisa.log. It does not show up in test case log.

lisa.log
image

node.log
image

Do you think we should add it to test case log as well? I think people would miss out this log if it's not in case log

@squirrelsc
Copy link
Member

@squirrelsc The log is currently printed to node.log and lisa.log. It does not show up in test case log.

Do you think we should add it to test case log as well? I think people would miss out this log if it's not in case log

It should be, but the node.initialize happens before test case is chosen. So, it needs some changes on the flow to put the initialize log into test case log.

)

except Exception:
self._log.info("RunCommand failed to return expected result.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the exception is raised below, it doesn't need to print log before it.

try:
# Since wait_operation returns a dict (result.as_dict()), access as dict
value = result.get("value")
if value and len(value) > 0 and value[0].get("message"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code below is enough.

if value and value[0].get("message"):

@adityagesh adityagesh force-pushed the aditya/serial_log_after_test_failure branch from 4d6e826 to f5b0fbd Compare August 11, 2025 06:57
@adityagesh adityagesh force-pushed the aditya/serial_log_after_test_failure branch from f5b0fbd to c8cd146 Compare August 11, 2025 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants