-
Notifications
You must be signed in to change notification settings - Fork 212
Collect log using SerialConsole when SshShell fails to connect #3845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@squirrelsc @LiliDeng , this PR is for early design review. This will enhance triaging of failures for both engineers and AI |
|
||
# Prepare the RunCommandInput for Azure | ||
command = RunCommandInput( | ||
command_id="RunShellScript", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this Linux specific? Consider whether you can add support for Windows. If not possible, think about how to exclude the Feature for non-Posix systems.
ff40ee5
to
bc92811
Compare
df59804
to
a69b551
Compare
a69b551
to
84d4f4a
Compare
741e95e
to
4ff4695
Compare
4ff4695
to
e033ae1
Compare
Testing with non-Linux environments are pending. |
lisa/features/non_ssh_executor.py
Outdated
_ = serial_console.read() | ||
for command in commands: | ||
serial_console.write(self._add_newline(command)) | ||
out.append(serial_console.read()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read()
may not get all output. How about use previous _ = serial_console.read()
to get prompt, and check if the prompt is printed out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, added logic
serial_console.write("\n")
response = serial_console.read()
if not response or "$" not in response and "#" not in response:
Do you recommend the above or response.strip().endswith()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After writing \n
, it will return a full prompt. So, you can check the full prompt instead of only special chars.
lisa/features/non_ssh_executor.py
Outdated
_ = serial_console.read() | ||
for command in commands: | ||
serial_console.write(self._add_newline(command)) | ||
out.append(serial_console.read()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
besides return the output, also printed them out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _collect_logs_using_non_ssh_executor (the caller) is currently printing it.
Should I add an optional argument to print it in this function itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method should print by itself, instead of caller. If it's used by others, they don't need to copy code to print. It's similar to how node.execute print out.
c7b41ac
to
4d6e826
Compare
@squirrelsc Do you think we should add it to test case log as well? I think people would miss out this log if it's not in case log |
It should be, but the node.initialize happens before test case is chosen. So, it needs some changes on the flow to put the initialize log into test case log. |
) | ||
|
||
except Exception: | ||
self._log.info("RunCommand failed to return expected result.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the exception is raised below, it doesn't need to print log before it.
try: | ||
# Since wait_operation returns a dict (result.as_dict()), access as dict | ||
value = result.get("value") | ||
if value and len(value) > 0 and value[0].get("message"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code below is enough.
if value and value[0].get("message"):
ssh may not succeed due to many reasons, at present if ssh fails it is either very difficult or impossible to triage the issue without reproducing it. This change uses SerialConsole to run commands for log collection.
Introduce NonSshExecutor. It Internally uses SerialConsole/RunCommand
4d6e826
to
f5b0fbd
Compare
f5b0fbd
to
c8cd146
Compare
ssh may not succeed due to multiple reasons, at present if ssh fails it is either very difficult or impossible to triage the issue without reproducing it.
This change uses NonSshExecutor to run commands for log collection.