Collect log using SerialConsole when SshShell fails to connect #3845

adityagesh · 2025-06-05T10:04:15Z

ssh may not succeed due to multiple reasons, at present if ssh fails it is either very difficult or impossible to triage the issue without reproducing it.
This change uses NonSshExecutor to run commands for log collection.

adityagesh · 2025-06-05T10:06:26Z

@squirrelsc @LiliDeng , this PR is for early design review.

This will enhance triaging of failures for both engineers and AI

lisa/features/serial_console.py

lisa/node.py

lisa/features/serial_console.py

lisa/features/__init__.py

kamalca · 2025-06-18T18:09:13Z

lisa/sut_orchestrator/azure/features.py

+
+        # Prepare the RunCommandInput for Azure
+        command = RunCommandInput(
+            command_id="RunShellScript",


Is this Linux specific? Consider whether you can add support for Windows. If not possible, think about how to exclude the Feature for non-Posix systems.

lisa/node.py

lisa/features/non_ssh_executor.py

lisa/node.py

lisa/features/non_ssh_executor.py

lisa/node.py

lisa/sut_orchestrator/azure/features.py

lisa/features/serial_console.py

lisa/features/non_ssh_executor.py

lisa/node.py

adityagesh · 2025-07-14T15:27:37Z

Testing with non-Linux environments are pending.
I will test with one Windows and Freebsd image.

squirrelsc · 2025-07-14T15:44:15Z

lisa/features/non_ssh_executor.py

+        _ = serial_console.read()
+        for command in commands:
+            serial_console.write(self._add_newline(command))
+            out.append(serial_console.read())


read() may not get all output. How about use previous _ = serial_console.read() to get prompt, and check if the prompt is printed out.

Makes sense, added logic

serial_console.write("\n") response = serial_console.read() if not response or "$" not in response and "#" not in response:

Do you recommend the above or response.strip().endswith()?

After writing \n, it will return a full prompt. So, you can check the full prompt instead of only special chars.

squirrelsc · 2025-07-14T15:45:59Z

lisa/features/non_ssh_executor.py

+        _ = serial_console.read()
+        for command in commands:
+            serial_console.write(self._add_newline(command))
+            out.append(serial_console.read())


besides return the output, also printed them out.

def _collect_logs_using_non_ssh_executor (the caller) is currently printing it.

Should I add an optional argument to print it in this function itself?

The method should print by itself, instead of caller. If it's used by others, they don't need to copy code to print. It's similar to how node.execute print out.

adityagesh · 2025-07-23T13:29:27Z

@squirrelsc
The log is currently printed to node.log and lisa.log. It does not show up in test case log.

lisa.log

node.log

Do you think we should add it to test case log as well? I think people would miss out this log if it's not in case log

squirrelsc · 2025-07-23T16:01:09Z

@squirrelsc The log is currently printed to node.log and lisa.log. It does not show up in test case log.

Do you think we should add it to test case log as well? I think people would miss out this log if it's not in case log

It should be, but the node.initialize happens before test case is chosen. So, it needs some changes on the flow to put the initialize log into test case log.

squirrelsc · 2025-07-23T16:10:54Z

lisa/sut_orchestrator/azure/features.py

+                )
+
+        except Exception:
+            self._log.info("RunCommand failed to return expected result.")


Because the exception is raised below, it doesn't need to print log before it.

squirrelsc · 2025-07-23T16:11:29Z

lisa/sut_orchestrator/azure/features.py

+        try:
+            # Since wait_operation returns a dict (result.as_dict()), access as dict
+            value = result.get("value")
+            if value and len(value) > 0 and value[0].get("message"):


The code below is enough.

if value and value[0].get("message"):

ssh may not succeed due to many reasons, at present if ssh fails it is either very difficult or impossible to triage the issue without reproducing it. This change uses SerialConsole to run commands for log collection.

Introduce NonSshExecutor. It Internally uses SerialConsole/RunCommand

adityagesh requested review from squirrelsc and LiliDeng as code owners June 5, 2025 10:04

adityagesh added the DRAFT label Jun 5, 2025

squirrelsc reviewed Jun 5, 2025

View reviewed changes

lisa/features/serial_console.py Outdated Show resolved Hide resolved

squirrelsc reviewed Jun 5, 2025

View reviewed changes

lisa/node.py Outdated Show resolved Hide resolved

squirrelsc reviewed Jun 9, 2025

View reviewed changes

lisa/features/serial_console.py Outdated Show resolved Hide resolved

squirrelsc reviewed Jun 9, 2025

View reviewed changes

lisa/features/__init__.py Outdated Show resolved Hide resolved

kamalca reviewed Jun 18, 2025

View reviewed changes

adityagesh force-pushed the aditya/serial_log_after_test_failure branch 2 times, most recently from ff40ee5 to bc92811 Compare July 7, 2025 10:15

adityagesh commented Jul 7, 2025

View reviewed changes

lisa/node.py Outdated Show resolved Hide resolved

adityagesh force-pushed the aditya/serial_log_after_test_failure branch from df59804 to a69b551 Compare July 8, 2025 09:17

adityagesh marked this pull request as draft July 8, 2025 09:23

adityagesh force-pushed the aditya/serial_log_after_test_failure branch from a69b551 to 84d4f4a Compare July 8, 2025 09:50

squirrelsc reviewed Jul 8, 2025

View reviewed changes

lisa/features/non_ssh_executor.py Outdated Show resolved Hide resolved

squirrelsc reviewed Jul 8, 2025

View reviewed changes

lisa/node.py Outdated Show resolved Hide resolved

squirrelsc reviewed Jul 8, 2025

View reviewed changes

lisa/node.py Outdated Show resolved Hide resolved

squirrelsc reviewed Jul 8, 2025

View reviewed changes

lisa/features/non_ssh_executor.py Outdated Show resolved Hide resolved

squirrelsc reviewed Jul 8, 2025

View reviewed changes

lisa/node.py Outdated Show resolved Hide resolved

squirrelsc reviewed Jul 8, 2025

View reviewed changes

lisa/sut_orchestrator/azure/features.py Show resolved Hide resolved

squirrelsc reviewed Jul 9, 2025

View reviewed changes

lisa/features/serial_console.py Outdated Show resolved Hide resolved

adityagesh force-pushed the aditya/serial_log_after_test_failure branch 2 times, most recently from 741e95e to 4ff4695 Compare July 14, 2025 10:21

squirrelsc reviewed Jul 14, 2025

View reviewed changes

lisa/features/non_ssh_executor.py Outdated Show resolved Hide resolved

squirrelsc reviewed Jul 14, 2025

View reviewed changes

lisa/node.py Outdated Show resolved Hide resolved

adityagesh force-pushed the aditya/serial_log_after_test_failure branch from 4ff4695 to e033ae1 Compare July 14, 2025 15:01

squirrelsc reviewed Jul 14, 2025

View reviewed changes

adityagesh force-pushed the aditya/serial_log_after_test_failure branch 3 times, most recently from c7b41ac to 4d6e826 Compare July 22, 2025 09:06

squirrelsc reviewed Jul 23, 2025

View reviewed changes

adityagesh added 13 commits August 11, 2025 06:02

Collect log using SerialConsole when SshShell fails to connect

bbfc6a5

ssh may not succeed due to many reasons, at present if ssh fails it is either very difficult or impossible to triage the issue without reproducing it. This change uses SerialConsole to run commands for log collection.

SerialConsole login

f995104

Add RunCommand feature in Azure and use it to collect logs

c063261

Revert changes - to be squashed at the end

a1ca77b

Remove serial console log collection for network failure

cbf9894

Add RunCommand class

7935505

Changes based on review commands

82ed779

Introduce NonSshExecutor. It Internally uses SerialConsole/RunCommand

Review comments

74c92e8

Update commands to execute

4809777

Serial Console: check for prompt before executing

40957a3

Fix connection alive issue

372506c

RunCommand to run one command at a time due to return size limitation

83abbd5

Update review comments

737ca41

adityagesh force-pushed the aditya/serial_log_after_test_failure branch from 4d6e826 to f5b0fbd Compare August 11, 2025 06:57

Update Serial Console prompt check

c8cd146

adityagesh force-pushed the aditya/serial_log_after_test_failure branch from f5b0fbd to c8cd146 Compare August 11, 2025 07:01

Collect log using SerialConsole when SshShell fails to connect #3845

Are you sure you want to change the base?

Collect log using SerialConsole when SshShell fails to connect #3845

Uh oh!

Conversation

adityagesh commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adityagesh commented Jun 5, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adityagesh commented Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adityagesh commented Jul 23, 2025

Uh oh!

squirrelsc commented Jul 23, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adityagesh commented Jun 5, 2025 •

edited

Loading

adityagesh commented Jul 14, 2025 •

edited

Loading