-
Notifications
You must be signed in to change notification settings - Fork 1
FE HA validation framework #22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@@ -0,0 +1,6 @@ | |||
--- | |||
#################### TEST CONFIGURATION VARIABLES #################### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this parameter and follow the decision we made after Bruno's feedback to create a hierarchy of config params.
See the docs here: https://github.com/OpenNebula/one-infra/wiki/Development-and-Testing-Post%E2%80%90deployment-Validation-tools#variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss to which extend this is applicable for validation framework and where we should store role specific variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, lets discuss and conclude in chat.
I think the role specific variables can also follow the same hierarchy. And a control flag for skip/execute. As I understood we agreed on this decision with this PR #20
- name: Wait for leader failover | ||
pause: | ||
seconds: 20 | ||
prompt: "Waiting for OpenNebula to elect a new leader. Press Enter to continue after 1 minutes, or Ctrl+C to abort." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not wait for user input here. Rather look for some log message or zone state (or check leader selection state directly somehow) and wait with a specified timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, make sense. Will remove prompt from here and will wait for 20 seconds
pause: | ||
seconds: 20 | ||
prompt: "Waiting for OpenNebula to elect a new leader. Press Enter to continue after 1 minutes, or Ctrl+C to abort." | ||
run_once: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
run_once should not be used, as recommended by Michal in one-deploy coding style: https://github.com/OpenNebula/one-deploy/wiki/code_style#4-be-careful-with-run_once
Due to this I am using also in the connectivity matrix a logic to make sure it only runs once in the first frontend. In this case maybe we have to find another option, because we might have different behaviour if we happen to run the leader status checking on the "initial_leader" vs. any other one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a bit another case, i.e. we don't have parallel operation here, but have to get output of the onezone command from one of the nodes. Let's discuss if you see any risk here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, as I understand the run_once will just run on random node of the group. And the test would anyway fail earlier if we do not run these on the FE nodes... So I guess it is fine, I would prefer if we have the same approach for run_once, but we can live with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we include this in the main "validation.yml" so it will be run together with the other tests? In that case we can use also the same HTML report, no need for a new file just for this testcase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Majority of the latest deployment is non-FA, so IMO this roles should be conditionally included only for HA deployments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, then we should figure out what should be the condition for it. I think still the easiest would be to also include this test casse under "validation.fe_ha" for the params and control it with "validation.run_fe_ha" that would be pretty consistent, and we can disable it by default.
#when: hostvars[groups[frontend_group | d('frontend')][1]]['ansible_host'] == ansible_host | ||
run_once: true | ||
|
||
- name: Save a new Leader node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's use leader with downcase for consistency
No description provided.