-
Notifications
You must be signed in to change notification settings - Fork 54
chore: add capability of e2e tests to observe provider cluster state and check workload side effects #284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis update introduces Kubernetes pod state observation and verification into end-to-end deployment tests. New observer utilities are added to integration helpers, enabling tests to track pod creation and restarts during deployment lifecycle events, including updates and provider restarts. Several documentation and environment files for local development environments are removed. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as E2E Test
participant Kube as Kubernetes
participant Provider as Provider Service
participant Blockchain as Blockchain
Test->>Blockchain: Create Deployment
Blockchain-->>Test: Deployment Created
Test->>Provider: Query for Bid
Provider-->>Test: Bid Exists
Test->>Blockchain: Create Lease
Blockchain-->>Test: Lease Created
Test->>Provider: Send Manifest
Provider-->>Test: Manifest Accepted
Test->>Kube: Query Pods (Observer.Observe)
Kube-->>Test: Pod State Returned
Test->>Provider: Restart Provider
Provider-->>Test: Provider Restarted
Test->>Kube: Query Pods (Observer.VerifyNoChangeOccurred/VerifyNewPodCreate)
Kube-->>Test: Pod State Verified
Poem
📜 Recent review detailsConfiguration used: .coderabbit.yaml 📒 Files selected for processing (16)
💤 Files with no reviewable changes (12)
🚧 Files skipped from review as they are similar to previous changes (2)
🧰 Additional context used🧬 Code Graph Analysis (2)integration/e2e_test.go (1)
integration/deployment_test.go (5)
⏰ Context from checks skipped due to timeout of 90000ms (1)
🔇 Additional comments (3)
✨ Finishing Touches
🧪 Generate Unit Tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
integration/e2e_test.go (1)
249-257
:⚠️ Potential issueFix incorrect transaction validation.
The
res
variable from line 229 (provider creation) is being incorrectly used to validate unrelated transactions on lines 257 and 278. Each transaction should have its result captured and validated independently.Apply this diff to fix the issue:
// Publish provider's certificate -_, err = ccli.TxPublishServerExec( +res, err = ccli.TxPublishServerExec( context.Background(), s.validator.ClientCtx, s.keyProvider.GetAddress(), cliGlobalFlags()..., ) s.Require().NoError(err) s.Require().NoError(s.network.WaitForNextBlock()) clitestutil.ValidateTxSuccessful(s.T(), s.validator.ClientCtx, res.Bytes())// Publish tenant's certificate -_, err = ccli.TxPublishClientExec( +res, err = ccli.TxPublishClientExec( context.Background(), s.validator.ClientCtx, s.keyTenant.GetAddress(), cliGlobalFlags()..., ) s.Require().NoError(err) s.Require().NoError(s.network.WaitForNextBlock()) clitestutil.ValidateTxSuccessful(s.T(), s.validator.ClientCtx, res.Bytes())Also applies to: 270-278
🧹 Nitpick comments (1)
_run/.envrc (1)
8-10
: Remove dead commented-out code
Since thetqdm
check is no longer enforced, it’s clearer to delete these lines entirely rather than leave them commented out in.envrc
. You can rely on VCS history to restore if needed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
_run/.envrc
(1 hunks)_run/minikube/.envrc
(0 hunks)_run/minikube/Makefile
(0 hunks)_run/minikube/README.md
(0 hunks)_run/minikube/deployment.yaml
(0 hunks)_run/minikube/provider.yaml
(0 hunks)_run/single/.envrc
(0 hunks)_run/single/Makefile
(0 hunks)_run/single/README.md
(0 hunks)_run/single/deployment.yaml
(0 hunks)_run/single/deployment2.yaml
(0 hunks)_run/single/kind-config.yaml
(0 hunks)_run/single/provider.yaml
(0 hunks)integration/deployment_test.go
(1 hunks)integration/deployment_update_test.go
(6 hunks)integration/e2e_test.go
(2 hunks)integration/test_helpers.go
(3 hunks)
💤 Files with no reviewable changes (12)
- _run/minikube/.envrc
- _run/single/.envrc
- _run/single/kind-config.yaml
- _run/single/provider.yaml
- _run/minikube/provider.yaml
- _run/single/deployment.yaml
- _run/minikube/deployment.yaml
- _run/single/README.md
- _run/single/deployment2.yaml
- _run/minikube/Makefile
- _run/minikube/README.md
- _run/single/Makefile
🧰 Additional context used
🧬 Code Graph Analysis (2)
integration/deployment_test.go (5)
integration/e2e_test.go (1)
IntegrationTestSuite
(54-73)testutil/provider/provider.go (4)
TestSendManifest
(48-62)RunLocalProvider
(107-136)RunLocalHostnameOperator
(138-149)RunLocalIPOperator
(151-164)tools/fromctx/context.go (2)
MustKubeClientFromCtx
(174-181)CtxKeyErrGroup
(31-31)cluster/util/lease_id_to_namespace.go (1)
LeaseIDToNamespace
(12-20)integration/test_helpers.go (1)
NewObserver
(198-204)
integration/e2e_test.go (1)
integration/deployment_test.go (1)
E2EDeploymentCreate
(28-30)
🪛 GitHub Actions: tests
integration/deployment_test.go
[error] 195-195: undefined: ptestutil.RunProviderJWTServer
🔇 Additional comments (5)
integration/deployment_update_test.go (2)
117-137
: LGTM! Good integration of pod state verification.The Observer pattern is correctly integrated to verify that deployment updates result in new pod creation, which aligns with the PR objectives.
206-285
: LGTM! Proper verification of pod stability during lease shell operations.The Observer correctly verifies that lease shell operations don't trigger pod restarts or recreations, which fulfills one of the key PR objectives.
integration/test_helpers.go (3)
184-222
: Well-designed Observer implementation for pod state tracking.The Observer correctly tracks both pod UIDs and container restart counts, providing comprehensive coverage for detecting pod lifecycle changes. The implementation properly handles the label selector for web services.
242-271
: Good timeout and retry logic for pod creation verification.The method uses appropriate timeouts (30s) and polling intervals (5s) to wait for pod creation, with clear error messages.
273-295
: Correct implementation for verifying pod stability.The method properly waits for the timeout period while continuously checking for unexpected changes, returning nil if no changes are detected (the expected behavior).
@cloud-j-luna lets rebase on to latest main and knock failing e2e out |
6e9f364
to
f6d5dba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
integration/deployment_test.go (1)
38-41
: Consider using dynamic deployment sequence to avoid conflicts.The hardcoded deployment sequence (102) could cause test failures if run concurrently or multiple times without cleanup.
deploymentID := dtypes.DeploymentID{ Owner: s.keyTenant.GetAddress().String(), - DSeq: uint64(102), + DSeq: uint64(time.Now().UnixNano() / 1000000), // Use timestamp for uniqueness }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
integration/deployment_test.go
(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
integration/deployment_test.go (5)
integration/e2e_test.go (1)
IntegrationTestSuite
(54-73)testutil/provider/provider.go (4)
TestSendManifest
(48-62)RunLocalProvider
(107-136)RunLocalHostnameOperator
(138-149)RunLocalIPOperator
(151-164)tools/fromctx/context.go (2)
MustKubeClientFromCtx
(176-183)CtxKeyErrGroup
(32-32)cluster/util/lease_id_to_namespace.go (1)
LeaseIDToNamespace
(12-20)integration/test_helpers.go (1)
NewObserver
(198-204)
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: integration-tests / crd-e2e
- GitHub Check: release-dry-run
- GitHub Check: lint
- GitHub Check: build-bins
- GitHub Check: tests
🔇 Additional comments (1)
integration/deployment_test.go (1)
33-214
: Excellent implementation of provider restart resilience test.This test successfully implements the PR objective of verifying that provider restarts don't affect deployed workloads. The use of the Observer pattern to track pod state before and after restart is well-designed and provides robust validation.
d84d3a9
to
f66b63d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
integration/deployment_test.go (1)
62-62
: Fix incorrect transaction validation - still using wrong result variable.This is the same issue flagged in previous reviews. Line 62 validates the lease creation using
res.Bytes()
, butres
contains the deployment creation result from line 35, not the lease creation result.- _, err = mcli.TxCreateLeaseExec( + leaseRes, err := mcli.TxCreateLeaseExec( s.validator.ClientCtx, bidID, s.keyTenant.GetAddress(), cliGlobalFlags()..., ) s.Require().NoError(err) s.Require().NoError(s.waitForBlocksCommitted(2)) - clitestutil.ValidateTxSuccessful(s.T(), s.validator.ClientCtx, res.Bytes()) + clitestutil.ValidateTxSuccessful(s.T(), s.validator.ClientCtx, leaseRes.Bytes())
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
_run/.envrc
(1 hunks)integration/deployment_test.go
(1 hunks)integration/deployment_update_test.go
(6 hunks)integration/test_helpers.go
(3 hunks)make/test-integration.mk
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- _run/.envrc
🚧 Files skipped from review as they are similar to previous changes (2)
- integration/deployment_update_test.go
- integration/test_helpers.go
🧰 Additional context used
🧬 Code Graph Analysis (1)
integration/deployment_test.go (4)
testutil/provider/provider.go (2)
TestSendManifest
(48-62)RunLocalProvider
(107-132)tools/fromctx/context.go (1)
MustKubeClientFromCtx
(172-179)cluster/util/lease_id_to_namespace.go (1)
LeaseIDToNamespace
(12-20)integration/test_helpers.go (1)
NewObserver
(198-204)
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: integration-tests / crd-e2e
- GitHub Check: release-dry-run
- GitHub Check: lint
- GitHub Check: tests
- GitHub Check: build-bins
🔇 Additional comments (3)
integration/deployment_test.go (2)
24-41
: Test implementation looks good overall.The deployment creation logic follows the expected pattern with proper error handling, block waiting, and transaction validation. The test structure aligns well with the PR objectives of observing cluster state during deployment updates.
110-110
: Ignore missing import for NewObserverThe
NewObserver
function is defined inintegration/test_helpers.go
within the sameintegration
package, so it’s already in scope and requires no additional import.
- integration/test_helpers.go:198–204 — defines
func NewObserver(namespace string) *Observer
Likely an incorrect or invalid review comment.
make/test-integration.mk (1)
28-28
: Consider if narrowing test scope is intentional for this PR.The test execution has been narrowed from running the entire
TestIntegrationTestSuite
to onlyTestE2ELeaseShell
. While this can be useful for focused development testing, it might not provide comprehensive coverage for CI/CD pipelines.Verify that this change aligns with the testing strategy - if this is temporary for development, consider reverting it before merging.
This issue is stale until I get back. Changes were overwritten with old branch and work is on a machine I don't have access right now. Will be fixed and merged by 10th of July. |
d84d3a9
to
1b08a34
Compare
Signed-off-by: Joao Luna <[email protected]>
Signed-off-by: Joao Luna <[email protected]>
…suer (#299) refs akash-network/support#302 Signed-off-by: Artur Troian <[email protected]> Co-authored-by: Artur Troian <[email protected]> Signed-off-by: Joao Luna <[email protected]>
Signed-off-by: Artur Troian <[email protected]> Co-authored-by: Artur Troian <[email protected]> Signed-off-by: Joao Luna <[email protected]>
Signed-off-by: Artur Troian <[email protected]> Co-authored-by: Artur Troian <[email protected]> Signed-off-by: Joao Luna <[email protected]>
Signed-off-by: Joao Luna <[email protected]> Signed-off-by: Joao Luna <[email protected]>
Signed-off-by: Joao Luna <[email protected]> Signed-off-by: Joao Luna <[email protected]>
1b08a34
to
e2a3d95
Compare
This PR adds the capability of observing a provider cluster state and performing verifications.
It adds several e2e tests to verify if workloads had unexpected impact from several scenarios.
Current tests