-
Notifications
You must be signed in to change notification settings - Fork 1
feat: Add configurable stack concurrency limit #335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughIntroduces configurable max concurrent reconciles across multiple module reconcilers by adding an option in core and wiring it into resource initializations, reading from MAX_CONCURRENT_RECONCILES with defined defaults. Adds documentation for stack concurrency. Also includes minor formatting cleanups and a small function signature fix without behavioral changes. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User as Cluster Env
participant Operator as Operator
participant Core as core.GetMaxConcurrentReconciles
participant Builder as controller-runtime Builder
participant Ctrl as Controller(s)
participant K8s as API Server
User->>Operator: Set env MAX_CONCURRENT_RECONCILES
Operator->>Core: GetMaxConcurrentReconciles()
Core-->>Operator: int (0→1000, invalid→5)
Operator->>Builder: WithMaxConcurrentReconciles(max)
Builder-->>Ctrl: Start controllers with max workers
K8s-->>Ctrl: Enqueue reconcile requests
par Up to max concurrent
Ctrl->>Operator: Reconcile(resource)
Operator-->>Ctrl: Result/Status
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (1)
helm/operator/STACK_CONCURRENCY.md (1)
86-104
: Consider adding language specifiers to fenced code blocks.The markdownlint tool flags fenced code blocks without language specifiers on lines 86, 95. While these are ASCII diagrams rather than code, you could add
text
as the language identifier to satisfy the linter and improve rendering consistency:-**Without limit (default):** -``` +**Without limit (when set to 0):** +```text Stack A ──┐ Stack B ──┤ Stack C ──┼─> All processed in parallel Stack D ──┤ Stack E ──┘-With limit of 5:
-+**With limit of 5:** +
text
Stack A ──┐
Stack B ──┤
Stack C ──┼─> Max 5 in parallel
Stack D ──┤
Stack E ──┘
Stack F ──> Queued (waiting)
Stack G ──> Queued (waiting)Similarly for the code block on line 118 showing the expected output.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (2)
helm/operator/templates/deployment.yaml
is excluded by!**/*.yaml
helm/operator/values.yaml
is excluded by!**/*.yaml
📒 Files selected for processing (8)
api/formance.com/v1beta1/benthos_types.go
(1 hunks)helm/operator/STACK_CONCURRENCY.md
(1 hunks)internal/core/reconciler.go
(4 hunks)internal/resources/orchestrations/deployments.go
(1 hunks)internal/resources/payments/deployments.go
(1 hunks)internal/resources/stacks/config.go
(1 hunks)internal/resources/stacks/init.go
(1 hunks)pkg/client/formance.com/v1beta1/databases.go
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
internal/resources/stacks/init.go (2)
internal/core/reconciler.go (1)
WithMaxConcurrentReconciles
(86-90)internal/resources/stacks/config.go (1)
GetStackConcurrency
(10-18)
internal/core/reconciler.go (1)
internal/core/context.go (1)
Context
(10-16)
internal/resources/payments/deployments.go (6)
internal/core/context.go (1)
Context
(10-16)api/formance.com/v1beta1/stack_types.go (1)
Stack
(77-83)api/formance.com/v1beta1/payments_types.go (1)
Payments
(45-51)api/formance.com/v1beta1/database_types.go (1)
Database
(73-79)internal/resources/registries/registries.go (1)
ImageConfiguration
(18-23)internal/resources/applications/liveness.go (1)
ProbeOpts
(24-24)
🪛 markdownlint-cli2 (0.18.1)
helm/operator/STACK_CONCURRENCY.md
86-86: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
95-95: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
118-118: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (2)
internal/core/reconciler.go (1)
86-90
: LGTM!The option constructor follows the established pattern and correctly sets the MaxConcurrentReconciles field.
internal/resources/stacks/init.go (1)
369-369
: LGTM!The integration correctly applies the stack concurrency limit by calling
GetStackConcurrency()
and passing the result to the newWithMaxConcurrentReconciles
option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Configuration is applied on all resources without exception.
Can lead to deadlock between different reconcilers.
8c3147e
to
58a33ef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (2)
helm/operator/templates/deployment.yaml
is excluded by!**/*.yaml
helm/operator/values.yaml
is excluded by!**/*.yaml
📒 Files selected for processing (15)
helm/operator/STACK_CONCURRENCY.md
(1 hunks)internal/core/concurrency.go
(1 hunks)internal/core/reconciler.go
(4 hunks)internal/resources/analytics/init.go
(1 hunks)internal/resources/auths/init.go
(1 hunks)internal/resources/gateways/init.go
(1 hunks)internal/resources/ledgers/init.go
(1 hunks)internal/resources/orchestrations/init.go
(1 hunks)internal/resources/payments/init.go
(1 hunks)internal/resources/reconciliations/init.go
(1 hunks)internal/resources/searches/init.go
(1 hunks)internal/resources/stacks/init.go
(1 hunks)internal/resources/stargates/init.go
(1 hunks)internal/resources/wallets/init.go
(1 hunks)internal/resources/webhooks/init.go
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- helm/operator/STACK_CONCURRENCY.md
🚧 Files skipped from review as they are similar to previous changes (8)
- internal/resources/searches/init.go
- internal/resources/ledgers/init.go
- internal/resources/wallets/init.go
- internal/resources/analytics/init.go
- internal/resources/reconciliations/init.go
- internal/resources/stacks/init.go
- internal/resources/orchestrations/init.go
- internal/resources/gateways/init.go
🧰 Additional context used
🧬 Code graph analysis (4)
internal/resources/auths/init.go (2)
internal/core/reconciler.go (1)
WithMaxConcurrentReconciles
(86-90)internal/core/concurrency.go (1)
GetMaxConcurrentReconciles
(10-18)
internal/resources/payments/init.go (2)
internal/core/reconciler.go (1)
WithMaxConcurrentReconciles
(86-90)internal/core/concurrency.go (1)
GetMaxConcurrentReconciles
(10-18)
internal/resources/stargates/init.go (2)
internal/core/reconciler.go (1)
WithMaxConcurrentReconciles
(86-90)internal/core/concurrency.go (1)
GetMaxConcurrentReconciles
(10-18)
internal/resources/webhooks/init.go (2)
internal/core/reconciler.go (1)
WithMaxConcurrentReconciles
(86-90)internal/core/concurrency.go (1)
GetMaxConcurrentReconciles
(10-18)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
internal/core/concurrency.go (1)
8-23
: 0-value semantics conflict with PR summary; align and extract constantsCode maps 0 → 1000 (“unlimited-ish”), but PR summary states 0 delegates to controller-runtime default (≈1). Please pick one and make docs + code consistent. I suggest keeping 0 → high value (1000) as implemented and updating docs accordingly. Also, extract constants for the env key and default values.
Apply this diff to improve clarity and avoid magic values:
package core import ( "os" "strconv" ) +const ( + EnvMaxConcurrentReconciles = "MAX_CONCURRENT_RECONCILES" + DefaultMaxConcurrentReconciles = 5 + UnlimitedMaxConcurrent = 1000 +) + // GetMaxConcurrentReconciles returns the maximum number of concurrent reconciliations // from the MAX_CONCURRENT_RECONCILES environment variable, or a default value of 5. // A value of 0 is treated as "unlimited" and mapped to a very high value (1000). func GetMaxConcurrentReconciles() int { - if v := os.Getenv("MAX_CONCURRENT_RECONCILES"); v != "" { + if v := os.Getenv(EnvMaxConcurrentReconciles); v != "" { if n, err := strconv.Atoi(v); err == nil && n >= 0 { if n == 0 { // Treat 0 as "unlimited" by using a very high value - return 1000 + return UnlimitedMaxConcurrent } return n } } // Default: 5 concurrent reconciliations (good balance for most clusters) - return 5 + return DefaultMaxConcurrentReconciles }To verify alignment with documentation, please confirm STACK_CONCURRENCY.md reflects the chosen 0-value behavior. If not, update the doc, Helm values description, and release notes to match.
internal/core/reconciler.go (2)
65-70
: Clarify field semantics (document 0/negative handling)Add a brief comment explaining expected values:
0 applies as-is,
- 0 is treated via GetMaxConcurrentReconciles mapping (near-unlimited),
- negatives are ignored.
This prevents misuse by future call sites.
86-90
: Add doc and (optionally) normalize hereDocument that callers should pass GetMaxConcurrentReconciles() to honor env semantics (0 mapped to high). Optionally, normalize here (map 0 to 1000; clamp negatives) to centralize behavior.
Suggested doc:
// WithMaxConcurrentReconciles sets the controller concurrency. // Prefer passing GetMaxConcurrentReconciles() so env semantics (0 -> near-unlimited) are honored.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
helm/operator/values.yaml
is excluded by!**/*.yaml
📒 Files selected for processing (3)
helm/operator/STACK_CONCURRENCY.md
(1 hunks)internal/core/concurrency.go
(1 hunks)internal/core/reconciler.go
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- helm/operator/STACK_CONCURRENCY.md
🧰 Additional context used
🧬 Code graph analysis (1)
internal/core/reconciler.go (1)
internal/core/context.go (1)
Context
(10-16)
🔇 Additional comments (2)
internal/core/reconciler.go (2)
14-14
: Import alias LGTMAlias for controller options is appropriate and clear.
237-243
: <run_scripts>#!/bin/bash # Find any literal zero assignments to MaxConcurrentReconciles rg -nP 'MaxConcurrentReconciles\s*:\s*0' -n # Spot-check all controllerconfig.Options literals with MaxConcurrentReconciles for their values rg -nP 'controllerconfig\.Options\s*{[^}]*MaxConcurrentReconciles\s*:' -n
Add support for limiting concurrent stack reconciliations to prevent cluster overload. This feature uses the native controller-runtime MaxConcurrentReconciles mechanism. Key changes: - Add MaxConcurrentReconciles support in core reconciler - Create GetStackConcurrency() to read STACK_MAX_CONCURRENT env var - Configure Helm chart with stackMaxConcurrent parameter (default: 5) - Add comprehensive Helm documentation in STACK_CONCURRENCY.md - Minor formatting fixes in unrelated files The default limit of 5 concurrent reconciliations provides a good balance for most clusters. Can be configured via Helm values or set to 0 for unlimited concurrency.
Extend the concurrency control from Stack-only to ALL reconcilers (Stacks, Modules, and Resources) to truly prevent "big bang" deployments. Key changes: - Move concurrency config to core package (internal/core/concurrency.go) - Apply MaxConcurrentReconciles to all Module reconcilers: * Analytics, Auth, Gateway, Ledger, Orchestration * Payments, Reconciliation, Search, Stargate * Wallets, Webhooks - Rename env var: STACK_MAX_CONCURRENT → MAX_CONCURRENT_RECONCILES - Rename Helm param: stackMaxConcurrent → maxConcurrentReconciles - Update documentation to reflect global scope This now controls: - Stack reconciliations (namespace creation, config updates) - Module reconciliations (Ledger, Payments, etc. deployments) - Resource reconciliations (Database, Broker management) Default remains 5 concurrent reconciliations for all resource types.
- Translate French comment to English in reconciler.go - Clarify value 0 behavior in documentation (uses controller-runtime default, not unlimited) - Add language specifiers to Markdown code blocks - Recommend value 1000 for near-unlimited concurrency instead of 0
Fixes CodeRabbit review feedback where value 0 should mean "unlimited" instead of falling back to controller-runtime default. Changes: - Map value 0 to 1000 (effectively unlimited) in GetMaxConcurrentReconciles() - Change condition from > 0 to >= 0 to always apply the option - Update documentation to clarify 0 = unlimited behavior - Update Helm values.yaml comment This ensures that setting MAX_CONCURRENT_RECONCILES=0 provides true unlimited concurrency as documented, rather than defaulting to 1.
a25fde3
to
9afbfac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
internal/core/reconciler.go (1)
237-242
: Consider using> 0
for cleaner semantics.The current logic with
>= 0
explicitly applies the option even whenMaxConcurrentReconciles
is 0. Since controller-runtime treatsMaxConcurrentReconciles: 0
the same as omitting the option entirely (both default to 1), using> 0
would be semantically cleaner: only apply the option when you want to override the default with a specific positive value. This treats 0 and negative values consistently as "use default."Apply this diff for slightly cleaner semantics:
- // Apply MaxConcurrentReconciles if specified (always apply when >= 0) - if options.MaxConcurrentReconciles >= 0 { + // Apply MaxConcurrentReconciles if a positive value is specified + if options.MaxConcurrentReconciles > 0 { b = b.WithOptions(controllerconfig.Options{ MaxConcurrentReconciles: options.MaxConcurrentReconciles, }) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (2)
helm/operator/templates/deployment.yaml
is excluded by!**/*.yaml
helm/operator/values.yaml
is excluded by!**/*.yaml
📒 Files selected for processing (18)
api/formance.com/v1beta1/benthos_types.go
(1 hunks)helm/operator/STACK_CONCURRENCY.md
(1 hunks)internal/core/concurrency.go
(1 hunks)internal/core/reconciler.go
(4 hunks)internal/resources/auths/init.go
(1 hunks)internal/resources/gateways/init.go
(1 hunks)internal/resources/ledgers/init.go
(1 hunks)internal/resources/orchestrations/deployments.go
(1 hunks)internal/resources/orchestrations/init.go
(1 hunks)internal/resources/payments/deployments.go
(1 hunks)internal/resources/payments/init.go
(1 hunks)internal/resources/reconciliations/init.go
(1 hunks)internal/resources/searches/init.go
(1 hunks)internal/resources/stacks/init.go
(1 hunks)internal/resources/stargates/init.go
(1 hunks)internal/resources/wallets/init.go
(1 hunks)internal/resources/webhooks/init.go
(1 hunks)pkg/client/formance.com/v1beta1/databases.go
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- pkg/client/formance.com/v1beta1/databases.go
- internal/resources/orchestrations/deployments.go
🚧 Files skipped from review as they are similar to previous changes (9)
- internal/resources/stacks/init.go
- internal/resources/wallets/init.go
- internal/resources/orchestrations/init.go
- internal/resources/searches/init.go
- internal/resources/webhooks/init.go
- internal/resources/gateways/init.go
- api/formance.com/v1beta1/benthos_types.go
- internal/resources/auths/init.go
- internal/core/concurrency.go
🔇 Additional comments (8)
internal/resources/reconciliations/init.go (1)
80-80
: LGTM!The addition of
WithMaxConcurrentReconciles
correctly applies the configurable concurrency limit to the Reconciliation module reconciler, consistent with the pattern applied across all 11 module reconcilers in this PR.internal/resources/payments/init.go (1)
129-129
: LGTM! Concurrency control correctly implemented.The addition of
WithMaxConcurrentReconciles[*v1beta1.Payments](GetMaxConcurrentReconciles())
correctly implements the configurable concurrency limit for the Payments module reconciler as described in the PR objectives. The generic type parameter matches the module type, and the placement as the first option is appropriate for a configuration setting.internal/resources/ledgers/init.go (1)
153-153
: LGTM! Concurrency limit integration looks correct.The addition of
WithMaxConcurrentReconciles[*v1beta1.Ledger](GetMaxConcurrentReconciles())
is consistent with the pattern applied across all module reconcilers in this PR. The generic type parameter matches the reconciler type, and the placement within the options is appropriate.internal/resources/stargates/init.go (1)
40-40
: LGTM! Concurrency control correctly applied.The addition of
WithMaxConcurrentReconciles[*v1beta1.Stargate](GetMaxConcurrentReconciles())
properly implements configurable concurrency limiting for Stargate reconciliations, aligning with the PR's objective to prevent cluster overload.To confirm consistency across all 11 module reconcilers mentioned in the PR objectives, run:
internal/core/reconciler.go (3)
14-14
: LGTM! Clear import alias.The
controllerconfig
alias clearly distinguishes the controller-runtime controller package and avoids potential naming conflicts.
65-69
: LGTM! Struct field addition is clean.The
MaxConcurrentReconciles
field is appropriately typed and positioned. The changes integrate well with the existing struct definition.
86-90
: LGTM! Follows the functional options pattern.The function signature and implementation are consistent with the existing
With*
option setters in this file.internal/resources/payments/deployments.go (1)
293-293
: LGTM!Removing the trailing comma from the parameter list is a valid style improvement that aligns with common Go conventions.
- **Default value: `5`** (good balance for most clusters) | ||
- Set to `0` for unlimited concurrency (mapped to 1000 internally) | ||
- Any positive value (1-999) sets that exact limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify the supported value range.
Controller-runtime accepts any positive integer here; we don't clamp at 999. Saying “1-999” understates what works (e.g., 1000, 2000). Please adjust the wording to note that any positive integer sets that exact limit.
🤖 Prompt for AI Agents
In helm/operator/STACK_CONCURRENCY.md around lines 27 to 29, the doc currently
states the concurrency range as "1-999" which is inaccurate; update the wording
to state that any positive integer is accepted and that the provided value sets
the exact concurrency limit (do not imply a 999 clamp). Replace the
bounded-range text with a sentence such as "Any positive integer; the controller
will use the exact value you provide" and optionally note examples like 1000 or
2000 to clarify.
This setting limits **all types** of reconciliations: | ||
- **Stack reconciliations**: Namespace creation, configuration updates | ||
- **Module reconciliations**: Ledger, Payments, Wallets, Gateway, etc. deployments | ||
- **Resource reconciliations**: Database, Broker, BrokerTopic management | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Align scope with the actual implementation.
The current code wires the max concurrency into stack and module reconcilers, but per this PR’s scope it does not yet cover every resource reconciler. Documenting “all types” (including resource reconcilers) over-promises behavior. Please update this section to match today’s coverage or defer the claim until resource reconcilers are actually limited.
Based on PR objectives
🤖 Prompt for AI Agents
In helm/operator/STACK_CONCURRENCY.md around lines 246 to 250, the documentation
overstates coverage by claiming concurrency limits apply to "all types"
including resource reconcilers, but the implementation currently only wires max
concurrency into stack and module reconcilers; update the text to accurately
reflect present behavior (limit applies to stack and module reconcilers only) or
remove/soften the "all types" claim and add a note indicating resource
reconcilers are not yet covered and will be addressed in a future change; keep
wording concise and aligned with PR scope.
Summary
This PR adds configurable concurrency control to all reconcilers (Stacks, Modules, and Resources) to prevent cluster overload and manage deployment pace effectively.
Evolution of the Implementation
Initial Implementation (First Commit)
Extended Implementation (Second Commit)
Code Quality Fixes (Third Commit)
0
behavior (uses controller-runtime default of 1, not unlimited)1000
for near-unlimited concurrencyChanges
Core Infrastructure
internal/core/concurrency.go
- Centralized concurrency configurationGetMaxConcurrentReconciles()
readsMAX_CONCURRENT_RECONCILES
env varWithMaxConcurrentReconciles[T]()
Module Updates
Updated all 11 module reconcilers to apply concurrency limit:
Helm Integration
operator.maxConcurrentReconciles
(default:5
)MAX_CONCURRENT_RECONCILES
helm/operator/STACK_CONCURRENCY.md
Configuration
Via Helm Values
Via Command Line
Behavior Notes
5
(good balance for most clusters)0
: Uses controller-runtime default (typically 1 concurrent reconciliation)1000
Recommended Values
How It Works
MAX_CONCURRENT_RECONCILES
environment variableGetMaxConcurrentReconciles()
Testing
Compilation
go build -o bin/manager ./cmd/main.go # ✅ Success
Runtime Verification
Breaking Changes
operator.stackMaxConcurrent
but this has been updated tooperator.maxConcurrentReconciles
Users should use the new parameter name in their values files.
What This Controls
This setting limits all Module reconciliations:
This prevents "big bang" deployments where all resources are processed simultaneously, giving you control over deployment pace.
Benefits
✅ True concurrency control: Limits all module deployments, not just Stacks
✅ Prevents cluster overload: Controlled deployment pace
✅ Configurable per environment: Different values for dev/staging/prod
✅ No recompilation needed: Runtime configuration via Helm
✅ Native Kubernetes: Uses controller-runtime's built-in mechanism
✅ Code quality: All comments in English, clear documentation