Skip to content

Conversation

maru-ava
Copy link
Contributor

@maru-ava maru-ava commented Aug 1, 2025

Why this should be merged

Our current monitoring stack was only ever intended to be a stop-gap until a managed alternative became available. Grafana Cloud is that alternative.

How this was tested

  • CI jobs are able to verify collection of metrics and logs
  • Local monitored process testing
  • Local monitored kube testing

Need to be documented in RELEASES.md?

N/A

@maru-ava maru-ava self-assigned this Aug 1, 2025
@maru-ava maru-ava added ci This focuses on changes to the CI process tooling Build, test and development tooling labels Aug 1, 2025
@maru-ava maru-ava force-pushed the ci-grafana-cloud branch 3 times, most recently from fbaf095 to 0ef4d40 Compare August 1, 2025 05:54
@maru-ava maru-ava moved this to In Progress 🏗️ in avalanchego Aug 1, 2025
@maru-ava maru-ava force-pushed the ci-grafana-cloud branch 9 times, most recently from 4fe235f to 74e3cfe Compare August 4, 2025 18:09
@maru-ava maru-ava moved this from In Progress 🏗️ to Ready 🚦 in avalanchego Aug 5, 2025
@maru-ava maru-ava marked this pull request as ready for review August 5, 2025 03:09
@Copilot Copilot AI review requested due to automatic review settings August 5, 2025 03:09
@maru-ava maru-ava requested a review from aaronbuchwald as a code owner August 5, 2025 03:09
@maru-ava maru-ava marked this pull request as draft August 5, 2025 03:10
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR migrates the monitoring infrastructure from a self-hosted solution to Grafana Cloud, replacing the previous "poc" endpoints with Grafana Cloud's managed services.

  • Updates all hardcoded URLs from self-hosted monitoring instances to Grafana Cloud endpoints
  • Adds URL configuration support for both Prometheus and Loki collectors
  • Updates secret management to include URLs alongside credentials

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/reexecute/c/README.md Updates Grafana URLs and dashboard IDs in documentation
tests/fixture/tmpnet/yaml/promtail-daemonset.yaml Adds Loki URL configuration and updates secret name
tests/fixture/tmpnet/yaml/prometheus-agent.yaml Adds Prometheus URL configuration and updates secret name
tests/fixture/tmpnet/network.go Updates Grafana URI and dashboard ID constants
tests/fixture/tmpnet/monitor_processes.go Refactors collector configuration to include URLs and removes hardcoded endpoints
tests/fixture/tmpnet/monitor_kube.go Updates Kubernetes secret creation to include URL configuration
tests/fixture/tmpnet/check_monitoring.go Removes URL validation and adds Prometheus path suffix for Grafana Cloud
tests/fixture/tmpnet/README.md Updates documentation with new URL requirements and endpoints
tests/e2e/README.md Updates documentation with new configuration requirements
scripts/configure-local-metrics-collection.sh Updates Grafana link generation
.github/workflows/*.yml Updates all workflow files to use new URL-based configuration
.github/actions/run-monitored-tmpnet-cmd/action.yml Adds URL inputs and updates default dashboard ID
Comments suppressed due to low confidence (1)

tests/fixture/tmpnet/monitor_processes.go:464

  • Error message is missing 'env' prefix for consistency with other error messages in the same function. Should be 'env var not set' to match the pattern used for other environment variables.
		return "", "", "", fmt.Errorf("%s env var not set", passwordEnvVar)

@maru-ava maru-ava force-pushed the ci-grafana-cloud branch 5 times, most recently from 8d9afbc to d8ffaf6 Compare August 7, 2025 04:58
@maru-ava maru-ava marked this pull request as ready for review August 7, 2025 04:58
@maru-ava maru-ava requested a review from joshua-kim as a code owner August 7, 2025 04:58
Copy link
Contributor

@joshua-kim joshua-kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending a fix for the rate-limiting.

@joshua-kim joshua-kim moved this from Ready 🚦 to In Progress 🏗️ in avalanchego Aug 8, 2025
@maru-ava maru-ava force-pushed the ci-grafana-cloud branch 2 times, most recently from 664ea18 to 69cf999 Compare August 25, 2025 15:08
@StephenButtolph StephenButtolph moved this from In Progress 🏗️ to In Review 🔎 in avalanchego Aug 25, 2025
Copy link
Collaborator

@aaronbuchwald aaronbuchwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - left one question on a comment I did not understand

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci This focuses on changes to the CI process tooling Build, test and development tooling
Projects
Status: In Review 🔎
Development

Successfully merging this pull request may close these issues.

4 participants