Skip to content

Fix another race condition that causes lost metrics #137

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

pwinckles
Copy link
Contributor

@pwinckles pwinckles commented Feb 21, 2025

This fixes another race condition that causes metrics to be lost when they are updated while the writer is stopped and after the monitorable has been re-added to the writer. In this case, the update needs to set the initial value on PcpValueInfo so that the correct value is written when the writer is started.

Additionally, while I was working on this, I noticed that if you have thousands of monitorables, then re-adding them in PcpMonitorBridge.startMonitoring() can take a very long time. For example, on one system I was seeing it take >20 seconds to add 40k monitorables. To help alleviate this problem, I added a cache to the metric name mapper and validator.

Resolves #132

This fixes another race condition that causes metrics to be lost when
they are updated while the writer is stopped and after the monitorable
has been re-added to the writer. In this case, the update needs to set
the initial value on `PcpValueInfo` so that the correct value is
written when the writer is started.
Adds a cache to the `MetricNameMapper` and `MetricNameValidator`. This
improves performance of mapping metric names because a regex no longer
has to be executed over and over again. The validator cache helps
because the metric name no longer has to be encoded over and over
again.

Neither of these caches expire entries, but this is not an issue
because it's currently not possible to remove monitorables from
Parfait.
@pwinckles pwinckles requested a review from tallpsmith February 21, 2025 16:40
@pwinckles
Copy link
Contributor Author

@tallpsmith @natoscott

Any thoughts on a path forward here?

@natoscott
Copy link
Member

@pwinckles it conceptually makes sense to me - I'm guessing retired life is treating @tallpsmith very well and we wont hear from him, so please go ahead and merge when you are happy with your PRs (I don't have the java experience to give you deep and meaningful reviews).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Race condition causes metric updates to be lost
2 participants