Skip to content
This repository was archived by the owner on May 25, 2021. It is now read-only.

Add new metric (histogram) to track number of reads using _bulk_get #118

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

brkolla
Copy link

@brkolla brkolla commented May 3, 2016

Add a new counter "bulk_request_docs" to track the number of documents being handled as part of the bulk request.

FogBugz 64560.

@kxepal
Copy link
Member

kxepal commented May 4, 2016

Why? When you have a line above: couch_stats:update_histogram([couchdb, httpd, bulk_docs], length(DocsArray)) which is more informative than just constantly incremented counter.

@brkolla
Copy link
Author

brkolla commented May 4, 2016

@kxepal ? I have asked the same question in 64560 and discussed it with Aaron. As per my understanding, counter gives a different perspective and probably easy to identify the spikes and trends? Can you add a comment to Aaron/Russel/Ops in 64560 and get his perspective on why this metric would help Operations?

@kxepal
Copy link
Member

kxepal commented May 4, 2016

@brkolla Sorry, I can't since I'm not from Cloudant team. Can you ping them here or share the reason why? because I'm -1 on having two metrics that does the same thing while one of them (histogram) is actually what is really useful.

@brkolla
Copy link
Author

brkolla commented May 4, 2016

I am actually asking Aaron about this in Slack. To be honest, I don't fully understand the difference between the histogram and the regular counter and if its possible to have one metric that does the both. Will ask around and find out more.

@kxepal
Copy link
Member

kxepal commented May 4, 2016

@brkolla Histogram uses sliding time window and provides percentile-wise information about the recorded numbers. Just compare output of both these metrics to see the difference.

@chewbranca
Copy link
Contributor

Yeah I agree with @kxepal on this, using a histogram is the correct choice here. It looks like there was also some confusion on this issue as @rnewson snuck in the actual fix for this issue back in cloudant@a49c474.

I do think it's worthwhile to also track the doc count in _bulk_get and that should be switched to a histogram like the above commit. The big question there is whether to use a different histogram to be able to distinguish between _bulk_get reads and _bulk_docs writes. I think having them separate would be reasonable.

Also, we should get similar stats for quantify of docs returned and provided in _all_docs and views and what not.

@kxepal
Copy link
Member

kxepal commented May 4, 2016

Oh, it also counts _bulk_get. I miss that moment, sorry. But that should be a different metric and should have a correct name anyway.

@brkolla
Copy link
Author

brkolla commented May 4, 2016

I think I have to revisit this and come up with a new metric to track the bulk gets and probably use one metric that tracks all the request that would result in bulk gets (_all_docs etc..) . We are still discussing on how we can do that.
I guess I will have to close this PR as we are not going to go ahead with this.

@chewbranca
Copy link
Contributor

Yeah agreed, _bulk_get is sufficiently different we should track it separately.

@brkolla brkolla force-pushed the 64560-add-metrics-bulk-docs-count branch from 374476b to dab6b16 Compare May 5, 2016 13:17
@brkolla brkolla force-pushed the 64560-add-metrics-bulk-docs-count branch from dab6b16 to b9efb96 Compare May 5, 2016 13:22
@brkolla brkolla changed the title 64560 add metrics bulk docs count Add new metric (histogram) to track number of reads using _bulk_get May 5, 2016
@brkolla
Copy link
Author

brkolla commented May 9, 2016

@chewbranca I have updated the code to add a new metric bulk_reads to track the number of doc reads done as part of the bulk_get. Can you review this code?

@@ -448,6 +448,7 @@ db_req(#httpd{method='POST', path_parts=[_, <<"_bulk_get">>]}=Req, Db) ->
undefined ->
throw({bad_request, <<"Missing JSON list of 'docs'.">>});
Docs ->
couch_stats:update_histogram([couchdb, httpd, bulk_reads], length(Docs)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name is confusing. If we want to count only "/_bulk_get" requests here, better give such name to the metric instead of generic bulk_reads. Otherwise, if you want to use "bulk_reads", should we count requests with include_docs=true to views and all_docs as bulk reads?

@brkolla
Copy link
Author

brkolla commented May 10, 2016

@kxepal I am not really sure, but this seems to be only used during the replication. I guess, we could change the name to bulk_get_docs? @chewbranca?

@kxepal
Copy link
Member

kxepal commented May 10, 2016

@brkolla Not only. Replication is just a client for CouchDB. Other users may be this endpoint in own way (and for profit!).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants