Get distinct values of metric
up vote
1
down vote
favorite
in my setup I have a java component reading data from YARN manager and exposing results of various jobs as metrics. For example I have a metrics with job duration which just holds duration of last app run. It may look like this:
duration_time_millis{job="probe",app_name="import-results",app_type="MAPREDUCE",status="SUCCEEDED"}
1991392 @1542770979.823
1991392 @1542770994.823
1991392 @1542771009.823
...
265722 @1542781554.823
265722 @1542781569.823
265722 @1542781584.823
...
The thing is I am scraping the expose server every 15s or so, but the jobs runs irregulary once per several hours. That means over past 6 hours I am getting 563x the first value and 520x the second value. As there is only one change in the interval.
Is there a way how to compute avg
or stddev
only on distinct values? Getting the number of distinct values would also mean better handling in histograms and heatmaps in grafana where count_values
does not seem to be a good solution.
Thanks for any help on this!
prometheus prometheus-java
add a comment |
up vote
1
down vote
favorite
in my setup I have a java component reading data from YARN manager and exposing results of various jobs as metrics. For example I have a metrics with job duration which just holds duration of last app run. It may look like this:
duration_time_millis{job="probe",app_name="import-results",app_type="MAPREDUCE",status="SUCCEEDED"}
1991392 @1542770979.823
1991392 @1542770994.823
1991392 @1542771009.823
...
265722 @1542781554.823
265722 @1542781569.823
265722 @1542781584.823
...
The thing is I am scraping the expose server every 15s or so, but the jobs runs irregulary once per several hours. That means over past 6 hours I am getting 563x the first value and 520x the second value. As there is only one change in the interval.
Is there a way how to compute avg
or stddev
only on distinct values? Getting the number of distinct values would also mean better handling in histograms and heatmaps in grafana where count_values
does not seem to be a good solution.
Thanks for any help on this!
prometheus prometheus-java
1
You seem to be on the right track withcount_values
. To get the current number of distinct values for a metric you could use something likecount(count_values("hi there stack overflow", up))
. I don't think there is currently any Promql function that would do anything likecount_values_over_time
so there is not a way that I am aware of to be able to calculateavg
oravg_over_time
based on unique values. Sorry to break it to ya :(
– wbh1
Nov 21 at 15:41
What a pity. If I check only one time seriescount_values
always returns1
as there is only one value at a time. And since there is no such function working with range vector, I cannot get much useful data for selected interval. Though I am a bit surprised there is no workaround at least for such simple query.
– Milano Nicolum
Nov 22 at 8:51
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
in my setup I have a java component reading data from YARN manager and exposing results of various jobs as metrics. For example I have a metrics with job duration which just holds duration of last app run. It may look like this:
duration_time_millis{job="probe",app_name="import-results",app_type="MAPREDUCE",status="SUCCEEDED"}
1991392 @1542770979.823
1991392 @1542770994.823
1991392 @1542771009.823
...
265722 @1542781554.823
265722 @1542781569.823
265722 @1542781584.823
...
The thing is I am scraping the expose server every 15s or so, but the jobs runs irregulary once per several hours. That means over past 6 hours I am getting 563x the first value and 520x the second value. As there is only one change in the interval.
Is there a way how to compute avg
or stddev
only on distinct values? Getting the number of distinct values would also mean better handling in histograms and heatmaps in grafana where count_values
does not seem to be a good solution.
Thanks for any help on this!
prometheus prometheus-java
in my setup I have a java component reading data from YARN manager and exposing results of various jobs as metrics. For example I have a metrics with job duration which just holds duration of last app run. It may look like this:
duration_time_millis{job="probe",app_name="import-results",app_type="MAPREDUCE",status="SUCCEEDED"}
1991392 @1542770979.823
1991392 @1542770994.823
1991392 @1542771009.823
...
265722 @1542781554.823
265722 @1542781569.823
265722 @1542781584.823
...
The thing is I am scraping the expose server every 15s or so, but the jobs runs irregulary once per several hours. That means over past 6 hours I am getting 563x the first value and 520x the second value. As there is only one change in the interval.
Is there a way how to compute avg
or stddev
only on distinct values? Getting the number of distinct values would also mean better handling in histograms and heatmaps in grafana where count_values
does not seem to be a good solution.
Thanks for any help on this!
prometheus prometheus-java
prometheus prometheus-java
asked Nov 21 at 12:30
Milano Nicolum
615
615
1
You seem to be on the right track withcount_values
. To get the current number of distinct values for a metric you could use something likecount(count_values("hi there stack overflow", up))
. I don't think there is currently any Promql function that would do anything likecount_values_over_time
so there is not a way that I am aware of to be able to calculateavg
oravg_over_time
based on unique values. Sorry to break it to ya :(
– wbh1
Nov 21 at 15:41
What a pity. If I check only one time seriescount_values
always returns1
as there is only one value at a time. And since there is no such function working with range vector, I cannot get much useful data for selected interval. Though I am a bit surprised there is no workaround at least for such simple query.
– Milano Nicolum
Nov 22 at 8:51
add a comment |
1
You seem to be on the right track withcount_values
. To get the current number of distinct values for a metric you could use something likecount(count_values("hi there stack overflow", up))
. I don't think there is currently any Promql function that would do anything likecount_values_over_time
so there is not a way that I am aware of to be able to calculateavg
oravg_over_time
based on unique values. Sorry to break it to ya :(
– wbh1
Nov 21 at 15:41
What a pity. If I check only one time seriescount_values
always returns1
as there is only one value at a time. And since there is no such function working with range vector, I cannot get much useful data for selected interval. Though I am a bit surprised there is no workaround at least for such simple query.
– Milano Nicolum
Nov 22 at 8:51
1
1
You seem to be on the right track with
count_values
. To get the current number of distinct values for a metric you could use something like count(count_values("hi there stack overflow", up))
. I don't think there is currently any Promql function that would do anything like count_values_over_time
so there is not a way that I am aware of to be able to calculate avg
or avg_over_time
based on unique values. Sorry to break it to ya :(– wbh1
Nov 21 at 15:41
You seem to be on the right track with
count_values
. To get the current number of distinct values for a metric you could use something like count(count_values("hi there stack overflow", up))
. I don't think there is currently any Promql function that would do anything like count_values_over_time
so there is not a way that I am aware of to be able to calculate avg
or avg_over_time
based on unique values. Sorry to break it to ya :(– wbh1
Nov 21 at 15:41
What a pity. If I check only one time series
count_values
always returns 1
as there is only one value at a time. And since there is no such function working with range vector, I cannot get much useful data for selected interval. Though I am a bit surprised there is no workaround at least for such simple query.– Milano Nicolum
Nov 22 at 8:51
What a pity. If I check only one time series
count_values
always returns 1
as there is only one value at a time. And since there is no such function working with range vector, I cannot get much useful data for selected interval. Though I am a bit surprised there is no workaround at least for such simple query.– Milano Nicolum
Nov 22 at 8:51
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53412081%2fget-distinct-values-of-metric%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
You seem to be on the right track with
count_values
. To get the current number of distinct values for a metric you could use something likecount(count_values("hi there stack overflow", up))
. I don't think there is currently any Promql function that would do anything likecount_values_over_time
so there is not a way that I am aware of to be able to calculateavg
oravg_over_time
based on unique values. Sorry to break it to ya :(– wbh1
Nov 21 at 15:41
What a pity. If I check only one time series
count_values
always returns1
as there is only one value at a time. And since there is no such function working with range vector, I cannot get much useful data for selected interval. Though I am a bit surprised there is no workaround at least for such simple query.– Milano Nicolum
Nov 22 at 8:51