In Elasticsearch how do I get an average of time differences of documents for a specific user?












0















Let's say a single Elasticsearch document may look like this:



{
"created": "2018-11-26T22:20:01+00:00",
"user_id": 2,
"text": "Test!"
"verb": "comment_posted",
"thread_id": 1
}


I would like to filter by verb to only "comment_posted", then get the average time between comments for a specific post (created field) for each user.



Here's an example dataset and expected results:




  • User 'A' posts on thread '1' (starting the thread) at 1:30


  • User 'B' posts on thread '2' (starting the thread) at 1:45


  • User 'A' posts on thread '2' at 2:00


  • User 'B' posts on thread '1' at 3:30


  • User 'B' posts on thread '1' at 4:30


  • User 'A' posts on thread '1' at 5:15



User 'A' would have an average of 30 minutes (2:00 - 1:45 and 5:15 - 4:30), and user 'B' would have an average of 120 minutes (3:30 - 1:30 and 4:30 - 3:30).



What would my query look like?










share|improve this question

























  • Welcome to SO! The times in your second-to-last last paragraph don't match with any time mentioned before, it makes it a little harder to understand the result you want.

    – AdrienF
    Nov 27 '18 at 10:13











  • @Adr Thank you for pointing that out. i've corrected the times.

    – Aaron Cheever
    Dec 3 '18 at 19:38


















0















Let's say a single Elasticsearch document may look like this:



{
"created": "2018-11-26T22:20:01+00:00",
"user_id": 2,
"text": "Test!"
"verb": "comment_posted",
"thread_id": 1
}


I would like to filter by verb to only "comment_posted", then get the average time between comments for a specific post (created field) for each user.



Here's an example dataset and expected results:




  • User 'A' posts on thread '1' (starting the thread) at 1:30


  • User 'B' posts on thread '2' (starting the thread) at 1:45


  • User 'A' posts on thread '2' at 2:00


  • User 'B' posts on thread '1' at 3:30


  • User 'B' posts on thread '1' at 4:30


  • User 'A' posts on thread '1' at 5:15



User 'A' would have an average of 30 minutes (2:00 - 1:45 and 5:15 - 4:30), and user 'B' would have an average of 120 minutes (3:30 - 1:30 and 4:30 - 3:30).



What would my query look like?










share|improve this question

























  • Welcome to SO! The times in your second-to-last last paragraph don't match with any time mentioned before, it makes it a little harder to understand the result you want.

    – AdrienF
    Nov 27 '18 at 10:13











  • @Adr Thank you for pointing that out. i've corrected the times.

    – Aaron Cheever
    Dec 3 '18 at 19:38
















0












0








0








Let's say a single Elasticsearch document may look like this:



{
"created": "2018-11-26T22:20:01+00:00",
"user_id": 2,
"text": "Test!"
"verb": "comment_posted",
"thread_id": 1
}


I would like to filter by verb to only "comment_posted", then get the average time between comments for a specific post (created field) for each user.



Here's an example dataset and expected results:




  • User 'A' posts on thread '1' (starting the thread) at 1:30


  • User 'B' posts on thread '2' (starting the thread) at 1:45


  • User 'A' posts on thread '2' at 2:00


  • User 'B' posts on thread '1' at 3:30


  • User 'B' posts on thread '1' at 4:30


  • User 'A' posts on thread '1' at 5:15



User 'A' would have an average of 30 minutes (2:00 - 1:45 and 5:15 - 4:30), and user 'B' would have an average of 120 minutes (3:30 - 1:30 and 4:30 - 3:30).



What would my query look like?










share|improve this question
















Let's say a single Elasticsearch document may look like this:



{
"created": "2018-11-26T22:20:01+00:00",
"user_id": 2,
"text": "Test!"
"verb": "comment_posted",
"thread_id": 1
}


I would like to filter by verb to only "comment_posted", then get the average time between comments for a specific post (created field) for each user.



Here's an example dataset and expected results:




  • User 'A' posts on thread '1' (starting the thread) at 1:30


  • User 'B' posts on thread '2' (starting the thread) at 1:45


  • User 'A' posts on thread '2' at 2:00


  • User 'B' posts on thread '1' at 3:30


  • User 'B' posts on thread '1' at 4:30


  • User 'A' posts on thread '1' at 5:15



User 'A' would have an average of 30 minutes (2:00 - 1:45 and 5:15 - 4:30), and user 'B' would have an average of 120 minutes (3:30 - 1:30 and 4:30 - 3:30).



What would my query look like?







elasticsearch






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 3 '18 at 19:37







Aaron Cheever

















asked Nov 26 '18 at 23:46









Aaron CheeverAaron Cheever

33




33













  • Welcome to SO! The times in your second-to-last last paragraph don't match with any time mentioned before, it makes it a little harder to understand the result you want.

    – AdrienF
    Nov 27 '18 at 10:13











  • @Adr Thank you for pointing that out. i've corrected the times.

    – Aaron Cheever
    Dec 3 '18 at 19:38





















  • Welcome to SO! The times in your second-to-last last paragraph don't match with any time mentioned before, it makes it a little harder to understand the result you want.

    – AdrienF
    Nov 27 '18 at 10:13











  • @Adr Thank you for pointing that out. i've corrected the times.

    – Aaron Cheever
    Dec 3 '18 at 19:38



















Welcome to SO! The times in your second-to-last last paragraph don't match with any time mentioned before, it makes it a little harder to understand the result you want.

– AdrienF
Nov 27 '18 at 10:13





Welcome to SO! The times in your second-to-last last paragraph don't match with any time mentioned before, it makes it a little harder to understand the result you want.

– AdrienF
Nov 27 '18 at 10:13













@Adr Thank you for pointing that out. i've corrected the times.

– Aaron Cheever
Dec 3 '18 at 19:38







@Adr Thank you for pointing that out. i've corrected the times.

– Aaron Cheever
Dec 3 '18 at 19:38














1 Answer
1






active

oldest

votes


















0














Short answer



It's potentially possible to do this but not recommended.



Long answer



In general to do something like this you'd need to use an aggregation.



The only aggregation that allows computing deltas is the Serial Differencing Aggregation. However, it's meant to be used in the context of a histogram or date histogram aggregation.



In your case, to get a single comment per histogram bucket you could create a histogram with very small bucket (e.g. a subdivision of a second), then use serial differencing to get the time deltas.
As mentioned in this answer on the ElasticSearch forum however, this would be terrible performance-wise.



So the answer here is that you need to compute those deltas at index-time, or using data from another store if you have one (this would be very easy to compute in Postgres for example).






share|improve this answer
























  • Thank you for your recommendation. I thought this might be a bit too much for aggregations but I wasn't sure if I was just missing some obvious functionality I didn't know about. Would it make sense to create another document containing this information, or are you suggesting attaching this info to the new/existing document? I suppose the answer to this question is up to me and my own decision for implementation, but what would you do given the information provided?

    – Aaron Cheever
    Nov 27 '18 at 18:48













  • @AaronCheever it's hard to tell without knowing more - I'd question the need to put this data in ElasticSearch in the first place. ES is meant for search, and it seems the kind of search you can do on time delta is quite limited, so you don't need ES for that.

    – AdrienF
    Nov 28 '18 at 17:48













  • In the end the requirements changed so I didn't need to come up with a solution. Making another document or creating that data at index time seem like the best solution if I were to still need to do this.

    – Aaron Cheever
    Dec 3 '18 at 19:41











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53490767%2fin-elasticsearch-how-do-i-get-an-average-of-time-differences-of-documents-for-a%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














Short answer



It's potentially possible to do this but not recommended.



Long answer



In general to do something like this you'd need to use an aggregation.



The only aggregation that allows computing deltas is the Serial Differencing Aggregation. However, it's meant to be used in the context of a histogram or date histogram aggregation.



In your case, to get a single comment per histogram bucket you could create a histogram with very small bucket (e.g. a subdivision of a second), then use serial differencing to get the time deltas.
As mentioned in this answer on the ElasticSearch forum however, this would be terrible performance-wise.



So the answer here is that you need to compute those deltas at index-time, or using data from another store if you have one (this would be very easy to compute in Postgres for example).






share|improve this answer
























  • Thank you for your recommendation. I thought this might be a bit too much for aggregations but I wasn't sure if I was just missing some obvious functionality I didn't know about. Would it make sense to create another document containing this information, or are you suggesting attaching this info to the new/existing document? I suppose the answer to this question is up to me and my own decision for implementation, but what would you do given the information provided?

    – Aaron Cheever
    Nov 27 '18 at 18:48













  • @AaronCheever it's hard to tell without knowing more - I'd question the need to put this data in ElasticSearch in the first place. ES is meant for search, and it seems the kind of search you can do on time delta is quite limited, so you don't need ES for that.

    – AdrienF
    Nov 28 '18 at 17:48













  • In the end the requirements changed so I didn't need to come up with a solution. Making another document or creating that data at index time seem like the best solution if I were to still need to do this.

    – Aaron Cheever
    Dec 3 '18 at 19:41
















0














Short answer



It's potentially possible to do this but not recommended.



Long answer



In general to do something like this you'd need to use an aggregation.



The only aggregation that allows computing deltas is the Serial Differencing Aggregation. However, it's meant to be used in the context of a histogram or date histogram aggregation.



In your case, to get a single comment per histogram bucket you could create a histogram with very small bucket (e.g. a subdivision of a second), then use serial differencing to get the time deltas.
As mentioned in this answer on the ElasticSearch forum however, this would be terrible performance-wise.



So the answer here is that you need to compute those deltas at index-time, or using data from another store if you have one (this would be very easy to compute in Postgres for example).






share|improve this answer
























  • Thank you for your recommendation. I thought this might be a bit too much for aggregations but I wasn't sure if I was just missing some obvious functionality I didn't know about. Would it make sense to create another document containing this information, or are you suggesting attaching this info to the new/existing document? I suppose the answer to this question is up to me and my own decision for implementation, but what would you do given the information provided?

    – Aaron Cheever
    Nov 27 '18 at 18:48













  • @AaronCheever it's hard to tell without knowing more - I'd question the need to put this data in ElasticSearch in the first place. ES is meant for search, and it seems the kind of search you can do on time delta is quite limited, so you don't need ES for that.

    – AdrienF
    Nov 28 '18 at 17:48













  • In the end the requirements changed so I didn't need to come up with a solution. Making another document or creating that data at index time seem like the best solution if I were to still need to do this.

    – Aaron Cheever
    Dec 3 '18 at 19:41














0












0








0







Short answer



It's potentially possible to do this but not recommended.



Long answer



In general to do something like this you'd need to use an aggregation.



The only aggregation that allows computing deltas is the Serial Differencing Aggregation. However, it's meant to be used in the context of a histogram or date histogram aggregation.



In your case, to get a single comment per histogram bucket you could create a histogram with very small bucket (e.g. a subdivision of a second), then use serial differencing to get the time deltas.
As mentioned in this answer on the ElasticSearch forum however, this would be terrible performance-wise.



So the answer here is that you need to compute those deltas at index-time, or using data from another store if you have one (this would be very easy to compute in Postgres for example).






share|improve this answer













Short answer



It's potentially possible to do this but not recommended.



Long answer



In general to do something like this you'd need to use an aggregation.



The only aggregation that allows computing deltas is the Serial Differencing Aggregation. However, it's meant to be used in the context of a histogram or date histogram aggregation.



In your case, to get a single comment per histogram bucket you could create a histogram with very small bucket (e.g. a subdivision of a second), then use serial differencing to get the time deltas.
As mentioned in this answer on the ElasticSearch forum however, this would be terrible performance-wise.



So the answer here is that you need to compute those deltas at index-time, or using data from another store if you have one (this would be very easy to compute in Postgres for example).







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 27 '18 at 10:52









AdrienFAdrienF

452314




452314













  • Thank you for your recommendation. I thought this might be a bit too much for aggregations but I wasn't sure if I was just missing some obvious functionality I didn't know about. Would it make sense to create another document containing this information, or are you suggesting attaching this info to the new/existing document? I suppose the answer to this question is up to me and my own decision for implementation, but what would you do given the information provided?

    – Aaron Cheever
    Nov 27 '18 at 18:48













  • @AaronCheever it's hard to tell without knowing more - I'd question the need to put this data in ElasticSearch in the first place. ES is meant for search, and it seems the kind of search you can do on time delta is quite limited, so you don't need ES for that.

    – AdrienF
    Nov 28 '18 at 17:48













  • In the end the requirements changed so I didn't need to come up with a solution. Making another document or creating that data at index time seem like the best solution if I were to still need to do this.

    – Aaron Cheever
    Dec 3 '18 at 19:41



















  • Thank you for your recommendation. I thought this might be a bit too much for aggregations but I wasn't sure if I was just missing some obvious functionality I didn't know about. Would it make sense to create another document containing this information, or are you suggesting attaching this info to the new/existing document? I suppose the answer to this question is up to me and my own decision for implementation, but what would you do given the information provided?

    – Aaron Cheever
    Nov 27 '18 at 18:48













  • @AaronCheever it's hard to tell without knowing more - I'd question the need to put this data in ElasticSearch in the first place. ES is meant for search, and it seems the kind of search you can do on time delta is quite limited, so you don't need ES for that.

    – AdrienF
    Nov 28 '18 at 17:48













  • In the end the requirements changed so I didn't need to come up with a solution. Making another document or creating that data at index time seem like the best solution if I were to still need to do this.

    – Aaron Cheever
    Dec 3 '18 at 19:41

















Thank you for your recommendation. I thought this might be a bit too much for aggregations but I wasn't sure if I was just missing some obvious functionality I didn't know about. Would it make sense to create another document containing this information, or are you suggesting attaching this info to the new/existing document? I suppose the answer to this question is up to me and my own decision for implementation, but what would you do given the information provided?

– Aaron Cheever
Nov 27 '18 at 18:48







Thank you for your recommendation. I thought this might be a bit too much for aggregations but I wasn't sure if I was just missing some obvious functionality I didn't know about. Would it make sense to create another document containing this information, or are you suggesting attaching this info to the new/existing document? I suppose the answer to this question is up to me and my own decision for implementation, but what would you do given the information provided?

– Aaron Cheever
Nov 27 '18 at 18:48















@AaronCheever it's hard to tell without knowing more - I'd question the need to put this data in ElasticSearch in the first place. ES is meant for search, and it seems the kind of search you can do on time delta is quite limited, so you don't need ES for that.

– AdrienF
Nov 28 '18 at 17:48







@AaronCheever it's hard to tell without knowing more - I'd question the need to put this data in ElasticSearch in the first place. ES is meant for search, and it seems the kind of search you can do on time delta is quite limited, so you don't need ES for that.

– AdrienF
Nov 28 '18 at 17:48















In the end the requirements changed so I didn't need to come up with a solution. Making another document or creating that data at index time seem like the best solution if I were to still need to do this.

– Aaron Cheever
Dec 3 '18 at 19:41





In the end the requirements changed so I didn't need to come up with a solution. Making another document or creating that data at index time seem like the best solution if I were to still need to do this.

– Aaron Cheever
Dec 3 '18 at 19:41




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53490767%2fin-elasticsearch-how-do-i-get-an-average-of-time-differences-of-documents-for-a%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Lallio

Futebolista

Jornalista