Clickhouse moving average











up vote
0
down vote

favorite












Input:
Clickhouse



Table A
business_dttm (datetime)
amount (float)



I need to calculate moving sum for 15 minutes (or for last 3 records) on each business_dttm



For example



amount business_dttm     moving sum
0.3 2018-11-19 13:00:00
0.3 2018-11-19 13:05:00
0.4 2018-11-19 13:10:00 1
0.5 2018-11-19 13:15:00 1.2
0.6 2018-11-19 13:15:00 1.5
0.7 2018-11-19 13:20:00 1.8
0.8 2018-11-19 13:25:00 2.1
0.9 2018-11-19 13:25:00 2.4
0.5 2018-11-19 13:30:00 2.2


Unfortunately we haven't window functions and join without equal conditions in Clickhouse



How can i do it without cross join and where condition?










share|improve this question


























    up vote
    0
    down vote

    favorite












    Input:
    Clickhouse



    Table A
    business_dttm (datetime)
    amount (float)



    I need to calculate moving sum for 15 minutes (or for last 3 records) on each business_dttm



    For example



    amount business_dttm     moving sum
    0.3 2018-11-19 13:00:00
    0.3 2018-11-19 13:05:00
    0.4 2018-11-19 13:10:00 1
    0.5 2018-11-19 13:15:00 1.2
    0.6 2018-11-19 13:15:00 1.5
    0.7 2018-11-19 13:20:00 1.8
    0.8 2018-11-19 13:25:00 2.1
    0.9 2018-11-19 13:25:00 2.4
    0.5 2018-11-19 13:30:00 2.2


    Unfortunately we haven't window functions and join without equal conditions in Clickhouse



    How can i do it without cross join and where condition?










    share|improve this question
























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      Input:
      Clickhouse



      Table A
      business_dttm (datetime)
      amount (float)



      I need to calculate moving sum for 15 minutes (or for last 3 records) on each business_dttm



      For example



      amount business_dttm     moving sum
      0.3 2018-11-19 13:00:00
      0.3 2018-11-19 13:05:00
      0.4 2018-11-19 13:10:00 1
      0.5 2018-11-19 13:15:00 1.2
      0.6 2018-11-19 13:15:00 1.5
      0.7 2018-11-19 13:20:00 1.8
      0.8 2018-11-19 13:25:00 2.1
      0.9 2018-11-19 13:25:00 2.4
      0.5 2018-11-19 13:30:00 2.2


      Unfortunately we haven't window functions and join without equal conditions in Clickhouse



      How can i do it without cross join and where condition?










      share|improve this question













      Input:
      Clickhouse



      Table A
      business_dttm (datetime)
      amount (float)



      I need to calculate moving sum for 15 minutes (or for last 3 records) on each business_dttm



      For example



      amount business_dttm     moving sum
      0.3 2018-11-19 13:00:00
      0.3 2018-11-19 13:05:00
      0.4 2018-11-19 13:10:00 1
      0.5 2018-11-19 13:15:00 1.2
      0.6 2018-11-19 13:15:00 1.5
      0.7 2018-11-19 13:20:00 1.8
      0.8 2018-11-19 13:25:00 2.1
      0.9 2018-11-19 13:25:00 2.4
      0.5 2018-11-19 13:30:00 2.2


      Unfortunately we haven't window functions and join without equal conditions in Clickhouse



      How can i do it without cross join and where condition?







      moving-average clickhouse






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 21 at 16:29









      Vsevolod Lukovsky

      1




      1
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          0
          down vote













          If the window size is countably small, you can do something like this



          SELECT
          sum(window.2) AS amount,
          max(dttm) AS business_dttm,
          sum(amt) AS moving_sum
          FROM
          (
          SELECT
          arrayJoin([(rowNumberInAllBlocks(), amount), (rowNumberInAllBlocks() + 1, 0), (rowNumberInAllBlocks() + 2, 0)]) AS window,
          amount AS amt,
          business_dttm AS dttm
          FROM
          (
          SELECT
          amount,
          business_dttm
          FROM A
          ORDER BY business_dttm
          )
          )
          GROUP BY window.1
          HAVING count() = 3
          ORDER BY window.1;


          The first two rows are ignored as ClickHouse doesn't collapse aggregates into null. You can prepend them later.



          Update:



          It's still possible to compute moving sum for arbitrary window sizes. Tune the window_size as you want (3 for this example).



          -- Note, rowNumberInAllBlocks is incorrect if declared inside with block due to being stateful
          WITH
          (
          SELECT arrayCumSum(groupArray(amount))
          FROM
          (
          SELECT
          amount
          FROM A
          ORDER BY business_dttm
          )
          ) AS arr,
          3 AS window_size
          SELECT
          amount,
          business_dttm,
          if(rowNumberInAllBlocks() + 1 < window_size, NULL, arr[rowNumberInAllBlocks() + 1] - arr[rowNumberInAllBlocks() + 1 - window_size]) AS moving_sum
          FROM
          (
          SELECT
          amount,
          business_dttm
          FROM A
          ORDER BY business_dttm
          )


          Or this variant



          SELECT
          amount,
          business_dttm,
          moving_sum
          FROM
          (
          WITH 3 AS window_size
          SELECT
          groupArray(amount) AS amount_arr,
          groupArray(business_dttm) AS business_dttm_arr,
          arrayCumSum(amount_arr) AS amount_cum_arr,
          arrayMap(i -> if(i < window_size, NULL, amount_cum_arr[i] - amount_cum_arr[(i - window_size)]), arrayEnumerate(amount_cum_arr)) AS moving_sum_arr
          FROM
          (
          SELECT *
          FROM A
          ORDER BY business_dttm ASC
          )
          )
          ARRAY JOIN
          amount_arr AS amount,
          business_dttm_arr AS business_dttm,
          moving_sum_arr AS moving_sum


          Fair warning, both approaches are far from optimal, but it exhibits the unique power of ClickHouse beyond SQL.






          share|improve this answer























          • Unfortunately, window size ~ 10000 rows
            – Vsevolod Lukovsky
            Nov 22 at 12:37










          • Thanks for answer, but 1 moment. I was talking about Moving Sum, Not Cumulative Sum. Is it real to do it for Moving Sum?
            – Vsevolod Lukovsky
            Nov 23 at 21:49












          • Sorry! Just have tried, that's working :)
            – Vsevolod Lukovsky
            Nov 23 at 22:07










          • @VsevolodLukovsky please accept the answer if it solves your problem
            – Amos
            Nov 24 at 4:09











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53416531%2fclickhouse-moving-average%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          0
          down vote













          If the window size is countably small, you can do something like this



          SELECT
          sum(window.2) AS amount,
          max(dttm) AS business_dttm,
          sum(amt) AS moving_sum
          FROM
          (
          SELECT
          arrayJoin([(rowNumberInAllBlocks(), amount), (rowNumberInAllBlocks() + 1, 0), (rowNumberInAllBlocks() + 2, 0)]) AS window,
          amount AS amt,
          business_dttm AS dttm
          FROM
          (
          SELECT
          amount,
          business_dttm
          FROM A
          ORDER BY business_dttm
          )
          )
          GROUP BY window.1
          HAVING count() = 3
          ORDER BY window.1;


          The first two rows are ignored as ClickHouse doesn't collapse aggregates into null. You can prepend them later.



          Update:



          It's still possible to compute moving sum for arbitrary window sizes. Tune the window_size as you want (3 for this example).



          -- Note, rowNumberInAllBlocks is incorrect if declared inside with block due to being stateful
          WITH
          (
          SELECT arrayCumSum(groupArray(amount))
          FROM
          (
          SELECT
          amount
          FROM A
          ORDER BY business_dttm
          )
          ) AS arr,
          3 AS window_size
          SELECT
          amount,
          business_dttm,
          if(rowNumberInAllBlocks() + 1 < window_size, NULL, arr[rowNumberInAllBlocks() + 1] - arr[rowNumberInAllBlocks() + 1 - window_size]) AS moving_sum
          FROM
          (
          SELECT
          amount,
          business_dttm
          FROM A
          ORDER BY business_dttm
          )


          Or this variant



          SELECT
          amount,
          business_dttm,
          moving_sum
          FROM
          (
          WITH 3 AS window_size
          SELECT
          groupArray(amount) AS amount_arr,
          groupArray(business_dttm) AS business_dttm_arr,
          arrayCumSum(amount_arr) AS amount_cum_arr,
          arrayMap(i -> if(i < window_size, NULL, amount_cum_arr[i] - amount_cum_arr[(i - window_size)]), arrayEnumerate(amount_cum_arr)) AS moving_sum_arr
          FROM
          (
          SELECT *
          FROM A
          ORDER BY business_dttm ASC
          )
          )
          ARRAY JOIN
          amount_arr AS amount,
          business_dttm_arr AS business_dttm,
          moving_sum_arr AS moving_sum


          Fair warning, both approaches are far from optimal, but it exhibits the unique power of ClickHouse beyond SQL.






          share|improve this answer























          • Unfortunately, window size ~ 10000 rows
            – Vsevolod Lukovsky
            Nov 22 at 12:37










          • Thanks for answer, but 1 moment. I was talking about Moving Sum, Not Cumulative Sum. Is it real to do it for Moving Sum?
            – Vsevolod Lukovsky
            Nov 23 at 21:49












          • Sorry! Just have tried, that's working :)
            – Vsevolod Lukovsky
            Nov 23 at 22:07










          • @VsevolodLukovsky please accept the answer if it solves your problem
            – Amos
            Nov 24 at 4:09















          up vote
          0
          down vote













          If the window size is countably small, you can do something like this



          SELECT
          sum(window.2) AS amount,
          max(dttm) AS business_dttm,
          sum(amt) AS moving_sum
          FROM
          (
          SELECT
          arrayJoin([(rowNumberInAllBlocks(), amount), (rowNumberInAllBlocks() + 1, 0), (rowNumberInAllBlocks() + 2, 0)]) AS window,
          amount AS amt,
          business_dttm AS dttm
          FROM
          (
          SELECT
          amount,
          business_dttm
          FROM A
          ORDER BY business_dttm
          )
          )
          GROUP BY window.1
          HAVING count() = 3
          ORDER BY window.1;


          The first two rows are ignored as ClickHouse doesn't collapse aggregates into null. You can prepend them later.



          Update:



          It's still possible to compute moving sum for arbitrary window sizes. Tune the window_size as you want (3 for this example).



          -- Note, rowNumberInAllBlocks is incorrect if declared inside with block due to being stateful
          WITH
          (
          SELECT arrayCumSum(groupArray(amount))
          FROM
          (
          SELECT
          amount
          FROM A
          ORDER BY business_dttm
          )
          ) AS arr,
          3 AS window_size
          SELECT
          amount,
          business_dttm,
          if(rowNumberInAllBlocks() + 1 < window_size, NULL, arr[rowNumberInAllBlocks() + 1] - arr[rowNumberInAllBlocks() + 1 - window_size]) AS moving_sum
          FROM
          (
          SELECT
          amount,
          business_dttm
          FROM A
          ORDER BY business_dttm
          )


          Or this variant



          SELECT
          amount,
          business_dttm,
          moving_sum
          FROM
          (
          WITH 3 AS window_size
          SELECT
          groupArray(amount) AS amount_arr,
          groupArray(business_dttm) AS business_dttm_arr,
          arrayCumSum(amount_arr) AS amount_cum_arr,
          arrayMap(i -> if(i < window_size, NULL, amount_cum_arr[i] - amount_cum_arr[(i - window_size)]), arrayEnumerate(amount_cum_arr)) AS moving_sum_arr
          FROM
          (
          SELECT *
          FROM A
          ORDER BY business_dttm ASC
          )
          )
          ARRAY JOIN
          amount_arr AS amount,
          business_dttm_arr AS business_dttm,
          moving_sum_arr AS moving_sum


          Fair warning, both approaches are far from optimal, but it exhibits the unique power of ClickHouse beyond SQL.






          share|improve this answer























          • Unfortunately, window size ~ 10000 rows
            – Vsevolod Lukovsky
            Nov 22 at 12:37










          • Thanks for answer, but 1 moment. I was talking about Moving Sum, Not Cumulative Sum. Is it real to do it for Moving Sum?
            – Vsevolod Lukovsky
            Nov 23 at 21:49












          • Sorry! Just have tried, that's working :)
            – Vsevolod Lukovsky
            Nov 23 at 22:07










          • @VsevolodLukovsky please accept the answer if it solves your problem
            – Amos
            Nov 24 at 4:09













          up vote
          0
          down vote










          up vote
          0
          down vote









          If the window size is countably small, you can do something like this



          SELECT
          sum(window.2) AS amount,
          max(dttm) AS business_dttm,
          sum(amt) AS moving_sum
          FROM
          (
          SELECT
          arrayJoin([(rowNumberInAllBlocks(), amount), (rowNumberInAllBlocks() + 1, 0), (rowNumberInAllBlocks() + 2, 0)]) AS window,
          amount AS amt,
          business_dttm AS dttm
          FROM
          (
          SELECT
          amount,
          business_dttm
          FROM A
          ORDER BY business_dttm
          )
          )
          GROUP BY window.1
          HAVING count() = 3
          ORDER BY window.1;


          The first two rows are ignored as ClickHouse doesn't collapse aggregates into null. You can prepend them later.



          Update:



          It's still possible to compute moving sum for arbitrary window sizes. Tune the window_size as you want (3 for this example).



          -- Note, rowNumberInAllBlocks is incorrect if declared inside with block due to being stateful
          WITH
          (
          SELECT arrayCumSum(groupArray(amount))
          FROM
          (
          SELECT
          amount
          FROM A
          ORDER BY business_dttm
          )
          ) AS arr,
          3 AS window_size
          SELECT
          amount,
          business_dttm,
          if(rowNumberInAllBlocks() + 1 < window_size, NULL, arr[rowNumberInAllBlocks() + 1] - arr[rowNumberInAllBlocks() + 1 - window_size]) AS moving_sum
          FROM
          (
          SELECT
          amount,
          business_dttm
          FROM A
          ORDER BY business_dttm
          )


          Or this variant



          SELECT
          amount,
          business_dttm,
          moving_sum
          FROM
          (
          WITH 3 AS window_size
          SELECT
          groupArray(amount) AS amount_arr,
          groupArray(business_dttm) AS business_dttm_arr,
          arrayCumSum(amount_arr) AS amount_cum_arr,
          arrayMap(i -> if(i < window_size, NULL, amount_cum_arr[i] - amount_cum_arr[(i - window_size)]), arrayEnumerate(amount_cum_arr)) AS moving_sum_arr
          FROM
          (
          SELECT *
          FROM A
          ORDER BY business_dttm ASC
          )
          )
          ARRAY JOIN
          amount_arr AS amount,
          business_dttm_arr AS business_dttm,
          moving_sum_arr AS moving_sum


          Fair warning, both approaches are far from optimal, but it exhibits the unique power of ClickHouse beyond SQL.






          share|improve this answer














          If the window size is countably small, you can do something like this



          SELECT
          sum(window.2) AS amount,
          max(dttm) AS business_dttm,
          sum(amt) AS moving_sum
          FROM
          (
          SELECT
          arrayJoin([(rowNumberInAllBlocks(), amount), (rowNumberInAllBlocks() + 1, 0), (rowNumberInAllBlocks() + 2, 0)]) AS window,
          amount AS amt,
          business_dttm AS dttm
          FROM
          (
          SELECT
          amount,
          business_dttm
          FROM A
          ORDER BY business_dttm
          )
          )
          GROUP BY window.1
          HAVING count() = 3
          ORDER BY window.1;


          The first two rows are ignored as ClickHouse doesn't collapse aggregates into null. You can prepend them later.



          Update:



          It's still possible to compute moving sum for arbitrary window sizes. Tune the window_size as you want (3 for this example).



          -- Note, rowNumberInAllBlocks is incorrect if declared inside with block due to being stateful
          WITH
          (
          SELECT arrayCumSum(groupArray(amount))
          FROM
          (
          SELECT
          amount
          FROM A
          ORDER BY business_dttm
          )
          ) AS arr,
          3 AS window_size
          SELECT
          amount,
          business_dttm,
          if(rowNumberInAllBlocks() + 1 < window_size, NULL, arr[rowNumberInAllBlocks() + 1] - arr[rowNumberInAllBlocks() + 1 - window_size]) AS moving_sum
          FROM
          (
          SELECT
          amount,
          business_dttm
          FROM A
          ORDER BY business_dttm
          )


          Or this variant



          SELECT
          amount,
          business_dttm,
          moving_sum
          FROM
          (
          WITH 3 AS window_size
          SELECT
          groupArray(amount) AS amount_arr,
          groupArray(business_dttm) AS business_dttm_arr,
          arrayCumSum(amount_arr) AS amount_cum_arr,
          arrayMap(i -> if(i < window_size, NULL, amount_cum_arr[i] - amount_cum_arr[(i - window_size)]), arrayEnumerate(amount_cum_arr)) AS moving_sum_arr
          FROM
          (
          SELECT *
          FROM A
          ORDER BY business_dttm ASC
          )
          )
          ARRAY JOIN
          amount_arr AS amount,
          business_dttm_arr AS business_dttm,
          moving_sum_arr AS moving_sum


          Fair warning, both approaches are far from optimal, but it exhibits the unique power of ClickHouse beyond SQL.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 22 at 18:46

























          answered Nov 22 at 9:33









          Amos

          1,167927




          1,167927












          • Unfortunately, window size ~ 10000 rows
            – Vsevolod Lukovsky
            Nov 22 at 12:37










          • Thanks for answer, but 1 moment. I was talking about Moving Sum, Not Cumulative Sum. Is it real to do it for Moving Sum?
            – Vsevolod Lukovsky
            Nov 23 at 21:49












          • Sorry! Just have tried, that's working :)
            – Vsevolod Lukovsky
            Nov 23 at 22:07










          • @VsevolodLukovsky please accept the answer if it solves your problem
            – Amos
            Nov 24 at 4:09


















          • Unfortunately, window size ~ 10000 rows
            – Vsevolod Lukovsky
            Nov 22 at 12:37










          • Thanks for answer, but 1 moment. I was talking about Moving Sum, Not Cumulative Sum. Is it real to do it for Moving Sum?
            – Vsevolod Lukovsky
            Nov 23 at 21:49












          • Sorry! Just have tried, that's working :)
            – Vsevolod Lukovsky
            Nov 23 at 22:07










          • @VsevolodLukovsky please accept the answer if it solves your problem
            – Amos
            Nov 24 at 4:09
















          Unfortunately, window size ~ 10000 rows
          – Vsevolod Lukovsky
          Nov 22 at 12:37




          Unfortunately, window size ~ 10000 rows
          – Vsevolod Lukovsky
          Nov 22 at 12:37












          Thanks for answer, but 1 moment. I was talking about Moving Sum, Not Cumulative Sum. Is it real to do it for Moving Sum?
          – Vsevolod Lukovsky
          Nov 23 at 21:49






          Thanks for answer, but 1 moment. I was talking about Moving Sum, Not Cumulative Sum. Is it real to do it for Moving Sum?
          – Vsevolod Lukovsky
          Nov 23 at 21:49














          Sorry! Just have tried, that's working :)
          – Vsevolod Lukovsky
          Nov 23 at 22:07




          Sorry! Just have tried, that's working :)
          – Vsevolod Lukovsky
          Nov 23 at 22:07












          @VsevolodLukovsky please accept the answer if it solves your problem
          – Amos
          Nov 24 at 4:09




          @VsevolodLukovsky please accept the answer if it solves your problem
          – Amos
          Nov 24 at 4:09


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53416531%2fclickhouse-moving-average%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

          Calculate evaluation metrics using cross_val_predict sklearn

          Insert data from modal to MySQL (multiple modal on website)