Clickhouse moving average
up vote
0
down vote
favorite
Input:
Clickhouse
Table A
business_dttm (datetime)
amount (float)
I need to calculate moving sum for 15 minutes (or for last 3 records) on each business_dttm
For example
amount business_dttm moving sum
0.3 2018-11-19 13:00:00
0.3 2018-11-19 13:05:00
0.4 2018-11-19 13:10:00 1
0.5 2018-11-19 13:15:00 1.2
0.6 2018-11-19 13:15:00 1.5
0.7 2018-11-19 13:20:00 1.8
0.8 2018-11-19 13:25:00 2.1
0.9 2018-11-19 13:25:00 2.4
0.5 2018-11-19 13:30:00 2.2
Unfortunately we haven't window functions and join without equal conditions in Clickhouse
How can i do it without cross join and where condition?
moving-average clickhouse
add a comment |
up vote
0
down vote
favorite
Input:
Clickhouse
Table A
business_dttm (datetime)
amount (float)
I need to calculate moving sum for 15 minutes (or for last 3 records) on each business_dttm
For example
amount business_dttm moving sum
0.3 2018-11-19 13:00:00
0.3 2018-11-19 13:05:00
0.4 2018-11-19 13:10:00 1
0.5 2018-11-19 13:15:00 1.2
0.6 2018-11-19 13:15:00 1.5
0.7 2018-11-19 13:20:00 1.8
0.8 2018-11-19 13:25:00 2.1
0.9 2018-11-19 13:25:00 2.4
0.5 2018-11-19 13:30:00 2.2
Unfortunately we haven't window functions and join without equal conditions in Clickhouse
How can i do it without cross join and where condition?
moving-average clickhouse
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Input:
Clickhouse
Table A
business_dttm (datetime)
amount (float)
I need to calculate moving sum for 15 minutes (or for last 3 records) on each business_dttm
For example
amount business_dttm moving sum
0.3 2018-11-19 13:00:00
0.3 2018-11-19 13:05:00
0.4 2018-11-19 13:10:00 1
0.5 2018-11-19 13:15:00 1.2
0.6 2018-11-19 13:15:00 1.5
0.7 2018-11-19 13:20:00 1.8
0.8 2018-11-19 13:25:00 2.1
0.9 2018-11-19 13:25:00 2.4
0.5 2018-11-19 13:30:00 2.2
Unfortunately we haven't window functions and join without equal conditions in Clickhouse
How can i do it without cross join and where condition?
moving-average clickhouse
Input:
Clickhouse
Table A
business_dttm (datetime)
amount (float)
I need to calculate moving sum for 15 minutes (or for last 3 records) on each business_dttm
For example
amount business_dttm moving sum
0.3 2018-11-19 13:00:00
0.3 2018-11-19 13:05:00
0.4 2018-11-19 13:10:00 1
0.5 2018-11-19 13:15:00 1.2
0.6 2018-11-19 13:15:00 1.5
0.7 2018-11-19 13:20:00 1.8
0.8 2018-11-19 13:25:00 2.1
0.9 2018-11-19 13:25:00 2.4
0.5 2018-11-19 13:30:00 2.2
Unfortunately we haven't window functions and join without equal conditions in Clickhouse
How can i do it without cross join and where condition?
moving-average clickhouse
moving-average clickhouse
asked Nov 21 at 16:29
Vsevolod Lukovsky
1
1
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
If the window size is countably small, you can do something like this
SELECT
sum(window.2) AS amount,
max(dttm) AS business_dttm,
sum(amt) AS moving_sum
FROM
(
SELECT
arrayJoin([(rowNumberInAllBlocks(), amount), (rowNumberInAllBlocks() + 1, 0), (rowNumberInAllBlocks() + 2, 0)]) AS window,
amount AS amt,
business_dttm AS dttm
FROM
(
SELECT
amount,
business_dttm
FROM A
ORDER BY business_dttm
)
)
GROUP BY window.1
HAVING count() = 3
ORDER BY window.1;
The first two rows are ignored as ClickHouse doesn't collapse aggregates into null. You can prepend them later.
Update:
It's still possible to compute moving sum for arbitrary window sizes. Tune the window_size
as you want (3 for this example).
-- Note, rowNumberInAllBlocks is incorrect if declared inside with block due to being stateful
WITH
(
SELECT arrayCumSum(groupArray(amount))
FROM
(
SELECT
amount
FROM A
ORDER BY business_dttm
)
) AS arr,
3 AS window_size
SELECT
amount,
business_dttm,
if(rowNumberInAllBlocks() + 1 < window_size, NULL, arr[rowNumberInAllBlocks() + 1] - arr[rowNumberInAllBlocks() + 1 - window_size]) AS moving_sum
FROM
(
SELECT
amount,
business_dttm
FROM A
ORDER BY business_dttm
)
Or this variant
SELECT
amount,
business_dttm,
moving_sum
FROM
(
WITH 3 AS window_size
SELECT
groupArray(amount) AS amount_arr,
groupArray(business_dttm) AS business_dttm_arr,
arrayCumSum(amount_arr) AS amount_cum_arr,
arrayMap(i -> if(i < window_size, NULL, amount_cum_arr[i] - amount_cum_arr[(i - window_size)]), arrayEnumerate(amount_cum_arr)) AS moving_sum_arr
FROM
(
SELECT *
FROM A
ORDER BY business_dttm ASC
)
)
ARRAY JOIN
amount_arr AS amount,
business_dttm_arr AS business_dttm,
moving_sum_arr AS moving_sum
Fair warning, both approaches are far from optimal, but it exhibits the unique power of ClickHouse beyond SQL.
Unfortunately, window size ~ 10000 rows
– Vsevolod Lukovsky
Nov 22 at 12:37
Thanks for answer, but 1 moment. I was talking about Moving Sum, Not Cumulative Sum. Is it real to do it for Moving Sum?
– Vsevolod Lukovsky
Nov 23 at 21:49
Sorry! Just have tried, that's working :)
– Vsevolod Lukovsky
Nov 23 at 22:07
@VsevolodLukovsky please accept the answer if it solves your problem
– Amos
Nov 24 at 4:09
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
If the window size is countably small, you can do something like this
SELECT
sum(window.2) AS amount,
max(dttm) AS business_dttm,
sum(amt) AS moving_sum
FROM
(
SELECT
arrayJoin([(rowNumberInAllBlocks(), amount), (rowNumberInAllBlocks() + 1, 0), (rowNumberInAllBlocks() + 2, 0)]) AS window,
amount AS amt,
business_dttm AS dttm
FROM
(
SELECT
amount,
business_dttm
FROM A
ORDER BY business_dttm
)
)
GROUP BY window.1
HAVING count() = 3
ORDER BY window.1;
The first two rows are ignored as ClickHouse doesn't collapse aggregates into null. You can prepend them later.
Update:
It's still possible to compute moving sum for arbitrary window sizes. Tune the window_size
as you want (3 for this example).
-- Note, rowNumberInAllBlocks is incorrect if declared inside with block due to being stateful
WITH
(
SELECT arrayCumSum(groupArray(amount))
FROM
(
SELECT
amount
FROM A
ORDER BY business_dttm
)
) AS arr,
3 AS window_size
SELECT
amount,
business_dttm,
if(rowNumberInAllBlocks() + 1 < window_size, NULL, arr[rowNumberInAllBlocks() + 1] - arr[rowNumberInAllBlocks() + 1 - window_size]) AS moving_sum
FROM
(
SELECT
amount,
business_dttm
FROM A
ORDER BY business_dttm
)
Or this variant
SELECT
amount,
business_dttm,
moving_sum
FROM
(
WITH 3 AS window_size
SELECT
groupArray(amount) AS amount_arr,
groupArray(business_dttm) AS business_dttm_arr,
arrayCumSum(amount_arr) AS amount_cum_arr,
arrayMap(i -> if(i < window_size, NULL, amount_cum_arr[i] - amount_cum_arr[(i - window_size)]), arrayEnumerate(amount_cum_arr)) AS moving_sum_arr
FROM
(
SELECT *
FROM A
ORDER BY business_dttm ASC
)
)
ARRAY JOIN
amount_arr AS amount,
business_dttm_arr AS business_dttm,
moving_sum_arr AS moving_sum
Fair warning, both approaches are far from optimal, but it exhibits the unique power of ClickHouse beyond SQL.
Unfortunately, window size ~ 10000 rows
– Vsevolod Lukovsky
Nov 22 at 12:37
Thanks for answer, but 1 moment. I was talking about Moving Sum, Not Cumulative Sum. Is it real to do it for Moving Sum?
– Vsevolod Lukovsky
Nov 23 at 21:49
Sorry! Just have tried, that's working :)
– Vsevolod Lukovsky
Nov 23 at 22:07
@VsevolodLukovsky please accept the answer if it solves your problem
– Amos
Nov 24 at 4:09
add a comment |
up vote
0
down vote
If the window size is countably small, you can do something like this
SELECT
sum(window.2) AS amount,
max(dttm) AS business_dttm,
sum(amt) AS moving_sum
FROM
(
SELECT
arrayJoin([(rowNumberInAllBlocks(), amount), (rowNumberInAllBlocks() + 1, 0), (rowNumberInAllBlocks() + 2, 0)]) AS window,
amount AS amt,
business_dttm AS dttm
FROM
(
SELECT
amount,
business_dttm
FROM A
ORDER BY business_dttm
)
)
GROUP BY window.1
HAVING count() = 3
ORDER BY window.1;
The first two rows are ignored as ClickHouse doesn't collapse aggregates into null. You can prepend them later.
Update:
It's still possible to compute moving sum for arbitrary window sizes. Tune the window_size
as you want (3 for this example).
-- Note, rowNumberInAllBlocks is incorrect if declared inside with block due to being stateful
WITH
(
SELECT arrayCumSum(groupArray(amount))
FROM
(
SELECT
amount
FROM A
ORDER BY business_dttm
)
) AS arr,
3 AS window_size
SELECT
amount,
business_dttm,
if(rowNumberInAllBlocks() + 1 < window_size, NULL, arr[rowNumberInAllBlocks() + 1] - arr[rowNumberInAllBlocks() + 1 - window_size]) AS moving_sum
FROM
(
SELECT
amount,
business_dttm
FROM A
ORDER BY business_dttm
)
Or this variant
SELECT
amount,
business_dttm,
moving_sum
FROM
(
WITH 3 AS window_size
SELECT
groupArray(amount) AS amount_arr,
groupArray(business_dttm) AS business_dttm_arr,
arrayCumSum(amount_arr) AS amount_cum_arr,
arrayMap(i -> if(i < window_size, NULL, amount_cum_arr[i] - amount_cum_arr[(i - window_size)]), arrayEnumerate(amount_cum_arr)) AS moving_sum_arr
FROM
(
SELECT *
FROM A
ORDER BY business_dttm ASC
)
)
ARRAY JOIN
amount_arr AS amount,
business_dttm_arr AS business_dttm,
moving_sum_arr AS moving_sum
Fair warning, both approaches are far from optimal, but it exhibits the unique power of ClickHouse beyond SQL.
Unfortunately, window size ~ 10000 rows
– Vsevolod Lukovsky
Nov 22 at 12:37
Thanks for answer, but 1 moment. I was talking about Moving Sum, Not Cumulative Sum. Is it real to do it for Moving Sum?
– Vsevolod Lukovsky
Nov 23 at 21:49
Sorry! Just have tried, that's working :)
– Vsevolod Lukovsky
Nov 23 at 22:07
@VsevolodLukovsky please accept the answer if it solves your problem
– Amos
Nov 24 at 4:09
add a comment |
up vote
0
down vote
up vote
0
down vote
If the window size is countably small, you can do something like this
SELECT
sum(window.2) AS amount,
max(dttm) AS business_dttm,
sum(amt) AS moving_sum
FROM
(
SELECT
arrayJoin([(rowNumberInAllBlocks(), amount), (rowNumberInAllBlocks() + 1, 0), (rowNumberInAllBlocks() + 2, 0)]) AS window,
amount AS amt,
business_dttm AS dttm
FROM
(
SELECT
amount,
business_dttm
FROM A
ORDER BY business_dttm
)
)
GROUP BY window.1
HAVING count() = 3
ORDER BY window.1;
The first two rows are ignored as ClickHouse doesn't collapse aggregates into null. You can prepend them later.
Update:
It's still possible to compute moving sum for arbitrary window sizes. Tune the window_size
as you want (3 for this example).
-- Note, rowNumberInAllBlocks is incorrect if declared inside with block due to being stateful
WITH
(
SELECT arrayCumSum(groupArray(amount))
FROM
(
SELECT
amount
FROM A
ORDER BY business_dttm
)
) AS arr,
3 AS window_size
SELECT
amount,
business_dttm,
if(rowNumberInAllBlocks() + 1 < window_size, NULL, arr[rowNumberInAllBlocks() + 1] - arr[rowNumberInAllBlocks() + 1 - window_size]) AS moving_sum
FROM
(
SELECT
amount,
business_dttm
FROM A
ORDER BY business_dttm
)
Or this variant
SELECT
amount,
business_dttm,
moving_sum
FROM
(
WITH 3 AS window_size
SELECT
groupArray(amount) AS amount_arr,
groupArray(business_dttm) AS business_dttm_arr,
arrayCumSum(amount_arr) AS amount_cum_arr,
arrayMap(i -> if(i < window_size, NULL, amount_cum_arr[i] - amount_cum_arr[(i - window_size)]), arrayEnumerate(amount_cum_arr)) AS moving_sum_arr
FROM
(
SELECT *
FROM A
ORDER BY business_dttm ASC
)
)
ARRAY JOIN
amount_arr AS amount,
business_dttm_arr AS business_dttm,
moving_sum_arr AS moving_sum
Fair warning, both approaches are far from optimal, but it exhibits the unique power of ClickHouse beyond SQL.
If the window size is countably small, you can do something like this
SELECT
sum(window.2) AS amount,
max(dttm) AS business_dttm,
sum(amt) AS moving_sum
FROM
(
SELECT
arrayJoin([(rowNumberInAllBlocks(), amount), (rowNumberInAllBlocks() + 1, 0), (rowNumberInAllBlocks() + 2, 0)]) AS window,
amount AS amt,
business_dttm AS dttm
FROM
(
SELECT
amount,
business_dttm
FROM A
ORDER BY business_dttm
)
)
GROUP BY window.1
HAVING count() = 3
ORDER BY window.1;
The first two rows are ignored as ClickHouse doesn't collapse aggregates into null. You can prepend them later.
Update:
It's still possible to compute moving sum for arbitrary window sizes. Tune the window_size
as you want (3 for this example).
-- Note, rowNumberInAllBlocks is incorrect if declared inside with block due to being stateful
WITH
(
SELECT arrayCumSum(groupArray(amount))
FROM
(
SELECT
amount
FROM A
ORDER BY business_dttm
)
) AS arr,
3 AS window_size
SELECT
amount,
business_dttm,
if(rowNumberInAllBlocks() + 1 < window_size, NULL, arr[rowNumberInAllBlocks() + 1] - arr[rowNumberInAllBlocks() + 1 - window_size]) AS moving_sum
FROM
(
SELECT
amount,
business_dttm
FROM A
ORDER BY business_dttm
)
Or this variant
SELECT
amount,
business_dttm,
moving_sum
FROM
(
WITH 3 AS window_size
SELECT
groupArray(amount) AS amount_arr,
groupArray(business_dttm) AS business_dttm_arr,
arrayCumSum(amount_arr) AS amount_cum_arr,
arrayMap(i -> if(i < window_size, NULL, amount_cum_arr[i] - amount_cum_arr[(i - window_size)]), arrayEnumerate(amount_cum_arr)) AS moving_sum_arr
FROM
(
SELECT *
FROM A
ORDER BY business_dttm ASC
)
)
ARRAY JOIN
amount_arr AS amount,
business_dttm_arr AS business_dttm,
moving_sum_arr AS moving_sum
Fair warning, both approaches are far from optimal, but it exhibits the unique power of ClickHouse beyond SQL.
edited Nov 22 at 18:46
answered Nov 22 at 9:33
Amos
1,167927
1,167927
Unfortunately, window size ~ 10000 rows
– Vsevolod Lukovsky
Nov 22 at 12:37
Thanks for answer, but 1 moment. I was talking about Moving Sum, Not Cumulative Sum. Is it real to do it for Moving Sum?
– Vsevolod Lukovsky
Nov 23 at 21:49
Sorry! Just have tried, that's working :)
– Vsevolod Lukovsky
Nov 23 at 22:07
@VsevolodLukovsky please accept the answer if it solves your problem
– Amos
Nov 24 at 4:09
add a comment |
Unfortunately, window size ~ 10000 rows
– Vsevolod Lukovsky
Nov 22 at 12:37
Thanks for answer, but 1 moment. I was talking about Moving Sum, Not Cumulative Sum. Is it real to do it for Moving Sum?
– Vsevolod Lukovsky
Nov 23 at 21:49
Sorry! Just have tried, that's working :)
– Vsevolod Lukovsky
Nov 23 at 22:07
@VsevolodLukovsky please accept the answer if it solves your problem
– Amos
Nov 24 at 4:09
Unfortunately, window size ~ 10000 rows
– Vsevolod Lukovsky
Nov 22 at 12:37
Unfortunately, window size ~ 10000 rows
– Vsevolod Lukovsky
Nov 22 at 12:37
Thanks for answer, but 1 moment. I was talking about Moving Sum, Not Cumulative Sum. Is it real to do it for Moving Sum?
– Vsevolod Lukovsky
Nov 23 at 21:49
Thanks for answer, but 1 moment. I was talking about Moving Sum, Not Cumulative Sum. Is it real to do it for Moving Sum?
– Vsevolod Lukovsky
Nov 23 at 21:49
Sorry! Just have tried, that's working :)
– Vsevolod Lukovsky
Nov 23 at 22:07
Sorry! Just have tried, that's working :)
– Vsevolod Lukovsky
Nov 23 at 22:07
@VsevolodLukovsky please accept the answer if it solves your problem
– Amos
Nov 24 at 4:09
@VsevolodLukovsky please accept the answer if it solves your problem
– Amos
Nov 24 at 4:09
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53416531%2fclickhouse-moving-average%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown