Hive - Select distinct unique IDs per date without creating external tables or using JOINS
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I am working on a data set which has the following columns :
unique_ID Date
a 2018_09_08
a 2018_09_18
a 2018_09_28
d 2018_09_08
I am looking to select those Unique_IDs which are occurring on all three dates i.e 2018_09_08, 2018_09_18 and 2018_09_28.
My output should be just 'a'.
There is a long solution to this problem - Extract unique_IDs per date and create external table on top of all three of them and then use join on three tables to get unique IDs for all three dates. I believe there should be a better solution as we have just 3 dates in this case which might rise later so I am looking for a more generalized solution.
Here is the query that I have written - select distinct(unique_ID) from table_name where Date = '2018_09_08' and Date = '2018_09_18' and Date = '2018_09_28'
which is returning null.
I am also trying to write a sub-query but I doubt HIVE supports such sub queries in this case. Here is what I have written :
select count(distinct(unique_ID)) from (
(select distinct(unique_ID) from table_name where Date = '2018_09_08') a
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_18') b
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_28') c
);
and I am getting following parsing error : FAILED: ParseException line 3:0 missing ) at 'union' near ')' line 4:87 missing EOF at 'b' near ')'
How could we get the Unique_IDs in this case ?
hive bigdata hiveql hadoop2
add a comment |
I am working on a data set which has the following columns :
unique_ID Date
a 2018_09_08
a 2018_09_18
a 2018_09_28
d 2018_09_08
I am looking to select those Unique_IDs which are occurring on all three dates i.e 2018_09_08, 2018_09_18 and 2018_09_28.
My output should be just 'a'.
There is a long solution to this problem - Extract unique_IDs per date and create external table on top of all three of them and then use join on three tables to get unique IDs for all three dates. I believe there should be a better solution as we have just 3 dates in this case which might rise later so I am looking for a more generalized solution.
Here is the query that I have written - select distinct(unique_ID) from table_name where Date = '2018_09_08' and Date = '2018_09_18' and Date = '2018_09_28'
which is returning null.
I am also trying to write a sub-query but I doubt HIVE supports such sub queries in this case. Here is what I have written :
select count(distinct(unique_ID)) from (
(select distinct(unique_ID) from table_name where Date = '2018_09_08') a
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_18') b
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_28') c
);
and I am getting following parsing error : FAILED: ParseException line 3:0 missing ) at 'union' near ')' line 4:87 missing EOF at 'b' near ')'
How could we get the Unique_IDs in this case ?
hive bigdata hiveql hadoop2
Have you tried OR instead of AND?
– JARS
Nov 29 '18 at 6:33
OR
will give me distinct Unique_IDs for all three days combined. So it will give me a and b both from above case whereas I just want to get a as it is common for all three dates.
– Rishabh Dixit
Nov 29 '18 at 6:44
add a comment |
I am working on a data set which has the following columns :
unique_ID Date
a 2018_09_08
a 2018_09_18
a 2018_09_28
d 2018_09_08
I am looking to select those Unique_IDs which are occurring on all three dates i.e 2018_09_08, 2018_09_18 and 2018_09_28.
My output should be just 'a'.
There is a long solution to this problem - Extract unique_IDs per date and create external table on top of all three of them and then use join on three tables to get unique IDs for all three dates. I believe there should be a better solution as we have just 3 dates in this case which might rise later so I am looking for a more generalized solution.
Here is the query that I have written - select distinct(unique_ID) from table_name where Date = '2018_09_08' and Date = '2018_09_18' and Date = '2018_09_28'
which is returning null.
I am also trying to write a sub-query but I doubt HIVE supports such sub queries in this case. Here is what I have written :
select count(distinct(unique_ID)) from (
(select distinct(unique_ID) from table_name where Date = '2018_09_08') a
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_18') b
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_28') c
);
and I am getting following parsing error : FAILED: ParseException line 3:0 missing ) at 'union' near ')' line 4:87 missing EOF at 'b' near ')'
How could we get the Unique_IDs in this case ?
hive bigdata hiveql hadoop2
I am working on a data set which has the following columns :
unique_ID Date
a 2018_09_08
a 2018_09_18
a 2018_09_28
d 2018_09_08
I am looking to select those Unique_IDs which are occurring on all three dates i.e 2018_09_08, 2018_09_18 and 2018_09_28.
My output should be just 'a'.
There is a long solution to this problem - Extract unique_IDs per date and create external table on top of all three of them and then use join on three tables to get unique IDs for all three dates. I believe there should be a better solution as we have just 3 dates in this case which might rise later so I am looking for a more generalized solution.
Here is the query that I have written - select distinct(unique_ID) from table_name where Date = '2018_09_08' and Date = '2018_09_18' and Date = '2018_09_28'
which is returning null.
I am also trying to write a sub-query but I doubt HIVE supports such sub queries in this case. Here is what I have written :
select count(distinct(unique_ID)) from (
(select distinct(unique_ID) from table_name where Date = '2018_09_08') a
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_18') b
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_28') c
);
and I am getting following parsing error : FAILED: ParseException line 3:0 missing ) at 'union' near ')' line 4:87 missing EOF at 'b' near ')'
How could we get the Unique_IDs in this case ?
hive bigdata hiveql hadoop2
hive bigdata hiveql hadoop2
edited Nov 29 '18 at 6:46
Rishabh Dixit
asked Nov 29 '18 at 6:30
Rishabh DixitRishabh Dixit
4010
4010
Have you tried OR instead of AND?
– JARS
Nov 29 '18 at 6:33
OR
will give me distinct Unique_IDs for all three days combined. So it will give me a and b both from above case whereas I just want to get a as it is common for all three dates.
– Rishabh Dixit
Nov 29 '18 at 6:44
add a comment |
Have you tried OR instead of AND?
– JARS
Nov 29 '18 at 6:33
OR
will give me distinct Unique_IDs for all three days combined. So it will give me a and b both from above case whereas I just want to get a as it is common for all three dates.
– Rishabh Dixit
Nov 29 '18 at 6:44
Have you tried OR instead of AND?
– JARS
Nov 29 '18 at 6:33
Have you tried OR instead of AND?
– JARS
Nov 29 '18 at 6:33
OR
will give me distinct Unique_IDs for all three days combined. So it will give me a and b both from above case whereas I just want to get a as it is common for all three dates.– Rishabh Dixit
Nov 29 '18 at 6:44
OR
will give me distinct Unique_IDs for all three days combined. So it will give me a and b both from above case whereas I just want to get a as it is common for all three dates.– Rishabh Dixit
Nov 29 '18 at 6:44
add a comment |
1 Answer
1
active
oldest
votes
This can be accomplished with group by
and having
.
select unique_id,count(distinct date)
from tbl
where date in ('2018_09_08','2018_09_18','2018_09_28')
group by id
having count(distinct date) = 3
I got that yesterday myself...forgot to post the answer..that is exactly what I did ! Thank you for the effort, I really appreciate it.
– Rishabh Dixit
Nov 30 '18 at 10:09
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53533073%2fhive-select-distinct-unique-ids-per-date-without-creating-external-tables-or-u%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
This can be accomplished with group by
and having
.
select unique_id,count(distinct date)
from tbl
where date in ('2018_09_08','2018_09_18','2018_09_28')
group by id
having count(distinct date) = 3
I got that yesterday myself...forgot to post the answer..that is exactly what I did ! Thank you for the effort, I really appreciate it.
– Rishabh Dixit
Nov 30 '18 at 10:09
add a comment |
This can be accomplished with group by
and having
.
select unique_id,count(distinct date)
from tbl
where date in ('2018_09_08','2018_09_18','2018_09_28')
group by id
having count(distinct date) = 3
I got that yesterday myself...forgot to post the answer..that is exactly what I did ! Thank you for the effort, I really appreciate it.
– Rishabh Dixit
Nov 30 '18 at 10:09
add a comment |
This can be accomplished with group by
and having
.
select unique_id,count(distinct date)
from tbl
where date in ('2018_09_08','2018_09_18','2018_09_28')
group by id
having count(distinct date) = 3
This can be accomplished with group by
and having
.
select unique_id,count(distinct date)
from tbl
where date in ('2018_09_08','2018_09_18','2018_09_28')
group by id
having count(distinct date) = 3
answered Nov 29 '18 at 10:48
Vamsi PrabhalaVamsi Prabhala
41.8k42139
41.8k42139
I got that yesterday myself...forgot to post the answer..that is exactly what I did ! Thank you for the effort, I really appreciate it.
– Rishabh Dixit
Nov 30 '18 at 10:09
add a comment |
I got that yesterday myself...forgot to post the answer..that is exactly what I did ! Thank you for the effort, I really appreciate it.
– Rishabh Dixit
Nov 30 '18 at 10:09
I got that yesterday myself...forgot to post the answer..that is exactly what I did ! Thank you for the effort, I really appreciate it.
– Rishabh Dixit
Nov 30 '18 at 10:09
I got that yesterday myself...forgot to post the answer..that is exactly what I did ! Thank you for the effort, I really appreciate it.
– Rishabh Dixit
Nov 30 '18 at 10:09
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53533073%2fhive-select-distinct-unique-ids-per-date-without-creating-external-tables-or-u%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Have you tried OR instead of AND?
– JARS
Nov 29 '18 at 6:33
OR
will give me distinct Unique_IDs for all three days combined. So it will give me a and b both from above case whereas I just want to get a as it is common for all three dates.– Rishabh Dixit
Nov 29 '18 at 6:44