Hive - Select distinct unique IDs per date without creating external tables or using JOINS





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















I am working on a data set which has the following columns :



unique_ID       Date
a 2018_09_08
a 2018_09_18
a 2018_09_28
d 2018_09_08


I am looking to select those Unique_IDs which are occurring on all three dates i.e 2018_09_08, 2018_09_18 and 2018_09_28.



My output should be just 'a'.



There is a long solution to this problem - Extract unique_IDs per date and create external table on top of all three of them and then use join on three tables to get unique IDs for all three dates. I believe there should be a better solution as we have just 3 dates in this case which might rise later so I am looking for a more generalized solution.



Here is the query that I have written - select distinct(unique_ID) from table_name where Date = '2018_09_08' and Date = '2018_09_18' and Date = '2018_09_28' which is returning null.



I am also trying to write a sub-query but I doubt HIVE supports such sub queries in this case. Here is what I have written :



select count(distinct(unique_ID)) from (
(select distinct(unique_ID) from table_name where Date = '2018_09_08') a
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_18') b
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_28') c
);


and I am getting following parsing error : FAILED: ParseException line 3:0 missing ) at 'union' near ')' line 4:87 missing EOF at 'b' near ')'



How could we get the Unique_IDs in this case ?










share|improve this question

























  • Have you tried OR instead of AND?

    – JARS
    Nov 29 '18 at 6:33











  • OR will give me distinct Unique_IDs for all three days combined. So it will give me a and b both from above case whereas I just want to get a as it is common for all three dates.

    – Rishabh Dixit
    Nov 29 '18 at 6:44


















0















I am working on a data set which has the following columns :



unique_ID       Date
a 2018_09_08
a 2018_09_18
a 2018_09_28
d 2018_09_08


I am looking to select those Unique_IDs which are occurring on all three dates i.e 2018_09_08, 2018_09_18 and 2018_09_28.



My output should be just 'a'.



There is a long solution to this problem - Extract unique_IDs per date and create external table on top of all three of them and then use join on three tables to get unique IDs for all three dates. I believe there should be a better solution as we have just 3 dates in this case which might rise later so I am looking for a more generalized solution.



Here is the query that I have written - select distinct(unique_ID) from table_name where Date = '2018_09_08' and Date = '2018_09_18' and Date = '2018_09_28' which is returning null.



I am also trying to write a sub-query but I doubt HIVE supports such sub queries in this case. Here is what I have written :



select count(distinct(unique_ID)) from (
(select distinct(unique_ID) from table_name where Date = '2018_09_08') a
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_18') b
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_28') c
);


and I am getting following parsing error : FAILED: ParseException line 3:0 missing ) at 'union' near ')' line 4:87 missing EOF at 'b' near ')'



How could we get the Unique_IDs in this case ?










share|improve this question

























  • Have you tried OR instead of AND?

    – JARS
    Nov 29 '18 at 6:33











  • OR will give me distinct Unique_IDs for all three days combined. So it will give me a and b both from above case whereas I just want to get a as it is common for all three dates.

    – Rishabh Dixit
    Nov 29 '18 at 6:44














0












0








0








I am working on a data set which has the following columns :



unique_ID       Date
a 2018_09_08
a 2018_09_18
a 2018_09_28
d 2018_09_08


I am looking to select those Unique_IDs which are occurring on all three dates i.e 2018_09_08, 2018_09_18 and 2018_09_28.



My output should be just 'a'.



There is a long solution to this problem - Extract unique_IDs per date and create external table on top of all three of them and then use join on three tables to get unique IDs for all three dates. I believe there should be a better solution as we have just 3 dates in this case which might rise later so I am looking for a more generalized solution.



Here is the query that I have written - select distinct(unique_ID) from table_name where Date = '2018_09_08' and Date = '2018_09_18' and Date = '2018_09_28' which is returning null.



I am also trying to write a sub-query but I doubt HIVE supports such sub queries in this case. Here is what I have written :



select count(distinct(unique_ID)) from (
(select distinct(unique_ID) from table_name where Date = '2018_09_08') a
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_18') b
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_28') c
);


and I am getting following parsing error : FAILED: ParseException line 3:0 missing ) at 'union' near ')' line 4:87 missing EOF at 'b' near ')'



How could we get the Unique_IDs in this case ?










share|improve this question
















I am working on a data set which has the following columns :



unique_ID       Date
a 2018_09_08
a 2018_09_18
a 2018_09_28
d 2018_09_08


I am looking to select those Unique_IDs which are occurring on all three dates i.e 2018_09_08, 2018_09_18 and 2018_09_28.



My output should be just 'a'.



There is a long solution to this problem - Extract unique_IDs per date and create external table on top of all three of them and then use join on three tables to get unique IDs for all three dates. I believe there should be a better solution as we have just 3 dates in this case which might rise later so I am looking for a more generalized solution.



Here is the query that I have written - select distinct(unique_ID) from table_name where Date = '2018_09_08' and Date = '2018_09_18' and Date = '2018_09_28' which is returning null.



I am also trying to write a sub-query but I doubt HIVE supports such sub queries in this case. Here is what I have written :



select count(distinct(unique_ID)) from (
(select distinct(unique_ID) from table_name where Date = '2018_09_08') a
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_18') b
union all
(select distinct(unique_ID) from table_name where Date = '2018_09_28') c
);


and I am getting following parsing error : FAILED: ParseException line 3:0 missing ) at 'union' near ')' line 4:87 missing EOF at 'b' near ')'



How could we get the Unique_IDs in this case ?







hive bigdata hiveql hadoop2






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 29 '18 at 6:46







Rishabh Dixit

















asked Nov 29 '18 at 6:30









Rishabh DixitRishabh Dixit

4010




4010













  • Have you tried OR instead of AND?

    – JARS
    Nov 29 '18 at 6:33











  • OR will give me distinct Unique_IDs for all three days combined. So it will give me a and b both from above case whereas I just want to get a as it is common for all three dates.

    – Rishabh Dixit
    Nov 29 '18 at 6:44



















  • Have you tried OR instead of AND?

    – JARS
    Nov 29 '18 at 6:33











  • OR will give me distinct Unique_IDs for all three days combined. So it will give me a and b both from above case whereas I just want to get a as it is common for all three dates.

    – Rishabh Dixit
    Nov 29 '18 at 6:44

















Have you tried OR instead of AND?

– JARS
Nov 29 '18 at 6:33





Have you tried OR instead of AND?

– JARS
Nov 29 '18 at 6:33













OR will give me distinct Unique_IDs for all three days combined. So it will give me a and b both from above case whereas I just want to get a as it is common for all three dates.

– Rishabh Dixit
Nov 29 '18 at 6:44





OR will give me distinct Unique_IDs for all three days combined. So it will give me a and b both from above case whereas I just want to get a as it is common for all three dates.

– Rishabh Dixit
Nov 29 '18 at 6:44












1 Answer
1






active

oldest

votes


















2














This can be accomplished with group by and having.



select unique_id,count(distinct date)
from tbl
where date in ('2018_09_08','2018_09_18','2018_09_28')
group by id
having count(distinct date) = 3





share|improve this answer
























  • I got that yesterday myself...forgot to post the answer..that is exactly what I did ! Thank you for the effort, I really appreciate it.

    – Rishabh Dixit
    Nov 30 '18 at 10:09












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53533073%2fhive-select-distinct-unique-ids-per-date-without-creating-external-tables-or-u%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














This can be accomplished with group by and having.



select unique_id,count(distinct date)
from tbl
where date in ('2018_09_08','2018_09_18','2018_09_28')
group by id
having count(distinct date) = 3





share|improve this answer
























  • I got that yesterday myself...forgot to post the answer..that is exactly what I did ! Thank you for the effort, I really appreciate it.

    – Rishabh Dixit
    Nov 30 '18 at 10:09
















2














This can be accomplished with group by and having.



select unique_id,count(distinct date)
from tbl
where date in ('2018_09_08','2018_09_18','2018_09_28')
group by id
having count(distinct date) = 3





share|improve this answer
























  • I got that yesterday myself...forgot to post the answer..that is exactly what I did ! Thank you for the effort, I really appreciate it.

    – Rishabh Dixit
    Nov 30 '18 at 10:09














2












2








2







This can be accomplished with group by and having.



select unique_id,count(distinct date)
from tbl
where date in ('2018_09_08','2018_09_18','2018_09_28')
group by id
having count(distinct date) = 3





share|improve this answer













This can be accomplished with group by and having.



select unique_id,count(distinct date)
from tbl
where date in ('2018_09_08','2018_09_18','2018_09_28')
group by id
having count(distinct date) = 3






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 29 '18 at 10:48









Vamsi PrabhalaVamsi Prabhala

41.8k42139




41.8k42139













  • I got that yesterday myself...forgot to post the answer..that is exactly what I did ! Thank you for the effort, I really appreciate it.

    – Rishabh Dixit
    Nov 30 '18 at 10:09



















  • I got that yesterday myself...forgot to post the answer..that is exactly what I did ! Thank you for the effort, I really appreciate it.

    – Rishabh Dixit
    Nov 30 '18 at 10:09

















I got that yesterday myself...forgot to post the answer..that is exactly what I did ! Thank you for the effort, I really appreciate it.

– Rishabh Dixit
Nov 30 '18 at 10:09





I got that yesterday myself...forgot to post the answer..that is exactly what I did ! Thank you for the effort, I really appreciate it.

– Rishabh Dixit
Nov 30 '18 at 10:09




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53533073%2fhive-select-distinct-unique-ids-per-date-without-creating-external-tables-or-u%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

Calculate evaluation metrics using cross_val_predict sklearn

Insert data from modal to MySQL (multiple modal on website)