Creating Hive schema for frequently changing metadata structure for SAS table












0















I have a SAS master dataset which gets appended every month and have corresponding columns added for those values. Such like below columns



Name Address Column_201809 Column_201810 Column_201811



Can you please suggest how i should handle this schema changes when writing this data back to Hadoop.










share|improve this question























  • Do you have to keep adding extra columns? This would be much easier to deal with if you had yyyymm as an extra column and appended new rows to your table instead.

    – user667489
    Nov 28 '18 at 10:13











  • Unfortunately i cannot do that as my master data should have unique rows only and inserting rows for each month would duplicate the data.

    – P.Sharma
    Nov 28 '18 at 10:15






  • 1





    In that case you would just need to change the constraint so that each combination of date + id is unique.

    – user667489
    Nov 28 '18 at 10:40













  • That could be one of the cases.what I am looking for is something as Schema evolution so as to incorporate such changes at schema level on Hadoop side efficiently.

    – P.Sharma
    Nov 28 '18 at 10:53











  • That is simply a difficult, denormalized, table design. You are placing date`i nformation in the table schema itself instead of in rows. The *date columns seem to be detail information and might not be considered master level records. Is it possible you are looking for a slowly changing dimension (SCD) solution ?

    – Richard
    Nov 28 '18 at 13:03
















0















I have a SAS master dataset which gets appended every month and have corresponding columns added for those values. Such like below columns



Name Address Column_201809 Column_201810 Column_201811



Can you please suggest how i should handle this schema changes when writing this data back to Hadoop.










share|improve this question























  • Do you have to keep adding extra columns? This would be much easier to deal with if you had yyyymm as an extra column and appended new rows to your table instead.

    – user667489
    Nov 28 '18 at 10:13











  • Unfortunately i cannot do that as my master data should have unique rows only and inserting rows for each month would duplicate the data.

    – P.Sharma
    Nov 28 '18 at 10:15






  • 1





    In that case you would just need to change the constraint so that each combination of date + id is unique.

    – user667489
    Nov 28 '18 at 10:40













  • That could be one of the cases.what I am looking for is something as Schema evolution so as to incorporate such changes at schema level on Hadoop side efficiently.

    – P.Sharma
    Nov 28 '18 at 10:53











  • That is simply a difficult, denormalized, table design. You are placing date`i nformation in the table schema itself instead of in rows. The *date columns seem to be detail information and might not be considered master level records. Is it possible you are looking for a slowly changing dimension (SCD) solution ?

    – Richard
    Nov 28 '18 at 13:03














0












0








0








I have a SAS master dataset which gets appended every month and have corresponding columns added for those values. Such like below columns



Name Address Column_201809 Column_201810 Column_201811



Can you please suggest how i should handle this schema changes when writing this data back to Hadoop.










share|improve this question














I have a SAS master dataset which gets appended every month and have corresponding columns added for those values. Such like below columns



Name Address Column_201809 Column_201810 Column_201811



Can you please suggest how i should handle this schema changes when writing this data back to Hadoop.







hadoop hive sas hdfs






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 28 '18 at 9:07









P.SharmaP.Sharma

81




81













  • Do you have to keep adding extra columns? This would be much easier to deal with if you had yyyymm as an extra column and appended new rows to your table instead.

    – user667489
    Nov 28 '18 at 10:13











  • Unfortunately i cannot do that as my master data should have unique rows only and inserting rows for each month would duplicate the data.

    – P.Sharma
    Nov 28 '18 at 10:15






  • 1





    In that case you would just need to change the constraint so that each combination of date + id is unique.

    – user667489
    Nov 28 '18 at 10:40













  • That could be one of the cases.what I am looking for is something as Schema evolution so as to incorporate such changes at schema level on Hadoop side efficiently.

    – P.Sharma
    Nov 28 '18 at 10:53











  • That is simply a difficult, denormalized, table design. You are placing date`i nformation in the table schema itself instead of in rows. The *date columns seem to be detail information and might not be considered master level records. Is it possible you are looking for a slowly changing dimension (SCD) solution ?

    – Richard
    Nov 28 '18 at 13:03



















  • Do you have to keep adding extra columns? This would be much easier to deal with if you had yyyymm as an extra column and appended new rows to your table instead.

    – user667489
    Nov 28 '18 at 10:13











  • Unfortunately i cannot do that as my master data should have unique rows only and inserting rows for each month would duplicate the data.

    – P.Sharma
    Nov 28 '18 at 10:15






  • 1





    In that case you would just need to change the constraint so that each combination of date + id is unique.

    – user667489
    Nov 28 '18 at 10:40













  • That could be one of the cases.what I am looking for is something as Schema evolution so as to incorporate such changes at schema level on Hadoop side efficiently.

    – P.Sharma
    Nov 28 '18 at 10:53











  • That is simply a difficult, denormalized, table design. You are placing date`i nformation in the table schema itself instead of in rows. The *date columns seem to be detail information and might not be considered master level records. Is it possible you are looking for a slowly changing dimension (SCD) solution ?

    – Richard
    Nov 28 '18 at 13:03

















Do you have to keep adding extra columns? This would be much easier to deal with if you had yyyymm as an extra column and appended new rows to your table instead.

– user667489
Nov 28 '18 at 10:13





Do you have to keep adding extra columns? This would be much easier to deal with if you had yyyymm as an extra column and appended new rows to your table instead.

– user667489
Nov 28 '18 at 10:13













Unfortunately i cannot do that as my master data should have unique rows only and inserting rows for each month would duplicate the data.

– P.Sharma
Nov 28 '18 at 10:15





Unfortunately i cannot do that as my master data should have unique rows only and inserting rows for each month would duplicate the data.

– P.Sharma
Nov 28 '18 at 10:15




1




1





In that case you would just need to change the constraint so that each combination of date + id is unique.

– user667489
Nov 28 '18 at 10:40







In that case you would just need to change the constraint so that each combination of date + id is unique.

– user667489
Nov 28 '18 at 10:40















That could be one of the cases.what I am looking for is something as Schema evolution so as to incorporate such changes at schema level on Hadoop side efficiently.

– P.Sharma
Nov 28 '18 at 10:53





That could be one of the cases.what I am looking for is something as Schema evolution so as to incorporate such changes at schema level on Hadoop side efficiently.

– P.Sharma
Nov 28 '18 at 10:53













That is simply a difficult, denormalized, table design. You are placing date`i nformation in the table schema itself instead of in rows. The *date columns seem to be detail information and might not be considered master level records. Is it possible you are looking for a slowly changing dimension (SCD) solution ?

– Richard
Nov 28 '18 at 13:03





That is simply a difficult, denormalized, table design. You are placing date`i nformation in the table schema itself instead of in rows. The *date columns seem to be detail information and might not be considered master level records. Is it possible you are looking for a slowly changing dimension (SCD) solution ?

– Richard
Nov 28 '18 at 13:03












0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53515742%2fcreating-hive-schema-for-frequently-changing-metadata-structure-for-sas-table%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53515742%2fcreating-hive-schema-for-frequently-changing-metadata-structure-for-sas-table%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Lallio

Futebolista

Jornalista