Should I use a composite primary key (clustered index) or a surrogate key & secondary index?












0















Problem Statement



I have the following candidate key



{Student Id, Entity Id, Action Id......}


And I'm trying to decide whether I should use a composite primary key or a surrogate (auto increment integer or timestamp or guid) key.



Research & Analysis



I looked up up a couple of questions & discussions on stackoverflow but did not get a clear direction, specific to my scenario.



Should a composite key be e primary key



Compositve primary key or primary surrogate key



Here are the access patterns to the table



Write operations



The data gets stored into this table in response to Student actions in the web app; i.e. we can view it as time series data capturing the user actions on different entities. Over a period of time, I expect the number of concurrent students logged into the web app to be very high (depending on adoption of the product by the student population!). Im thus expecting a high number of concurrent writes, and a large table size over time.



Read operations



The table is to be accessed per student whenever a student logs in to the system to show him an overall summary of his past actions.



Planned Approach



Keeping in mind the access patterns, projected size of the table and the time series nature (events!) of the data, I'm more inclined towards using an unique surrogate key (auto increment integer), and a composite secondary key to get the best balance of write & read performance. The other alternative being, to create a composite primary key and avoiding a surrogate key.



Can you validate my approach or give your suggestions on alternatives or refinements to my approach?










share|improve this question





























    0















    Problem Statement



    I have the following candidate key



    {Student Id, Entity Id, Action Id......}


    And I'm trying to decide whether I should use a composite primary key or a surrogate (auto increment integer or timestamp or guid) key.



    Research & Analysis



    I looked up up a couple of questions & discussions on stackoverflow but did not get a clear direction, specific to my scenario.



    Should a composite key be e primary key



    Compositve primary key or primary surrogate key



    Here are the access patterns to the table



    Write operations



    The data gets stored into this table in response to Student actions in the web app; i.e. we can view it as time series data capturing the user actions on different entities. Over a period of time, I expect the number of concurrent students logged into the web app to be very high (depending on adoption of the product by the student population!). Im thus expecting a high number of concurrent writes, and a large table size over time.



    Read operations



    The table is to be accessed per student whenever a student logs in to the system to show him an overall summary of his past actions.



    Planned Approach



    Keeping in mind the access patterns, projected size of the table and the time series nature (events!) of the data, I'm more inclined towards using an unique surrogate key (auto increment integer), and a composite secondary key to get the best balance of write & read performance. The other alternative being, to create a composite primary key and avoiding a surrogate key.



    Can you validate my approach or give your suggestions on alternatives or refinements to my approach?










    share|improve this question



























      0












      0








      0








      Problem Statement



      I have the following candidate key



      {Student Id, Entity Id, Action Id......}


      And I'm trying to decide whether I should use a composite primary key or a surrogate (auto increment integer or timestamp or guid) key.



      Research & Analysis



      I looked up up a couple of questions & discussions on stackoverflow but did not get a clear direction, specific to my scenario.



      Should a composite key be e primary key



      Compositve primary key or primary surrogate key



      Here are the access patterns to the table



      Write operations



      The data gets stored into this table in response to Student actions in the web app; i.e. we can view it as time series data capturing the user actions on different entities. Over a period of time, I expect the number of concurrent students logged into the web app to be very high (depending on adoption of the product by the student population!). Im thus expecting a high number of concurrent writes, and a large table size over time.



      Read operations



      The table is to be accessed per student whenever a student logs in to the system to show him an overall summary of his past actions.



      Planned Approach



      Keeping in mind the access patterns, projected size of the table and the time series nature (events!) of the data, I'm more inclined towards using an unique surrogate key (auto increment integer), and a composite secondary key to get the best balance of write & read performance. The other alternative being, to create a composite primary key and avoiding a surrogate key.



      Can you validate my approach or give your suggestions on alternatives or refinements to my approach?










      share|improve this question
















      Problem Statement



      I have the following candidate key



      {Student Id, Entity Id, Action Id......}


      And I'm trying to decide whether I should use a composite primary key or a surrogate (auto increment integer or timestamp or guid) key.



      Research & Analysis



      I looked up up a couple of questions & discussions on stackoverflow but did not get a clear direction, specific to my scenario.



      Should a composite key be e primary key



      Compositve primary key or primary surrogate key



      Here are the access patterns to the table



      Write operations



      The data gets stored into this table in response to Student actions in the web app; i.e. we can view it as time series data capturing the user actions on different entities. Over a period of time, I expect the number of concurrent students logged into the web app to be very high (depending on adoption of the product by the student population!). Im thus expecting a high number of concurrent writes, and a large table size over time.



      Read operations



      The table is to be accessed per student whenever a student logs in to the system to show him an overall summary of his past actions.



      Planned Approach



      Keeping in mind the access patterns, projected size of the table and the time series nature (events!) of the data, I'm more inclined towards using an unique surrogate key (auto increment integer), and a composite secondary key to get the best balance of write & read performance. The other alternative being, to create a composite primary key and avoiding a surrogate key.



      Can you validate my approach or give your suggestions on alternatives or refinements to my approach?







      mysql database database-design amazon-rds-aurora






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 28 '18 at 14:19







      Alwyn - Numino Labs

















      asked Nov 27 '18 at 7:19









      Alwyn - Numino LabsAlwyn - Numino Labs

      559




      559
























          1 Answer
          1






          active

          oldest

          votes


















          1














          Technically, you are better off using a composite key. The only downside is that it's not suitable for entities.



          What I mean is that an M:N may become an entity over time. It will gain columns and it's individual rows will be manipulated and referenced, which is complicated with composite keys.



          A plain M:N table on the other side is required to have a unique key on referencing columns, so using it as a primary key makes a lot of sense. You get rid of extra ID column, sequence and additional index.






          share|improve this answer
























          • If I keep a composite key, as MySql Innodb engine provides a clustered index, high number of concurrent writes could result in a lot of re-arranging of the existing records in the db resulting in poor write performance and thus poor response times for users?

            – Alwyn - Numino Labs
            Nov 28 '18 at 13:01











          • There should not be much difference. You would have the UNIQUE KEY in one case or the PRIMARY KEY in the other - implementation is basically same. Further more, if you are building a solution to handle thousands of transactions per second you should not have used MySQL in the first place.

            – Boris Schegolev
            Nov 28 '18 at 14:06











          • Boris, I should correct myself. I plan to use AWS RDS- Aurora serverless (MySQL InnoDb Engine). I expect Aurora to get the product to atleast 100k-200k total students on the platform, and at least a few thousand to be able to act/login concurrently. Also, I plan to avoid using the unique key, but have a secondary index. I expect the secondary index to be re-built asynchronously, and not significantly impact write performance?

            – Alwyn - Numino Labs
            Nov 28 '18 at 14:16











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53494565%2fshould-i-use-a-composite-primary-key-clustered-index-or-a-surrogate-key-seco%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          Technically, you are better off using a composite key. The only downside is that it's not suitable for entities.



          What I mean is that an M:N may become an entity over time. It will gain columns and it's individual rows will be manipulated and referenced, which is complicated with composite keys.



          A plain M:N table on the other side is required to have a unique key on referencing columns, so using it as a primary key makes a lot of sense. You get rid of extra ID column, sequence and additional index.






          share|improve this answer
























          • If I keep a composite key, as MySql Innodb engine provides a clustered index, high number of concurrent writes could result in a lot of re-arranging of the existing records in the db resulting in poor write performance and thus poor response times for users?

            – Alwyn - Numino Labs
            Nov 28 '18 at 13:01











          • There should not be much difference. You would have the UNIQUE KEY in one case or the PRIMARY KEY in the other - implementation is basically same. Further more, if you are building a solution to handle thousands of transactions per second you should not have used MySQL in the first place.

            – Boris Schegolev
            Nov 28 '18 at 14:06











          • Boris, I should correct myself. I plan to use AWS RDS- Aurora serverless (MySQL InnoDb Engine). I expect Aurora to get the product to atleast 100k-200k total students on the platform, and at least a few thousand to be able to act/login concurrently. Also, I plan to avoid using the unique key, but have a secondary index. I expect the secondary index to be re-built asynchronously, and not significantly impact write performance?

            – Alwyn - Numino Labs
            Nov 28 '18 at 14:16
















          1














          Technically, you are better off using a composite key. The only downside is that it's not suitable for entities.



          What I mean is that an M:N may become an entity over time. It will gain columns and it's individual rows will be manipulated and referenced, which is complicated with composite keys.



          A plain M:N table on the other side is required to have a unique key on referencing columns, so using it as a primary key makes a lot of sense. You get rid of extra ID column, sequence and additional index.






          share|improve this answer
























          • If I keep a composite key, as MySql Innodb engine provides a clustered index, high number of concurrent writes could result in a lot of re-arranging of the existing records in the db resulting in poor write performance and thus poor response times for users?

            – Alwyn - Numino Labs
            Nov 28 '18 at 13:01











          • There should not be much difference. You would have the UNIQUE KEY in one case or the PRIMARY KEY in the other - implementation is basically same. Further more, if you are building a solution to handle thousands of transactions per second you should not have used MySQL in the first place.

            – Boris Schegolev
            Nov 28 '18 at 14:06











          • Boris, I should correct myself. I plan to use AWS RDS- Aurora serverless (MySQL InnoDb Engine). I expect Aurora to get the product to atleast 100k-200k total students on the platform, and at least a few thousand to be able to act/login concurrently. Also, I plan to avoid using the unique key, but have a secondary index. I expect the secondary index to be re-built asynchronously, and not significantly impact write performance?

            – Alwyn - Numino Labs
            Nov 28 '18 at 14:16














          1












          1








          1







          Technically, you are better off using a composite key. The only downside is that it's not suitable for entities.



          What I mean is that an M:N may become an entity over time. It will gain columns and it's individual rows will be manipulated and referenced, which is complicated with composite keys.



          A plain M:N table on the other side is required to have a unique key on referencing columns, so using it as a primary key makes a lot of sense. You get rid of extra ID column, sequence and additional index.






          share|improve this answer













          Technically, you are better off using a composite key. The only downside is that it's not suitable for entities.



          What I mean is that an M:N may become an entity over time. It will gain columns and it's individual rows will be manipulated and referenced, which is complicated with composite keys.



          A plain M:N table on the other side is required to have a unique key on referencing columns, so using it as a primary key makes a lot of sense. You get rid of extra ID column, sequence and additional index.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 28 '18 at 11:48









          Boris SchegolevBoris Schegolev

          3,21651629




          3,21651629













          • If I keep a composite key, as MySql Innodb engine provides a clustered index, high number of concurrent writes could result in a lot of re-arranging of the existing records in the db resulting in poor write performance and thus poor response times for users?

            – Alwyn - Numino Labs
            Nov 28 '18 at 13:01











          • There should not be much difference. You would have the UNIQUE KEY in one case or the PRIMARY KEY in the other - implementation is basically same. Further more, if you are building a solution to handle thousands of transactions per second you should not have used MySQL in the first place.

            – Boris Schegolev
            Nov 28 '18 at 14:06











          • Boris, I should correct myself. I plan to use AWS RDS- Aurora serverless (MySQL InnoDb Engine). I expect Aurora to get the product to atleast 100k-200k total students on the platform, and at least a few thousand to be able to act/login concurrently. Also, I plan to avoid using the unique key, but have a secondary index. I expect the secondary index to be re-built asynchronously, and not significantly impact write performance?

            – Alwyn - Numino Labs
            Nov 28 '18 at 14:16



















          • If I keep a composite key, as MySql Innodb engine provides a clustered index, high number of concurrent writes could result in a lot of re-arranging of the existing records in the db resulting in poor write performance and thus poor response times for users?

            – Alwyn - Numino Labs
            Nov 28 '18 at 13:01











          • There should not be much difference. You would have the UNIQUE KEY in one case or the PRIMARY KEY in the other - implementation is basically same. Further more, if you are building a solution to handle thousands of transactions per second you should not have used MySQL in the first place.

            – Boris Schegolev
            Nov 28 '18 at 14:06











          • Boris, I should correct myself. I plan to use AWS RDS- Aurora serverless (MySQL InnoDb Engine). I expect Aurora to get the product to atleast 100k-200k total students on the platform, and at least a few thousand to be able to act/login concurrently. Also, I plan to avoid using the unique key, but have a secondary index. I expect the secondary index to be re-built asynchronously, and not significantly impact write performance?

            – Alwyn - Numino Labs
            Nov 28 '18 at 14:16

















          If I keep a composite key, as MySql Innodb engine provides a clustered index, high number of concurrent writes could result in a lot of re-arranging of the existing records in the db resulting in poor write performance and thus poor response times for users?

          – Alwyn - Numino Labs
          Nov 28 '18 at 13:01





          If I keep a composite key, as MySql Innodb engine provides a clustered index, high number of concurrent writes could result in a lot of re-arranging of the existing records in the db resulting in poor write performance and thus poor response times for users?

          – Alwyn - Numino Labs
          Nov 28 '18 at 13:01













          There should not be much difference. You would have the UNIQUE KEY in one case or the PRIMARY KEY in the other - implementation is basically same. Further more, if you are building a solution to handle thousands of transactions per second you should not have used MySQL in the first place.

          – Boris Schegolev
          Nov 28 '18 at 14:06





          There should not be much difference. You would have the UNIQUE KEY in one case or the PRIMARY KEY in the other - implementation is basically same. Further more, if you are building a solution to handle thousands of transactions per second you should not have used MySQL in the first place.

          – Boris Schegolev
          Nov 28 '18 at 14:06













          Boris, I should correct myself. I plan to use AWS RDS- Aurora serverless (MySQL InnoDb Engine). I expect Aurora to get the product to atleast 100k-200k total students on the platform, and at least a few thousand to be able to act/login concurrently. Also, I plan to avoid using the unique key, but have a secondary index. I expect the secondary index to be re-built asynchronously, and not significantly impact write performance?

          – Alwyn - Numino Labs
          Nov 28 '18 at 14:16





          Boris, I should correct myself. I plan to use AWS RDS- Aurora serverless (MySQL InnoDb Engine). I expect Aurora to get the product to atleast 100k-200k total students on the platform, and at least a few thousand to be able to act/login concurrently. Also, I plan to avoid using the unique key, but have a secondary index. I expect the secondary index to be re-built asynchronously, and not significantly impact write performance?

          – Alwyn - Numino Labs
          Nov 28 '18 at 14:16




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53494565%2fshould-i-use-a-composite-primary-key-clustered-index-or-a-surrogate-key-seco%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Lallio

          Futebolista

          Jornalista