Assign rank to each row in a group in R












0















I have the following input dataframe and would like to first group it by Gene and then arrange by descending Expression. Once I've done that, I'd like to add a Rank column that ranks each row per Gene according to the Expression value - so rows with higher Expression per gene get ranked higher.



I've already done the group by and arrange by part (below), but I'm struggling with how to do the ranking.



dat_sorted <- dat %>% select(Gene, Expression, Sample) %>%
group_by(Gene) %>%
arrange(Gene, desc(Expression))


**INPUT (dat)**

Gene Expression Sample
ENSG00000000027 2.79336700 HSB431
ENSG00000000938 0.83478860 HSB414
ENSG00000000003 2.40009100 HSB618
ENSG00000000938 1.75148448 HSB671
ENSG00000000938 1.52182467 HSB670
ENSG00000000938 0.62174432 HSB459
ENSG00000000003 2.81561500 HSB671



**EXPECTED OUTPUT**

Gene Expression Sample Rank
ENSG00000000003 2.81561500 HSB671 1
ENSG00000000003 2.79336700 HSB431 2
ENSG00000000027 2.79336700 HSB431 1
ENSG00000000938 1.75148448 HSB671 1
ENSG00000000938 1.52182467 HSB670 2
ENSG00000000938 0.83478860 HSB414 3
ENSG00000000938 0.62174432 HSB459 4


UPDATE



When trying:



dat %>% 
group_by(Gene) %>%
mutate(Rank = dense_rank(Expression)) %>%
arrange(Gene, Expression, Rank)


I get:



Gene                Sample   Expression     Rank
ENSG00000000003 HSB626 3.52200400 31107
ENSG00000000938 HSB152 -1.60663921 1585
ENSG00000000938 HSB425 -0.40209856 3536
ENSG00000000938 HSB627 -1.09598712 2244
ENSG00000000938 HSB645 -0.82846242 2666
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB195 2.45561878 18041
ENSG00000000971 HSB222 5.54389646 79697









share|improve this question




















  • 2





    ... %>% mutate(Rank = 1:n()), this relies on your arrange ordering. Or using the rank function (which ranks low to high, so we need a negative) ... %>% mutate(Rank = rank(-Expression)), which just uses the Expression values so it doesn't depend on the ordering. Also rank has several options for dealing with ties.

    – Gregor
    Nov 28 '18 at 17:07













  • @Gregor: Do you mind providing a complete example? I've tried variations of what you suggested but am still not having any luck.

    – claudiadast
    Nov 28 '18 at 17:32








  • 2





    From your edit it looks like you've loaded plyr after dplyr and ignored the warnings. See this R-FAQ. Run detach(package:plyr) or specify dplyr::mutate and it should work.

    – Gregor
    Nov 28 '18 at 17:34


















0















I have the following input dataframe and would like to first group it by Gene and then arrange by descending Expression. Once I've done that, I'd like to add a Rank column that ranks each row per Gene according to the Expression value - so rows with higher Expression per gene get ranked higher.



I've already done the group by and arrange by part (below), but I'm struggling with how to do the ranking.



dat_sorted <- dat %>% select(Gene, Expression, Sample) %>%
group_by(Gene) %>%
arrange(Gene, desc(Expression))


**INPUT (dat)**

Gene Expression Sample
ENSG00000000027 2.79336700 HSB431
ENSG00000000938 0.83478860 HSB414
ENSG00000000003 2.40009100 HSB618
ENSG00000000938 1.75148448 HSB671
ENSG00000000938 1.52182467 HSB670
ENSG00000000938 0.62174432 HSB459
ENSG00000000003 2.81561500 HSB671



**EXPECTED OUTPUT**

Gene Expression Sample Rank
ENSG00000000003 2.81561500 HSB671 1
ENSG00000000003 2.79336700 HSB431 2
ENSG00000000027 2.79336700 HSB431 1
ENSG00000000938 1.75148448 HSB671 1
ENSG00000000938 1.52182467 HSB670 2
ENSG00000000938 0.83478860 HSB414 3
ENSG00000000938 0.62174432 HSB459 4


UPDATE



When trying:



dat %>% 
group_by(Gene) %>%
mutate(Rank = dense_rank(Expression)) %>%
arrange(Gene, Expression, Rank)


I get:



Gene                Sample   Expression     Rank
ENSG00000000003 HSB626 3.52200400 31107
ENSG00000000938 HSB152 -1.60663921 1585
ENSG00000000938 HSB425 -0.40209856 3536
ENSG00000000938 HSB627 -1.09598712 2244
ENSG00000000938 HSB645 -0.82846242 2666
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB195 2.45561878 18041
ENSG00000000971 HSB222 5.54389646 79697









share|improve this question




















  • 2





    ... %>% mutate(Rank = 1:n()), this relies on your arrange ordering. Or using the rank function (which ranks low to high, so we need a negative) ... %>% mutate(Rank = rank(-Expression)), which just uses the Expression values so it doesn't depend on the ordering. Also rank has several options for dealing with ties.

    – Gregor
    Nov 28 '18 at 17:07













  • @Gregor: Do you mind providing a complete example? I've tried variations of what you suggested but am still not having any luck.

    – claudiadast
    Nov 28 '18 at 17:32








  • 2





    From your edit it looks like you've loaded plyr after dplyr and ignored the warnings. See this R-FAQ. Run detach(package:plyr) or specify dplyr::mutate and it should work.

    – Gregor
    Nov 28 '18 at 17:34
















0












0








0








I have the following input dataframe and would like to first group it by Gene and then arrange by descending Expression. Once I've done that, I'd like to add a Rank column that ranks each row per Gene according to the Expression value - so rows with higher Expression per gene get ranked higher.



I've already done the group by and arrange by part (below), but I'm struggling with how to do the ranking.



dat_sorted <- dat %>% select(Gene, Expression, Sample) %>%
group_by(Gene) %>%
arrange(Gene, desc(Expression))


**INPUT (dat)**

Gene Expression Sample
ENSG00000000027 2.79336700 HSB431
ENSG00000000938 0.83478860 HSB414
ENSG00000000003 2.40009100 HSB618
ENSG00000000938 1.75148448 HSB671
ENSG00000000938 1.52182467 HSB670
ENSG00000000938 0.62174432 HSB459
ENSG00000000003 2.81561500 HSB671



**EXPECTED OUTPUT**

Gene Expression Sample Rank
ENSG00000000003 2.81561500 HSB671 1
ENSG00000000003 2.79336700 HSB431 2
ENSG00000000027 2.79336700 HSB431 1
ENSG00000000938 1.75148448 HSB671 1
ENSG00000000938 1.52182467 HSB670 2
ENSG00000000938 0.83478860 HSB414 3
ENSG00000000938 0.62174432 HSB459 4


UPDATE



When trying:



dat %>% 
group_by(Gene) %>%
mutate(Rank = dense_rank(Expression)) %>%
arrange(Gene, Expression, Rank)


I get:



Gene                Sample   Expression     Rank
ENSG00000000003 HSB626 3.52200400 31107
ENSG00000000938 HSB152 -1.60663921 1585
ENSG00000000938 HSB425 -0.40209856 3536
ENSG00000000938 HSB627 -1.09598712 2244
ENSG00000000938 HSB645 -0.82846242 2666
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB195 2.45561878 18041
ENSG00000000971 HSB222 5.54389646 79697









share|improve this question
















I have the following input dataframe and would like to first group it by Gene and then arrange by descending Expression. Once I've done that, I'd like to add a Rank column that ranks each row per Gene according to the Expression value - so rows with higher Expression per gene get ranked higher.



I've already done the group by and arrange by part (below), but I'm struggling with how to do the ranking.



dat_sorted <- dat %>% select(Gene, Expression, Sample) %>%
group_by(Gene) %>%
arrange(Gene, desc(Expression))


**INPUT (dat)**

Gene Expression Sample
ENSG00000000027 2.79336700 HSB431
ENSG00000000938 0.83478860 HSB414
ENSG00000000003 2.40009100 HSB618
ENSG00000000938 1.75148448 HSB671
ENSG00000000938 1.52182467 HSB670
ENSG00000000938 0.62174432 HSB459
ENSG00000000003 2.81561500 HSB671



**EXPECTED OUTPUT**

Gene Expression Sample Rank
ENSG00000000003 2.81561500 HSB671 1
ENSG00000000003 2.79336700 HSB431 2
ENSG00000000027 2.79336700 HSB431 1
ENSG00000000938 1.75148448 HSB671 1
ENSG00000000938 1.52182467 HSB670 2
ENSG00000000938 0.83478860 HSB414 3
ENSG00000000938 0.62174432 HSB459 4


UPDATE



When trying:



dat %>% 
group_by(Gene) %>%
mutate(Rank = dense_rank(Expression)) %>%
arrange(Gene, Expression, Rank)


I get:



Gene                Sample   Expression     Rank
ENSG00000000003 HSB626 3.52200400 31107
ENSG00000000938 HSB152 -1.60663921 1585
ENSG00000000938 HSB425 -0.40209856 3536
ENSG00000000938 HSB627 -1.09598712 2244
ENSG00000000938 HSB645 -0.82846242 2666
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB154 4.61434903 53421
ENSG00000000971 HSB195 2.45561878 18041
ENSG00000000971 HSB222 5.54389646 79697






r dataframe ranking






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 28 '18 at 17:20







claudiadast

















asked Nov 28 '18 at 17:04









claudiadastclaudiadast

13711




13711








  • 2





    ... %>% mutate(Rank = 1:n()), this relies on your arrange ordering. Or using the rank function (which ranks low to high, so we need a negative) ... %>% mutate(Rank = rank(-Expression)), which just uses the Expression values so it doesn't depend on the ordering. Also rank has several options for dealing with ties.

    – Gregor
    Nov 28 '18 at 17:07













  • @Gregor: Do you mind providing a complete example? I've tried variations of what you suggested but am still not having any luck.

    – claudiadast
    Nov 28 '18 at 17:32








  • 2





    From your edit it looks like you've loaded plyr after dplyr and ignored the warnings. See this R-FAQ. Run detach(package:plyr) or specify dplyr::mutate and it should work.

    – Gregor
    Nov 28 '18 at 17:34
















  • 2





    ... %>% mutate(Rank = 1:n()), this relies on your arrange ordering. Or using the rank function (which ranks low to high, so we need a negative) ... %>% mutate(Rank = rank(-Expression)), which just uses the Expression values so it doesn't depend on the ordering. Also rank has several options for dealing with ties.

    – Gregor
    Nov 28 '18 at 17:07













  • @Gregor: Do you mind providing a complete example? I've tried variations of what you suggested but am still not having any luck.

    – claudiadast
    Nov 28 '18 at 17:32








  • 2





    From your edit it looks like you've loaded plyr after dplyr and ignored the warnings. See this R-FAQ. Run detach(package:plyr) or specify dplyr::mutate and it should work.

    – Gregor
    Nov 28 '18 at 17:34










2




2





... %>% mutate(Rank = 1:n()), this relies on your arrange ordering. Or using the rank function (which ranks low to high, so we need a negative) ... %>% mutate(Rank = rank(-Expression)), which just uses the Expression values so it doesn't depend on the ordering. Also rank has several options for dealing with ties.

– Gregor
Nov 28 '18 at 17:07







... %>% mutate(Rank = 1:n()), this relies on your arrange ordering. Or using the rank function (which ranks low to high, so we need a negative) ... %>% mutate(Rank = rank(-Expression)), which just uses the Expression values so it doesn't depend on the ordering. Also rank has several options for dealing with ties.

– Gregor
Nov 28 '18 at 17:07















@Gregor: Do you mind providing a complete example? I've tried variations of what you suggested but am still not having any luck.

– claudiadast
Nov 28 '18 at 17:32







@Gregor: Do you mind providing a complete example? I've tried variations of what you suggested but am still not having any luck.

– claudiadast
Nov 28 '18 at 17:32






2




2





From your edit it looks like you've loaded plyr after dplyr and ignored the warnings. See this R-FAQ. Run detach(package:plyr) or specify dplyr::mutate and it should work.

– Gregor
Nov 28 '18 at 17:34







From your edit it looks like you've loaded plyr after dplyr and ignored the warnings. See this R-FAQ. Run detach(package:plyr) or specify dplyr::mutate and it should work.

– Gregor
Nov 28 '18 at 17:34














2 Answers
2






active

oldest

votes


















0














We can use dense_rank



dat %>% 
group_by(Gene) %>%
mutate(Rank = dense_rank(Expression)) %>%
arrange(Gene, Expression, Rank)





share|improve this answer































    0














    The following worked:



    dplyr::mutate

    dat_rank <- dat %>%
    group_by(Gene) %>%
    arrange(Gene, desc(Expression)) %>%
    mutate(Rank = 1:n())





    share|improve this answer























      Your Answer






      StackExchange.ifUsing("editor", function () {
      StackExchange.using("externalEditor", function () {
      StackExchange.using("snippets", function () {
      StackExchange.snippets.init();
      });
      });
      }, "code-snippets");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "1"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53524636%2fassign-rank-to-each-row-in-a-group-in-r%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      0














      We can use dense_rank



      dat %>% 
      group_by(Gene) %>%
      mutate(Rank = dense_rank(Expression)) %>%
      arrange(Gene, Expression, Rank)





      share|improve this answer




























        0














        We can use dense_rank



        dat %>% 
        group_by(Gene) %>%
        mutate(Rank = dense_rank(Expression)) %>%
        arrange(Gene, Expression, Rank)





        share|improve this answer


























          0












          0








          0







          We can use dense_rank



          dat %>% 
          group_by(Gene) %>%
          mutate(Rank = dense_rank(Expression)) %>%
          arrange(Gene, Expression, Rank)





          share|improve this answer













          We can use dense_rank



          dat %>% 
          group_by(Gene) %>%
          mutate(Rank = dense_rank(Expression)) %>%
          arrange(Gene, Expression, Rank)






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 28 '18 at 17:07









          akrunakrun

          417k13205278




          417k13205278

























              0














              The following worked:



              dplyr::mutate

              dat_rank <- dat %>%
              group_by(Gene) %>%
              arrange(Gene, desc(Expression)) %>%
              mutate(Rank = 1:n())





              share|improve this answer




























                0














                The following worked:



                dplyr::mutate

                dat_rank <- dat %>%
                group_by(Gene) %>%
                arrange(Gene, desc(Expression)) %>%
                mutate(Rank = 1:n())





                share|improve this answer


























                  0












                  0








                  0







                  The following worked:



                  dplyr::mutate

                  dat_rank <- dat %>%
                  group_by(Gene) %>%
                  arrange(Gene, desc(Expression)) %>%
                  mutate(Rank = 1:n())





                  share|improve this answer













                  The following worked:



                  dplyr::mutate

                  dat_rank <- dat %>%
                  group_by(Gene) %>%
                  arrange(Gene, desc(Expression)) %>%
                  mutate(Rank = 1:n())






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 28 '18 at 17:42









                  claudiadastclaudiadast

                  13711




                  13711






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53524636%2fassign-rank-to-each-row-in-a-group-in-r%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

                      Calculate evaluation metrics using cross_val_predict sklearn

                      Insert data from modal to MySQL (multiple modal on website)