R dataframe, split one column by the values in another column [duplicate]












0
















This question already has an answer here:




  • Transpose / reshape dataframe without “timevar” from long to wide format

    6 answers




I have a dataframe in R that looks similar to the following:



A    B     C

A X 1
A Y 3
A Z 3
A Z 2


How can I convert the single column 'C' into multiple columns that correspond to their values in B, given that all the columns contain factors and not strings or integers?



I want something that looks like below, either with or without the B/C columns still in the dataframe.



A    B    C     X    Y    Z

A X 1 1 NA NA
A Y 3 NA 3 NA
A Z 3 NA NA 3
A Z 2 NA NA 2


Ideally, I would like my final output to be a dataframe in R, as I intend to merge it with another dataframe that has matching values A values.



Apologies if this post is duplicate but I didn't spot any where I could tell somebody was asking the same thing. - Thanks!










share|improve this question













marked as duplicate by Henrik r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 19:34


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.























    0
















    This question already has an answer here:




    • Transpose / reshape dataframe without “timevar” from long to wide format

      6 answers




    I have a dataframe in R that looks similar to the following:



    A    B     C

    A X 1
    A Y 3
    A Z 3
    A Z 2


    How can I convert the single column 'C' into multiple columns that correspond to their values in B, given that all the columns contain factors and not strings or integers?



    I want something that looks like below, either with or without the B/C columns still in the dataframe.



    A    B    C     X    Y    Z

    A X 1 1 NA NA
    A Y 3 NA 3 NA
    A Z 3 NA NA 3
    A Z 2 NA NA 2


    Ideally, I would like my final output to be a dataframe in R, as I intend to merge it with another dataframe that has matching values A values.



    Apologies if this post is duplicate but I didn't spot any where I could tell somebody was asking the same thing. - Thanks!










    share|improve this question













    marked as duplicate by Henrik r
    Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

    StackExchange.ready(function() {
    if (StackExchange.options.isMobile) return;

    $('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
    var $hover = $(this).addClass('hover-bound'),
    $msg = $hover.siblings('.dupe-hammer-message');

    $hover.hover(
    function() {
    $hover.showInfoMessage('', {
    messageElement: $msg.clone().show(),
    transient: false,
    position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
    dismissable: false,
    relativeToBody: true
    });
    },
    function() {
    StackExchange.helpers.removeMessages();
    }
    );
    });
    });
    Nov 28 '18 at 19:34


    This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.





















      0












      0








      0









      This question already has an answer here:




      • Transpose / reshape dataframe without “timevar” from long to wide format

        6 answers




      I have a dataframe in R that looks similar to the following:



      A    B     C

      A X 1
      A Y 3
      A Z 3
      A Z 2


      How can I convert the single column 'C' into multiple columns that correspond to their values in B, given that all the columns contain factors and not strings or integers?



      I want something that looks like below, either with or without the B/C columns still in the dataframe.



      A    B    C     X    Y    Z

      A X 1 1 NA NA
      A Y 3 NA 3 NA
      A Z 3 NA NA 3
      A Z 2 NA NA 2


      Ideally, I would like my final output to be a dataframe in R, as I intend to merge it with another dataframe that has matching values A values.



      Apologies if this post is duplicate but I didn't spot any where I could tell somebody was asking the same thing. - Thanks!










      share|improve this question















      This question already has an answer here:




      • Transpose / reshape dataframe without “timevar” from long to wide format

        6 answers




      I have a dataframe in R that looks similar to the following:



      A    B     C

      A X 1
      A Y 3
      A Z 3
      A Z 2


      How can I convert the single column 'C' into multiple columns that correspond to their values in B, given that all the columns contain factors and not strings or integers?



      I want something that looks like below, either with or without the B/C columns still in the dataframe.



      A    B    C     X    Y    Z

      A X 1 1 NA NA
      A Y 3 NA 3 NA
      A Z 3 NA NA 3
      A Z 2 NA NA 2


      Ideally, I would like my final output to be a dataframe in R, as I intend to merge it with another dataframe that has matching values A values.



      Apologies if this post is duplicate but I didn't spot any where I could tell somebody was asking the same thing. - Thanks!





      This question already has an answer here:




      • Transpose / reshape dataframe without “timevar” from long to wide format

        6 answers








      r dataframe






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 28 '18 at 18:38









      Jamie GorzynskiJamie Gorzynski

      63




      63




      marked as duplicate by Henrik r
      Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

      StackExchange.ready(function() {
      if (StackExchange.options.isMobile) return;

      $('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
      var $hover = $(this).addClass('hover-bound'),
      $msg = $hover.siblings('.dupe-hammer-message');

      $hover.hover(
      function() {
      $hover.showInfoMessage('', {
      messageElement: $msg.clone().show(),
      transient: false,
      position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
      dismissable: false,
      relativeToBody: true
      });
      },
      function() {
      StackExchange.helpers.removeMessages();
      }
      );
      });
      });
      Nov 28 '18 at 19:34


      This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.









      marked as duplicate by Henrik r
      Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

      StackExchange.ready(function() {
      if (StackExchange.options.isMobile) return;

      $('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
      var $hover = $(this).addClass('hover-bound'),
      $msg = $hover.siblings('.dupe-hammer-message');

      $hover.hover(
      function() {
      $hover.showInfoMessage('', {
      messageElement: $msg.clone().show(),
      transient: false,
      position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
      dismissable: false,
      relativeToBody: true
      });
      },
      function() {
      StackExchange.helpers.removeMessages();
      }
      );
      });
      });
      Nov 28 '18 at 19:34


      This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.


























          2 Answers
          2






          active

          oldest

          votes


















          0














          Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.





          library(tidyr)
          library(dplyr)


          To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.



          df %>%
          tibble::rowid_to_column() %>%
          spread(key = B, value = C)
          #> rowid A X Y Z
          #> 1 1 A 1 NA NA
          #> 2 2 A NA 3 NA
          #> 3 3 A NA NA 3
          #> 4 4 A NA NA 2


          You can do that, then column-bind this to the original data frame, but your columns will be out of order and you'll have a column A1 that is identical to column A. Using select, you can pick the columns you want in the proper order.



          df %>%
          tibble::rowid_to_column() %>%
          spread(key = B, value = C) %>%
          bind_cols(df) %>%
          select(A, B, C, X, Y, Z)
          #> A B C X Y Z
          #> 1 A X 1 1 NA NA
          #> 2 A Y 3 NA 3 NA
          #> 3 A Z 3 NA NA 3
          #> 4 A Z 2 NA NA 2


          A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.



          bind_cols(
          df,
          df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)
          ) %>%
          select(-rowid, -A1)
          #> A B C X Y Z
          #> 1 A X 1 1 NA NA
          #> 2 A Y 3 NA 3 NA
          #> 3 A Z 3 NA NA 3
          #> 4 A Z 2 NA NA 2


          Created on 2018-11-28 by the reprex package (v0.2.1)






          share|improve this answer































            0














            We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset



            library(data.table)
            cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,
            value.var = 'C')[, -1, with = FALSE])
            # A B C X Y Z
            #1: A X 1 1 NA NA
            #2: A Y 3 NA 3 NA
            #3: A Z 3 NA NA 3
            #4: A Z 2 NA NA 2


            data



            df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z", 
            "Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA,
            -4L))





            share|improve this answer
































              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              0














              Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.





              library(tidyr)
              library(dplyr)


              To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.



              df %>%
              tibble::rowid_to_column() %>%
              spread(key = B, value = C)
              #> rowid A X Y Z
              #> 1 1 A 1 NA NA
              #> 2 2 A NA 3 NA
              #> 3 3 A NA NA 3
              #> 4 4 A NA NA 2


              You can do that, then column-bind this to the original data frame, but your columns will be out of order and you'll have a column A1 that is identical to column A. Using select, you can pick the columns you want in the proper order.



              df %>%
              tibble::rowid_to_column() %>%
              spread(key = B, value = C) %>%
              bind_cols(df) %>%
              select(A, B, C, X, Y, Z)
              #> A B C X Y Z
              #> 1 A X 1 1 NA NA
              #> 2 A Y 3 NA 3 NA
              #> 3 A Z 3 NA NA 3
              #> 4 A Z 2 NA NA 2


              A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.



              bind_cols(
              df,
              df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)
              ) %>%
              select(-rowid, -A1)
              #> A B C X Y Z
              #> 1 A X 1 1 NA NA
              #> 2 A Y 3 NA 3 NA
              #> 3 A Z 3 NA NA 3
              #> 4 A Z 2 NA NA 2


              Created on 2018-11-28 by the reprex package (v0.2.1)






              share|improve this answer




























                0














                Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.





                library(tidyr)
                library(dplyr)


                To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.



                df %>%
                tibble::rowid_to_column() %>%
                spread(key = B, value = C)
                #> rowid A X Y Z
                #> 1 1 A 1 NA NA
                #> 2 2 A NA 3 NA
                #> 3 3 A NA NA 3
                #> 4 4 A NA NA 2


                You can do that, then column-bind this to the original data frame, but your columns will be out of order and you'll have a column A1 that is identical to column A. Using select, you can pick the columns you want in the proper order.



                df %>%
                tibble::rowid_to_column() %>%
                spread(key = B, value = C) %>%
                bind_cols(df) %>%
                select(A, B, C, X, Y, Z)
                #> A B C X Y Z
                #> 1 A X 1 1 NA NA
                #> 2 A Y 3 NA 3 NA
                #> 3 A Z 3 NA NA 3
                #> 4 A Z 2 NA NA 2


                A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.



                bind_cols(
                df,
                df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)
                ) %>%
                select(-rowid, -A1)
                #> A B C X Y Z
                #> 1 A X 1 1 NA NA
                #> 2 A Y 3 NA 3 NA
                #> 3 A Z 3 NA NA 3
                #> 4 A Z 2 NA NA 2


                Created on 2018-11-28 by the reprex package (v0.2.1)






                share|improve this answer


























                  0












                  0








                  0







                  Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.





                  library(tidyr)
                  library(dplyr)


                  To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.



                  df %>%
                  tibble::rowid_to_column() %>%
                  spread(key = B, value = C)
                  #> rowid A X Y Z
                  #> 1 1 A 1 NA NA
                  #> 2 2 A NA 3 NA
                  #> 3 3 A NA NA 3
                  #> 4 4 A NA NA 2


                  You can do that, then column-bind this to the original data frame, but your columns will be out of order and you'll have a column A1 that is identical to column A. Using select, you can pick the columns you want in the proper order.



                  df %>%
                  tibble::rowid_to_column() %>%
                  spread(key = B, value = C) %>%
                  bind_cols(df) %>%
                  select(A, B, C, X, Y, Z)
                  #> A B C X Y Z
                  #> 1 A X 1 1 NA NA
                  #> 2 A Y 3 NA 3 NA
                  #> 3 A Z 3 NA NA 3
                  #> 4 A Z 2 NA NA 2


                  A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.



                  bind_cols(
                  df,
                  df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)
                  ) %>%
                  select(-rowid, -A1)
                  #> A B C X Y Z
                  #> 1 A X 1 1 NA NA
                  #> 2 A Y 3 NA 3 NA
                  #> 3 A Z 3 NA NA 3
                  #> 4 A Z 2 NA NA 2


                  Created on 2018-11-28 by the reprex package (v0.2.1)






                  share|improve this answer













                  Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.





                  library(tidyr)
                  library(dplyr)


                  To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.



                  df %>%
                  tibble::rowid_to_column() %>%
                  spread(key = B, value = C)
                  #> rowid A X Y Z
                  #> 1 1 A 1 NA NA
                  #> 2 2 A NA 3 NA
                  #> 3 3 A NA NA 3
                  #> 4 4 A NA NA 2


                  You can do that, then column-bind this to the original data frame, but your columns will be out of order and you'll have a column A1 that is identical to column A. Using select, you can pick the columns you want in the proper order.



                  df %>%
                  tibble::rowid_to_column() %>%
                  spread(key = B, value = C) %>%
                  bind_cols(df) %>%
                  select(A, B, C, X, Y, Z)
                  #> A B C X Y Z
                  #> 1 A X 1 1 NA NA
                  #> 2 A Y 3 NA 3 NA
                  #> 3 A Z 3 NA NA 3
                  #> 4 A Z 2 NA NA 2


                  A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.



                  bind_cols(
                  df,
                  df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)
                  ) %>%
                  select(-rowid, -A1)
                  #> A B C X Y Z
                  #> 1 A X 1 1 NA NA
                  #> 2 A Y 3 NA 3 NA
                  #> 3 A Z 3 NA NA 3
                  #> 4 A Z 2 NA NA 2


                  Created on 2018-11-28 by the reprex package (v0.2.1)







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 28 '18 at 19:22









                  camillecamille

                  7,87531833




                  7,87531833

























                      0














                      We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset



                      library(data.table)
                      cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,
                      value.var = 'C')[, -1, with = FALSE])
                      # A B C X Y Z
                      #1: A X 1 1 NA NA
                      #2: A Y 3 NA 3 NA
                      #3: A Z 3 NA NA 3
                      #4: A Z 2 NA NA 2


                      data



                      df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z", 
                      "Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA,
                      -4L))





                      share|improve this answer






























                        0














                        We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset



                        library(data.table)
                        cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,
                        value.var = 'C')[, -1, with = FALSE])
                        # A B C X Y Z
                        #1: A X 1 1 NA NA
                        #2: A Y 3 NA 3 NA
                        #3: A Z 3 NA NA 3
                        #4: A Z 2 NA NA 2


                        data



                        df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z", 
                        "Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA,
                        -4L))





                        share|improve this answer




























                          0












                          0








                          0







                          We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset



                          library(data.table)
                          cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,
                          value.var = 'C')[, -1, with = FALSE])
                          # A B C X Y Z
                          #1: A X 1 1 NA NA
                          #2: A Y 3 NA 3 NA
                          #3: A Z 3 NA NA 3
                          #4: A Z 2 NA NA 2


                          data



                          df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z", 
                          "Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA,
                          -4L))





                          share|improve this answer















                          We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset



                          library(data.table)
                          cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,
                          value.var = 'C')[, -1, with = FALSE])
                          # A B C X Y Z
                          #1: A X 1 1 NA NA
                          #2: A Y 3 NA 3 NA
                          #3: A Z 3 NA NA 3
                          #4: A Z 2 NA NA 2


                          data



                          df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z", 
                          "Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA,
                          -4L))






                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Nov 28 '18 at 19:25

























                          answered Nov 28 '18 at 18:40









                          akrunakrun

                          417k13206279




                          417k13206279















                              Popular posts from this blog

                              Lallio

                              Futebolista

                              Jornalista