R dataframe, split one column by the values in another column [duplicate]

This question already has an answer here:

Transpose / reshape dataframe without “timevar” from long to wide format

6 answers

I have a dataframe in R that looks similar to the following:

A    B     C



A    X     1   

A    Y     3   

A    Z     3

A    Z     2

How can I convert the single column 'C' into multiple columns that correspond to their values in B, given that all the columns contain factors and not strings or integers?

I want something that looks like below, either with or without the B/C columns still in the dataframe.

A    B    C     X    Y    Z



A    X    1     1    NA   NA

A    Y    3     NA   3    NA

A    Z    3     NA   NA   3

A    Z    2     NA   NA   2

Ideally, I would like my final output to be a dataframe in R, as I intend to merge it with another dataframe that has matching values A values.

Apologies if this post is duplicate but I didn't spot any where I could tell somebody was asking the same thing. - Thanks!

asked Nov 28 '18 at 18:38

Jamie Gorzynski

marked as duplicate by Henrik r
Users with the r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 19:34

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

add a comment |

This question already has an answer here:

Transpose / reshape dataframe without “timevar” from long to wide format

6 answers

I have a dataframe in R that looks similar to the following:

A    B     C



A    X     1   

A    Y     3   

A    Z     3

A    Z     2

How can I convert the single column 'C' into multiple columns that correspond to their values in B, given that all the columns contain factors and not strings or integers?

I want something that looks like below, either with or without the B/C columns still in the dataframe.

A    B    C     X    Y    Z



A    X    1     1    NA   NA

A    Y    3     NA   3    NA

A    Z    3     NA   NA   3

A    Z    2     NA   NA   2

Ideally, I would like my final output to be a dataframe in R, as I intend to merge it with another dataframe that has matching values A values.

Apologies if this post is duplicate but I didn't spot any where I could tell somebody was asking the same thing. - Thanks!

asked Nov 28 '18 at 18:38

Jamie Gorzynski

marked as duplicate by Henrik r
Users with the r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 19:34

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

add a comment |

This question already has an answer here:

Transpose / reshape dataframe without “timevar” from long to wide format

6 answers

I have a dataframe in R that looks similar to the following:

A    B     C



A    X     1   

A    Y     3   

A    Z     3

A    Z     2

How can I convert the single column 'C' into multiple columns that correspond to their values in B, given that all the columns contain factors and not strings or integers?

I want something that looks like below, either with or without the B/C columns still in the dataframe.

A    B    C     X    Y    Z



A    X    1     1    NA   NA

A    Y    3     NA   3    NA

A    Z    3     NA   NA   3

A    Z    2     NA   NA   2

Ideally, I would like my final output to be a dataframe in R, as I intend to merge it with another dataframe that has matching values A values.

Apologies if this post is duplicate but I didn't spot any where I could tell somebody was asking the same thing. - Thanks!

asked Nov 28 '18 at 18:38

Jamie Gorzynski

This question already has an answer here:

Transpose / reshape dataframe without “timevar” from long to wide format

6 answers

I have a dataframe in R that looks similar to the following:

A    B     C



A    X     1   

A    Y     3   

A    Z     3

A    Z     2

How can I convert the single column 'C' into multiple columns that correspond to their values in B, given that all the columns contain factors and not strings or integers?

I want something that looks like below, either with or without the B/C columns still in the dataframe.

A    B    C     X    Y    Z



A    X    1     1    NA   NA

A    Y    3     NA   3    NA

A    Z    3     NA   NA   3

A    Z    2     NA   NA   2

Ideally, I would like my final output to be a dataframe in R, as I intend to merge it with another dataframe that has matching values A values.

Apologies if this post is duplicate but I didn't spot any where I could tell somebody was asking the same thing. - Thanks!

This question already has an answer here:

Transpose / reshape dataframe without “timevar” from long to wide format

6 answers

r dataframe

asked Nov 28 '18 at 18:38

Jamie Gorzynski

asked Nov 28 '18 at 18:38

Jamie Gorzynski

asked Nov 28 '18 at 18:38

Jamie Gorzynski

asked Nov 28 '18 at 18:38

Jamie Gorzynski

asked Nov 28 '18 at 18:38

Jamie Gorzynski

marked as duplicate by Henrik r
Users with the r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 19:34

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

marked as duplicate by Henrik r
Users with the r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 19:34

This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.

add a comment |

2 Answers
2

active

oldest

votes

Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.

library(tidyr)

library(dplyr)

To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.

df %>%

  tibble::rowid_to_column() %>%

  spread(key = B, value = C)

#>   rowid A  X  Y  Z

#> 1     1 A  1 NA NA

#> 2     2 A NA  3 NA

#> 3     3 A NA NA  3

#> 4     4 A NA NA  2

You can do that, then column-bind this to the original data frame, but your columns will be out of order and you'll have a column A1 that is identical to column A. Using select, you can pick the columns you want in the proper order.

df %>%

  tibble::rowid_to_column() %>%

  spread(key = B, value = C) %>%

  bind_cols(df) %>%

  select(A, B, C, X, Y, Z)

#>   A B C  X  Y  Z

#> 1 A X 1  1 NA NA

#> 2 A Y 3 NA  3 NA

#> 3 A Z 3 NA NA  3

#> 4 A Z 2 NA NA  2

A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.

bind_cols(

  df, 

  df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)

) %>%

  select(-rowid, -A1)

#>   A B C  X  Y  Z

#> 1 A X 1  1 NA NA

#> 2 A Y 3 NA  3 NA

#> 3 A Z 3 NA NA  3

#> 4 A Z 2 NA NA  2

^{Created on 2018-11-28 by the reprex package (v0.2.1)}

answered Nov 28 '18 at 19:22

camille

7,87531833

add a comment |

We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset

library(data.table)

cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,

           value.var = 'C')[, -1, with = FALSE])

#    A B C  X  Y  Z

#1: A X 1  1 NA NA

#2: A Y 3 NA  3 NA

#3: A Z 3 NA NA  3

#4: A Z 2 NA NA  2

data

df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z", 

 "Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA, 

 -4L))

edited Nov 28 '18 at 19:25

answered Nov 28 '18 at 18:40

akrun

417k13206279

add a comment |

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.

library(tidyr)

library(dplyr)

To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.

df %>%

  tibble::rowid_to_column() %>%

  spread(key = B, value = C)

#>   rowid A  X  Y  Z

#> 1     1 A  1 NA NA

#> 2     2 A NA  3 NA

#> 3     3 A NA NA  3

#> 4     4 A NA NA  2

df %>%

  tibble::rowid_to_column() %>%

  spread(key = B, value = C) %>%

  bind_cols(df) %>%

  select(A, B, C, X, Y, Z)

#>   A B C  X  Y  Z

#> 1 A X 1  1 NA NA

#> 2 A Y 3 NA  3 NA

#> 3 A Z 3 NA NA  3

#> 4 A Z 2 NA NA  2

A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.

bind_cols(

  df, 

  df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)

) %>%

  select(-rowid, -A1)

#>   A B C  X  Y  Z

#> 1 A X 1  1 NA NA

#> 2 A Y 3 NA  3 NA

#> 3 A Z 3 NA NA  3

#> 4 A Z 2 NA NA  2

^{Created on 2018-11-28 by the reprex package (v0.2.1)}

answered Nov 28 '18 at 19:22

camille

7,87531833

add a comment |

Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.

library(tidyr)

library(dplyr)

To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.

df %>%

  tibble::rowid_to_column() %>%

  spread(key = B, value = C)

#>   rowid A  X  Y  Z

#> 1     1 A  1 NA NA

#> 2     2 A NA  3 NA

#> 3     3 A NA NA  3

#> 4     4 A NA NA  2

df %>%

  tibble::rowid_to_column() %>%

  spread(key = B, value = C) %>%

  bind_cols(df) %>%

  select(A, B, C, X, Y, Z)

#>   A B C  X  Y  Z

#> 1 A X 1  1 NA NA

#> 2 A Y 3 NA  3 NA

#> 3 A Z 3 NA NA  3

#> 4 A Z 2 NA NA  2

A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.

bind_cols(

  df, 

  df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)

) %>%

  select(-rowid, -A1)

#>   A B C  X  Y  Z

#> 1 A X 1  1 NA NA

#> 2 A Y 3 NA  3 NA

#> 3 A Z 3 NA NA  3

#> 4 A Z 2 NA NA  2

^{Created on 2018-11-28 by the reprex package (v0.2.1)}

answered Nov 28 '18 at 19:22

camille

7,87531833

add a comment |

Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.

library(tidyr)

library(dplyr)

To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.

df %>%

  tibble::rowid_to_column() %>%

  spread(key = B, value = C)

#>   rowid A  X  Y  Z

#> 1     1 A  1 NA NA

#> 2     2 A NA  3 NA

#> 3     3 A NA NA  3

#> 4     4 A NA NA  2

df %>%

  tibble::rowid_to_column() %>%

  spread(key = B, value = C) %>%

  bind_cols(df) %>%

  select(A, B, C, X, Y, Z)

#>   A B C  X  Y  Z

#> 1 A X 1  1 NA NA

#> 2 A Y 3 NA  3 NA

#> 3 A Z 3 NA NA  3

#> 4 A Z 2 NA NA  2

A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.

bind_cols(

  df, 

  df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)

) %>%

  select(-rowid, -A1)

#>   A B C  X  Y  Z

#> 1 A X 1  1 NA NA

#> 2 A Y 3 NA  3 NA

#> 3 A Z 3 NA NA  3

#> 4 A Z 2 NA NA  2

^{Created on 2018-11-28 by the reprex package (v0.2.1)}

answered Nov 28 '18 at 19:22

camille

7,87531833

Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.

library(tidyr)

library(dplyr)

To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.

df %>%

  tibble::rowid_to_column() %>%

  spread(key = B, value = C)

#>   rowid A  X  Y  Z

#> 1     1 A  1 NA NA

#> 2     2 A NA  3 NA

#> 3     3 A NA NA  3

#> 4     4 A NA NA  2

df %>%

  tibble::rowid_to_column() %>%

  spread(key = B, value = C) %>%

  bind_cols(df) %>%

  select(A, B, C, X, Y, Z)

#>   A B C  X  Y  Z

#> 1 A X 1  1 NA NA

#> 2 A Y 3 NA  3 NA

#> 3 A Z 3 NA NA  3

#> 4 A Z 2 NA NA  2

A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.

bind_cols(

  df, 

  df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)

) %>%

  select(-rowid, -A1)

#>   A B C  X  Y  Z

#> 1 A X 1  1 NA NA

#> 2 A Y 3 NA  3 NA

#> 3 A Z 3 NA NA  3

#> 4 A Z 2 NA NA  2

^{Created on 2018-11-28 by the reprex package (v0.2.1)}

answered Nov 28 '18 at 19:22

camille

7,87531833

answered Nov 28 '18 at 19:22

camille

7,87531833

answered Nov 28 '18 at 19:22

camille

7,87531833

answered Nov 28 '18 at 19:22

camille

7,87531833

add a comment |

We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset

library(data.table)

cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,

           value.var = 'C')[, -1, with = FALSE])

#    A B C  X  Y  Z

#1: A X 1  1 NA NA

#2: A Y 3 NA  3 NA

#3: A Z 3 NA NA  3

#4: A Z 2 NA NA  2

data

df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z", 

 "Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA, 

 -4L))

edited Nov 28 '18 at 19:25

answered Nov 28 '18 at 18:40

akrun

417k13206279

add a comment |

We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset

library(data.table)

cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,

           value.var = 'C')[, -1, with = FALSE])

#    A B C  X  Y  Z

#1: A X 1  1 NA NA

#2: A Y 3 NA  3 NA

#3: A Z 3 NA NA  3

#4: A Z 2 NA NA  2

data

df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z", 

 "Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA, 

 -4L))

edited Nov 28 '18 at 19:25

answered Nov 28 '18 at 18:40

akrun

417k13206279

add a comment |

We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset

library(data.table)

cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,

           value.var = 'C')[, -1, with = FALSE])

#    A B C  X  Y  Z

#1: A X 1  1 NA NA

#2: A Y 3 NA  3 NA

#3: A Z 3 NA NA  3

#4: A Z 2 NA NA  2

data

df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z", 

 "Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA, 

 -4L))

edited Nov 28 '18 at 19:25

answered Nov 28 '18 at 18:40

akrun

417k13206279

We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset

library(data.table)

cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,

           value.var = 'C')[, -1, with = FALSE])

#    A B C  X  Y  Z

#1: A X 1  1 NA NA

#2: A Y 3 NA  3 NA

#3: A Z 3 NA NA  3

#4: A Z 2 NA NA  2

data

df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z", 

 "Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA, 

 -4L))

edited Nov 28 '18 at 19:25

answered Nov 28 '18 at 18:40

akrun

417k13206279

edited Nov 28 '18 at 19:25

answered Nov 28 '18 at 18:40

akrun

417k13206279

answered Nov 28 '18 at 18:40

akrun

417k13206279

answered Nov 28 '18 at 18:40

akrun

417k13206279

add a comment |

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Btukfyl