R dataframe, split one column by the values in another column [duplicate]
This question already has an answer here:
Transpose / reshape dataframe without “timevar” from long to wide format
6 answers
I have a dataframe in R that looks similar to the following:
A B C
A X 1
A Y 3
A Z 3
A Z 2
How can I convert the single column 'C' into multiple columns that correspond to their values in B, given that all the columns contain factors and not strings or integers?
I want something that looks like below, either with or without the B/C columns still in the dataframe.
A B C X Y Z
A X 1 1 NA NA
A Y 3 NA 3 NA
A Z 3 NA NA 3
A Z 2 NA NA 2
Ideally, I would like my final output to be a dataframe in R, as I intend to merge it with another dataframe that has matching values A values.
Apologies if this post is duplicate but I didn't spot any where I could tell somebody was asking the same thing. - Thanks!
r dataframe
marked as duplicate by Henrik
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 19:34
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
This question already has an answer here:
Transpose / reshape dataframe without “timevar” from long to wide format
6 answers
I have a dataframe in R that looks similar to the following:
A B C
A X 1
A Y 3
A Z 3
A Z 2
How can I convert the single column 'C' into multiple columns that correspond to their values in B, given that all the columns contain factors and not strings or integers?
I want something that looks like below, either with or without the B/C columns still in the dataframe.
A B C X Y Z
A X 1 1 NA NA
A Y 3 NA 3 NA
A Z 3 NA NA 3
A Z 2 NA NA 2
Ideally, I would like my final output to be a dataframe in R, as I intend to merge it with another dataframe that has matching values A values.
Apologies if this post is duplicate but I didn't spot any where I could tell somebody was asking the same thing. - Thanks!
r dataframe
marked as duplicate by Henrik
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 19:34
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
This question already has an answer here:
Transpose / reshape dataframe without “timevar” from long to wide format
6 answers
I have a dataframe in R that looks similar to the following:
A B C
A X 1
A Y 3
A Z 3
A Z 2
How can I convert the single column 'C' into multiple columns that correspond to their values in B, given that all the columns contain factors and not strings or integers?
I want something that looks like below, either with or without the B/C columns still in the dataframe.
A B C X Y Z
A X 1 1 NA NA
A Y 3 NA 3 NA
A Z 3 NA NA 3
A Z 2 NA NA 2
Ideally, I would like my final output to be a dataframe in R, as I intend to merge it with another dataframe that has matching values A values.
Apologies if this post is duplicate but I didn't spot any where I could tell somebody was asking the same thing. - Thanks!
r dataframe
This question already has an answer here:
Transpose / reshape dataframe without “timevar” from long to wide format
6 answers
I have a dataframe in R that looks similar to the following:
A B C
A X 1
A Y 3
A Z 3
A Z 2
How can I convert the single column 'C' into multiple columns that correspond to their values in B, given that all the columns contain factors and not strings or integers?
I want something that looks like below, either with or without the B/C columns still in the dataframe.
A B C X Y Z
A X 1 1 NA NA
A Y 3 NA 3 NA
A Z 3 NA NA 3
A Z 2 NA NA 2
Ideally, I would like my final output to be a dataframe in R, as I intend to merge it with another dataframe that has matching values A values.
Apologies if this post is duplicate but I didn't spot any where I could tell somebody was asking the same thing. - Thanks!
This question already has an answer here:
Transpose / reshape dataframe without “timevar” from long to wide format
6 answers
r dataframe
r dataframe
asked Nov 28 '18 at 18:38
Jamie GorzynskiJamie Gorzynski
63
63
marked as duplicate by Henrik
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 19:34
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
marked as duplicate by Henrik
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 19:34
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.
library(tidyr)
library(dplyr)
To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.
df %>%
tibble::rowid_to_column() %>%
spread(key = B, value = C)
#> rowid A X Y Z
#> 1 1 A 1 NA NA
#> 2 2 A NA 3 NA
#> 3 3 A NA NA 3
#> 4 4 A NA NA 2
You can do that, then column-bind this to the original data frame, but your columns will be out of order and you'll have a column A1 that is identical to column A. Using select, you can pick the columns you want in the proper order.
df %>%
tibble::rowid_to_column() %>%
spread(key = B, value = C) %>%
bind_cols(df) %>%
select(A, B, C, X, Y, Z)
#> A B C X Y Z
#> 1 A X 1 1 NA NA
#> 2 A Y 3 NA 3 NA
#> 3 A Z 3 NA NA 3
#> 4 A Z 2 NA NA 2
A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.
bind_cols(
df,
df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)
) %>%
select(-rowid, -A1)
#> A B C X Y Z
#> 1 A X 1 1 NA NA
#> 2 A Y 3 NA 3 NA
#> 3 A Z 3 NA NA 3
#> 4 A Z 2 NA NA 2
Created on 2018-11-28 by the reprex package (v0.2.1)
add a comment |
We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset
library(data.table)
cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,
value.var = 'C')[, -1, with = FALSE])
# A B C X Y Z
#1: A X 1 1 NA NA
#2: A Y 3 NA 3 NA
#3: A Z 3 NA NA 3
#4: A Z 2 NA NA 2
data
df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z",
"Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA,
-4L))
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.
library(tidyr)
library(dplyr)
To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.
df %>%
tibble::rowid_to_column() %>%
spread(key = B, value = C)
#> rowid A X Y Z
#> 1 1 A 1 NA NA
#> 2 2 A NA 3 NA
#> 3 3 A NA NA 3
#> 4 4 A NA NA 2
You can do that, then column-bind this to the original data frame, but your columns will be out of order and you'll have a column A1 that is identical to column A. Using select, you can pick the columns you want in the proper order.
df %>%
tibble::rowid_to_column() %>%
spread(key = B, value = C) %>%
bind_cols(df) %>%
select(A, B, C, X, Y, Z)
#> A B C X Y Z
#> 1 A X 1 1 NA NA
#> 2 A Y 3 NA 3 NA
#> 3 A Z 3 NA NA 3
#> 4 A Z 2 NA NA 2
A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.
bind_cols(
df,
df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)
) %>%
select(-rowid, -A1)
#> A B C X Y Z
#> 1 A X 1 1 NA NA
#> 2 A Y 3 NA 3 NA
#> 3 A Z 3 NA NA 3
#> 4 A Z 2 NA NA 2
Created on 2018-11-28 by the reprex package (v0.2.1)
add a comment |
Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.
library(tidyr)
library(dplyr)
To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.
df %>%
tibble::rowid_to_column() %>%
spread(key = B, value = C)
#> rowid A X Y Z
#> 1 1 A 1 NA NA
#> 2 2 A NA 3 NA
#> 3 3 A NA NA 3
#> 4 4 A NA NA 2
You can do that, then column-bind this to the original data frame, but your columns will be out of order and you'll have a column A1 that is identical to column A. Using select, you can pick the columns you want in the proper order.
df %>%
tibble::rowid_to_column() %>%
spread(key = B, value = C) %>%
bind_cols(df) %>%
select(A, B, C, X, Y, Z)
#> A B C X Y Z
#> 1 A X 1 1 NA NA
#> 2 A Y 3 NA 3 NA
#> 3 A Z 3 NA NA 3
#> 4 A Z 2 NA NA 2
A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.
bind_cols(
df,
df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)
) %>%
select(-rowid, -A1)
#> A B C X Y Z
#> 1 A X 1 1 NA NA
#> 2 A Y 3 NA 3 NA
#> 3 A Z 3 NA NA 3
#> 4 A Z 2 NA NA 2
Created on 2018-11-28 by the reprex package (v0.2.1)
add a comment |
Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.
library(tidyr)
library(dplyr)
To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.
df %>%
tibble::rowid_to_column() %>%
spread(key = B, value = C)
#> rowid A X Y Z
#> 1 1 A 1 NA NA
#> 2 2 A NA 3 NA
#> 3 3 A NA NA 3
#> 4 4 A NA NA 2
You can do that, then column-bind this to the original data frame, but your columns will be out of order and you'll have a column A1 that is identical to column A. Using select, you can pick the columns you want in the proper order.
df %>%
tibble::rowid_to_column() %>%
spread(key = B, value = C) %>%
bind_cols(df) %>%
select(A, B, C, X, Y, Z)
#> A B C X Y Z
#> 1 A X 1 1 NA NA
#> 2 A Y 3 NA 3 NA
#> 3 A Z 3 NA NA 3
#> 4 A Z 2 NA NA 2
A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.
bind_cols(
df,
df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)
) %>%
select(-rowid, -A1)
#> A B C X Y Z
#> 1 A X 1 1 NA NA
#> 2 A Y 3 NA 3 NA
#> 3 A Z 3 NA NA 3
#> 4 A Z 2 NA NA 2
Created on 2018-11-28 by the reprex package (v0.2.1)
Here are two tidyverse ways that I will admit are slightly clunky. They both make use of tidyr::spread to get wide-shaped data, and then bind the wide version back to the original.
library(tidyr)
library(dplyr)
To use spread, you first need some sort of identification for each row. A quick way to do that is with tibble::rowid_to_column.
df %>%
tibble::rowid_to_column() %>%
spread(key = B, value = C)
#> rowid A X Y Z
#> 1 1 A 1 NA NA
#> 2 2 A NA 3 NA
#> 3 3 A NA NA 3
#> 4 4 A NA NA 2
You can do that, then column-bind this to the original data frame, but your columns will be out of order and you'll have a column A1 that is identical to column A. Using select, you can pick the columns you want in the proper order.
df %>%
tibble::rowid_to_column() %>%
spread(key = B, value = C) %>%
bind_cols(df) %>%
select(A, B, C, X, Y, Z)
#> A B C X Y Z
#> 1 A X 1 1 NA NA
#> 2 A Y 3 NA 3 NA
#> 3 A Z 3 NA NA 3
#> 4 A Z 2 NA NA 2
A little nicer is to do the spread operation inside your column binding, then just drop the 2 extraneous columns.
bind_cols(
df,
df %>% tibble::rowid_to_column() %>% spread(key = B, value = C)
) %>%
select(-rowid, -A1)
#> A B C X Y Z
#> 1 A X 1 1 NA NA
#> 2 A Y 3 NA 3 NA
#> 3 A Z 3 NA NA 3
#> 4 A Z 2 NA NA 2
Created on 2018-11-28 by the reprex package (v0.2.1)
answered Nov 28 '18 at 19:22
camillecamille
7,87531833
7,87531833
add a comment |
add a comment |
We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset
library(data.table)
cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,
value.var = 'C')[, -1, with = FALSE])
# A B C X Y Z
#1: A X 1 1 NA NA
#2: A Y 3 NA 3 NA
#3: A Z 3 NA NA 3
#4: A Z 2 NA NA 2
data
df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z",
"Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA,
-4L))
add a comment |
We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset
library(data.table)
cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,
value.var = 'C')[, -1, with = FALSE])
# A B C X Y Z
#1: A X 1 1 NA NA
#2: A Y 3 NA 3 NA
#3: A Z 3 NA NA 3
#4: A Z 2 NA NA 2
data
df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z",
"Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA,
-4L))
add a comment |
We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset
library(data.table)
cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,
value.var = 'C')[, -1, with = FALSE])
# A B C X Y Z
#1: A X 1 1 NA NA
#2: A Y 3 NA 3 NA
#3: A Z 3 NA NA 3
#4: A Z 2 NA NA 2
data
df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z",
"Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA,
-4L))
We can use dcast from data.table to convert the 'column 'C' to 'wide' and then cbind with the original dataset
library(data.table)
cbind(df1, dcast(setDT(df1), seq_len(nrow(df1))~ B,
value.var = 'C')[, -1, with = FALSE])
# A B C X Y Z
#1: A X 1 1 NA NA
#2: A Y 3 NA 3 NA
#3: A Z 3 NA NA 3
#4: A Z 2 NA NA 2
data
df1 <- structure(list(A = c("A", "A", "A", "A"), B = c("X", "Y", "Z",
"Z"), C = c(1L, 3L, 3L, 2L)), class = "data.frame", row.names = c(NA,
-4L))
edited Nov 28 '18 at 19:25
answered Nov 28 '18 at 18:40
akrunakrun
417k13206279
417k13206279
add a comment |
add a comment |