R - combining ifelse and substr [duplicate]
This question already has an answer here:
Difference between `%in%` and `==`
3 answers
My sample data is:
df <- as.data.frame(c("10M_Amts", "D2B_Exp", "D3C_Exp", "D2_Amt", "D5_Amt", "53D_Amt"))
colnames(df) <- c("Label")
I'd like to adhere to the following rule:
If the first 2 letters are either D2, D3, D4, D5 or if the first 3 letters are D1A or D1_ then I would like to return the word "Work" in a new column called Work. If not, then return "NA".
I've searched around but wasn't able to find an example of dplyr combining ifelse and multiple substr commands. My attempted code using dplyr is:
df2 <- df %>%
mutate(Work = ifelse(substr(Label, 1, 3) == c("D1_", "D1A") |
substr(Label, 1, 2) == c("D2", "D3", "D4", "D5"), Work, "NA"))
As you can observe, there are multiple OR's going on for example for the first three strings I attempted to use c("D1_", "D1A") to represent D1_ or D1A. This is the same for the first two strings c("D2", "D3", "D4", "D5") to represent D2 or D3 or D4 or D5. In all, if there is D1_ or D1A or D2 or D3 or D4 or D5 in the first 2 or 3 letters, then it should return "Work" in a new column and if not, "NA". However, using substr function I resorted to split both of these categories.
My ideal output is:
Label Work
1 10M_Amts NA
2 D2B_Exp Work
3 D3C_Exp Work
4 D2_Amt Work
5 D5_Amt Work
6 53D_Amt NA
As you can see the new column name is Work. In excel, I would write the following:
=IF(OR(LEFT(A1,3)="D1_",LEFT(A1,3)="D1A",LEFT(A1,2)={"D2","D3","D4","D5"}),
"Work", "")
where column A is the Label column as per above. Sorry for the small sample, this worked when I did this in excel for ~5000 rows and for multiple categories other than "Work" but because the sheet will be too big, we'd like to convert to R.
Thank you so much in advance!
r if-statement substr
marked as duplicate by nicola
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 11:21
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
This question already has an answer here:
Difference between `%in%` and `==`
3 answers
My sample data is:
df <- as.data.frame(c("10M_Amts", "D2B_Exp", "D3C_Exp", "D2_Amt", "D5_Amt", "53D_Amt"))
colnames(df) <- c("Label")
I'd like to adhere to the following rule:
If the first 2 letters are either D2, D3, D4, D5 or if the first 3 letters are D1A or D1_ then I would like to return the word "Work" in a new column called Work. If not, then return "NA".
I've searched around but wasn't able to find an example of dplyr combining ifelse and multiple substr commands. My attempted code using dplyr is:
df2 <- df %>%
mutate(Work = ifelse(substr(Label, 1, 3) == c("D1_", "D1A") |
substr(Label, 1, 2) == c("D2", "D3", "D4", "D5"), Work, "NA"))
As you can observe, there are multiple OR's going on for example for the first three strings I attempted to use c("D1_", "D1A") to represent D1_ or D1A. This is the same for the first two strings c("D2", "D3", "D4", "D5") to represent D2 or D3 or D4 or D5. In all, if there is D1_ or D1A or D2 or D3 or D4 or D5 in the first 2 or 3 letters, then it should return "Work" in a new column and if not, "NA". However, using substr function I resorted to split both of these categories.
My ideal output is:
Label Work
1 10M_Amts NA
2 D2B_Exp Work
3 D3C_Exp Work
4 D2_Amt Work
5 D5_Amt Work
6 53D_Amt NA
As you can see the new column name is Work. In excel, I would write the following:
=IF(OR(LEFT(A1,3)="D1_",LEFT(A1,3)="D1A",LEFT(A1,2)={"D2","D3","D4","D5"}),
"Work", "")
where column A is the Label column as per above. Sorry for the small sample, this worked when I did this in excel for ~5000 rows and for multiple categories other than "Work" but because the sheet will be too big, we'd like to convert to R.
Thank you so much in advance!
r if-statement substr
marked as duplicate by nicola
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 11:21
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
1
What aboutD2A
or similar? if that can happen should it be "work"?
– sindri_baldur
Nov 28 '18 at 11:15
D2A/ similar would also return "Work". Yep!
– ynitSed
Nov 28 '18 at 11:17
1
You have to use%in%
instead of==
.
– nicola
Nov 28 '18 at 11:18
add a comment |
This question already has an answer here:
Difference between `%in%` and `==`
3 answers
My sample data is:
df <- as.data.frame(c("10M_Amts", "D2B_Exp", "D3C_Exp", "D2_Amt", "D5_Amt", "53D_Amt"))
colnames(df) <- c("Label")
I'd like to adhere to the following rule:
If the first 2 letters are either D2, D3, D4, D5 or if the first 3 letters are D1A or D1_ then I would like to return the word "Work" in a new column called Work. If not, then return "NA".
I've searched around but wasn't able to find an example of dplyr combining ifelse and multiple substr commands. My attempted code using dplyr is:
df2 <- df %>%
mutate(Work = ifelse(substr(Label, 1, 3) == c("D1_", "D1A") |
substr(Label, 1, 2) == c("D2", "D3", "D4", "D5"), Work, "NA"))
As you can observe, there are multiple OR's going on for example for the first three strings I attempted to use c("D1_", "D1A") to represent D1_ or D1A. This is the same for the first two strings c("D2", "D3", "D4", "D5") to represent D2 or D3 or D4 or D5. In all, if there is D1_ or D1A or D2 or D3 or D4 or D5 in the first 2 or 3 letters, then it should return "Work" in a new column and if not, "NA". However, using substr function I resorted to split both of these categories.
My ideal output is:
Label Work
1 10M_Amts NA
2 D2B_Exp Work
3 D3C_Exp Work
4 D2_Amt Work
5 D5_Amt Work
6 53D_Amt NA
As you can see the new column name is Work. In excel, I would write the following:
=IF(OR(LEFT(A1,3)="D1_",LEFT(A1,3)="D1A",LEFT(A1,2)={"D2","D3","D4","D5"}),
"Work", "")
where column A is the Label column as per above. Sorry for the small sample, this worked when I did this in excel for ~5000 rows and for multiple categories other than "Work" but because the sheet will be too big, we'd like to convert to R.
Thank you so much in advance!
r if-statement substr
This question already has an answer here:
Difference between `%in%` and `==`
3 answers
My sample data is:
df <- as.data.frame(c("10M_Amts", "D2B_Exp", "D3C_Exp", "D2_Amt", "D5_Amt", "53D_Amt"))
colnames(df) <- c("Label")
I'd like to adhere to the following rule:
If the first 2 letters are either D2, D3, D4, D5 or if the first 3 letters are D1A or D1_ then I would like to return the word "Work" in a new column called Work. If not, then return "NA".
I've searched around but wasn't able to find an example of dplyr combining ifelse and multiple substr commands. My attempted code using dplyr is:
df2 <- df %>%
mutate(Work = ifelse(substr(Label, 1, 3) == c("D1_", "D1A") |
substr(Label, 1, 2) == c("D2", "D3", "D4", "D5"), Work, "NA"))
As you can observe, there are multiple OR's going on for example for the first three strings I attempted to use c("D1_", "D1A") to represent D1_ or D1A. This is the same for the first two strings c("D2", "D3", "D4", "D5") to represent D2 or D3 or D4 or D5. In all, if there is D1_ or D1A or D2 or D3 or D4 or D5 in the first 2 or 3 letters, then it should return "Work" in a new column and if not, "NA". However, using substr function I resorted to split both of these categories.
My ideal output is:
Label Work
1 10M_Amts NA
2 D2B_Exp Work
3 D3C_Exp Work
4 D2_Amt Work
5 D5_Amt Work
6 53D_Amt NA
As you can see the new column name is Work. In excel, I would write the following:
=IF(OR(LEFT(A1,3)="D1_",LEFT(A1,3)="D1A",LEFT(A1,2)={"D2","D3","D4","D5"}),
"Work", "")
where column A is the Label column as per above. Sorry for the small sample, this worked when I did this in excel for ~5000 rows and for multiple categories other than "Work" but because the sheet will be too big, we'd like to convert to R.
Thank you so much in advance!
This question already has an answer here:
Difference between `%in%` and `==`
3 answers
r if-statement substr
r if-statement substr
edited Nov 28 '18 at 12:12
ynitSed
asked Nov 28 '18 at 11:08
ynitSedynitSed
666
666
marked as duplicate by nicola
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 11:21
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
marked as duplicate by nicola
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 11:21
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
1
What aboutD2A
or similar? if that can happen should it be "work"?
– sindri_baldur
Nov 28 '18 at 11:15
D2A/ similar would also return "Work". Yep!
– ynitSed
Nov 28 '18 at 11:17
1
You have to use%in%
instead of==
.
– nicola
Nov 28 '18 at 11:18
add a comment |
1
What aboutD2A
or similar? if that can happen should it be "work"?
– sindri_baldur
Nov 28 '18 at 11:15
D2A/ similar would also return "Work". Yep!
– ynitSed
Nov 28 '18 at 11:17
1
You have to use%in%
instead of==
.
– nicola
Nov 28 '18 at 11:18
1
1
What about
D2A
or similar? if that can happen should it be "work"?– sindri_baldur
Nov 28 '18 at 11:15
What about
D2A
or similar? if that can happen should it be "work"?– sindri_baldur
Nov 28 '18 at 11:15
D2A/ similar would also return "Work". Yep!
– ynitSed
Nov 28 '18 at 11:17
D2A/ similar would also return "Work". Yep!
– ynitSed
Nov 28 '18 at 11:17
1
1
You have to use
%in%
instead of ==
.– nicola
Nov 28 '18 at 11:18
You have to use
%in%
instead of ==
.– nicola
Nov 28 '18 at 11:18
add a comment |
2 Answers
2
active
oldest
votes
This will work, you have two problems Work should be 'Work' and using %in%
instead of ==
as you have multiple comparisons
df %>%
mutate(WRE = ifelse(substr(Label, 1, 3) %in% c("D1_", "D1A")|
substr(Label, 1, 2) %in% c("D2", "D3", "D4", "D5"), 'Work', "NA"))
add a comment |
If anything starting in D1, D2, ..., D5
translates into "work"
you could do use some simple regular expressions instead of substr()
:
df %>%
mutate(work = ifelse(grepl("^D[0-5]", Label), "Work", NA))
Label work
1 10M_Amts <NA>
2 D2B_Exp Work
3 D3C_Exp Work
4 D2_Amt Work
5 D5_Amt Work
6 53D_Amt <NA>
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
This will work, you have two problems Work should be 'Work' and using %in%
instead of ==
as you have multiple comparisons
df %>%
mutate(WRE = ifelse(substr(Label, 1, 3) %in% c("D1_", "D1A")|
substr(Label, 1, 2) %in% c("D2", "D3", "D4", "D5"), 'Work', "NA"))
add a comment |
This will work, you have two problems Work should be 'Work' and using %in%
instead of ==
as you have multiple comparisons
df %>%
mutate(WRE = ifelse(substr(Label, 1, 3) %in% c("D1_", "D1A")|
substr(Label, 1, 2) %in% c("D2", "D3", "D4", "D5"), 'Work', "NA"))
add a comment |
This will work, you have two problems Work should be 'Work' and using %in%
instead of ==
as you have multiple comparisons
df %>%
mutate(WRE = ifelse(substr(Label, 1, 3) %in% c("D1_", "D1A")|
substr(Label, 1, 2) %in% c("D2", "D3", "D4", "D5"), 'Work', "NA"))
This will work, you have two problems Work should be 'Work' and using %in%
instead of ==
as you have multiple comparisons
df %>%
mutate(WRE = ifelse(substr(Label, 1, 3) %in% c("D1_", "D1A")|
substr(Label, 1, 2) %in% c("D2", "D3", "D4", "D5"), 'Work', "NA"))
answered Nov 28 '18 at 11:18
A. SulimanA. Suliman
5,63241223
5,63241223
add a comment |
add a comment |
If anything starting in D1, D2, ..., D5
translates into "work"
you could do use some simple regular expressions instead of substr()
:
df %>%
mutate(work = ifelse(grepl("^D[0-5]", Label), "Work", NA))
Label work
1 10M_Amts <NA>
2 D2B_Exp Work
3 D3C_Exp Work
4 D2_Amt Work
5 D5_Amt Work
6 53D_Amt <NA>
add a comment |
If anything starting in D1, D2, ..., D5
translates into "work"
you could do use some simple regular expressions instead of substr()
:
df %>%
mutate(work = ifelse(grepl("^D[0-5]", Label), "Work", NA))
Label work
1 10M_Amts <NA>
2 D2B_Exp Work
3 D3C_Exp Work
4 D2_Amt Work
5 D5_Amt Work
6 53D_Amt <NA>
add a comment |
If anything starting in D1, D2, ..., D5
translates into "work"
you could do use some simple regular expressions instead of substr()
:
df %>%
mutate(work = ifelse(grepl("^D[0-5]", Label), "Work", NA))
Label work
1 10M_Amts <NA>
2 D2B_Exp Work
3 D3C_Exp Work
4 D2_Amt Work
5 D5_Amt Work
6 53D_Amt <NA>
If anything starting in D1, D2, ..., D5
translates into "work"
you could do use some simple regular expressions instead of substr()
:
df %>%
mutate(work = ifelse(grepl("^D[0-5]", Label), "Work", NA))
Label work
1 10M_Amts <NA>
2 D2B_Exp Work
3 D3C_Exp Work
4 D2_Amt Work
5 D5_Amt Work
6 53D_Amt <NA>
answered Nov 28 '18 at 11:18
sindri_baldursindri_baldur
8,2451033
8,2451033
add a comment |
add a comment |
1
What about
D2A
or similar? if that can happen should it be "work"?– sindri_baldur
Nov 28 '18 at 11:15
D2A/ similar would also return "Work". Yep!
– ynitSed
Nov 28 '18 at 11:17
1
You have to use
%in%
instead of==
.– nicola
Nov 28 '18 at 11:18