R - combining ifelse and substr [duplicate]












0
















This question already has an answer here:




  • Difference between `%in%` and `==`

    3 answers




My sample data is:



df <- as.data.frame(c("10M_Amts", "D2B_Exp", "D3C_Exp", "D2_Amt", "D5_Amt", "53D_Amt"))
colnames(df) <- c("Label")


I'd like to adhere to the following rule:



If the first 2 letters are either D2, D3, D4, D5 or if the first 3 letters are D1A or D1_ then I would like to return the word "Work" in a new column called Work. If not, then return "NA".



I've searched around but wasn't able to find an example of dplyr combining ifelse and multiple substr commands. My attempted code using dplyr is:



df2 <- df %>%
mutate(Work = ifelse(substr(Label, 1, 3) == c("D1_", "D1A") |
substr(Label, 1, 2) == c("D2", "D3", "D4", "D5"), Work, "NA"))


As you can observe, there are multiple OR's going on for example for the first three strings I attempted to use c("D1_", "D1A") to represent D1_ or D1A. This is the same for the first two strings c("D2", "D3", "D4", "D5") to represent D2 or D3 or D4 or D5. In all, if there is D1_ or D1A or D2 or D3 or D4 or D5 in the first 2 or 3 letters, then it should return "Work" in a new column and if not, "NA". However, using substr function I resorted to split both of these categories.



My ideal output is:



     Label       Work
1 10M_Amts NA
2 D2B_Exp Work
3 D3C_Exp Work
4 D2_Amt Work
5 D5_Amt Work
6 53D_Amt NA


As you can see the new column name is Work. In excel, I would write the following:



=IF(OR(LEFT(A1,3)="D1_",LEFT(A1,3)="D1A",LEFT(A1,2)={"D2","D3","D4","D5"}), 
"Work", "")


where column A is the Label column as per above. Sorry for the small sample, this worked when I did this in excel for ~5000 rows and for multiple categories other than "Work" but because the sheet will be too big, we'd like to convert to R.



Thank you so much in advance!










share|improve this question















marked as duplicate by nicola r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 11:21


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.














  • 1





    What about D2A or similar? if that can happen should it be "work"?

    – sindri_baldur
    Nov 28 '18 at 11:15













  • D2A/ similar would also return "Work". Yep!

    – ynitSed
    Nov 28 '18 at 11:17








  • 1





    You have to use %in% instead of ==.

    – nicola
    Nov 28 '18 at 11:18
















0
















This question already has an answer here:




  • Difference between `%in%` and `==`

    3 answers




My sample data is:



df <- as.data.frame(c("10M_Amts", "D2B_Exp", "D3C_Exp", "D2_Amt", "D5_Amt", "53D_Amt"))
colnames(df) <- c("Label")


I'd like to adhere to the following rule:



If the first 2 letters are either D2, D3, D4, D5 or if the first 3 letters are D1A or D1_ then I would like to return the word "Work" in a new column called Work. If not, then return "NA".



I've searched around but wasn't able to find an example of dplyr combining ifelse and multiple substr commands. My attempted code using dplyr is:



df2 <- df %>%
mutate(Work = ifelse(substr(Label, 1, 3) == c("D1_", "D1A") |
substr(Label, 1, 2) == c("D2", "D3", "D4", "D5"), Work, "NA"))


As you can observe, there are multiple OR's going on for example for the first three strings I attempted to use c("D1_", "D1A") to represent D1_ or D1A. This is the same for the first two strings c("D2", "D3", "D4", "D5") to represent D2 or D3 or D4 or D5. In all, if there is D1_ or D1A or D2 or D3 or D4 or D5 in the first 2 or 3 letters, then it should return "Work" in a new column and if not, "NA". However, using substr function I resorted to split both of these categories.



My ideal output is:



     Label       Work
1 10M_Amts NA
2 D2B_Exp Work
3 D3C_Exp Work
4 D2_Amt Work
5 D5_Amt Work
6 53D_Amt NA


As you can see the new column name is Work. In excel, I would write the following:



=IF(OR(LEFT(A1,3)="D1_",LEFT(A1,3)="D1A",LEFT(A1,2)={"D2","D3","D4","D5"}), 
"Work", "")


where column A is the Label column as per above. Sorry for the small sample, this worked when I did this in excel for ~5000 rows and for multiple categories other than "Work" but because the sheet will be too big, we'd like to convert to R.



Thank you so much in advance!










share|improve this question















marked as duplicate by nicola r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 11:21


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.














  • 1





    What about D2A or similar? if that can happen should it be "work"?

    – sindri_baldur
    Nov 28 '18 at 11:15













  • D2A/ similar would also return "Work". Yep!

    – ynitSed
    Nov 28 '18 at 11:17








  • 1





    You have to use %in% instead of ==.

    – nicola
    Nov 28 '18 at 11:18














0












0








0









This question already has an answer here:




  • Difference between `%in%` and `==`

    3 answers




My sample data is:



df <- as.data.frame(c("10M_Amts", "D2B_Exp", "D3C_Exp", "D2_Amt", "D5_Amt", "53D_Amt"))
colnames(df) <- c("Label")


I'd like to adhere to the following rule:



If the first 2 letters are either D2, D3, D4, D5 or if the first 3 letters are D1A or D1_ then I would like to return the word "Work" in a new column called Work. If not, then return "NA".



I've searched around but wasn't able to find an example of dplyr combining ifelse and multiple substr commands. My attempted code using dplyr is:



df2 <- df %>%
mutate(Work = ifelse(substr(Label, 1, 3) == c("D1_", "D1A") |
substr(Label, 1, 2) == c("D2", "D3", "D4", "D5"), Work, "NA"))


As you can observe, there are multiple OR's going on for example for the first three strings I attempted to use c("D1_", "D1A") to represent D1_ or D1A. This is the same for the first two strings c("D2", "D3", "D4", "D5") to represent D2 or D3 or D4 or D5. In all, if there is D1_ or D1A or D2 or D3 or D4 or D5 in the first 2 or 3 letters, then it should return "Work" in a new column and if not, "NA". However, using substr function I resorted to split both of these categories.



My ideal output is:



     Label       Work
1 10M_Amts NA
2 D2B_Exp Work
3 D3C_Exp Work
4 D2_Amt Work
5 D5_Amt Work
6 53D_Amt NA


As you can see the new column name is Work. In excel, I would write the following:



=IF(OR(LEFT(A1,3)="D1_",LEFT(A1,3)="D1A",LEFT(A1,2)={"D2","D3","D4","D5"}), 
"Work", "")


where column A is the Label column as per above. Sorry for the small sample, this worked when I did this in excel for ~5000 rows and for multiple categories other than "Work" but because the sheet will be too big, we'd like to convert to R.



Thank you so much in advance!










share|improve this question

















This question already has an answer here:




  • Difference between `%in%` and `==`

    3 answers




My sample data is:



df <- as.data.frame(c("10M_Amts", "D2B_Exp", "D3C_Exp", "D2_Amt", "D5_Amt", "53D_Amt"))
colnames(df) <- c("Label")


I'd like to adhere to the following rule:



If the first 2 letters are either D2, D3, D4, D5 or if the first 3 letters are D1A or D1_ then I would like to return the word "Work" in a new column called Work. If not, then return "NA".



I've searched around but wasn't able to find an example of dplyr combining ifelse and multiple substr commands. My attempted code using dplyr is:



df2 <- df %>%
mutate(Work = ifelse(substr(Label, 1, 3) == c("D1_", "D1A") |
substr(Label, 1, 2) == c("D2", "D3", "D4", "D5"), Work, "NA"))


As you can observe, there are multiple OR's going on for example for the first three strings I attempted to use c("D1_", "D1A") to represent D1_ or D1A. This is the same for the first two strings c("D2", "D3", "D4", "D5") to represent D2 or D3 or D4 or D5. In all, if there is D1_ or D1A or D2 or D3 or D4 or D5 in the first 2 or 3 letters, then it should return "Work" in a new column and if not, "NA". However, using substr function I resorted to split both of these categories.



My ideal output is:



     Label       Work
1 10M_Amts NA
2 D2B_Exp Work
3 D3C_Exp Work
4 D2_Amt Work
5 D5_Amt Work
6 53D_Amt NA


As you can see the new column name is Work. In excel, I would write the following:



=IF(OR(LEFT(A1,3)="D1_",LEFT(A1,3)="D1A",LEFT(A1,2)={"D2","D3","D4","D5"}), 
"Work", "")


where column A is the Label column as per above. Sorry for the small sample, this worked when I did this in excel for ~5000 rows and for multiple categories other than "Work" but because the sheet will be too big, we'd like to convert to R.



Thank you so much in advance!





This question already has an answer here:




  • Difference between `%in%` and `==`

    3 answers








r if-statement substr






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 28 '18 at 12:12







ynitSed

















asked Nov 28 '18 at 11:08









ynitSedynitSed

666




666




marked as duplicate by nicola r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 11:21


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.









marked as duplicate by nicola r
Users with the  r badge can single-handedly close r questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 28 '18 at 11:21


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.










  • 1





    What about D2A or similar? if that can happen should it be "work"?

    – sindri_baldur
    Nov 28 '18 at 11:15













  • D2A/ similar would also return "Work". Yep!

    – ynitSed
    Nov 28 '18 at 11:17








  • 1





    You have to use %in% instead of ==.

    – nicola
    Nov 28 '18 at 11:18














  • 1





    What about D2A or similar? if that can happen should it be "work"?

    – sindri_baldur
    Nov 28 '18 at 11:15













  • D2A/ similar would also return "Work". Yep!

    – ynitSed
    Nov 28 '18 at 11:17








  • 1





    You have to use %in% instead of ==.

    – nicola
    Nov 28 '18 at 11:18








1




1





What about D2A or similar? if that can happen should it be "work"?

– sindri_baldur
Nov 28 '18 at 11:15







What about D2A or similar? if that can happen should it be "work"?

– sindri_baldur
Nov 28 '18 at 11:15















D2A/ similar would also return "Work". Yep!

– ynitSed
Nov 28 '18 at 11:17







D2A/ similar would also return "Work". Yep!

– ynitSed
Nov 28 '18 at 11:17






1




1





You have to use %in% instead of ==.

– nicola
Nov 28 '18 at 11:18





You have to use %in% instead of ==.

– nicola
Nov 28 '18 at 11:18












2 Answers
2






active

oldest

votes


















2














This will work, you have two problems Work should be 'Work' and using %in% instead of == as you have multiple comparisons



df %>%
mutate(WRE = ifelse(substr(Label, 1, 3) %in% c("D1_", "D1A")|
substr(Label, 1, 2) %in% c("D2", "D3", "D4", "D5"), 'Work', "NA"))





share|improve this answer































    2














    If anything starting in D1, D2, ..., D5 translates into "work" you could do use some simple regular expressions instead of substr():



    df %>%
    mutate(work = ifelse(grepl("^D[0-5]", Label), "Work", NA))

    Label work
    1 10M_Amts <NA>
    2 D2B_Exp Work
    3 D3C_Exp Work
    4 D2_Amt Work
    5 D5_Amt Work
    6 53D_Amt <NA>





    share|improve this answer






























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      2














      This will work, you have two problems Work should be 'Work' and using %in% instead of == as you have multiple comparisons



      df %>%
      mutate(WRE = ifelse(substr(Label, 1, 3) %in% c("D1_", "D1A")|
      substr(Label, 1, 2) %in% c("D2", "D3", "D4", "D5"), 'Work', "NA"))





      share|improve this answer




























        2














        This will work, you have two problems Work should be 'Work' and using %in% instead of == as you have multiple comparisons



        df %>%
        mutate(WRE = ifelse(substr(Label, 1, 3) %in% c("D1_", "D1A")|
        substr(Label, 1, 2) %in% c("D2", "D3", "D4", "D5"), 'Work', "NA"))





        share|improve this answer


























          2












          2








          2







          This will work, you have two problems Work should be 'Work' and using %in% instead of == as you have multiple comparisons



          df %>%
          mutate(WRE = ifelse(substr(Label, 1, 3) %in% c("D1_", "D1A")|
          substr(Label, 1, 2) %in% c("D2", "D3", "D4", "D5"), 'Work', "NA"))





          share|improve this answer













          This will work, you have two problems Work should be 'Work' and using %in% instead of == as you have multiple comparisons



          df %>%
          mutate(WRE = ifelse(substr(Label, 1, 3) %in% c("D1_", "D1A")|
          substr(Label, 1, 2) %in% c("D2", "D3", "D4", "D5"), 'Work', "NA"))






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 28 '18 at 11:18









          A. SulimanA. Suliman

          5,63241223




          5,63241223

























              2














              If anything starting in D1, D2, ..., D5 translates into "work" you could do use some simple regular expressions instead of substr():



              df %>%
              mutate(work = ifelse(grepl("^D[0-5]", Label), "Work", NA))

              Label work
              1 10M_Amts <NA>
              2 D2B_Exp Work
              3 D3C_Exp Work
              4 D2_Amt Work
              5 D5_Amt Work
              6 53D_Amt <NA>





              share|improve this answer




























                2














                If anything starting in D1, D2, ..., D5 translates into "work" you could do use some simple regular expressions instead of substr():



                df %>%
                mutate(work = ifelse(grepl("^D[0-5]", Label), "Work", NA))

                Label work
                1 10M_Amts <NA>
                2 D2B_Exp Work
                3 D3C_Exp Work
                4 D2_Amt Work
                5 D5_Amt Work
                6 53D_Amt <NA>





                share|improve this answer


























                  2












                  2








                  2







                  If anything starting in D1, D2, ..., D5 translates into "work" you could do use some simple regular expressions instead of substr():



                  df %>%
                  mutate(work = ifelse(grepl("^D[0-5]", Label), "Work", NA))

                  Label work
                  1 10M_Amts <NA>
                  2 D2B_Exp Work
                  3 D3C_Exp Work
                  4 D2_Amt Work
                  5 D5_Amt Work
                  6 53D_Amt <NA>





                  share|improve this answer













                  If anything starting in D1, D2, ..., D5 translates into "work" you could do use some simple regular expressions instead of substr():



                  df %>%
                  mutate(work = ifelse(grepl("^D[0-5]", Label), "Work", NA))

                  Label work
                  1 10M_Amts <NA>
                  2 D2B_Exp Work
                  3 D3C_Exp Work
                  4 D2_Amt Work
                  5 D5_Amt Work
                  6 53D_Amt <NA>






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 28 '18 at 11:18









                  sindri_baldursindri_baldur

                  8,2451033




                  8,2451033















                      Popular posts from this blog

                      A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

                      Calculate evaluation metrics using cross_val_predict sklearn

                      Insert data from modal to MySQL (multiple modal on website)