Why am I getting this New York output alone?












0















Consider the following vector ‘tels’ which contains telephone numbers from “KANSAS”, “TEXAS” and “NEW YORK” regions.



tels <- c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")



If the number starts with 510, the phone number is from “KANSAS”, if it is 707, then “NEW YORK” and if it is 650 then the number is from “TEXAS”

Use R concepts and obtain the following dataframe as ouput.

Expected Output:

PhoneNumbers State

1 5105482238 KANSAS

2 7072312440 NEW YORK

3 6507521300 TEXAS

4 5106743482 KANSAS

5 5108535695 KANSAS

6 5108829898 KANSAS

7 6505556311 TEXAS

8 7078856351 NEW YORK

9 6502311234 TEXAS

10 6500960023 TEXAS

11 7076916763 NEW YORK



This is my code :

z<-substr(tels,1,3)
dirt<-data.frame(tels,z)
dirt
for(i in z){
if(i==510){
sta<-"ddfdd"
}if(i==707){
sta<-"NEW YORK"
}
if((i==650)){
sta<-"TEXAS"
}

}
das<-data.frame(tels,sta)
das


but I'm getting this output:
tels sta
1 510-548-2238 NEW YORK
2 707-231-2440 NEW YORK
3 650-752-1300 NEW YORK
4 510-674-3482 NEW YORK
5 510-853-5695 NEW YORK
6 510-882-9898 NEW YORK
7 650-555-6311 NEW YORK
8 707-885-6351 NEW YORK
9 650-231-1234 NEW YORK
10 650-096-0023 NEW YORK
11 707-691-6763 NEW YORK









share|improve this question



























    0















    Consider the following vector ‘tels’ which contains telephone numbers from “KANSAS”, “TEXAS” and “NEW YORK” regions.



    tels <- c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")



    If the number starts with 510, the phone number is from “KANSAS”, if it is 707, then “NEW YORK” and if it is 650 then the number is from “TEXAS”

    Use R concepts and obtain the following dataframe as ouput.

    Expected Output:

    PhoneNumbers State

    1 5105482238 KANSAS

    2 7072312440 NEW YORK

    3 6507521300 TEXAS

    4 5106743482 KANSAS

    5 5108535695 KANSAS

    6 5108829898 KANSAS

    7 6505556311 TEXAS

    8 7078856351 NEW YORK

    9 6502311234 TEXAS

    10 6500960023 TEXAS

    11 7076916763 NEW YORK



    This is my code :

    z<-substr(tels,1,3)
    dirt<-data.frame(tels,z)
    dirt
    for(i in z){
    if(i==510){
    sta<-"ddfdd"
    }if(i==707){
    sta<-"NEW YORK"
    }
    if((i==650)){
    sta<-"TEXAS"
    }

    }
    das<-data.frame(tels,sta)
    das


    but I'm getting this output:
    tels sta
    1 510-548-2238 NEW YORK
    2 707-231-2440 NEW YORK
    3 650-752-1300 NEW YORK
    4 510-674-3482 NEW YORK
    5 510-853-5695 NEW YORK
    6 510-882-9898 NEW YORK
    7 650-555-6311 NEW YORK
    8 707-885-6351 NEW YORK
    9 650-231-1234 NEW YORK
    10 650-096-0023 NEW YORK
    11 707-691-6763 NEW YORK









    share|improve this question

























      0












      0








      0








      Consider the following vector ‘tels’ which contains telephone numbers from “KANSAS”, “TEXAS” and “NEW YORK” regions.



      tels <- c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")



      If the number starts with 510, the phone number is from “KANSAS”, if it is 707, then “NEW YORK” and if it is 650 then the number is from “TEXAS”

      Use R concepts and obtain the following dataframe as ouput.

      Expected Output:

      PhoneNumbers State

      1 5105482238 KANSAS

      2 7072312440 NEW YORK

      3 6507521300 TEXAS

      4 5106743482 KANSAS

      5 5108535695 KANSAS

      6 5108829898 KANSAS

      7 6505556311 TEXAS

      8 7078856351 NEW YORK

      9 6502311234 TEXAS

      10 6500960023 TEXAS

      11 7076916763 NEW YORK



      This is my code :

      z<-substr(tels,1,3)
      dirt<-data.frame(tels,z)
      dirt
      for(i in z){
      if(i==510){
      sta<-"ddfdd"
      }if(i==707){
      sta<-"NEW YORK"
      }
      if((i==650)){
      sta<-"TEXAS"
      }

      }
      das<-data.frame(tels,sta)
      das


      but I'm getting this output:
      tels sta
      1 510-548-2238 NEW YORK
      2 707-231-2440 NEW YORK
      3 650-752-1300 NEW YORK
      4 510-674-3482 NEW YORK
      5 510-853-5695 NEW YORK
      6 510-882-9898 NEW YORK
      7 650-555-6311 NEW YORK
      8 707-885-6351 NEW YORK
      9 650-231-1234 NEW YORK
      10 650-096-0023 NEW YORK
      11 707-691-6763 NEW YORK









      share|improve this question














      Consider the following vector ‘tels’ which contains telephone numbers from “KANSAS”, “TEXAS” and “NEW YORK” regions.



      tels <- c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")



      If the number starts with 510, the phone number is from “KANSAS”, if it is 707, then “NEW YORK” and if it is 650 then the number is from “TEXAS”

      Use R concepts and obtain the following dataframe as ouput.

      Expected Output:

      PhoneNumbers State

      1 5105482238 KANSAS

      2 7072312440 NEW YORK

      3 6507521300 TEXAS

      4 5106743482 KANSAS

      5 5108535695 KANSAS

      6 5108829898 KANSAS

      7 6505556311 TEXAS

      8 7078856351 NEW YORK

      9 6502311234 TEXAS

      10 6500960023 TEXAS

      11 7076916763 NEW YORK



      This is my code :

      z<-substr(tels,1,3)
      dirt<-data.frame(tels,z)
      dirt
      for(i in z){
      if(i==510){
      sta<-"ddfdd"
      }if(i==707){
      sta<-"NEW YORK"
      }
      if((i==650)){
      sta<-"TEXAS"
      }

      }
      das<-data.frame(tels,sta)
      das


      but I'm getting this output:
      tels sta
      1 510-548-2238 NEW YORK
      2 707-231-2440 NEW YORK
      3 650-752-1300 NEW YORK
      4 510-674-3482 NEW YORK
      5 510-853-5695 NEW YORK
      6 510-882-9898 NEW YORK
      7 650-555-6311 NEW YORK
      8 707-885-6351 NEW YORK
      9 650-231-1234 NEW YORK
      10 650-096-0023 NEW YORK
      11 707-691-6763 NEW YORK






      r






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 28 '18 at 5:44









      ChandanChandan

      1




      1
























          3 Answers
          3






          active

          oldest

          votes


















          2














          You can use factor with the labels being the state and labels being the first 3 digits



          data.frame(tels,
          state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))
          tels state
          1 510-548-2238 KANSAS
          2 707-231-2440 NEW YORK
          3 650-752-1300 TEXAS
          4 510-674-3482 KANSAS
          5 510-853-5695 KANSAS
          6 510-882-9898 KANSAS
          7 650-555-6311 TEXAS
          8 707-885-6351 NEW YORK
          9 650-231-1234 TEXAS
          10 650-096-0023 TEXAS
          11 707-691-6763 NEW YORK





          share|improve this answer































            1














            We substr the 'tels' and then create a named vector to match the substr values and replace it with the values in the named vector



            data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),
            c('510', '707', '650'))[substr(tels, 1, 3)])
            # PhoneNumbers state
            #1 510-548-2238 KANSAS
            #2 707-231-2440 NEW YORK
            #3 650-752-1300 TEXAS
            #4 510-674-3482 KANSAS
            #5 510-853-5695 KANSAS
            #6 510-882-9898 KANSAS
            #7 650-555-6311 TEXAS
            #8 707-885-6351 NEW YORK
            #9 650-231-1234 TEXAS
            #10 650-096-0023 TEXAS
            #11 707-691-6763 NEW YORK





            share|improve this answer


























            • I want the output as mentioned in the question.

              – Chandan
              Nov 28 '18 at 5:50











            • It is the output as showed in the question.

              – akrun
              Nov 28 '18 at 8:16



















            1














            You can find first pattern using ^510, ^650, and ^707. To easily add new column, I have used dplyr package.



            library(tidyverse) # has dplyr and stringr
            # data set -------------------------------
            (dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))
            #> # A tibble: 11 x 1
            #> PhoneNumbers
            #> <chr>
            #> 1 510-548-2238
            #> 2 707-231-2440
            #> 3 650-752-1300
            #> 4 510-674-3482
            #> 5 510-853-5695
            #> 6 510-882-9898
            #> 7 650-555-6311
            #> 8 707-885-6351
            #> 9 650-231-1234
            #> 10 650-096-0023
            #> 11 707-691-6763


            You can make function to find each region by finding each pattern: stringr::str_detect()



            You can do it at once using sapply(). If you perform str_detect to c("^510", "^650", "^707"), you will get a matrix each of which column is the number. Each value is whether the number contains the pattern(TRUE or FALSE), i.e. 11 x 3.



            For each row, you have only one TRUE by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK").



            find_region <- function(x) {
            sta <- c("^510", "^650", "^707")
            stt <- sapply(sta, function(p) {
            str_detect(x, pattern = p)
            }) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern
            apply(1, which) # get the index
            c("KANSAS", "TEXAS", "NEW YORK")[stt]
            }


            Using this function, you can add new column: dplyr::mutate()



            dirt %>% 
            mutate(State = find_region(PhoneNumbers))
            #> # A tibble: 11 x 2
            #> PhoneNumbers State
            #> <chr> <chr>
            #> 1 510-548-2238 KANSAS
            #> 2 707-231-2440 NEW YORK
            #> 3 650-752-1300 TEXAS
            #> 4 510-674-3482 KANSAS
            #> 5 510-853-5695 KANSAS
            #> 6 510-882-9898 KANSAS
            #> 7 650-555-6311 TEXAS
            #> 8 707-885-6351 NEW YORK
            #> 9 650-231-1234 TEXAS
            #> 10 650-096-0023 TEXAS
            #> 11 707-691-6763 NEW YORK





            share|improve this answer

























              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53512872%2fwhy-am-i-getting-this-new-york-output-alone%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              3 Answers
              3






              active

              oldest

              votes








              3 Answers
              3






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              2














              You can use factor with the labels being the state and labels being the first 3 digits



              data.frame(tels,
              state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))
              tels state
              1 510-548-2238 KANSAS
              2 707-231-2440 NEW YORK
              3 650-752-1300 TEXAS
              4 510-674-3482 KANSAS
              5 510-853-5695 KANSAS
              6 510-882-9898 KANSAS
              7 650-555-6311 TEXAS
              8 707-885-6351 NEW YORK
              9 650-231-1234 TEXAS
              10 650-096-0023 TEXAS
              11 707-691-6763 NEW YORK





              share|improve this answer




























                2














                You can use factor with the labels being the state and labels being the first 3 digits



                data.frame(tels,
                state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))
                tels state
                1 510-548-2238 KANSAS
                2 707-231-2440 NEW YORK
                3 650-752-1300 TEXAS
                4 510-674-3482 KANSAS
                5 510-853-5695 KANSAS
                6 510-882-9898 KANSAS
                7 650-555-6311 TEXAS
                8 707-885-6351 NEW YORK
                9 650-231-1234 TEXAS
                10 650-096-0023 TEXAS
                11 707-691-6763 NEW YORK





                share|improve this answer


























                  2












                  2








                  2







                  You can use factor with the labels being the state and labels being the first 3 digits



                  data.frame(tels,
                  state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))
                  tels state
                  1 510-548-2238 KANSAS
                  2 707-231-2440 NEW YORK
                  3 650-752-1300 TEXAS
                  4 510-674-3482 KANSAS
                  5 510-853-5695 KANSAS
                  6 510-882-9898 KANSAS
                  7 650-555-6311 TEXAS
                  8 707-885-6351 NEW YORK
                  9 650-231-1234 TEXAS
                  10 650-096-0023 TEXAS
                  11 707-691-6763 NEW YORK





                  share|improve this answer













                  You can use factor with the labels being the state and labels being the first 3 digits



                  data.frame(tels,
                  state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))
                  tels state
                  1 510-548-2238 KANSAS
                  2 707-231-2440 NEW YORK
                  3 650-752-1300 TEXAS
                  4 510-674-3482 KANSAS
                  5 510-853-5695 KANSAS
                  6 510-882-9898 KANSAS
                  7 650-555-6311 TEXAS
                  8 707-885-6351 NEW YORK
                  9 650-231-1234 TEXAS
                  10 650-096-0023 TEXAS
                  11 707-691-6763 NEW YORK






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 28 '18 at 5:49









                  OnyambuOnyambu

                  16.1k1523




                  16.1k1523

























                      1














                      We substr the 'tels' and then create a named vector to match the substr values and replace it with the values in the named vector



                      data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),
                      c('510', '707', '650'))[substr(tels, 1, 3)])
                      # PhoneNumbers state
                      #1 510-548-2238 KANSAS
                      #2 707-231-2440 NEW YORK
                      #3 650-752-1300 TEXAS
                      #4 510-674-3482 KANSAS
                      #5 510-853-5695 KANSAS
                      #6 510-882-9898 KANSAS
                      #7 650-555-6311 TEXAS
                      #8 707-885-6351 NEW YORK
                      #9 650-231-1234 TEXAS
                      #10 650-096-0023 TEXAS
                      #11 707-691-6763 NEW YORK





                      share|improve this answer


























                      • I want the output as mentioned in the question.

                        – Chandan
                        Nov 28 '18 at 5:50











                      • It is the output as showed in the question.

                        – akrun
                        Nov 28 '18 at 8:16
















                      1














                      We substr the 'tels' and then create a named vector to match the substr values and replace it with the values in the named vector



                      data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),
                      c('510', '707', '650'))[substr(tels, 1, 3)])
                      # PhoneNumbers state
                      #1 510-548-2238 KANSAS
                      #2 707-231-2440 NEW YORK
                      #3 650-752-1300 TEXAS
                      #4 510-674-3482 KANSAS
                      #5 510-853-5695 KANSAS
                      #6 510-882-9898 KANSAS
                      #7 650-555-6311 TEXAS
                      #8 707-885-6351 NEW YORK
                      #9 650-231-1234 TEXAS
                      #10 650-096-0023 TEXAS
                      #11 707-691-6763 NEW YORK





                      share|improve this answer


























                      • I want the output as mentioned in the question.

                        – Chandan
                        Nov 28 '18 at 5:50











                      • It is the output as showed in the question.

                        – akrun
                        Nov 28 '18 at 8:16














                      1












                      1








                      1







                      We substr the 'tels' and then create a named vector to match the substr values and replace it with the values in the named vector



                      data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),
                      c('510', '707', '650'))[substr(tels, 1, 3)])
                      # PhoneNumbers state
                      #1 510-548-2238 KANSAS
                      #2 707-231-2440 NEW YORK
                      #3 650-752-1300 TEXAS
                      #4 510-674-3482 KANSAS
                      #5 510-853-5695 KANSAS
                      #6 510-882-9898 KANSAS
                      #7 650-555-6311 TEXAS
                      #8 707-885-6351 NEW YORK
                      #9 650-231-1234 TEXAS
                      #10 650-096-0023 TEXAS
                      #11 707-691-6763 NEW YORK





                      share|improve this answer















                      We substr the 'tels' and then create a named vector to match the substr values and replace it with the values in the named vector



                      data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),
                      c('510', '707', '650'))[substr(tels, 1, 3)])
                      # PhoneNumbers state
                      #1 510-548-2238 KANSAS
                      #2 707-231-2440 NEW YORK
                      #3 650-752-1300 TEXAS
                      #4 510-674-3482 KANSAS
                      #5 510-853-5695 KANSAS
                      #6 510-882-9898 KANSAS
                      #7 650-555-6311 TEXAS
                      #8 707-885-6351 NEW YORK
                      #9 650-231-1234 TEXAS
                      #10 650-096-0023 TEXAS
                      #11 707-691-6763 NEW YORK






                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited Nov 28 '18 at 5:56

























                      answered Nov 28 '18 at 5:47









                      akrunakrun

                      414k13202276




                      414k13202276













                      • I want the output as mentioned in the question.

                        – Chandan
                        Nov 28 '18 at 5:50











                      • It is the output as showed in the question.

                        – akrun
                        Nov 28 '18 at 8:16



















                      • I want the output as mentioned in the question.

                        – Chandan
                        Nov 28 '18 at 5:50











                      • It is the output as showed in the question.

                        – akrun
                        Nov 28 '18 at 8:16

















                      I want the output as mentioned in the question.

                      – Chandan
                      Nov 28 '18 at 5:50





                      I want the output as mentioned in the question.

                      – Chandan
                      Nov 28 '18 at 5:50













                      It is the output as showed in the question.

                      – akrun
                      Nov 28 '18 at 8:16





                      It is the output as showed in the question.

                      – akrun
                      Nov 28 '18 at 8:16











                      1














                      You can find first pattern using ^510, ^650, and ^707. To easily add new column, I have used dplyr package.



                      library(tidyverse) # has dplyr and stringr
                      # data set -------------------------------
                      (dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))
                      #> # A tibble: 11 x 1
                      #> PhoneNumbers
                      #> <chr>
                      #> 1 510-548-2238
                      #> 2 707-231-2440
                      #> 3 650-752-1300
                      #> 4 510-674-3482
                      #> 5 510-853-5695
                      #> 6 510-882-9898
                      #> 7 650-555-6311
                      #> 8 707-885-6351
                      #> 9 650-231-1234
                      #> 10 650-096-0023
                      #> 11 707-691-6763


                      You can make function to find each region by finding each pattern: stringr::str_detect()



                      You can do it at once using sapply(). If you perform str_detect to c("^510", "^650", "^707"), you will get a matrix each of which column is the number. Each value is whether the number contains the pattern(TRUE or FALSE), i.e. 11 x 3.



                      For each row, you have only one TRUE by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK").



                      find_region <- function(x) {
                      sta <- c("^510", "^650", "^707")
                      stt <- sapply(sta, function(p) {
                      str_detect(x, pattern = p)
                      }) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern
                      apply(1, which) # get the index
                      c("KANSAS", "TEXAS", "NEW YORK")[stt]
                      }


                      Using this function, you can add new column: dplyr::mutate()



                      dirt %>% 
                      mutate(State = find_region(PhoneNumbers))
                      #> # A tibble: 11 x 2
                      #> PhoneNumbers State
                      #> <chr> <chr>
                      #> 1 510-548-2238 KANSAS
                      #> 2 707-231-2440 NEW YORK
                      #> 3 650-752-1300 TEXAS
                      #> 4 510-674-3482 KANSAS
                      #> 5 510-853-5695 KANSAS
                      #> 6 510-882-9898 KANSAS
                      #> 7 650-555-6311 TEXAS
                      #> 8 707-885-6351 NEW YORK
                      #> 9 650-231-1234 TEXAS
                      #> 10 650-096-0023 TEXAS
                      #> 11 707-691-6763 NEW YORK





                      share|improve this answer






























                        1














                        You can find first pattern using ^510, ^650, and ^707. To easily add new column, I have used dplyr package.



                        library(tidyverse) # has dplyr and stringr
                        # data set -------------------------------
                        (dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))
                        #> # A tibble: 11 x 1
                        #> PhoneNumbers
                        #> <chr>
                        #> 1 510-548-2238
                        #> 2 707-231-2440
                        #> 3 650-752-1300
                        #> 4 510-674-3482
                        #> 5 510-853-5695
                        #> 6 510-882-9898
                        #> 7 650-555-6311
                        #> 8 707-885-6351
                        #> 9 650-231-1234
                        #> 10 650-096-0023
                        #> 11 707-691-6763


                        You can make function to find each region by finding each pattern: stringr::str_detect()



                        You can do it at once using sapply(). If you perform str_detect to c("^510", "^650", "^707"), you will get a matrix each of which column is the number. Each value is whether the number contains the pattern(TRUE or FALSE), i.e. 11 x 3.



                        For each row, you have only one TRUE by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK").



                        find_region <- function(x) {
                        sta <- c("^510", "^650", "^707")
                        stt <- sapply(sta, function(p) {
                        str_detect(x, pattern = p)
                        }) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern
                        apply(1, which) # get the index
                        c("KANSAS", "TEXAS", "NEW YORK")[stt]
                        }


                        Using this function, you can add new column: dplyr::mutate()



                        dirt %>% 
                        mutate(State = find_region(PhoneNumbers))
                        #> # A tibble: 11 x 2
                        #> PhoneNumbers State
                        #> <chr> <chr>
                        #> 1 510-548-2238 KANSAS
                        #> 2 707-231-2440 NEW YORK
                        #> 3 650-752-1300 TEXAS
                        #> 4 510-674-3482 KANSAS
                        #> 5 510-853-5695 KANSAS
                        #> 6 510-882-9898 KANSAS
                        #> 7 650-555-6311 TEXAS
                        #> 8 707-885-6351 NEW YORK
                        #> 9 650-231-1234 TEXAS
                        #> 10 650-096-0023 TEXAS
                        #> 11 707-691-6763 NEW YORK





                        share|improve this answer




























                          1












                          1








                          1







                          You can find first pattern using ^510, ^650, and ^707. To easily add new column, I have used dplyr package.



                          library(tidyverse) # has dplyr and stringr
                          # data set -------------------------------
                          (dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))
                          #> # A tibble: 11 x 1
                          #> PhoneNumbers
                          #> <chr>
                          #> 1 510-548-2238
                          #> 2 707-231-2440
                          #> 3 650-752-1300
                          #> 4 510-674-3482
                          #> 5 510-853-5695
                          #> 6 510-882-9898
                          #> 7 650-555-6311
                          #> 8 707-885-6351
                          #> 9 650-231-1234
                          #> 10 650-096-0023
                          #> 11 707-691-6763


                          You can make function to find each region by finding each pattern: stringr::str_detect()



                          You can do it at once using sapply(). If you perform str_detect to c("^510", "^650", "^707"), you will get a matrix each of which column is the number. Each value is whether the number contains the pattern(TRUE or FALSE), i.e. 11 x 3.



                          For each row, you have only one TRUE by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK").



                          find_region <- function(x) {
                          sta <- c("^510", "^650", "^707")
                          stt <- sapply(sta, function(p) {
                          str_detect(x, pattern = p)
                          }) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern
                          apply(1, which) # get the index
                          c("KANSAS", "TEXAS", "NEW YORK")[stt]
                          }


                          Using this function, you can add new column: dplyr::mutate()



                          dirt %>% 
                          mutate(State = find_region(PhoneNumbers))
                          #> # A tibble: 11 x 2
                          #> PhoneNumbers State
                          #> <chr> <chr>
                          #> 1 510-548-2238 KANSAS
                          #> 2 707-231-2440 NEW YORK
                          #> 3 650-752-1300 TEXAS
                          #> 4 510-674-3482 KANSAS
                          #> 5 510-853-5695 KANSAS
                          #> 6 510-882-9898 KANSAS
                          #> 7 650-555-6311 TEXAS
                          #> 8 707-885-6351 NEW YORK
                          #> 9 650-231-1234 TEXAS
                          #> 10 650-096-0023 TEXAS
                          #> 11 707-691-6763 NEW YORK





                          share|improve this answer















                          You can find first pattern using ^510, ^650, and ^707. To easily add new column, I have used dplyr package.



                          library(tidyverse) # has dplyr and stringr
                          # data set -------------------------------
                          (dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))
                          #> # A tibble: 11 x 1
                          #> PhoneNumbers
                          #> <chr>
                          #> 1 510-548-2238
                          #> 2 707-231-2440
                          #> 3 650-752-1300
                          #> 4 510-674-3482
                          #> 5 510-853-5695
                          #> 6 510-882-9898
                          #> 7 650-555-6311
                          #> 8 707-885-6351
                          #> 9 650-231-1234
                          #> 10 650-096-0023
                          #> 11 707-691-6763


                          You can make function to find each region by finding each pattern: stringr::str_detect()



                          You can do it at once using sapply(). If you perform str_detect to c("^510", "^650", "^707"), you will get a matrix each of which column is the number. Each value is whether the number contains the pattern(TRUE or FALSE), i.e. 11 x 3.



                          For each row, you have only one TRUE by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK").



                          find_region <- function(x) {
                          sta <- c("^510", "^650", "^707")
                          stt <- sapply(sta, function(p) {
                          str_detect(x, pattern = p)
                          }) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern
                          apply(1, which) # get the index
                          c("KANSAS", "TEXAS", "NEW YORK")[stt]
                          }


                          Using this function, you can add new column: dplyr::mutate()



                          dirt %>% 
                          mutate(State = find_region(PhoneNumbers))
                          #> # A tibble: 11 x 2
                          #> PhoneNumbers State
                          #> <chr> <chr>
                          #> 1 510-548-2238 KANSAS
                          #> 2 707-231-2440 NEW YORK
                          #> 3 650-752-1300 TEXAS
                          #> 4 510-674-3482 KANSAS
                          #> 5 510-853-5695 KANSAS
                          #> 6 510-882-9898 KANSAS
                          #> 7 650-555-6311 TEXAS
                          #> 8 707-885-6351 NEW YORK
                          #> 9 650-231-1234 TEXAS
                          #> 10 650-096-0023 TEXAS
                          #> 11 707-691-6763 NEW YORK






                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Nov 28 '18 at 9:28

























                          answered Nov 28 '18 at 7:32









                          BlendedBlended

                          7411311




                          7411311






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53512872%2fwhy-am-i-getting-this-new-york-output-alone%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

                              Calculate evaluation metrics using cross_val_predict sklearn

                              Insert data from modal to MySQL (multiple modal on website)