Why am I getting this New York output alone?

Consider the following vector ‘tels’ which contains telephone numbers from “KANSAS”, “TEXAS” and “NEW YORK” regions.

tels <- c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")

If the number starts with 510, the phone number is from “KANSAS”, if it is 707, then “NEW YORK” and if it is 650 then the number is from “TEXAS”



Use R concepts and obtain the following dataframe as ouput.



               Expected Output:



                  PhoneNumbers            State



               1     5105482238       KANSAS



               2     7072312440   NEW YORK



               3     6507521300          TEXAS



               4     5106743482       KANSAS



               5     5108535695       KANSAS



               6     5108829898       KANSAS



               7     6505556311         TEXAS



               8     7078856351   NEW YORK



               9     6502311234         TEXAS



               10   6500960023         TEXAS



               11   7076916763   NEW YORK







This is my code :



z<-substr(tels,1,3)

dirt<-data.frame(tels,z)

dirt

for(i in z){

  if(i==510){

    sta<-"ddfdd"

  }if(i==707){

    sta<-"NEW YORK"

  }

  if((i==650)){

    sta<-"TEXAS"

  }



}

das<-data.frame(tels,sta)

das





but I'm getting this output:

          tels      sta

1  510-548-2238 NEW YORK

2  707-231-2440 NEW YORK

3  650-752-1300 NEW YORK

4  510-674-3482 NEW YORK

5  510-853-5695 NEW YORK

6  510-882-9898 NEW YORK

7  650-555-6311 NEW YORK

8  707-885-6351 NEW YORK

9  650-231-1234 NEW YORK

10 650-096-0023 NEW YORK

11 707-691-6763 NEW YORK

asked Nov 28 '18 at 5:44

Chandan

add a comment |

Consider the following vector ‘tels’ which contains telephone numbers from “KANSAS”, “TEXAS” and “NEW YORK” regions.

tels <- c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")

If the number starts with 510, the phone number is from “KANSAS”, if it is 707, then “NEW YORK” and if it is 650 then the number is from “TEXAS”



Use R concepts and obtain the following dataframe as ouput.



               Expected Output:



                  PhoneNumbers            State



               1     5105482238       KANSAS



               2     7072312440   NEW YORK



               3     6507521300          TEXAS



               4     5106743482       KANSAS



               5     5108535695       KANSAS



               6     5108829898       KANSAS



               7     6505556311         TEXAS



               8     7078856351   NEW YORK



               9     6502311234         TEXAS



               10   6500960023         TEXAS



               11   7076916763   NEW YORK







This is my code :



z<-substr(tels,1,3)

dirt<-data.frame(tels,z)

dirt

for(i in z){

  if(i==510){

    sta<-"ddfdd"

  }if(i==707){

    sta<-"NEW YORK"

  }

  if((i==650)){

    sta<-"TEXAS"

  }



}

das<-data.frame(tels,sta)

das





but I'm getting this output:

          tels      sta

1  510-548-2238 NEW YORK

2  707-231-2440 NEW YORK

3  650-752-1300 NEW YORK

4  510-674-3482 NEW YORK

5  510-853-5695 NEW YORK

6  510-882-9898 NEW YORK

7  650-555-6311 NEW YORK

8  707-885-6351 NEW YORK

9  650-231-1234 NEW YORK

10 650-096-0023 NEW YORK

11 707-691-6763 NEW YORK

asked Nov 28 '18 at 5:44

Chandan

add a comment |

Consider the following vector ‘tels’ which contains telephone numbers from “KANSAS”, “TEXAS” and “NEW YORK” regions.

tels <- c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")

If the number starts with 510, the phone number is from “KANSAS”, if it is 707, then “NEW YORK” and if it is 650 then the number is from “TEXAS”



Use R concepts and obtain the following dataframe as ouput.



               Expected Output:



                  PhoneNumbers            State



               1     5105482238       KANSAS



               2     7072312440   NEW YORK



               3     6507521300          TEXAS



               4     5106743482       KANSAS



               5     5108535695       KANSAS



               6     5108829898       KANSAS



               7     6505556311         TEXAS



               8     7078856351   NEW YORK



               9     6502311234         TEXAS



               10   6500960023         TEXAS



               11   7076916763   NEW YORK







This is my code :



z<-substr(tels,1,3)

dirt<-data.frame(tels,z)

dirt

for(i in z){

  if(i==510){

    sta<-"ddfdd"

  }if(i==707){

    sta<-"NEW YORK"

  }

  if((i==650)){

    sta<-"TEXAS"

  }



}

das<-data.frame(tels,sta)

das





but I'm getting this output:

          tels      sta

1  510-548-2238 NEW YORK

2  707-231-2440 NEW YORK

3  650-752-1300 NEW YORK

4  510-674-3482 NEW YORK

5  510-853-5695 NEW YORK

6  510-882-9898 NEW YORK

7  650-555-6311 NEW YORK

8  707-885-6351 NEW YORK

9  650-231-1234 NEW YORK

10 650-096-0023 NEW YORK

11 707-691-6763 NEW YORK

asked Nov 28 '18 at 5:44

Chandan

Consider the following vector ‘tels’ which contains telephone numbers from “KANSAS”, “TEXAS” and “NEW YORK” regions.

tels <- c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")

If the number starts with 510, the phone number is from “KANSAS”, if it is 707, then “NEW YORK” and if it is 650 then the number is from “TEXAS”



Use R concepts and obtain the following dataframe as ouput.



               Expected Output:



                  PhoneNumbers            State



               1     5105482238       KANSAS



               2     7072312440   NEW YORK



               3     6507521300          TEXAS



               4     5106743482       KANSAS



               5     5108535695       KANSAS



               6     5108829898       KANSAS



               7     6505556311         TEXAS



               8     7078856351   NEW YORK



               9     6502311234         TEXAS



               10   6500960023         TEXAS



               11   7076916763   NEW YORK







This is my code :



z<-substr(tels,1,3)

dirt<-data.frame(tels,z)

dirt

for(i in z){

  if(i==510){

    sta<-"ddfdd"

  }if(i==707){

    sta<-"NEW YORK"

  }

  if((i==650)){

    sta<-"TEXAS"

  }



}

das<-data.frame(tels,sta)

das





but I'm getting this output:

          tels      sta

1  510-548-2238 NEW YORK

2  707-231-2440 NEW YORK

3  650-752-1300 NEW YORK

4  510-674-3482 NEW YORK

5  510-853-5695 NEW YORK

6  510-882-9898 NEW YORK

7  650-555-6311 NEW YORK

8  707-885-6351 NEW YORK

9  650-231-1234 NEW YORK

10 650-096-0023 NEW YORK

11 707-691-6763 NEW YORK

asked Nov 28 '18 at 5:44

Chandan

asked Nov 28 '18 at 5:44

Chandan

asked Nov 28 '18 at 5:44

Chandan

asked Nov 28 '18 at 5:44

Chandan

asked Nov 28 '18 at 5:44

Chandan

add a comment |

3 Answers
3

active

oldest

votes

You can use factor with the labels being the state and labels being the first 3 digits

data.frame(tels,

 state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))

           tels    state

1  510-548-2238   KANSAS

2  707-231-2440 NEW YORK

3  650-752-1300    TEXAS

4  510-674-3482   KANSAS

5  510-853-5695   KANSAS

6  510-882-9898   KANSAS

7  650-555-6311    TEXAS

8  707-885-6351 NEW YORK

9  650-231-1234    TEXAS

10 650-096-0023    TEXAS

11 707-691-6763 NEW YORK

answered Nov 28 '18 at 5:49

Onyambu

16.1k1523

add a comment |

We substr the 'tels' and then create a named vector to match the substr values and replace it with the values in the named vector

data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),

               c('510', '707', '650'))[substr(tels, 1, 3)])

#   PhoneNumbers    state

#1  510-548-2238   KANSAS

#2  707-231-2440 NEW YORK

#3  650-752-1300    TEXAS

#4  510-674-3482   KANSAS

#5  510-853-5695   KANSAS

#6  510-882-9898   KANSAS

#7  650-555-6311    TEXAS

#8  707-885-6351 NEW YORK

#9  650-231-1234    TEXAS

#10 650-096-0023    TEXAS

#11 707-691-6763 NEW YORK

edited Nov 28 '18 at 5:56

answered Nov 28 '18 at 5:47

akrun

414k13202276

I want the output as mentioned in the question.

– Chandan
Nov 28 '18 at 5:50

It is the output as showed in the question.

– akrun
Nov 28 '18 at 8:16

add a comment |

You can find first pattern using ^510, ^650, and ^707. To easily add new column, I have used dplyr package.

library(tidyverse) # has dplyr and stringr

# data set -------------------------------

(dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))

#> # A tibble: 11 x 1

#>    PhoneNumbers

#>    <chr>       

#>  1 510-548-2238

#>  2 707-231-2440

#>  3 650-752-1300

#>  4 510-674-3482

#>  5 510-853-5695

#>  6 510-882-9898

#>  7 650-555-6311

#>  8 707-885-6351

#>  9 650-231-1234

#> 10 650-096-0023

#> 11 707-691-6763

You can make function to find each region by finding each pattern: stringr::str_detect()

You can do it at once using sapply(). If you perform str_detect to c("^510", "^650", "^707"), you will get a matrix each of which column is the number. Each value is whether the number contains the pattern(TRUE or FALSE), i.e. 11 x 3.

For each row, you have only one TRUE by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK").

find_region <- function(x) {

  sta <- c("^510", "^650", "^707")

  stt <- sapply(sta, function(p) {

    str_detect(x, pattern = p)

  }) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern

    apply(1, which) # get the index

  c("KANSAS", "TEXAS", "NEW YORK")[stt]

}

Using this function, you can add new column: dplyr::mutate()

dirt %>% 

  mutate(State = find_region(PhoneNumbers))

#> # A tibble: 11 x 2

#>    PhoneNumbers State   

#>    <chr>        <chr>   

#>  1 510-548-2238 KANSAS  

#>  2 707-231-2440 NEW YORK

#>  3 650-752-1300 TEXAS   

#>  4 510-674-3482 KANSAS  

#>  5 510-853-5695 KANSAS  

#>  6 510-882-9898 KANSAS  

#>  7 650-555-6311 TEXAS   

#>  8 707-885-6351 NEW YORK

#>  9 650-231-1234 TEXAS   

#> 10 650-096-0023 TEXAS   

#> 11 707-691-6763 NEW YORK

edited Nov 28 '18 at 9:28

answered Nov 28 '18 at 7:32

Blended

7411311

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53512872%2fwhy-am-i-getting-this-new-york-output-alone%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

You can use factor with the labels being the state and labels being the first 3 digits

data.frame(tels,

 state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))

           tels    state

1  510-548-2238   KANSAS

2  707-231-2440 NEW YORK

3  650-752-1300    TEXAS

4  510-674-3482   KANSAS

5  510-853-5695   KANSAS

6  510-882-9898   KANSAS

7  650-555-6311    TEXAS

8  707-885-6351 NEW YORK

9  650-231-1234    TEXAS

10 650-096-0023    TEXAS

11 707-691-6763 NEW YORK

answered Nov 28 '18 at 5:49

Onyambu

16.1k1523

add a comment |

You can use factor with the labels being the state and labels being the first 3 digits

data.frame(tels,

 state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))

           tels    state

1  510-548-2238   KANSAS

2  707-231-2440 NEW YORK

3  650-752-1300    TEXAS

4  510-674-3482   KANSAS

5  510-853-5695   KANSAS

6  510-882-9898   KANSAS

7  650-555-6311    TEXAS

8  707-885-6351 NEW YORK

9  650-231-1234    TEXAS

10 650-096-0023    TEXAS

11 707-691-6763 NEW YORK

answered Nov 28 '18 at 5:49

Onyambu

16.1k1523

add a comment |

You can use factor with the labels being the state and labels being the first 3 digits

data.frame(tels,

 state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))

           tels    state

1  510-548-2238   KANSAS

2  707-231-2440 NEW YORK

3  650-752-1300    TEXAS

4  510-674-3482   KANSAS

5  510-853-5695   KANSAS

6  510-882-9898   KANSAS

7  650-555-6311    TEXAS

8  707-885-6351 NEW YORK

9  650-231-1234    TEXAS

10 650-096-0023    TEXAS

11 707-691-6763 NEW YORK

answered Nov 28 '18 at 5:49

Onyambu

16.1k1523

You can use factor with the labels being the state and labels being the first 3 digits

data.frame(tels,

 state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))

           tels    state

1  510-548-2238   KANSAS

2  707-231-2440 NEW YORK

3  650-752-1300    TEXAS

4  510-674-3482   KANSAS

5  510-853-5695   KANSAS

6  510-882-9898   KANSAS

7  650-555-6311    TEXAS

8  707-885-6351 NEW YORK

9  650-231-1234    TEXAS

10 650-096-0023    TEXAS

11 707-691-6763 NEW YORK

answered Nov 28 '18 at 5:49

Onyambu

16.1k1523

answered Nov 28 '18 at 5:49

Onyambu

16.1k1523

answered Nov 28 '18 at 5:49

Onyambu

16.1k1523

answered Nov 28 '18 at 5:49

Onyambu

16.1k1523

add a comment |

We substr the 'tels' and then create a named vector to match the substr values and replace it with the values in the named vector

data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),

               c('510', '707', '650'))[substr(tels, 1, 3)])

#   PhoneNumbers    state

#1  510-548-2238   KANSAS

#2  707-231-2440 NEW YORK

#3  650-752-1300    TEXAS

#4  510-674-3482   KANSAS

#5  510-853-5695   KANSAS

#6  510-882-9898   KANSAS

#7  650-555-6311    TEXAS

#8  707-885-6351 NEW YORK

#9  650-231-1234    TEXAS

#10 650-096-0023    TEXAS

#11 707-691-6763 NEW YORK

edited Nov 28 '18 at 5:56

answered Nov 28 '18 at 5:47

akrun

414k13202276

I want the output as mentioned in the question.

– Chandan
Nov 28 '18 at 5:50

It is the output as showed in the question.

– akrun
Nov 28 '18 at 8:16

add a comment |

We substr the 'tels' and then create a named vector to match the substr values and replace it with the values in the named vector

data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),

               c('510', '707', '650'))[substr(tels, 1, 3)])

#   PhoneNumbers    state

#1  510-548-2238   KANSAS

#2  707-231-2440 NEW YORK

#3  650-752-1300    TEXAS

#4  510-674-3482   KANSAS

#5  510-853-5695   KANSAS

#6  510-882-9898   KANSAS

#7  650-555-6311    TEXAS

#8  707-885-6351 NEW YORK

#9  650-231-1234    TEXAS

#10 650-096-0023    TEXAS

#11 707-691-6763 NEW YORK

edited Nov 28 '18 at 5:56

answered Nov 28 '18 at 5:47

akrun

414k13202276

I want the output as mentioned in the question.

– Chandan
Nov 28 '18 at 5:50

It is the output as showed in the question.

– akrun
Nov 28 '18 at 8:16

add a comment |

We substr the 'tels' and then create a named vector to match the substr values and replace it with the values in the named vector

data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),

               c('510', '707', '650'))[substr(tels, 1, 3)])

#   PhoneNumbers    state

#1  510-548-2238   KANSAS

#2  707-231-2440 NEW YORK

#3  650-752-1300    TEXAS

#4  510-674-3482   KANSAS

#5  510-853-5695   KANSAS

#6  510-882-9898   KANSAS

#7  650-555-6311    TEXAS

#8  707-885-6351 NEW YORK

#9  650-231-1234    TEXAS

#10 650-096-0023    TEXAS

#11 707-691-6763 NEW YORK

edited Nov 28 '18 at 5:56

answered Nov 28 '18 at 5:47

akrun

414k13202276

We substr the 'tels' and then create a named vector to match the substr values and replace it with the values in the named vector

data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),

               c('510', '707', '650'))[substr(tels, 1, 3)])

#   PhoneNumbers    state

#1  510-548-2238   KANSAS

#2  707-231-2440 NEW YORK

#3  650-752-1300    TEXAS

#4  510-674-3482   KANSAS

#5  510-853-5695   KANSAS

#6  510-882-9898   KANSAS

#7  650-555-6311    TEXAS

#8  707-885-6351 NEW YORK

#9  650-231-1234    TEXAS

#10 650-096-0023    TEXAS

#11 707-691-6763 NEW YORK

edited Nov 28 '18 at 5:56

answered Nov 28 '18 at 5:47

akrun

414k13202276

edited Nov 28 '18 at 5:56

answered Nov 28 '18 at 5:47

akrun

414k13202276

answered Nov 28 '18 at 5:47

akrun

414k13202276

answered Nov 28 '18 at 5:47

akrun

414k13202276

I want the output as mentioned in the question.

– Chandan
Nov 28 '18 at 5:50

It is the output as showed in the question.

– akrun
Nov 28 '18 at 8:16

add a comment |

I want the output as mentioned in the question.

– Chandan
Nov 28 '18 at 5:50

It is the output as showed in the question.

– akrun
Nov 28 '18 at 8:16

I want the output as mentioned in the question.

– Chandan
Nov 28 '18 at 5:50

It is the output as showed in the question.

– akrun
Nov 28 '18 at 8:16

add a comment |

You can find first pattern using ^510, ^650, and ^707. To easily add new column, I have used dplyr package.

library(tidyverse) # has dplyr and stringr

# data set -------------------------------

(dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))

#> # A tibble: 11 x 1

#>    PhoneNumbers

#>    <chr>       

#>  1 510-548-2238

#>  2 707-231-2440

#>  3 650-752-1300

#>  4 510-674-3482

#>  5 510-853-5695

#>  6 510-882-9898

#>  7 650-555-6311

#>  8 707-885-6351

#>  9 650-231-1234

#> 10 650-096-0023

#> 11 707-691-6763

You can make function to find each region by finding each pattern: stringr::str_detect()

For each row, you have only one TRUE by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK").

find_region <- function(x) {

  sta <- c("^510", "^650", "^707")

  stt <- sapply(sta, function(p) {

    str_detect(x, pattern = p)

  }) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern

    apply(1, which) # get the index

  c("KANSAS", "TEXAS", "NEW YORK")[stt]

}

Using this function, you can add new column: dplyr::mutate()

dirt %>% 

  mutate(State = find_region(PhoneNumbers))

#> # A tibble: 11 x 2

#>    PhoneNumbers State   

#>    <chr>        <chr>   

#>  1 510-548-2238 KANSAS  

#>  2 707-231-2440 NEW YORK

#>  3 650-752-1300 TEXAS   

#>  4 510-674-3482 KANSAS  

#>  5 510-853-5695 KANSAS  

#>  6 510-882-9898 KANSAS  

#>  7 650-555-6311 TEXAS   

#>  8 707-885-6351 NEW YORK

#>  9 650-231-1234 TEXAS   

#> 10 650-096-0023 TEXAS   

#> 11 707-691-6763 NEW YORK

edited Nov 28 '18 at 9:28

answered Nov 28 '18 at 7:32

Blended

7411311

add a comment |

You can find first pattern using ^510, ^650, and ^707. To easily add new column, I have used dplyr package.

library(tidyverse) # has dplyr and stringr

# data set -------------------------------

(dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))

#> # A tibble: 11 x 1

#>    PhoneNumbers

#>    <chr>       

#>  1 510-548-2238

#>  2 707-231-2440

#>  3 650-752-1300

#>  4 510-674-3482

#>  5 510-853-5695

#>  6 510-882-9898

#>  7 650-555-6311

#>  8 707-885-6351

#>  9 650-231-1234

#> 10 650-096-0023

#> 11 707-691-6763

You can make function to find each region by finding each pattern: stringr::str_detect()

For each row, you have only one TRUE by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK").

find_region <- function(x) {

  sta <- c("^510", "^650", "^707")

  stt <- sapply(sta, function(p) {

    str_detect(x, pattern = p)

  }) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern

    apply(1, which) # get the index

  c("KANSAS", "TEXAS", "NEW YORK")[stt]

}

Using this function, you can add new column: dplyr::mutate()

dirt %>% 

  mutate(State = find_region(PhoneNumbers))

#> # A tibble: 11 x 2

#>    PhoneNumbers State   

#>    <chr>        <chr>   

#>  1 510-548-2238 KANSAS  

#>  2 707-231-2440 NEW YORK

#>  3 650-752-1300 TEXAS   

#>  4 510-674-3482 KANSAS  

#>  5 510-853-5695 KANSAS  

#>  6 510-882-9898 KANSAS  

#>  7 650-555-6311 TEXAS   

#>  8 707-885-6351 NEW YORK

#>  9 650-231-1234 TEXAS   

#> 10 650-096-0023 TEXAS   

#> 11 707-691-6763 NEW YORK

edited Nov 28 '18 at 9:28

answered Nov 28 '18 at 7:32

Blended

7411311

add a comment |

You can find first pattern using ^510, ^650, and ^707. To easily add new column, I have used dplyr package.

library(tidyverse) # has dplyr and stringr

# data set -------------------------------

(dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))

#> # A tibble: 11 x 1

#>    PhoneNumbers

#>    <chr>       

#>  1 510-548-2238

#>  2 707-231-2440

#>  3 650-752-1300

#>  4 510-674-3482

#>  5 510-853-5695

#>  6 510-882-9898

#>  7 650-555-6311

#>  8 707-885-6351

#>  9 650-231-1234

#> 10 650-096-0023

#> 11 707-691-6763

You can make function to find each region by finding each pattern: stringr::str_detect()

For each row, you have only one TRUE by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK").

find_region <- function(x) {

  sta <- c("^510", "^650", "^707")

  stt <- sapply(sta, function(p) {

    str_detect(x, pattern = p)

  }) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern

    apply(1, which) # get the index

  c("KANSAS", "TEXAS", "NEW YORK")[stt]

}

Using this function, you can add new column: dplyr::mutate()

dirt %>% 

  mutate(State = find_region(PhoneNumbers))

#> # A tibble: 11 x 2

#>    PhoneNumbers State   

#>    <chr>        <chr>   

#>  1 510-548-2238 KANSAS  

#>  2 707-231-2440 NEW YORK

#>  3 650-752-1300 TEXAS   

#>  4 510-674-3482 KANSAS  

#>  5 510-853-5695 KANSAS  

#>  6 510-882-9898 KANSAS  

#>  7 650-555-6311 TEXAS   

#>  8 707-885-6351 NEW YORK

#>  9 650-231-1234 TEXAS   

#> 10 650-096-0023 TEXAS   

#> 11 707-691-6763 NEW YORK

edited Nov 28 '18 at 9:28

answered Nov 28 '18 at 7:32

Blended

7411311

You can find first pattern using ^510, ^650, and ^707. To easily add new column, I have used dplyr package.

library(tidyverse) # has dplyr and stringr

# data set -------------------------------

(dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))

#> # A tibble: 11 x 1

#>    PhoneNumbers

#>    <chr>       

#>  1 510-548-2238

#>  2 707-231-2440

#>  3 650-752-1300

#>  4 510-674-3482

#>  5 510-853-5695

#>  6 510-882-9898

#>  7 650-555-6311

#>  8 707-885-6351

#>  9 650-231-1234

#> 10 650-096-0023

#> 11 707-691-6763

You can make function to find each region by finding each pattern: stringr::str_detect()

For each row, you have only one TRUE by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK").

find_region <- function(x) {

  sta <- c("^510", "^650", "^707")

  stt <- sapply(sta, function(p) {

    str_detect(x, pattern = p)

  }) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern

    apply(1, which) # get the index

  c("KANSAS", "TEXAS", "NEW YORK")[stt]

}

Using this function, you can add new column: dplyr::mutate()

dirt %>% 

  mutate(State = find_region(PhoneNumbers))

#> # A tibble: 11 x 2

#>    PhoneNumbers State   

#>    <chr>        <chr>   

#>  1 510-548-2238 KANSAS  

#>  2 707-231-2440 NEW YORK

#>  3 650-752-1300 TEXAS   

#>  4 510-674-3482 KANSAS  

#>  5 510-853-5695 KANSAS  

#>  6 510-882-9898 KANSAS  

#>  7 650-555-6311 TEXAS   

#>  8 707-885-6351 NEW YORK

#>  9 650-231-1234 TEXAS   

#> 10 650-096-0023 TEXAS   

#> 11 707-691-6763 NEW YORK

edited Nov 28 '18 at 9:28

answered Nov 28 '18 at 7:32

Blended

7411311

edited Nov 28 '18 at 9:28

answered Nov 28 '18 at 7:32

Blended

7411311

answered Nov 28 '18 at 7:32

Blended

7411311

answered Nov 28 '18 at 7:32

Blended

7411311

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

BNCBE1OpI 2

搜尋此網誌

Btukfyl