Why am I getting this New York output alone?
Consider the following vector ‘tels’ which contains telephone numbers from “KANSAS”, “TEXAS” and “NEW YORK” regions.
tels <- c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")
If the number starts with 510, the phone number is from “KANSAS”, if it is 707, then “NEW YORK” and if it is 650 then the number is from “TEXAS”
Use R concepts and obtain the following dataframe as ouput.
Expected Output:
PhoneNumbers State
1 5105482238 KANSAS
2 7072312440 NEW YORK
3 6507521300 TEXAS
4 5106743482 KANSAS
5 5108535695 KANSAS
6 5108829898 KANSAS
7 6505556311 TEXAS
8 7078856351 NEW YORK
9 6502311234 TEXAS
10 6500960023 TEXAS
11 7076916763 NEW YORK
This is my code :
z<-substr(tels,1,3)
dirt<-data.frame(tels,z)
dirt
for(i in z){
if(i==510){
sta<-"ddfdd"
}if(i==707){
sta<-"NEW YORK"
}
if((i==650)){
sta<-"TEXAS"
}
}
das<-data.frame(tels,sta)
das
but I'm getting this output:
tels sta
1 510-548-2238 NEW YORK
2 707-231-2440 NEW YORK
3 650-752-1300 NEW YORK
4 510-674-3482 NEW YORK
5 510-853-5695 NEW YORK
6 510-882-9898 NEW YORK
7 650-555-6311 NEW YORK
8 707-885-6351 NEW YORK
9 650-231-1234 NEW YORK
10 650-096-0023 NEW YORK
11 707-691-6763 NEW YORK
r
add a comment |
Consider the following vector ‘tels’ which contains telephone numbers from “KANSAS”, “TEXAS” and “NEW YORK” regions.
tels <- c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")
If the number starts with 510, the phone number is from “KANSAS”, if it is 707, then “NEW YORK” and if it is 650 then the number is from “TEXAS”
Use R concepts and obtain the following dataframe as ouput.
Expected Output:
PhoneNumbers State
1 5105482238 KANSAS
2 7072312440 NEW YORK
3 6507521300 TEXAS
4 5106743482 KANSAS
5 5108535695 KANSAS
6 5108829898 KANSAS
7 6505556311 TEXAS
8 7078856351 NEW YORK
9 6502311234 TEXAS
10 6500960023 TEXAS
11 7076916763 NEW YORK
This is my code :
z<-substr(tels,1,3)
dirt<-data.frame(tels,z)
dirt
for(i in z){
if(i==510){
sta<-"ddfdd"
}if(i==707){
sta<-"NEW YORK"
}
if((i==650)){
sta<-"TEXAS"
}
}
das<-data.frame(tels,sta)
das
but I'm getting this output:
tels sta
1 510-548-2238 NEW YORK
2 707-231-2440 NEW YORK
3 650-752-1300 NEW YORK
4 510-674-3482 NEW YORK
5 510-853-5695 NEW YORK
6 510-882-9898 NEW YORK
7 650-555-6311 NEW YORK
8 707-885-6351 NEW YORK
9 650-231-1234 NEW YORK
10 650-096-0023 NEW YORK
11 707-691-6763 NEW YORK
r
add a comment |
Consider the following vector ‘tels’ which contains telephone numbers from “KANSAS”, “TEXAS” and “NEW YORK” regions.
tels <- c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")
If the number starts with 510, the phone number is from “KANSAS”, if it is 707, then “NEW YORK” and if it is 650 then the number is from “TEXAS”
Use R concepts and obtain the following dataframe as ouput.
Expected Output:
PhoneNumbers State
1 5105482238 KANSAS
2 7072312440 NEW YORK
3 6507521300 TEXAS
4 5106743482 KANSAS
5 5108535695 KANSAS
6 5108829898 KANSAS
7 6505556311 TEXAS
8 7078856351 NEW YORK
9 6502311234 TEXAS
10 6500960023 TEXAS
11 7076916763 NEW YORK
This is my code :
z<-substr(tels,1,3)
dirt<-data.frame(tels,z)
dirt
for(i in z){
if(i==510){
sta<-"ddfdd"
}if(i==707){
sta<-"NEW YORK"
}
if((i==650)){
sta<-"TEXAS"
}
}
das<-data.frame(tels,sta)
das
but I'm getting this output:
tels sta
1 510-548-2238 NEW YORK
2 707-231-2440 NEW YORK
3 650-752-1300 NEW YORK
4 510-674-3482 NEW YORK
5 510-853-5695 NEW YORK
6 510-882-9898 NEW YORK
7 650-555-6311 NEW YORK
8 707-885-6351 NEW YORK
9 650-231-1234 NEW YORK
10 650-096-0023 NEW YORK
11 707-691-6763 NEW YORK
r
Consider the following vector ‘tels’ which contains telephone numbers from “KANSAS”, “TEXAS” and “NEW YORK” regions.
tels <- c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")
If the number starts with 510, the phone number is from “KANSAS”, if it is 707, then “NEW YORK” and if it is 650 then the number is from “TEXAS”
Use R concepts and obtain the following dataframe as ouput.
Expected Output:
PhoneNumbers State
1 5105482238 KANSAS
2 7072312440 NEW YORK
3 6507521300 TEXAS
4 5106743482 KANSAS
5 5108535695 KANSAS
6 5108829898 KANSAS
7 6505556311 TEXAS
8 7078856351 NEW YORK
9 6502311234 TEXAS
10 6500960023 TEXAS
11 7076916763 NEW YORK
This is my code :
z<-substr(tels,1,3)
dirt<-data.frame(tels,z)
dirt
for(i in z){
if(i==510){
sta<-"ddfdd"
}if(i==707){
sta<-"NEW YORK"
}
if((i==650)){
sta<-"TEXAS"
}
}
das<-data.frame(tels,sta)
das
but I'm getting this output:
tels sta
1 510-548-2238 NEW YORK
2 707-231-2440 NEW YORK
3 650-752-1300 NEW YORK
4 510-674-3482 NEW YORK
5 510-853-5695 NEW YORK
6 510-882-9898 NEW YORK
7 650-555-6311 NEW YORK
8 707-885-6351 NEW YORK
9 650-231-1234 NEW YORK
10 650-096-0023 NEW YORK
11 707-691-6763 NEW YORK
r
r
asked Nov 28 '18 at 5:44
ChandanChandan
1
1
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
You can use factor
with the labels being the state
and labels being the first 3 digits
data.frame(tels,
state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))
tels state
1 510-548-2238 KANSAS
2 707-231-2440 NEW YORK
3 650-752-1300 TEXAS
4 510-674-3482 KANSAS
5 510-853-5695 KANSAS
6 510-882-9898 KANSAS
7 650-555-6311 TEXAS
8 707-885-6351 NEW YORK
9 650-231-1234 TEXAS
10 650-096-0023 TEXAS
11 707-691-6763 NEW YORK
add a comment |
We substr
the 'tels' and then create a named vector
to match the substr values and replace it with the values in the named vector
data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),
c('510', '707', '650'))[substr(tels, 1, 3)])
# PhoneNumbers state
#1 510-548-2238 KANSAS
#2 707-231-2440 NEW YORK
#3 650-752-1300 TEXAS
#4 510-674-3482 KANSAS
#5 510-853-5695 KANSAS
#6 510-882-9898 KANSAS
#7 650-555-6311 TEXAS
#8 707-885-6351 NEW YORK
#9 650-231-1234 TEXAS
#10 650-096-0023 TEXAS
#11 707-691-6763 NEW YORK
I want the output as mentioned in the question.
– Chandan
Nov 28 '18 at 5:50
It is the output as showed in the question.
– akrun
Nov 28 '18 at 8:16
add a comment |
You can find first pattern using ^510
, ^650
, and ^707
. To easily add new column, I have used dplyr
package.
library(tidyverse) # has dplyr and stringr
# data set -------------------------------
(dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))
#> # A tibble: 11 x 1
#> PhoneNumbers
#> <chr>
#> 1 510-548-2238
#> 2 707-231-2440
#> 3 650-752-1300
#> 4 510-674-3482
#> 5 510-853-5695
#> 6 510-882-9898
#> 7 650-555-6311
#> 8 707-885-6351
#> 9 650-231-1234
#> 10 650-096-0023
#> 11 707-691-6763
You can make function to find each region by finding each pattern: stringr::str_detect()
You can do it at once using sapply()
. If you perform str_detect
to c("^510", "^650", "^707")
, you will get a matrix each of which column is the number. Each value is whether the number contains the pattern(TRUE
or FALSE
), i.e. 11 x 3.
For each row, you have only one TRUE
by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK")
.
find_region <- function(x) {
sta <- c("^510", "^650", "^707")
stt <- sapply(sta, function(p) {
str_detect(x, pattern = p)
}) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern
apply(1, which) # get the index
c("KANSAS", "TEXAS", "NEW YORK")[stt]
}
Using this function, you can add new column: dplyr::mutate()
dirt %>%
mutate(State = find_region(PhoneNumbers))
#> # A tibble: 11 x 2
#> PhoneNumbers State
#> <chr> <chr>
#> 1 510-548-2238 KANSAS
#> 2 707-231-2440 NEW YORK
#> 3 650-752-1300 TEXAS
#> 4 510-674-3482 KANSAS
#> 5 510-853-5695 KANSAS
#> 6 510-882-9898 KANSAS
#> 7 650-555-6311 TEXAS
#> 8 707-885-6351 NEW YORK
#> 9 650-231-1234 TEXAS
#> 10 650-096-0023 TEXAS
#> 11 707-691-6763 NEW YORK
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53512872%2fwhy-am-i-getting-this-new-york-output-alone%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use factor
with the labels being the state
and labels being the first 3 digits
data.frame(tels,
state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))
tels state
1 510-548-2238 KANSAS
2 707-231-2440 NEW YORK
3 650-752-1300 TEXAS
4 510-674-3482 KANSAS
5 510-853-5695 KANSAS
6 510-882-9898 KANSAS
7 650-555-6311 TEXAS
8 707-885-6351 NEW YORK
9 650-231-1234 TEXAS
10 650-096-0023 TEXAS
11 707-691-6763 NEW YORK
add a comment |
You can use factor
with the labels being the state
and labels being the first 3 digits
data.frame(tels,
state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))
tels state
1 510-548-2238 KANSAS
2 707-231-2440 NEW YORK
3 650-752-1300 TEXAS
4 510-674-3482 KANSAS
5 510-853-5695 KANSAS
6 510-882-9898 KANSAS
7 650-555-6311 TEXAS
8 707-885-6351 NEW YORK
9 650-231-1234 TEXAS
10 650-096-0023 TEXAS
11 707-691-6763 NEW YORK
add a comment |
You can use factor
with the labels being the state
and labels being the first 3 digits
data.frame(tels,
state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))
tels state
1 510-548-2238 KANSAS
2 707-231-2440 NEW YORK
3 650-752-1300 TEXAS
4 510-674-3482 KANSAS
5 510-853-5695 KANSAS
6 510-882-9898 KANSAS
7 650-555-6311 TEXAS
8 707-885-6351 NEW YORK
9 650-231-1234 TEXAS
10 650-096-0023 TEXAS
11 707-691-6763 NEW YORK
You can use factor
with the labels being the state
and labels being the first 3 digits
data.frame(tels,
state = factor(substr(tels,0,3), c('510','650','707'), c('KANSAS','TEXAS','NEW YORK')))
tels state
1 510-548-2238 KANSAS
2 707-231-2440 NEW YORK
3 650-752-1300 TEXAS
4 510-674-3482 KANSAS
5 510-853-5695 KANSAS
6 510-882-9898 KANSAS
7 650-555-6311 TEXAS
8 707-885-6351 NEW YORK
9 650-231-1234 TEXAS
10 650-096-0023 TEXAS
11 707-691-6763 NEW YORK
answered Nov 28 '18 at 5:49
OnyambuOnyambu
16.1k1523
16.1k1523
add a comment |
add a comment |
We substr
the 'tels' and then create a named vector
to match the substr values and replace it with the values in the named vector
data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),
c('510', '707', '650'))[substr(tels, 1, 3)])
# PhoneNumbers state
#1 510-548-2238 KANSAS
#2 707-231-2440 NEW YORK
#3 650-752-1300 TEXAS
#4 510-674-3482 KANSAS
#5 510-853-5695 KANSAS
#6 510-882-9898 KANSAS
#7 650-555-6311 TEXAS
#8 707-885-6351 NEW YORK
#9 650-231-1234 TEXAS
#10 650-096-0023 TEXAS
#11 707-691-6763 NEW YORK
I want the output as mentioned in the question.
– Chandan
Nov 28 '18 at 5:50
It is the output as showed in the question.
– akrun
Nov 28 '18 at 8:16
add a comment |
We substr
the 'tels' and then create a named vector
to match the substr values and replace it with the values in the named vector
data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),
c('510', '707', '650'))[substr(tels, 1, 3)])
# PhoneNumbers state
#1 510-548-2238 KANSAS
#2 707-231-2440 NEW YORK
#3 650-752-1300 TEXAS
#4 510-674-3482 KANSAS
#5 510-853-5695 KANSAS
#6 510-882-9898 KANSAS
#7 650-555-6311 TEXAS
#8 707-885-6351 NEW YORK
#9 650-231-1234 TEXAS
#10 650-096-0023 TEXAS
#11 707-691-6763 NEW YORK
I want the output as mentioned in the question.
– Chandan
Nov 28 '18 at 5:50
It is the output as showed in the question.
– akrun
Nov 28 '18 at 8:16
add a comment |
We substr
the 'tels' and then create a named vector
to match the substr values and replace it with the values in the named vector
data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),
c('510', '707', '650'))[substr(tels, 1, 3)])
# PhoneNumbers state
#1 510-548-2238 KANSAS
#2 707-231-2440 NEW YORK
#3 650-752-1300 TEXAS
#4 510-674-3482 KANSAS
#5 510-853-5695 KANSAS
#6 510-882-9898 KANSAS
#7 650-555-6311 TEXAS
#8 707-885-6351 NEW YORK
#9 650-231-1234 TEXAS
#10 650-096-0023 TEXAS
#11 707-691-6763 NEW YORK
We substr
the 'tels' and then create a named vector
to match the substr values and replace it with the values in the named vector
data.frame(PhoneNumbers = tels, state = setNames(c("KANSAS", "NEW YORK", "TEXAS"),
c('510', '707', '650'))[substr(tels, 1, 3)])
# PhoneNumbers state
#1 510-548-2238 KANSAS
#2 707-231-2440 NEW YORK
#3 650-752-1300 TEXAS
#4 510-674-3482 KANSAS
#5 510-853-5695 KANSAS
#6 510-882-9898 KANSAS
#7 650-555-6311 TEXAS
#8 707-885-6351 NEW YORK
#9 650-231-1234 TEXAS
#10 650-096-0023 TEXAS
#11 707-691-6763 NEW YORK
edited Nov 28 '18 at 5:56
answered Nov 28 '18 at 5:47
akrunakrun
414k13202276
414k13202276
I want the output as mentioned in the question.
– Chandan
Nov 28 '18 at 5:50
It is the output as showed in the question.
– akrun
Nov 28 '18 at 8:16
add a comment |
I want the output as mentioned in the question.
– Chandan
Nov 28 '18 at 5:50
It is the output as showed in the question.
– akrun
Nov 28 '18 at 8:16
I want the output as mentioned in the question.
– Chandan
Nov 28 '18 at 5:50
I want the output as mentioned in the question.
– Chandan
Nov 28 '18 at 5:50
It is the output as showed in the question.
– akrun
Nov 28 '18 at 8:16
It is the output as showed in the question.
– akrun
Nov 28 '18 at 8:16
add a comment |
You can find first pattern using ^510
, ^650
, and ^707
. To easily add new column, I have used dplyr
package.
library(tidyverse) # has dplyr and stringr
# data set -------------------------------
(dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))
#> # A tibble: 11 x 1
#> PhoneNumbers
#> <chr>
#> 1 510-548-2238
#> 2 707-231-2440
#> 3 650-752-1300
#> 4 510-674-3482
#> 5 510-853-5695
#> 6 510-882-9898
#> 7 650-555-6311
#> 8 707-885-6351
#> 9 650-231-1234
#> 10 650-096-0023
#> 11 707-691-6763
You can make function to find each region by finding each pattern: stringr::str_detect()
You can do it at once using sapply()
. If you perform str_detect
to c("^510", "^650", "^707")
, you will get a matrix each of which column is the number. Each value is whether the number contains the pattern(TRUE
or FALSE
), i.e. 11 x 3.
For each row, you have only one TRUE
by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK")
.
find_region <- function(x) {
sta <- c("^510", "^650", "^707")
stt <- sapply(sta, function(p) {
str_detect(x, pattern = p)
}) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern
apply(1, which) # get the index
c("KANSAS", "TEXAS", "NEW YORK")[stt]
}
Using this function, you can add new column: dplyr::mutate()
dirt %>%
mutate(State = find_region(PhoneNumbers))
#> # A tibble: 11 x 2
#> PhoneNumbers State
#> <chr> <chr>
#> 1 510-548-2238 KANSAS
#> 2 707-231-2440 NEW YORK
#> 3 650-752-1300 TEXAS
#> 4 510-674-3482 KANSAS
#> 5 510-853-5695 KANSAS
#> 6 510-882-9898 KANSAS
#> 7 650-555-6311 TEXAS
#> 8 707-885-6351 NEW YORK
#> 9 650-231-1234 TEXAS
#> 10 650-096-0023 TEXAS
#> 11 707-691-6763 NEW YORK
add a comment |
You can find first pattern using ^510
, ^650
, and ^707
. To easily add new column, I have used dplyr
package.
library(tidyverse) # has dplyr and stringr
# data set -------------------------------
(dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))
#> # A tibble: 11 x 1
#> PhoneNumbers
#> <chr>
#> 1 510-548-2238
#> 2 707-231-2440
#> 3 650-752-1300
#> 4 510-674-3482
#> 5 510-853-5695
#> 6 510-882-9898
#> 7 650-555-6311
#> 8 707-885-6351
#> 9 650-231-1234
#> 10 650-096-0023
#> 11 707-691-6763
You can make function to find each region by finding each pattern: stringr::str_detect()
You can do it at once using sapply()
. If you perform str_detect
to c("^510", "^650", "^707")
, you will get a matrix each of which column is the number. Each value is whether the number contains the pattern(TRUE
or FALSE
), i.e. 11 x 3.
For each row, you have only one TRUE
by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK")
.
find_region <- function(x) {
sta <- c("^510", "^650", "^707")
stt <- sapply(sta, function(p) {
str_detect(x, pattern = p)
}) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern
apply(1, which) # get the index
c("KANSAS", "TEXAS", "NEW YORK")[stt]
}
Using this function, you can add new column: dplyr::mutate()
dirt %>%
mutate(State = find_region(PhoneNumbers))
#> # A tibble: 11 x 2
#> PhoneNumbers State
#> <chr> <chr>
#> 1 510-548-2238 KANSAS
#> 2 707-231-2440 NEW YORK
#> 3 650-752-1300 TEXAS
#> 4 510-674-3482 KANSAS
#> 5 510-853-5695 KANSAS
#> 6 510-882-9898 KANSAS
#> 7 650-555-6311 TEXAS
#> 8 707-885-6351 NEW YORK
#> 9 650-231-1234 TEXAS
#> 10 650-096-0023 TEXAS
#> 11 707-691-6763 NEW YORK
add a comment |
You can find first pattern using ^510
, ^650
, and ^707
. To easily add new column, I have used dplyr
package.
library(tidyverse) # has dplyr and stringr
# data set -------------------------------
(dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))
#> # A tibble: 11 x 1
#> PhoneNumbers
#> <chr>
#> 1 510-548-2238
#> 2 707-231-2440
#> 3 650-752-1300
#> 4 510-674-3482
#> 5 510-853-5695
#> 6 510-882-9898
#> 7 650-555-6311
#> 8 707-885-6351
#> 9 650-231-1234
#> 10 650-096-0023
#> 11 707-691-6763
You can make function to find each region by finding each pattern: stringr::str_detect()
You can do it at once using sapply()
. If you perform str_detect
to c("^510", "^650", "^707")
, you will get a matrix each of which column is the number. Each value is whether the number contains the pattern(TRUE
or FALSE
), i.e. 11 x 3.
For each row, you have only one TRUE
by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK")
.
find_region <- function(x) {
sta <- c("^510", "^650", "^707")
stt <- sapply(sta, function(p) {
str_detect(x, pattern = p)
}) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern
apply(1, which) # get the index
c("KANSAS", "TEXAS", "NEW YORK")[stt]
}
Using this function, you can add new column: dplyr::mutate()
dirt %>%
mutate(State = find_region(PhoneNumbers))
#> # A tibble: 11 x 2
#> PhoneNumbers State
#> <chr> <chr>
#> 1 510-548-2238 KANSAS
#> 2 707-231-2440 NEW YORK
#> 3 650-752-1300 TEXAS
#> 4 510-674-3482 KANSAS
#> 5 510-853-5695 KANSAS
#> 6 510-882-9898 KANSAS
#> 7 650-555-6311 TEXAS
#> 8 707-885-6351 NEW YORK
#> 9 650-231-1234 TEXAS
#> 10 650-096-0023 TEXAS
#> 11 707-691-6763 NEW YORK
You can find first pattern using ^510
, ^650
, and ^707
. To easily add new column, I have used dplyr
package.
library(tidyverse) # has dplyr and stringr
# data set -------------------------------
(dirt <- data_frame(PhoneNumbers = c("510-548-2238", "707-231-2440", "650-752-1300", "510-674-3482", "510-853-5695", "510-882-9898", "650-555-6311", "707-885-6351", "650-231-1234", "650-096-0023", "707-691-6763")))
#> # A tibble: 11 x 1
#> PhoneNumbers
#> <chr>
#> 1 510-548-2238
#> 2 707-231-2440
#> 3 650-752-1300
#> 4 510-674-3482
#> 5 510-853-5695
#> 6 510-882-9898
#> 7 650-555-6311
#> 8 707-885-6351
#> 9 650-231-1234
#> 10 650-096-0023
#> 11 707-691-6763
You can make function to find each region by finding each pattern: stringr::str_detect()
You can do it at once using sapply()
. If you perform str_detect
to c("^510", "^650", "^707")
, you will get a matrix each of which column is the number. Each value is whether the number contains the pattern(TRUE
or FALSE
), i.e. 11 x 3.
For each row, you have only one TRUE
by construction. You can find this index and subset c("KANSAS", "TEXAS", "NEW YORK")
.
find_region <- function(x) {
sta <- c("^510", "^650", "^707")
stt <- sapply(sta, function(p) {
str_detect(x, pattern = p)
}) %>% # produce matrix 11x3 of TRUE and FALSE, each column = 510, 650, 707, TRUE if x contains the pattern
apply(1, which) # get the index
c("KANSAS", "TEXAS", "NEW YORK")[stt]
}
Using this function, you can add new column: dplyr::mutate()
dirt %>%
mutate(State = find_region(PhoneNumbers))
#> # A tibble: 11 x 2
#> PhoneNumbers State
#> <chr> <chr>
#> 1 510-548-2238 KANSAS
#> 2 707-231-2440 NEW YORK
#> 3 650-752-1300 TEXAS
#> 4 510-674-3482 KANSAS
#> 5 510-853-5695 KANSAS
#> 6 510-882-9898 KANSAS
#> 7 650-555-6311 TEXAS
#> 8 707-885-6351 NEW YORK
#> 9 650-231-1234 TEXAS
#> 10 650-096-0023 TEXAS
#> 11 707-691-6763 NEW YORK
edited Nov 28 '18 at 9:28
answered Nov 28 '18 at 7:32
BlendedBlended
7411311
7411311
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53512872%2fwhy-am-i-getting-this-new-york-output-alone%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown