get the number of redirects from a url in R
I have to extract a feature- the number of redirects, from the url in my dataframe. Is there a way to find the number in R like there is in python:
r = requests.get(url)
i=0
for h in r.history:
i=i+1
print(i)
r curl httr
add a comment |
I have to extract a feature- the number of redirects, from the url in my dataframe. Is there a way to find the number in R like there is in python:
r = requests.get(url)
i=0
for h in r.history:
i=i+1
print(i)
r curl httr
You can take a look at thatlongurl
does to get to the bottom of things: github.com/hrbrmtr/longurl; alsostr()
works on most anything, including your the equivalent of yourr
object in R.
– hrbrmstr
Nov 29 '18 at 0:09
add a comment |
I have to extract a feature- the number of redirects, from the url in my dataframe. Is there a way to find the number in R like there is in python:
r = requests.get(url)
i=0
for h in r.history:
i=i+1
print(i)
r curl httr
I have to extract a feature- the number of redirects, from the url in my dataframe. Is there a way to find the number in R like there is in python:
r = requests.get(url)
i=0
for h in r.history:
i=i+1
print(i)
r curl httr
r curl httr
asked Nov 28 '18 at 23:11
aasthaaastha
256
256
You can take a look at thatlongurl
does to get to the bottom of things: github.com/hrbrmtr/longurl; alsostr()
works on most anything, including your the equivalent of yourr
object in R.
– hrbrmstr
Nov 29 '18 at 0:09
add a comment |
You can take a look at thatlongurl
does to get to the bottom of things: github.com/hrbrmtr/longurl; alsostr()
works on most anything, including your the equivalent of yourr
object in R.
– hrbrmstr
Nov 29 '18 at 0:09
You can take a look at that
longurl
does to get to the bottom of things: github.com/hrbrmtr/longurl; also str()
works on most anything, including your the equivalent of your r
object in R.– hrbrmstr
Nov 29 '18 at 0:09
You can take a look at that
longurl
does to get to the bottom of things: github.com/hrbrmtr/longurl; also str()
works on most anything, including your the equivalent of your r
object in R.– hrbrmstr
Nov 29 '18 at 0:09
add a comment |
2 Answers
2
active
oldest
votes
The return value from httr::GET
is completely undocumented, but the headers etc from redirects seem to appear in the $all_headers
object:
> url = "http://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 2
because http redirects to https. If you go straight to https you dont see a redirect:
> url = "https://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 1
For some interesting definition of "completely"?httr::response
– hrbrmstr
Nov 29 '18 at 0:38
Its undocumented in ?GET and none of the "See Also". The examples provide no further lucidation or any clue to go look at?httr::response
. If I had any clue that even existed I'd have gone filed a bug report by now: "See Also: response" or "Return Value: a response object"
– Spacedman
Nov 29 '18 at 8:32
That's hardly "completely" undocumented.
– hrbrmstr
Nov 29 '18 at 13:24
Documentation that can't be easily found is not useful documentation. Anyway, issue submitted: github.com/r-lib/httr/issues/551
– Spacedman
Nov 29 '18 at 14:07
add a comment |
The return value of httr::GET
is an httr::response
object which has the core documentation at ?httr::response
. You can examine the whole object with str()
to see the parts that aren't salient to most R users. It's been documented, like, forever. I don't know where folks might be confused that it has no docs. Perhaps heads are above the clouds…perhaps in orbit or space or something.
Since what you want is count of redirects, you might actually care about redirects vs a naive count of all the response headers. e.g.
res <- httr::GET("http://1.usa.gov/1J6GNoW")
sum(((sapply(res$all_headers, `[[`, "status") %% 300) == 1))
That's 3 (and may not be exactly what you want either).
length(res$all_headers)
is 4 and I doubt you should be including 4xx responses in the redirects, but you could be clearer in your question if it is just the number of 3xx's vs total in the HTTP chain.
You might also want to consider:
cat(rawToChar(curl::curl_fetch_memory("http://1.usa.gov/1J6GNoW")$headers))
count the actual redirects from that (depending on what the actual "mission" is).
3xx codes being success status codes right? and 4xx the broken ones?
– aastha
Nov 29 '18 at 0:26
3xx
include "redirect" and4xx
are "yeah, not so much" : w3.org/Protocols/rfc2616/rfc2616-sec10.html
– hrbrmstr
Nov 29 '18 at 0:36
great solution. cheers
– aastha
Nov 29 '18 at 0:43
Hopefully you're counting the right things.
– hrbrmstr
Nov 29 '18 at 1:03
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53529512%2fget-the-number-of-redirects-from-a-url-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
The return value from httr::GET
is completely undocumented, but the headers etc from redirects seem to appear in the $all_headers
object:
> url = "http://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 2
because http redirects to https. If you go straight to https you dont see a redirect:
> url = "https://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 1
For some interesting definition of "completely"?httr::response
– hrbrmstr
Nov 29 '18 at 0:38
Its undocumented in ?GET and none of the "See Also". The examples provide no further lucidation or any clue to go look at?httr::response
. If I had any clue that even existed I'd have gone filed a bug report by now: "See Also: response" or "Return Value: a response object"
– Spacedman
Nov 29 '18 at 8:32
That's hardly "completely" undocumented.
– hrbrmstr
Nov 29 '18 at 13:24
Documentation that can't be easily found is not useful documentation. Anyway, issue submitted: github.com/r-lib/httr/issues/551
– Spacedman
Nov 29 '18 at 14:07
add a comment |
The return value from httr::GET
is completely undocumented, but the headers etc from redirects seem to appear in the $all_headers
object:
> url = "http://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 2
because http redirects to https. If you go straight to https you dont see a redirect:
> url = "https://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 1
For some interesting definition of "completely"?httr::response
– hrbrmstr
Nov 29 '18 at 0:38
Its undocumented in ?GET and none of the "See Also". The examples provide no further lucidation or any clue to go look at?httr::response
. If I had any clue that even existed I'd have gone filed a bug report by now: "See Also: response" or "Return Value: a response object"
– Spacedman
Nov 29 '18 at 8:32
That's hardly "completely" undocumented.
– hrbrmstr
Nov 29 '18 at 13:24
Documentation that can't be easily found is not useful documentation. Anyway, issue submitted: github.com/r-lib/httr/issues/551
– Spacedman
Nov 29 '18 at 14:07
add a comment |
The return value from httr::GET
is completely undocumented, but the headers etc from redirects seem to appear in the $all_headers
object:
> url = "http://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 2
because http redirects to https. If you go straight to https you dont see a redirect:
> url = "https://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 1
The return value from httr::GET
is completely undocumented, but the headers etc from redirects seem to appear in the $all_headers
object:
> url = "http://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 2
because http redirects to https. If you go straight to https you dont see a redirect:
> url = "https://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 1
answered Nov 28 '18 at 23:53
SpacedmanSpacedman
73.8k1097169
73.8k1097169
For some interesting definition of "completely"?httr::response
– hrbrmstr
Nov 29 '18 at 0:38
Its undocumented in ?GET and none of the "See Also". The examples provide no further lucidation or any clue to go look at?httr::response
. If I had any clue that even existed I'd have gone filed a bug report by now: "See Also: response" or "Return Value: a response object"
– Spacedman
Nov 29 '18 at 8:32
That's hardly "completely" undocumented.
– hrbrmstr
Nov 29 '18 at 13:24
Documentation that can't be easily found is not useful documentation. Anyway, issue submitted: github.com/r-lib/httr/issues/551
– Spacedman
Nov 29 '18 at 14:07
add a comment |
For some interesting definition of "completely"?httr::response
– hrbrmstr
Nov 29 '18 at 0:38
Its undocumented in ?GET and none of the "See Also". The examples provide no further lucidation or any clue to go look at?httr::response
. If I had any clue that even existed I'd have gone filed a bug report by now: "See Also: response" or "Return Value: a response object"
– Spacedman
Nov 29 '18 at 8:32
That's hardly "completely" undocumented.
– hrbrmstr
Nov 29 '18 at 13:24
Documentation that can't be easily found is not useful documentation. Anyway, issue submitted: github.com/r-lib/httr/issues/551
– Spacedman
Nov 29 '18 at 14:07
For some interesting definition of "completely"
?httr::response
– hrbrmstr
Nov 29 '18 at 0:38
For some interesting definition of "completely"
?httr::response
– hrbrmstr
Nov 29 '18 at 0:38
Its undocumented in ?GET and none of the "See Also". The examples provide no further lucidation or any clue to go look at
?httr::response
. If I had any clue that even existed I'd have gone filed a bug report by now: "See Also: response" or "Return Value: a response object"– Spacedman
Nov 29 '18 at 8:32
Its undocumented in ?GET and none of the "See Also". The examples provide no further lucidation or any clue to go look at
?httr::response
. If I had any clue that even existed I'd have gone filed a bug report by now: "See Also: response" or "Return Value: a response object"– Spacedman
Nov 29 '18 at 8:32
That's hardly "completely" undocumented.
– hrbrmstr
Nov 29 '18 at 13:24
That's hardly "completely" undocumented.
– hrbrmstr
Nov 29 '18 at 13:24
Documentation that can't be easily found is not useful documentation. Anyway, issue submitted: github.com/r-lib/httr/issues/551
– Spacedman
Nov 29 '18 at 14:07
Documentation that can't be easily found is not useful documentation. Anyway, issue submitted: github.com/r-lib/httr/issues/551
– Spacedman
Nov 29 '18 at 14:07
add a comment |
The return value of httr::GET
is an httr::response
object which has the core documentation at ?httr::response
. You can examine the whole object with str()
to see the parts that aren't salient to most R users. It's been documented, like, forever. I don't know where folks might be confused that it has no docs. Perhaps heads are above the clouds…perhaps in orbit or space or something.
Since what you want is count of redirects, you might actually care about redirects vs a naive count of all the response headers. e.g.
res <- httr::GET("http://1.usa.gov/1J6GNoW")
sum(((sapply(res$all_headers, `[[`, "status") %% 300) == 1))
That's 3 (and may not be exactly what you want either).
length(res$all_headers)
is 4 and I doubt you should be including 4xx responses in the redirects, but you could be clearer in your question if it is just the number of 3xx's vs total in the HTTP chain.
You might also want to consider:
cat(rawToChar(curl::curl_fetch_memory("http://1.usa.gov/1J6GNoW")$headers))
count the actual redirects from that (depending on what the actual "mission" is).
3xx codes being success status codes right? and 4xx the broken ones?
– aastha
Nov 29 '18 at 0:26
3xx
include "redirect" and4xx
are "yeah, not so much" : w3.org/Protocols/rfc2616/rfc2616-sec10.html
– hrbrmstr
Nov 29 '18 at 0:36
great solution. cheers
– aastha
Nov 29 '18 at 0:43
Hopefully you're counting the right things.
– hrbrmstr
Nov 29 '18 at 1:03
add a comment |
The return value of httr::GET
is an httr::response
object which has the core documentation at ?httr::response
. You can examine the whole object with str()
to see the parts that aren't salient to most R users. It's been documented, like, forever. I don't know where folks might be confused that it has no docs. Perhaps heads are above the clouds…perhaps in orbit or space or something.
Since what you want is count of redirects, you might actually care about redirects vs a naive count of all the response headers. e.g.
res <- httr::GET("http://1.usa.gov/1J6GNoW")
sum(((sapply(res$all_headers, `[[`, "status") %% 300) == 1))
That's 3 (and may not be exactly what you want either).
length(res$all_headers)
is 4 and I doubt you should be including 4xx responses in the redirects, but you could be clearer in your question if it is just the number of 3xx's vs total in the HTTP chain.
You might also want to consider:
cat(rawToChar(curl::curl_fetch_memory("http://1.usa.gov/1J6GNoW")$headers))
count the actual redirects from that (depending on what the actual "mission" is).
3xx codes being success status codes right? and 4xx the broken ones?
– aastha
Nov 29 '18 at 0:26
3xx
include "redirect" and4xx
are "yeah, not so much" : w3.org/Protocols/rfc2616/rfc2616-sec10.html
– hrbrmstr
Nov 29 '18 at 0:36
great solution. cheers
– aastha
Nov 29 '18 at 0:43
Hopefully you're counting the right things.
– hrbrmstr
Nov 29 '18 at 1:03
add a comment |
The return value of httr::GET
is an httr::response
object which has the core documentation at ?httr::response
. You can examine the whole object with str()
to see the parts that aren't salient to most R users. It's been documented, like, forever. I don't know where folks might be confused that it has no docs. Perhaps heads are above the clouds…perhaps in orbit or space or something.
Since what you want is count of redirects, you might actually care about redirects vs a naive count of all the response headers. e.g.
res <- httr::GET("http://1.usa.gov/1J6GNoW")
sum(((sapply(res$all_headers, `[[`, "status") %% 300) == 1))
That's 3 (and may not be exactly what you want either).
length(res$all_headers)
is 4 and I doubt you should be including 4xx responses in the redirects, but you could be clearer in your question if it is just the number of 3xx's vs total in the HTTP chain.
You might also want to consider:
cat(rawToChar(curl::curl_fetch_memory("http://1.usa.gov/1J6GNoW")$headers))
count the actual redirects from that (depending on what the actual "mission" is).
The return value of httr::GET
is an httr::response
object which has the core documentation at ?httr::response
. You can examine the whole object with str()
to see the parts that aren't salient to most R users. It's been documented, like, forever. I don't know where folks might be confused that it has no docs. Perhaps heads are above the clouds…perhaps in orbit or space or something.
Since what you want is count of redirects, you might actually care about redirects vs a naive count of all the response headers. e.g.
res <- httr::GET("http://1.usa.gov/1J6GNoW")
sum(((sapply(res$all_headers, `[[`, "status") %% 300) == 1))
That's 3 (and may not be exactly what you want either).
length(res$all_headers)
is 4 and I doubt you should be including 4xx responses in the redirects, but you could be clearer in your question if it is just the number of 3xx's vs total in the HTTP chain.
You might also want to consider:
cat(rawToChar(curl::curl_fetch_memory("http://1.usa.gov/1J6GNoW")$headers))
count the actual redirects from that (depending on what the actual "mission" is).
edited Nov 29 '18 at 0:35
answered Nov 29 '18 at 0:17
hrbrmstrhrbrmstr
61.9k694154
61.9k694154
3xx codes being success status codes right? and 4xx the broken ones?
– aastha
Nov 29 '18 at 0:26
3xx
include "redirect" and4xx
are "yeah, not so much" : w3.org/Protocols/rfc2616/rfc2616-sec10.html
– hrbrmstr
Nov 29 '18 at 0:36
great solution. cheers
– aastha
Nov 29 '18 at 0:43
Hopefully you're counting the right things.
– hrbrmstr
Nov 29 '18 at 1:03
add a comment |
3xx codes being success status codes right? and 4xx the broken ones?
– aastha
Nov 29 '18 at 0:26
3xx
include "redirect" and4xx
are "yeah, not so much" : w3.org/Protocols/rfc2616/rfc2616-sec10.html
– hrbrmstr
Nov 29 '18 at 0:36
great solution. cheers
– aastha
Nov 29 '18 at 0:43
Hopefully you're counting the right things.
– hrbrmstr
Nov 29 '18 at 1:03
3xx codes being success status codes right? and 4xx the broken ones?
– aastha
Nov 29 '18 at 0:26
3xx codes being success status codes right? and 4xx the broken ones?
– aastha
Nov 29 '18 at 0:26
3xx
include "redirect" and 4xx
are "yeah, not so much" : w3.org/Protocols/rfc2616/rfc2616-sec10.html– hrbrmstr
Nov 29 '18 at 0:36
3xx
include "redirect" and 4xx
are "yeah, not so much" : w3.org/Protocols/rfc2616/rfc2616-sec10.html– hrbrmstr
Nov 29 '18 at 0:36
great solution. cheers
– aastha
Nov 29 '18 at 0:43
great solution. cheers
– aastha
Nov 29 '18 at 0:43
Hopefully you're counting the right things.
– hrbrmstr
Nov 29 '18 at 1:03
Hopefully you're counting the right things.
– hrbrmstr
Nov 29 '18 at 1:03
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53529512%2fget-the-number-of-redirects-from-a-url-in-r%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
You can take a look at that
longurl
does to get to the bottom of things: github.com/hrbrmtr/longurl; alsostr()
works on most anything, including your the equivalent of yourr
object in R.– hrbrmstr
Nov 29 '18 at 0:09