get the number of redirects from a url in R












0















I have to extract a feature- the number of redirects, from the url in my dataframe. Is there a way to find the number in R like there is in python:



r = requests.get(url)
i=0
for h in r.history:
i=i+1
print(i)









share|improve this question























  • You can take a look at that longurl does to get to the bottom of things: github.com/hrbrmtr/longurl; also str() works on most anything, including your the equivalent of your r object in R.

    – hrbrmstr
    Nov 29 '18 at 0:09


















0















I have to extract a feature- the number of redirects, from the url in my dataframe. Is there a way to find the number in R like there is in python:



r = requests.get(url)
i=0
for h in r.history:
i=i+1
print(i)









share|improve this question























  • You can take a look at that longurl does to get to the bottom of things: github.com/hrbrmtr/longurl; also str() works on most anything, including your the equivalent of your r object in R.

    – hrbrmstr
    Nov 29 '18 at 0:09
















0












0








0








I have to extract a feature- the number of redirects, from the url in my dataframe. Is there a way to find the number in R like there is in python:



r = requests.get(url)
i=0
for h in r.history:
i=i+1
print(i)









share|improve this question














I have to extract a feature- the number of redirects, from the url in my dataframe. Is there a way to find the number in R like there is in python:



r = requests.get(url)
i=0
for h in r.history:
i=i+1
print(i)






r curl httr






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 28 '18 at 23:11









aasthaaastha

256




256













  • You can take a look at that longurl does to get to the bottom of things: github.com/hrbrmtr/longurl; also str() works on most anything, including your the equivalent of your r object in R.

    – hrbrmstr
    Nov 29 '18 at 0:09





















  • You can take a look at that longurl does to get to the bottom of things: github.com/hrbrmtr/longurl; also str() works on most anything, including your the equivalent of your r object in R.

    – hrbrmstr
    Nov 29 '18 at 0:09



















You can take a look at that longurl does to get to the bottom of things: github.com/hrbrmtr/longurl; also str() works on most anything, including your the equivalent of your r object in R.

– hrbrmstr
Nov 29 '18 at 0:09







You can take a look at that longurl does to get to the bottom of things: github.com/hrbrmtr/longurl; also str() works on most anything, including your the equivalent of your r object in R.

– hrbrmstr
Nov 29 '18 at 0:09














2 Answers
2






active

oldest

votes


















1














The return value from httr::GET is completely undocumented, but the headers etc from redirects seem to appear in the $all_headers object:



> url = "http://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 2


because http redirects to https. If you go straight to https you dont see a redirect:



> url = "https://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 1





share|improve this answer
























  • For some interesting definition of "completely" ?httr::response

    – hrbrmstr
    Nov 29 '18 at 0:38











  • Its undocumented in ?GET and none of the "See Also". The examples provide no further lucidation or any clue to go look at ?httr::response. If I had any clue that even existed I'd have gone filed a bug report by now: "See Also: response" or "Return Value: a response object"

    – Spacedman
    Nov 29 '18 at 8:32













  • That's hardly "completely" undocumented.

    – hrbrmstr
    Nov 29 '18 at 13:24











  • Documentation that can't be easily found is not useful documentation. Anyway, issue submitted: github.com/r-lib/httr/issues/551

    – Spacedman
    Nov 29 '18 at 14:07



















1














The return value of httr::GET is an httr::response object which has the core documentation at ?httr::response. You can examine the whole object with str() to see the parts that aren't salient to most R users. It's been documented, like, forever. I don't know where folks might be confused that it has no docs. Perhaps heads are above the clouds…perhaps in orbit or space or something.



Since what you want is count of redirects, you might actually care about redirects vs a naive count of all the response headers. e.g.



res <- httr::GET("http://1.usa.gov/1J6GNoW")
sum(((sapply(res$all_headers, `[[`, "status") %% 300) == 1))


That's 3 (and may not be exactly what you want either).



length(res$all_headers)


is 4 and I doubt you should be including 4xx responses in the redirects, but you could be clearer in your question if it is just the number of 3xx's vs total in the HTTP chain.



You might also want to consider:



cat(rawToChar(curl::curl_fetch_memory("http://1.usa.gov/1J6GNoW")$headers))


count the actual redirects from that (depending on what the actual "mission" is).






share|improve this answer


























  • 3xx codes being success status codes right? and 4xx the broken ones?

    – aastha
    Nov 29 '18 at 0:26











  • 3xx include "redirect" and 4xx are "yeah, not so much" : w3.org/Protocols/rfc2616/rfc2616-sec10.html

    – hrbrmstr
    Nov 29 '18 at 0:36











  • great solution. cheers

    – aastha
    Nov 29 '18 at 0:43











  • Hopefully you're counting the right things.

    – hrbrmstr
    Nov 29 '18 at 1:03












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53529512%2fget-the-number-of-redirects-from-a-url-in-r%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














The return value from httr::GET is completely undocumented, but the headers etc from redirects seem to appear in the $all_headers object:



> url = "http://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 2


because http redirects to https. If you go straight to https you dont see a redirect:



> url = "https://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 1





share|improve this answer
























  • For some interesting definition of "completely" ?httr::response

    – hrbrmstr
    Nov 29 '18 at 0:38











  • Its undocumented in ?GET and none of the "See Also". The examples provide no further lucidation or any clue to go look at ?httr::response. If I had any clue that even existed I'd have gone filed a bug report by now: "See Also: response" or "Return Value: a response object"

    – Spacedman
    Nov 29 '18 at 8:32













  • That's hardly "completely" undocumented.

    – hrbrmstr
    Nov 29 '18 at 13:24











  • Documentation that can't be easily found is not useful documentation. Anyway, issue submitted: github.com/r-lib/httr/issues/551

    – Spacedman
    Nov 29 '18 at 14:07
















1














The return value from httr::GET is completely undocumented, but the headers etc from redirects seem to appear in the $all_headers object:



> url = "http://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 2


because http redirects to https. If you go straight to https you dont see a redirect:



> url = "https://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 1





share|improve this answer
























  • For some interesting definition of "completely" ?httr::response

    – hrbrmstr
    Nov 29 '18 at 0:38











  • Its undocumented in ?GET and none of the "See Also". The examples provide no further lucidation or any clue to go look at ?httr::response. If I had any clue that even existed I'd have gone filed a bug report by now: "See Also: response" or "Return Value: a response object"

    – Spacedman
    Nov 29 '18 at 8:32













  • That's hardly "completely" undocumented.

    – hrbrmstr
    Nov 29 '18 at 13:24











  • Documentation that can't be easily found is not useful documentation. Anyway, issue submitted: github.com/r-lib/httr/issues/551

    – Spacedman
    Nov 29 '18 at 14:07














1












1








1







The return value from httr::GET is completely undocumented, but the headers etc from redirects seem to appear in the $all_headers object:



> url = "http://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 2


because http redirects to https. If you go straight to https you dont see a redirect:



> url = "https://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 1





share|improve this answer













The return value from httr::GET is completely undocumented, but the headers etc from redirects seem to appear in the $all_headers object:



> url = "http://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 2


because http redirects to https. If you go straight to https you dont see a redirect:



> url = "https://github.com"
> g = httr::GET(url)
> length(g$all_headers)
[1] 1






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 28 '18 at 23:53









SpacedmanSpacedman

73.8k1097169




73.8k1097169













  • For some interesting definition of "completely" ?httr::response

    – hrbrmstr
    Nov 29 '18 at 0:38











  • Its undocumented in ?GET and none of the "See Also". The examples provide no further lucidation or any clue to go look at ?httr::response. If I had any clue that even existed I'd have gone filed a bug report by now: "See Also: response" or "Return Value: a response object"

    – Spacedman
    Nov 29 '18 at 8:32













  • That's hardly "completely" undocumented.

    – hrbrmstr
    Nov 29 '18 at 13:24











  • Documentation that can't be easily found is not useful documentation. Anyway, issue submitted: github.com/r-lib/httr/issues/551

    – Spacedman
    Nov 29 '18 at 14:07



















  • For some interesting definition of "completely" ?httr::response

    – hrbrmstr
    Nov 29 '18 at 0:38











  • Its undocumented in ?GET and none of the "See Also". The examples provide no further lucidation or any clue to go look at ?httr::response. If I had any clue that even existed I'd have gone filed a bug report by now: "See Also: response" or "Return Value: a response object"

    – Spacedman
    Nov 29 '18 at 8:32













  • That's hardly "completely" undocumented.

    – hrbrmstr
    Nov 29 '18 at 13:24











  • Documentation that can't be easily found is not useful documentation. Anyway, issue submitted: github.com/r-lib/httr/issues/551

    – Spacedman
    Nov 29 '18 at 14:07

















For some interesting definition of "completely" ?httr::response

– hrbrmstr
Nov 29 '18 at 0:38





For some interesting definition of "completely" ?httr::response

– hrbrmstr
Nov 29 '18 at 0:38













Its undocumented in ?GET and none of the "See Also". The examples provide no further lucidation or any clue to go look at ?httr::response. If I had any clue that even existed I'd have gone filed a bug report by now: "See Also: response" or "Return Value: a response object"

– Spacedman
Nov 29 '18 at 8:32







Its undocumented in ?GET and none of the "See Also". The examples provide no further lucidation or any clue to go look at ?httr::response. If I had any clue that even existed I'd have gone filed a bug report by now: "See Also: response" or "Return Value: a response object"

– Spacedman
Nov 29 '18 at 8:32















That's hardly "completely" undocumented.

– hrbrmstr
Nov 29 '18 at 13:24





That's hardly "completely" undocumented.

– hrbrmstr
Nov 29 '18 at 13:24













Documentation that can't be easily found is not useful documentation. Anyway, issue submitted: github.com/r-lib/httr/issues/551

– Spacedman
Nov 29 '18 at 14:07





Documentation that can't be easily found is not useful documentation. Anyway, issue submitted: github.com/r-lib/httr/issues/551

– Spacedman
Nov 29 '18 at 14:07













1














The return value of httr::GET is an httr::response object which has the core documentation at ?httr::response. You can examine the whole object with str() to see the parts that aren't salient to most R users. It's been documented, like, forever. I don't know where folks might be confused that it has no docs. Perhaps heads are above the clouds…perhaps in orbit or space or something.



Since what you want is count of redirects, you might actually care about redirects vs a naive count of all the response headers. e.g.



res <- httr::GET("http://1.usa.gov/1J6GNoW")
sum(((sapply(res$all_headers, `[[`, "status") %% 300) == 1))


That's 3 (and may not be exactly what you want either).



length(res$all_headers)


is 4 and I doubt you should be including 4xx responses in the redirects, but you could be clearer in your question if it is just the number of 3xx's vs total in the HTTP chain.



You might also want to consider:



cat(rawToChar(curl::curl_fetch_memory("http://1.usa.gov/1J6GNoW")$headers))


count the actual redirects from that (depending on what the actual "mission" is).






share|improve this answer


























  • 3xx codes being success status codes right? and 4xx the broken ones?

    – aastha
    Nov 29 '18 at 0:26











  • 3xx include "redirect" and 4xx are "yeah, not so much" : w3.org/Protocols/rfc2616/rfc2616-sec10.html

    – hrbrmstr
    Nov 29 '18 at 0:36











  • great solution. cheers

    – aastha
    Nov 29 '18 at 0:43











  • Hopefully you're counting the right things.

    – hrbrmstr
    Nov 29 '18 at 1:03
















1














The return value of httr::GET is an httr::response object which has the core documentation at ?httr::response. You can examine the whole object with str() to see the parts that aren't salient to most R users. It's been documented, like, forever. I don't know where folks might be confused that it has no docs. Perhaps heads are above the clouds…perhaps in orbit or space or something.



Since what you want is count of redirects, you might actually care about redirects vs a naive count of all the response headers. e.g.



res <- httr::GET("http://1.usa.gov/1J6GNoW")
sum(((sapply(res$all_headers, `[[`, "status") %% 300) == 1))


That's 3 (and may not be exactly what you want either).



length(res$all_headers)


is 4 and I doubt you should be including 4xx responses in the redirects, but you could be clearer in your question if it is just the number of 3xx's vs total in the HTTP chain.



You might also want to consider:



cat(rawToChar(curl::curl_fetch_memory("http://1.usa.gov/1J6GNoW")$headers))


count the actual redirects from that (depending on what the actual "mission" is).






share|improve this answer


























  • 3xx codes being success status codes right? and 4xx the broken ones?

    – aastha
    Nov 29 '18 at 0:26











  • 3xx include "redirect" and 4xx are "yeah, not so much" : w3.org/Protocols/rfc2616/rfc2616-sec10.html

    – hrbrmstr
    Nov 29 '18 at 0:36











  • great solution. cheers

    – aastha
    Nov 29 '18 at 0:43











  • Hopefully you're counting the right things.

    – hrbrmstr
    Nov 29 '18 at 1:03














1












1








1







The return value of httr::GET is an httr::response object which has the core documentation at ?httr::response. You can examine the whole object with str() to see the parts that aren't salient to most R users. It's been documented, like, forever. I don't know where folks might be confused that it has no docs. Perhaps heads are above the clouds…perhaps in orbit or space or something.



Since what you want is count of redirects, you might actually care about redirects vs a naive count of all the response headers. e.g.



res <- httr::GET("http://1.usa.gov/1J6GNoW")
sum(((sapply(res$all_headers, `[[`, "status") %% 300) == 1))


That's 3 (and may not be exactly what you want either).



length(res$all_headers)


is 4 and I doubt you should be including 4xx responses in the redirects, but you could be clearer in your question if it is just the number of 3xx's vs total in the HTTP chain.



You might also want to consider:



cat(rawToChar(curl::curl_fetch_memory("http://1.usa.gov/1J6GNoW")$headers))


count the actual redirects from that (depending on what the actual "mission" is).






share|improve this answer















The return value of httr::GET is an httr::response object which has the core documentation at ?httr::response. You can examine the whole object with str() to see the parts that aren't salient to most R users. It's been documented, like, forever. I don't know where folks might be confused that it has no docs. Perhaps heads are above the clouds…perhaps in orbit or space or something.



Since what you want is count of redirects, you might actually care about redirects vs a naive count of all the response headers. e.g.



res <- httr::GET("http://1.usa.gov/1J6GNoW")
sum(((sapply(res$all_headers, `[[`, "status") %% 300) == 1))


That's 3 (and may not be exactly what you want either).



length(res$all_headers)


is 4 and I doubt you should be including 4xx responses in the redirects, but you could be clearer in your question if it is just the number of 3xx's vs total in the HTTP chain.



You might also want to consider:



cat(rawToChar(curl::curl_fetch_memory("http://1.usa.gov/1J6GNoW")$headers))


count the actual redirects from that (depending on what the actual "mission" is).







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 29 '18 at 0:35

























answered Nov 29 '18 at 0:17









hrbrmstrhrbrmstr

61.9k694154




61.9k694154













  • 3xx codes being success status codes right? and 4xx the broken ones?

    – aastha
    Nov 29 '18 at 0:26











  • 3xx include "redirect" and 4xx are "yeah, not so much" : w3.org/Protocols/rfc2616/rfc2616-sec10.html

    – hrbrmstr
    Nov 29 '18 at 0:36











  • great solution. cheers

    – aastha
    Nov 29 '18 at 0:43











  • Hopefully you're counting the right things.

    – hrbrmstr
    Nov 29 '18 at 1:03



















  • 3xx codes being success status codes right? and 4xx the broken ones?

    – aastha
    Nov 29 '18 at 0:26











  • 3xx include "redirect" and 4xx are "yeah, not so much" : w3.org/Protocols/rfc2616/rfc2616-sec10.html

    – hrbrmstr
    Nov 29 '18 at 0:36











  • great solution. cheers

    – aastha
    Nov 29 '18 at 0:43











  • Hopefully you're counting the right things.

    – hrbrmstr
    Nov 29 '18 at 1:03

















3xx codes being success status codes right? and 4xx the broken ones?

– aastha
Nov 29 '18 at 0:26





3xx codes being success status codes right? and 4xx the broken ones?

– aastha
Nov 29 '18 at 0:26













3xx include "redirect" and 4xx are "yeah, not so much" : w3.org/Protocols/rfc2616/rfc2616-sec10.html

– hrbrmstr
Nov 29 '18 at 0:36





3xx include "redirect" and 4xx are "yeah, not so much" : w3.org/Protocols/rfc2616/rfc2616-sec10.html

– hrbrmstr
Nov 29 '18 at 0:36













great solution. cheers

– aastha
Nov 29 '18 at 0:43





great solution. cheers

– aastha
Nov 29 '18 at 0:43













Hopefully you're counting the right things.

– hrbrmstr
Nov 29 '18 at 1:03





Hopefully you're counting the right things.

– hrbrmstr
Nov 29 '18 at 1:03


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53529512%2fget-the-number-of-redirects-from-a-url-in-r%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

Calculate evaluation metrics using cross_val_predict sklearn

Insert data from modal to MySQL (multiple modal on website)