Programatically identified cookie is not getting accepted

up vote
0
down vote

favorite

I am working on a web scraper on Python 2 that reads some contents of a website. To access the contents, I need to pass a cookie. Right now, I am finding the cookie by opening the website in Chrome, and finding the cookie from site information. I am hardcoding this cookie into my scraper and getting contents from website. However, the cookies gets invalidated in some hours and then no information can be extracted from the website. To address this, I am trying to refresh the cookie in my scraper itself when a new cookie is needed.

I have tried the following two codes

First approach

import requests

import browsercookie

try:

    cj = browsercookie.chrome()

    session = requests.Session()

    r = session.get(base_url, cookies=cj)

    new_cookie = str(session.cookies.get_dict()['JSESSIONID'])        

except Exception as e:

    pass

Second approach

with requests.Session() as s:

    headers = {

        'User-Agent':  'Mozilla/5.0 (X11; Fedora; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.81 Safari/537.36',

        'X-Requested-With': 'XMLHttpRequest'

    }  

    headers['Connection'] = 'keep-alive'

    r = s.get(baseurl, headers=headers)

    new_cookie = s.cookies.get_dict()['JSESSIONID']

All of these codes return cookies that looks perfectly fine. The problem I am facing is that these programatically identified cookies make the scraper not extract any result. When I send the cookie found in browser as hardcoded while making a request to website from scraper, the scraper gets the DOM of the website. But When I send the cookie found programatically while making a request to the website from scraper, the scraper cant access the DOM of the webiste.

The cookie information on the browser says that the cookie gets invalidated "When the browsing session ends".

This is very puzzling. What is that I am missing in this whole process?

edited Nov 22 at 7:52

asked Nov 22 at 7:42

harshvardhan

18213

If you need chrome to get a good session cookie then you should use selenium or headless chrome.
– pguardiario
Nov 22 at 9:33

add a comment |

up vote
0
down vote

favorite

I have tried the following two codes

First approach

import requests

import browsercookie

try:

    cj = browsercookie.chrome()

    session = requests.Session()

    r = session.get(base_url, cookies=cj)

    new_cookie = str(session.cookies.get_dict()['JSESSIONID'])        

except Exception as e:

    pass

Second approach

with requests.Session() as s:

    headers = {

        'User-Agent':  'Mozilla/5.0 (X11; Fedora; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.81 Safari/537.36',

        'X-Requested-With': 'XMLHttpRequest'

    }  

    headers['Connection'] = 'keep-alive'

    r = s.get(baseurl, headers=headers)

    new_cookie = s.cookies.get_dict()['JSESSIONID']

The cookie information on the browser says that the cookie gets invalidated "When the browsing session ends".

This is very puzzling. What is that I am missing in this whole process?

edited Nov 22 at 7:52

asked Nov 22 at 7:42

harshvardhan

18213

If you need chrome to get a good session cookie then you should use selenium or headless chrome.
– pguardiario
Nov 22 at 9:33

add a comment |

up vote
0
down vote

favorite

I have tried the following two codes

First approach

import requests

import browsercookie

try:

    cj = browsercookie.chrome()

    session = requests.Session()

    r = session.get(base_url, cookies=cj)

    new_cookie = str(session.cookies.get_dict()['JSESSIONID'])        

except Exception as e:

    pass

Second approach

with requests.Session() as s:

    headers = {

        'User-Agent':  'Mozilla/5.0 (X11; Fedora; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.81 Safari/537.36',

        'X-Requested-With': 'XMLHttpRequest'

    }  

    headers['Connection'] = 'keep-alive'

    r = s.get(baseurl, headers=headers)

    new_cookie = s.cookies.get_dict()['JSESSIONID']

The cookie information on the browser says that the cookie gets invalidated "When the browsing session ends".

This is very puzzling. What is that I am missing in this whole process?

edited Nov 22 at 7:52

asked Nov 22 at 7:42

harshvardhan

18213

I have tried the following two codes

First approach

import requests

import browsercookie

try:

    cj = browsercookie.chrome()

    session = requests.Session()

    r = session.get(base_url, cookies=cj)

    new_cookie = str(session.cookies.get_dict()['JSESSIONID'])        

except Exception as e:

    pass

Second approach

with requests.Session() as s:

    headers = {

        'User-Agent':  'Mozilla/5.0 (X11; Fedora; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.81 Safari/537.36',

        'X-Requested-With': 'XMLHttpRequest'

    }  

    headers['Connection'] = 'keep-alive'

    r = s.get(baseurl, headers=headers)

    new_cookie = s.cookies.get_dict()['JSESSIONID']

The cookie information on the browser says that the cookie gets invalidated "When the browsing session ends".

This is very puzzling. What is that I am missing in this whole process?

python cookies web-scraping request

edited Nov 22 at 7:52

asked Nov 22 at 7:42

harshvardhan

18213

edited Nov 22 at 7:52

asked Nov 22 at 7:42

harshvardhan

18213

edited Nov 22 at 7:52

asked Nov 22 at 7:42

harshvardhan

18213

asked Nov 22 at 7:42

harshvardhan

18213

asked Nov 22 at 7:42

harshvardhan

18213

If you need chrome to get a good session cookie then you should use selenium or headless chrome.
– pguardiario
Nov 22 at 9:33

add a comment |

If you need chrome to get a good session cookie then you should use selenium or headless chrome.
– pguardiario
Nov 22 at 9:33

If you need chrome to get a good session cookie then you should use selenium or headless chrome.
– pguardiario
Nov 22 at 9:33

add a comment |

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53426041%2fprogramatically-identified-cookie-is-not-getting-accepted%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

eQ2htZoq c1wgKtg2ygnvIT3pCsgp z mlK3DB9Lr3KriWigpab0wLmJ

搜尋此網誌

Btukfyl