How to get the correct XPATH CSS on airbnb website
Halo everyone. now im learnin XPATH function for scraping airbnb website. i combine it with PHP PhantomJS in Laravel 5.1.
base on this airbnb site : https://www.airbnb.com/rooms/1064946
i want to get the hotel name and the price. but i confuse how to set the correct css in my xpath code. here is my code :
$client = Client::getInstance();
$request = $client->getMessageFactory()->createRequest('https://www.airbnb.com/rooms/1064946', 'GET');
$response = $client->getMessageFactory()->createResponse();
$client->send($request,$response);
$htmlstr = $response->getContent();
$dom = new DOMDocument;
@$dom->loadHTML($htmlstr);
$xpath = new DOMXPath($dom);
$entries = ;
foreach ($xpath->query('//div[@class="with-new-header has-epcot-header"]') as $node) {
$entries = [
'hotel_name' => $xpath->evaluate('string(//div[@class="_12ei9u44"])',$node),
'price' => $xpath->evaluate('string(//div[@class="_doc79r"])',$node)
];
}
var_dump($entries);
but the result is
array(0) { }
what wrong with this code. please help me. thank you
php laravel xpath phantomjs
|
show 2 more comments
Halo everyone. now im learnin XPATH function for scraping airbnb website. i combine it with PHP PhantomJS in Laravel 5.1.
base on this airbnb site : https://www.airbnb.com/rooms/1064946
i want to get the hotel name and the price. but i confuse how to set the correct css in my xpath code. here is my code :
$client = Client::getInstance();
$request = $client->getMessageFactory()->createRequest('https://www.airbnb.com/rooms/1064946', 'GET');
$response = $client->getMessageFactory()->createResponse();
$client->send($request,$response);
$htmlstr = $response->getContent();
$dom = new DOMDocument;
@$dom->loadHTML($htmlstr);
$xpath = new DOMXPath($dom);
$entries = ;
foreach ($xpath->query('//div[@class="with-new-header has-epcot-header"]') as $node) {
$entries = [
'hotel_name' => $xpath->evaluate('string(//div[@class="_12ei9u44"])',$node),
'price' => $xpath->evaluate('string(//div[@class="_doc79r"])',$node)
];
}
var_dump($entries);
but the result is
array(0) { }
what wrong with this code. please help me. thank you
php laravel xpath phantomjs
what is var_dump($page) ?
– Rahul Gurung
Nov 27 '18 at 6:38
debug. like print() or echo
– tjandra
Nov 27 '18 at 6:40
i am demanding output of var_dump($page), what I meant here is where does page variable came from in @$dom->loadHTML($page) ?
– Rahul Gurung
Nov 27 '18 at 6:41
ahh. i got your point. but i already fix it. but still not success. i change itu into @$dom->loadHTML($htmlstr );
– tjandra
Nov 27 '18 at 6:47
remove backslash behind DOMXPath($dom)
– Rahul Gurung
Nov 27 '18 at 6:57
|
show 2 more comments
Halo everyone. now im learnin XPATH function for scraping airbnb website. i combine it with PHP PhantomJS in Laravel 5.1.
base on this airbnb site : https://www.airbnb.com/rooms/1064946
i want to get the hotel name and the price. but i confuse how to set the correct css in my xpath code. here is my code :
$client = Client::getInstance();
$request = $client->getMessageFactory()->createRequest('https://www.airbnb.com/rooms/1064946', 'GET');
$response = $client->getMessageFactory()->createResponse();
$client->send($request,$response);
$htmlstr = $response->getContent();
$dom = new DOMDocument;
@$dom->loadHTML($htmlstr);
$xpath = new DOMXPath($dom);
$entries = ;
foreach ($xpath->query('//div[@class="with-new-header has-epcot-header"]') as $node) {
$entries = [
'hotel_name' => $xpath->evaluate('string(//div[@class="_12ei9u44"])',$node),
'price' => $xpath->evaluate('string(//div[@class="_doc79r"])',$node)
];
}
var_dump($entries);
but the result is
array(0) { }
what wrong with this code. please help me. thank you
php laravel xpath phantomjs
Halo everyone. now im learnin XPATH function for scraping airbnb website. i combine it with PHP PhantomJS in Laravel 5.1.
base on this airbnb site : https://www.airbnb.com/rooms/1064946
i want to get the hotel name and the price. but i confuse how to set the correct css in my xpath code. here is my code :
$client = Client::getInstance();
$request = $client->getMessageFactory()->createRequest('https://www.airbnb.com/rooms/1064946', 'GET');
$response = $client->getMessageFactory()->createResponse();
$client->send($request,$response);
$htmlstr = $response->getContent();
$dom = new DOMDocument;
@$dom->loadHTML($htmlstr);
$xpath = new DOMXPath($dom);
$entries = ;
foreach ($xpath->query('//div[@class="with-new-header has-epcot-header"]') as $node) {
$entries = [
'hotel_name' => $xpath->evaluate('string(//div[@class="_12ei9u44"])',$node),
'price' => $xpath->evaluate('string(//div[@class="_doc79r"])',$node)
];
}
var_dump($entries);
but the result is
array(0) { }
what wrong with this code. please help me. thank you
php laravel xpath phantomjs
php laravel xpath phantomjs
edited Nov 27 '18 at 6:47
tjandra
asked Nov 27 '18 at 6:34
tjandratjandra
155
155
what is var_dump($page) ?
– Rahul Gurung
Nov 27 '18 at 6:38
debug. like print() or echo
– tjandra
Nov 27 '18 at 6:40
i am demanding output of var_dump($page), what I meant here is where does page variable came from in @$dom->loadHTML($page) ?
– Rahul Gurung
Nov 27 '18 at 6:41
ahh. i got your point. but i already fix it. but still not success. i change itu into @$dom->loadHTML($htmlstr );
– tjandra
Nov 27 '18 at 6:47
remove backslash behind DOMXPath($dom)
– Rahul Gurung
Nov 27 '18 at 6:57
|
show 2 more comments
what is var_dump($page) ?
– Rahul Gurung
Nov 27 '18 at 6:38
debug. like print() or echo
– tjandra
Nov 27 '18 at 6:40
i am demanding output of var_dump($page), what I meant here is where does page variable came from in @$dom->loadHTML($page) ?
– Rahul Gurung
Nov 27 '18 at 6:41
ahh. i got your point. but i already fix it. but still not success. i change itu into @$dom->loadHTML($htmlstr );
– tjandra
Nov 27 '18 at 6:47
remove backslash behind DOMXPath($dom)
– Rahul Gurung
Nov 27 '18 at 6:57
what is var_dump($page) ?
– Rahul Gurung
Nov 27 '18 at 6:38
what is var_dump($page) ?
– Rahul Gurung
Nov 27 '18 at 6:38
debug. like print() or echo
– tjandra
Nov 27 '18 at 6:40
debug. like print() or echo
– tjandra
Nov 27 '18 at 6:40
i am demanding output of var_dump($page), what I meant here is where does page variable came from in @$dom->loadHTML($page) ?
– Rahul Gurung
Nov 27 '18 at 6:41
i am demanding output of var_dump($page), what I meant here is where does page variable came from in @$dom->loadHTML($page) ?
– Rahul Gurung
Nov 27 '18 at 6:41
ahh. i got your point. but i already fix it. but still not success. i change itu into @$dom->loadHTML($htmlstr );
– tjandra
Nov 27 '18 at 6:47
ahh. i got your point. but i already fix it. but still not success. i change itu into @$dom->loadHTML($htmlstr );
– tjandra
Nov 27 '18 at 6:47
remove backslash behind DOMXPath($dom)
– Rahul Gurung
Nov 27 '18 at 6:57
remove backslash behind DOMXPath($dom)
– Rahul Gurung
Nov 27 '18 at 6:57
|
show 2 more comments
2 Answers
2
active
oldest
votes
I think you need to check the various tag elements and classes your using, the ones you have all seem to be looking for things that I can't find. I have managed to extract some of the data, but as I'm not using Laravel this may affect it...
foreach ($xpath->query('//div[@class="_1kzvqab3"]') as $node) {
$entries = [
'hotel_name' => $xpath->evaluate('string(//span[@class="_12ei9u44"])', $node),
'price' => $xpath->evaluate('string(//span[@class="_doc79r"])', $node)
];
}
One useful thing I've found is to write the HTML to a temporary file which I can then check the contents of, something like...
file_put_contents("out.html", $htmlstr);
I can then use this to check what the code is actually running against and see what tags and classes are being used.
many thank youuu!!!!!!!!!. it solved!!!!!
– tjandra
Nov 27 '18 at 9:12
add a comment |
You are looking at a class that doesn't belong to a div:
//div[@class="with-new-header has-epcot-header"]
It belongs to the body:
//body[@class="with-new-header has-epcot-header"]
Also the following xpath statements are not divs either:
//div[@class="_12ei9u44"]
//div[@class="_doc79r"]
They are spans:
//span[@class="_12ei9u44"]
//span[@class="_doc79r"]
Are you seeing the pattern? You don't just start a xpath with div, it's the tag.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53494020%2fhow-to-get-the-correct-xpath-css-on-airbnb-website%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
I think you need to check the various tag elements and classes your using, the ones you have all seem to be looking for things that I can't find. I have managed to extract some of the data, but as I'm not using Laravel this may affect it...
foreach ($xpath->query('//div[@class="_1kzvqab3"]') as $node) {
$entries = [
'hotel_name' => $xpath->evaluate('string(//span[@class="_12ei9u44"])', $node),
'price' => $xpath->evaluate('string(//span[@class="_doc79r"])', $node)
];
}
One useful thing I've found is to write the HTML to a temporary file which I can then check the contents of, something like...
file_put_contents("out.html", $htmlstr);
I can then use this to check what the code is actually running against and see what tags and classes are being used.
many thank youuu!!!!!!!!!. it solved!!!!!
– tjandra
Nov 27 '18 at 9:12
add a comment |
I think you need to check the various tag elements and classes your using, the ones you have all seem to be looking for things that I can't find. I have managed to extract some of the data, but as I'm not using Laravel this may affect it...
foreach ($xpath->query('//div[@class="_1kzvqab3"]') as $node) {
$entries = [
'hotel_name' => $xpath->evaluate('string(//span[@class="_12ei9u44"])', $node),
'price' => $xpath->evaluate('string(//span[@class="_doc79r"])', $node)
];
}
One useful thing I've found is to write the HTML to a temporary file which I can then check the contents of, something like...
file_put_contents("out.html", $htmlstr);
I can then use this to check what the code is actually running against and see what tags and classes are being used.
many thank youuu!!!!!!!!!. it solved!!!!!
– tjandra
Nov 27 '18 at 9:12
add a comment |
I think you need to check the various tag elements and classes your using, the ones you have all seem to be looking for things that I can't find. I have managed to extract some of the data, but as I'm not using Laravel this may affect it...
foreach ($xpath->query('//div[@class="_1kzvqab3"]') as $node) {
$entries = [
'hotel_name' => $xpath->evaluate('string(//span[@class="_12ei9u44"])', $node),
'price' => $xpath->evaluate('string(//span[@class="_doc79r"])', $node)
];
}
One useful thing I've found is to write the HTML to a temporary file which I can then check the contents of, something like...
file_put_contents("out.html", $htmlstr);
I can then use this to check what the code is actually running against and see what tags and classes are being used.
I think you need to check the various tag elements and classes your using, the ones you have all seem to be looking for things that I can't find. I have managed to extract some of the data, but as I'm not using Laravel this may affect it...
foreach ($xpath->query('//div[@class="_1kzvqab3"]') as $node) {
$entries = [
'hotel_name' => $xpath->evaluate('string(//span[@class="_12ei9u44"])', $node),
'price' => $xpath->evaluate('string(//span[@class="_doc79r"])', $node)
];
}
One useful thing I've found is to write the HTML to a temporary file which I can then check the contents of, something like...
file_put_contents("out.html", $htmlstr);
I can then use this to check what the code is actually running against and see what tags and classes are being used.
answered Nov 27 '18 at 7:29
Nigel RenNigel Ren
27.7k61933
27.7k61933
many thank youuu!!!!!!!!!. it solved!!!!!
– tjandra
Nov 27 '18 at 9:12
add a comment |
many thank youuu!!!!!!!!!. it solved!!!!!
– tjandra
Nov 27 '18 at 9:12
many thank youuu!!!!!!!!!. it solved!!!!!
– tjandra
Nov 27 '18 at 9:12
many thank youuu!!!!!!!!!. it solved!!!!!
– tjandra
Nov 27 '18 at 9:12
add a comment |
You are looking at a class that doesn't belong to a div:
//div[@class="with-new-header has-epcot-header"]
It belongs to the body:
//body[@class="with-new-header has-epcot-header"]
Also the following xpath statements are not divs either:
//div[@class="_12ei9u44"]
//div[@class="_doc79r"]
They are spans:
//span[@class="_12ei9u44"]
//span[@class="_doc79r"]
Are you seeing the pattern? You don't just start a xpath with div, it's the tag.
add a comment |
You are looking at a class that doesn't belong to a div:
//div[@class="with-new-header has-epcot-header"]
It belongs to the body:
//body[@class="with-new-header has-epcot-header"]
Also the following xpath statements are not divs either:
//div[@class="_12ei9u44"]
//div[@class="_doc79r"]
They are spans:
//span[@class="_12ei9u44"]
//span[@class="_doc79r"]
Are you seeing the pattern? You don't just start a xpath with div, it's the tag.
add a comment |
You are looking at a class that doesn't belong to a div:
//div[@class="with-new-header has-epcot-header"]
It belongs to the body:
//body[@class="with-new-header has-epcot-header"]
Also the following xpath statements are not divs either:
//div[@class="_12ei9u44"]
//div[@class="_doc79r"]
They are spans:
//span[@class="_12ei9u44"]
//span[@class="_doc79r"]
Are you seeing the pattern? You don't just start a xpath with div, it's the tag.
You are looking at a class that doesn't belong to a div:
//div[@class="with-new-header has-epcot-header"]
It belongs to the body:
//body[@class="with-new-header has-epcot-header"]
Also the following xpath statements are not divs either:
//div[@class="_12ei9u44"]
//div[@class="_doc79r"]
They are spans:
//span[@class="_12ei9u44"]
//span[@class="_doc79r"]
Are you seeing the pattern? You don't just start a xpath with div, it's the tag.
edited Nov 27 '18 at 17:42
answered Nov 27 '18 at 17:36
IamBatmanIamBatman
7361018
7361018
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53494020%2fhow-to-get-the-correct-xpath-css-on-airbnb-website%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
what is var_dump($page) ?
– Rahul Gurung
Nov 27 '18 at 6:38
debug. like print() or echo
– tjandra
Nov 27 '18 at 6:40
i am demanding output of var_dump($page), what I meant here is where does page variable came from in @$dom->loadHTML($page) ?
– Rahul Gurung
Nov 27 '18 at 6:41
ahh. i got your point. but i already fix it. but still not success. i change itu into @$dom->loadHTML($htmlstr );
– tjandra
Nov 27 '18 at 6:47
remove backslash behind DOMXPath($dom)
– Rahul Gurung
Nov 27 '18 at 6:57