Why does my method not build a proper XML query from user input?











up vote
0
down vote

favorite












I'm currently trying to take user input from a text box and build an XML query with it.



The query should search within the document for an element with a given attribute 'dataItemId' and return the innerText value.
The two lines of code which are commented out within the try{} statement work as expected, returning the correct value, yet when I try to use the two lines above these and type 'Xabs' into my input box I am getting:




Your resultant Query XPath: //*[dataItemId = 'Xabs']
No Items Found."




It must be something to do with how the string is parsed but I'm stumped as to what.



public void MazakButton_Click(object sender, EventArgs e)
{
string userInput = searchInput.Text;
ResultBox.Items.Clear();
string query = "No Query Found";
string searchResult = "No Items Found";

if (userInput.Length > 3)// If query paramater is long enough
{
//string Query = ""//*[dataItemId = '" + userInput + "']""; // Build Attribite Query
query = "//*[dataItemId = '" + userInput + "']"; // Build Attribite Query
XmlDocument MTData = MTFunctions.ScrapeXMLData(MazakSourceURL, false);

try
{
XmlNode target = MTData.SelectSingleNode(query);
searchResult = userInput + ": " + target.InnerText;
//XmlNode target = MTData.SelectSingleNode("//*[@dataItemId = 'Xabs']");
//searchResult = target.InnerText;
}
catch
{
Console.WriteLine("!!!!!!!!!!!!!!! - XML SEARCH FAILED - !!!!!!!!!!!!!!!");
}
}
else{searchResult = "Invalid request (Too short)";}


searchInput.Focus();
ResultBox.Items.Add("Your Resultant Query XPath: " + query);
ResultBox.Items.Add(searchResult);
searchInput.Text = string.Empty; // Clear searchInput
}









share|improve this question
























  • Two queries are different, you use "[dataItemId" in the first one and "[@dataItemId" in the second one. See the "@" before "dataItemId"
    – NthDeveloper
    Nov 22 at 10:14










  • @NthDeveloper well spotted! :')
    – GigaJoules
    Nov 22 at 10:28















up vote
0
down vote

favorite












I'm currently trying to take user input from a text box and build an XML query with it.



The query should search within the document for an element with a given attribute 'dataItemId' and return the innerText value.
The two lines of code which are commented out within the try{} statement work as expected, returning the correct value, yet when I try to use the two lines above these and type 'Xabs' into my input box I am getting:




Your resultant Query XPath: //*[dataItemId = 'Xabs']
No Items Found."




It must be something to do with how the string is parsed but I'm stumped as to what.



public void MazakButton_Click(object sender, EventArgs e)
{
string userInput = searchInput.Text;
ResultBox.Items.Clear();
string query = "No Query Found";
string searchResult = "No Items Found";

if (userInput.Length > 3)// If query paramater is long enough
{
//string Query = ""//*[dataItemId = '" + userInput + "']""; // Build Attribite Query
query = "//*[dataItemId = '" + userInput + "']"; // Build Attribite Query
XmlDocument MTData = MTFunctions.ScrapeXMLData(MazakSourceURL, false);

try
{
XmlNode target = MTData.SelectSingleNode(query);
searchResult = userInput + ": " + target.InnerText;
//XmlNode target = MTData.SelectSingleNode("//*[@dataItemId = 'Xabs']");
//searchResult = target.InnerText;
}
catch
{
Console.WriteLine("!!!!!!!!!!!!!!! - XML SEARCH FAILED - !!!!!!!!!!!!!!!");
}
}
else{searchResult = "Invalid request (Too short)";}


searchInput.Focus();
ResultBox.Items.Add("Your Resultant Query XPath: " + query);
ResultBox.Items.Add(searchResult);
searchInput.Text = string.Empty; // Clear searchInput
}









share|improve this question
























  • Two queries are different, you use "[dataItemId" in the first one and "[@dataItemId" in the second one. See the "@" before "dataItemId"
    – NthDeveloper
    Nov 22 at 10:14










  • @NthDeveloper well spotted! :')
    – GigaJoules
    Nov 22 at 10:28













up vote
0
down vote

favorite









up vote
0
down vote

favorite











I'm currently trying to take user input from a text box and build an XML query with it.



The query should search within the document for an element with a given attribute 'dataItemId' and return the innerText value.
The two lines of code which are commented out within the try{} statement work as expected, returning the correct value, yet when I try to use the two lines above these and type 'Xabs' into my input box I am getting:




Your resultant Query XPath: //*[dataItemId = 'Xabs']
No Items Found."




It must be something to do with how the string is parsed but I'm stumped as to what.



public void MazakButton_Click(object sender, EventArgs e)
{
string userInput = searchInput.Text;
ResultBox.Items.Clear();
string query = "No Query Found";
string searchResult = "No Items Found";

if (userInput.Length > 3)// If query paramater is long enough
{
//string Query = ""//*[dataItemId = '" + userInput + "']""; // Build Attribite Query
query = "//*[dataItemId = '" + userInput + "']"; // Build Attribite Query
XmlDocument MTData = MTFunctions.ScrapeXMLData(MazakSourceURL, false);

try
{
XmlNode target = MTData.SelectSingleNode(query);
searchResult = userInput + ": " + target.InnerText;
//XmlNode target = MTData.SelectSingleNode("//*[@dataItemId = 'Xabs']");
//searchResult = target.InnerText;
}
catch
{
Console.WriteLine("!!!!!!!!!!!!!!! - XML SEARCH FAILED - !!!!!!!!!!!!!!!");
}
}
else{searchResult = "Invalid request (Too short)";}


searchInput.Focus();
ResultBox.Items.Add("Your Resultant Query XPath: " + query);
ResultBox.Items.Add(searchResult);
searchInput.Text = string.Empty; // Clear searchInput
}









share|improve this question















I'm currently trying to take user input from a text box and build an XML query with it.



The query should search within the document for an element with a given attribute 'dataItemId' and return the innerText value.
The two lines of code which are commented out within the try{} statement work as expected, returning the correct value, yet when I try to use the two lines above these and type 'Xabs' into my input box I am getting:




Your resultant Query XPath: //*[dataItemId = 'Xabs']
No Items Found."




It must be something to do with how the string is parsed but I'm stumped as to what.



public void MazakButton_Click(object sender, EventArgs e)
{
string userInput = searchInput.Text;
ResultBox.Items.Clear();
string query = "No Query Found";
string searchResult = "No Items Found";

if (userInput.Length > 3)// If query paramater is long enough
{
//string Query = ""//*[dataItemId = '" + userInput + "']""; // Build Attribite Query
query = "//*[dataItemId = '" + userInput + "']"; // Build Attribite Query
XmlDocument MTData = MTFunctions.ScrapeXMLData(MazakSourceURL, false);

try
{
XmlNode target = MTData.SelectSingleNode(query);
searchResult = userInput + ": " + target.InnerText;
//XmlNode target = MTData.SelectSingleNode("//*[@dataItemId = 'Xabs']");
//searchResult = target.InnerText;
}
catch
{
Console.WriteLine("!!!!!!!!!!!!!!! - XML SEARCH FAILED - !!!!!!!!!!!!!!!");
}
}
else{searchResult = "Invalid request (Too short)";}


searchInput.Focus();
ResultBox.Items.Add("Your Resultant Query XPath: " + query);
ResultBox.Items.Add(searchResult);
searchInput.Text = string.Empty; // Clear searchInput
}






c# xml xpath web-scraping






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 22 at 9:49

























asked Nov 22 at 9:35









GigaJoules

146




146












  • Two queries are different, you use "[dataItemId" in the first one and "[@dataItemId" in the second one. See the "@" before "dataItemId"
    – NthDeveloper
    Nov 22 at 10:14










  • @NthDeveloper well spotted! :')
    – GigaJoules
    Nov 22 at 10:28


















  • Two queries are different, you use "[dataItemId" in the first one and "[@dataItemId" in the second one. See the "@" before "dataItemId"
    – NthDeveloper
    Nov 22 at 10:14










  • @NthDeveloper well spotted! :')
    – GigaJoules
    Nov 22 at 10:28
















Two queries are different, you use "[dataItemId" in the first one and "[@dataItemId" in the second one. See the "@" before "dataItemId"
– NthDeveloper
Nov 22 at 10:14




Two queries are different, you use "[dataItemId" in the first one and "[@dataItemId" in the second one. See the "@" before "dataItemId"
– NthDeveloper
Nov 22 at 10:14












@NthDeveloper well spotted! :')
– GigaJoules
Nov 22 at 10:28




@NthDeveloper well spotted! :')
– GigaJoules
Nov 22 at 10:28












1 Answer
1






active

oldest

votes

















up vote
0
down vote













First point: you should never build a query by string concatenation incorporating user input. Google "SQL injection attacks" to find out why (most of the literature is in SQL terms, but XPath has exactly the same vulnerabilities).



If you use an expression with a variable, like //*[dataItemId = $userInput] then (a) you are protected against injection attacks, (b) you don't need to worry about escaping special characters in the input (like apostrophes), and (c) it's much more efficient because you only need to compile the XPath expression once.



The only difficulty is that you need to work out how to supply a parameter to the query (that is, a value for $userInput) before executing it. That depends on the API to your XPath processor and I'm not familiar with the XPath processor that you are using, so you'll need to research this yourself.






share|improve this answer





















  • Yeah I'm aware of the security risks, however, this is just a little application I am writing to familiarise myself with c# and won't ever be in the hands of anyone else or web facing so I'm not too worried about that here. Good to keep in mind though.
    – GigaJoules
    Nov 26 at 9:56










  • Even without the security risks, it's easier to write the code using a variable rather than using string concatenation.
    – Michael Kay
    Nov 26 at 10:06










  • In this case there is controlled language for the terms that are being searched, so perhaps something like a closest match lookup table which returns the actual text to be passed to the query construction would work? This way I could ensure that only a list of pre-approved strings is used to build queries since there is such a limited selection of searches the user needs to do?
    – GigaJoules
    Nov 27 at 11:46










  • In that case the benefits of compiling a parameterized query once and executing it repeatedly are even greater. Remember that compiling the query can cost 100 times as much as executing it. I'm not sure why you are so resistant to that approach.
    – Michael Kay
    Nov 27 at 11:58










  • I'm a little confused as to what you're actually suggesting? I was thinking of giving the user a text box to search for dataItemIDs and then providing a list of accepted IDs, and having the user select one of these from something like a drop down menu, and then doing the xpath query part with a hard coded string associated with that option which surely would isolate the user input from the final query?
    – GigaJoules
    Nov 27 at 15:37











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53427813%2fwhy-does-my-method-not-build-a-proper-xml-query-from-user-input%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
0
down vote













First point: you should never build a query by string concatenation incorporating user input. Google "SQL injection attacks" to find out why (most of the literature is in SQL terms, but XPath has exactly the same vulnerabilities).



If you use an expression with a variable, like //*[dataItemId = $userInput] then (a) you are protected against injection attacks, (b) you don't need to worry about escaping special characters in the input (like apostrophes), and (c) it's much more efficient because you only need to compile the XPath expression once.



The only difficulty is that you need to work out how to supply a parameter to the query (that is, a value for $userInput) before executing it. That depends on the API to your XPath processor and I'm not familiar with the XPath processor that you are using, so you'll need to research this yourself.






share|improve this answer





















  • Yeah I'm aware of the security risks, however, this is just a little application I am writing to familiarise myself with c# and won't ever be in the hands of anyone else or web facing so I'm not too worried about that here. Good to keep in mind though.
    – GigaJoules
    Nov 26 at 9:56










  • Even without the security risks, it's easier to write the code using a variable rather than using string concatenation.
    – Michael Kay
    Nov 26 at 10:06










  • In this case there is controlled language for the terms that are being searched, so perhaps something like a closest match lookup table which returns the actual text to be passed to the query construction would work? This way I could ensure that only a list of pre-approved strings is used to build queries since there is such a limited selection of searches the user needs to do?
    – GigaJoules
    Nov 27 at 11:46










  • In that case the benefits of compiling a parameterized query once and executing it repeatedly are even greater. Remember that compiling the query can cost 100 times as much as executing it. I'm not sure why you are so resistant to that approach.
    – Michael Kay
    Nov 27 at 11:58










  • I'm a little confused as to what you're actually suggesting? I was thinking of giving the user a text box to search for dataItemIDs and then providing a list of accepted IDs, and having the user select one of these from something like a drop down menu, and then doing the xpath query part with a hard coded string associated with that option which surely would isolate the user input from the final query?
    – GigaJoules
    Nov 27 at 15:37















up vote
0
down vote













First point: you should never build a query by string concatenation incorporating user input. Google "SQL injection attacks" to find out why (most of the literature is in SQL terms, but XPath has exactly the same vulnerabilities).



If you use an expression with a variable, like //*[dataItemId = $userInput] then (a) you are protected against injection attacks, (b) you don't need to worry about escaping special characters in the input (like apostrophes), and (c) it's much more efficient because you only need to compile the XPath expression once.



The only difficulty is that you need to work out how to supply a parameter to the query (that is, a value for $userInput) before executing it. That depends on the API to your XPath processor and I'm not familiar with the XPath processor that you are using, so you'll need to research this yourself.






share|improve this answer





















  • Yeah I'm aware of the security risks, however, this is just a little application I am writing to familiarise myself with c# and won't ever be in the hands of anyone else or web facing so I'm not too worried about that here. Good to keep in mind though.
    – GigaJoules
    Nov 26 at 9:56










  • Even without the security risks, it's easier to write the code using a variable rather than using string concatenation.
    – Michael Kay
    Nov 26 at 10:06










  • In this case there is controlled language for the terms that are being searched, so perhaps something like a closest match lookup table which returns the actual text to be passed to the query construction would work? This way I could ensure that only a list of pre-approved strings is used to build queries since there is such a limited selection of searches the user needs to do?
    – GigaJoules
    Nov 27 at 11:46










  • In that case the benefits of compiling a parameterized query once and executing it repeatedly are even greater. Remember that compiling the query can cost 100 times as much as executing it. I'm not sure why you are so resistant to that approach.
    – Michael Kay
    Nov 27 at 11:58










  • I'm a little confused as to what you're actually suggesting? I was thinking of giving the user a text box to search for dataItemIDs and then providing a list of accepted IDs, and having the user select one of these from something like a drop down menu, and then doing the xpath query part with a hard coded string associated with that option which surely would isolate the user input from the final query?
    – GigaJoules
    Nov 27 at 15:37













up vote
0
down vote










up vote
0
down vote









First point: you should never build a query by string concatenation incorporating user input. Google "SQL injection attacks" to find out why (most of the literature is in SQL terms, but XPath has exactly the same vulnerabilities).



If you use an expression with a variable, like //*[dataItemId = $userInput] then (a) you are protected against injection attacks, (b) you don't need to worry about escaping special characters in the input (like apostrophes), and (c) it's much more efficient because you only need to compile the XPath expression once.



The only difficulty is that you need to work out how to supply a parameter to the query (that is, a value for $userInput) before executing it. That depends on the API to your XPath processor and I'm not familiar with the XPath processor that you are using, so you'll need to research this yourself.






share|improve this answer












First point: you should never build a query by string concatenation incorporating user input. Google "SQL injection attacks" to find out why (most of the literature is in SQL terms, but XPath has exactly the same vulnerabilities).



If you use an expression with a variable, like //*[dataItemId = $userInput] then (a) you are protected against injection attacks, (b) you don't need to worry about escaping special characters in the input (like apostrophes), and (c) it's much more efficient because you only need to compile the XPath expression once.



The only difficulty is that you need to work out how to supply a parameter to the query (that is, a value for $userInput) before executing it. That depends on the API to your XPath processor and I'm not familiar with the XPath processor that you are using, so you'll need to research this yourself.







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 22 at 15:47









Michael Kay

108k659114




108k659114












  • Yeah I'm aware of the security risks, however, this is just a little application I am writing to familiarise myself with c# and won't ever be in the hands of anyone else or web facing so I'm not too worried about that here. Good to keep in mind though.
    – GigaJoules
    Nov 26 at 9:56










  • Even without the security risks, it's easier to write the code using a variable rather than using string concatenation.
    – Michael Kay
    Nov 26 at 10:06










  • In this case there is controlled language for the terms that are being searched, so perhaps something like a closest match lookup table which returns the actual text to be passed to the query construction would work? This way I could ensure that only a list of pre-approved strings is used to build queries since there is such a limited selection of searches the user needs to do?
    – GigaJoules
    Nov 27 at 11:46










  • In that case the benefits of compiling a parameterized query once and executing it repeatedly are even greater. Remember that compiling the query can cost 100 times as much as executing it. I'm not sure why you are so resistant to that approach.
    – Michael Kay
    Nov 27 at 11:58










  • I'm a little confused as to what you're actually suggesting? I was thinking of giving the user a text box to search for dataItemIDs and then providing a list of accepted IDs, and having the user select one of these from something like a drop down menu, and then doing the xpath query part with a hard coded string associated with that option which surely would isolate the user input from the final query?
    – GigaJoules
    Nov 27 at 15:37


















  • Yeah I'm aware of the security risks, however, this is just a little application I am writing to familiarise myself with c# and won't ever be in the hands of anyone else or web facing so I'm not too worried about that here. Good to keep in mind though.
    – GigaJoules
    Nov 26 at 9:56










  • Even without the security risks, it's easier to write the code using a variable rather than using string concatenation.
    – Michael Kay
    Nov 26 at 10:06










  • In this case there is controlled language for the terms that are being searched, so perhaps something like a closest match lookup table which returns the actual text to be passed to the query construction would work? This way I could ensure that only a list of pre-approved strings is used to build queries since there is such a limited selection of searches the user needs to do?
    – GigaJoules
    Nov 27 at 11:46










  • In that case the benefits of compiling a parameterized query once and executing it repeatedly are even greater. Remember that compiling the query can cost 100 times as much as executing it. I'm not sure why you are so resistant to that approach.
    – Michael Kay
    Nov 27 at 11:58










  • I'm a little confused as to what you're actually suggesting? I was thinking of giving the user a text box to search for dataItemIDs and then providing a list of accepted IDs, and having the user select one of these from something like a drop down menu, and then doing the xpath query part with a hard coded string associated with that option which surely would isolate the user input from the final query?
    – GigaJoules
    Nov 27 at 15:37
















Yeah I'm aware of the security risks, however, this is just a little application I am writing to familiarise myself with c# and won't ever be in the hands of anyone else or web facing so I'm not too worried about that here. Good to keep in mind though.
– GigaJoules
Nov 26 at 9:56




Yeah I'm aware of the security risks, however, this is just a little application I am writing to familiarise myself with c# and won't ever be in the hands of anyone else or web facing so I'm not too worried about that here. Good to keep in mind though.
– GigaJoules
Nov 26 at 9:56












Even without the security risks, it's easier to write the code using a variable rather than using string concatenation.
– Michael Kay
Nov 26 at 10:06




Even without the security risks, it's easier to write the code using a variable rather than using string concatenation.
– Michael Kay
Nov 26 at 10:06












In this case there is controlled language for the terms that are being searched, so perhaps something like a closest match lookup table which returns the actual text to be passed to the query construction would work? This way I could ensure that only a list of pre-approved strings is used to build queries since there is such a limited selection of searches the user needs to do?
– GigaJoules
Nov 27 at 11:46




In this case there is controlled language for the terms that are being searched, so perhaps something like a closest match lookup table which returns the actual text to be passed to the query construction would work? This way I could ensure that only a list of pre-approved strings is used to build queries since there is such a limited selection of searches the user needs to do?
– GigaJoules
Nov 27 at 11:46












In that case the benefits of compiling a parameterized query once and executing it repeatedly are even greater. Remember that compiling the query can cost 100 times as much as executing it. I'm not sure why you are so resistant to that approach.
– Michael Kay
Nov 27 at 11:58




In that case the benefits of compiling a parameterized query once and executing it repeatedly are even greater. Remember that compiling the query can cost 100 times as much as executing it. I'm not sure why you are so resistant to that approach.
– Michael Kay
Nov 27 at 11:58












I'm a little confused as to what you're actually suggesting? I was thinking of giving the user a text box to search for dataItemIDs and then providing a list of accepted IDs, and having the user select one of these from something like a drop down menu, and then doing the xpath query part with a hard coded string associated with that option which surely would isolate the user input from the final query?
– GigaJoules
Nov 27 at 15:37




I'm a little confused as to what you're actually suggesting? I was thinking of giving the user a text box to search for dataItemIDs and then providing a list of accepted IDs, and having the user select one of these from something like a drop down menu, and then doing the xpath query part with a hard coded string associated with that option which surely would isolate the user input from the final query?
– GigaJoules
Nov 27 at 15:37


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53427813%2fwhy-does-my-method-not-build-a-proper-xml-query-from-user-input%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

Calculate evaluation metrics using cross_val_predict sklearn

Insert data from modal to MySQL (multiple modal on website)