How long should a query that returns 5 million records take?
I realise the answer should probably be 'as little time as possible' but I'm trying to learn how to optimise databases and I have no idea what an acceptable time is for my hardware.
For a start I'm using my local machine with a copy of sql server 2008 express. I have a dual-core processor, 2GB ram and a 64bit OS (if that makes a difference). I'm only using a simple table with about 6 varchar fields.
At first I queried the data without any indexing. This took a ridiculously long amount of time so I cancelled and added a clustered index (using the PK) to the table. This cut the time down to 1 minute 14 sec. I have no idea if this is the best I can get or whether I'm still able to cut this down even further?
Am I limited by my hardware or is there anything else I can do to my table/database/queries to get results faster?
FYI I'm only using a standard SELECT * FROM to retrieve my results.
Thanks!
EDIT: Just to clarify, I'm only doing this for testing purposes. I don't NEED to pull out all the data, I'm just using that as a consistent test to see if I can cut down the query times.
I suppose what I'm asking is: Is there anything I can do to speed up the performance of my queries other than a) upgrading hardware and b) adding indexes (assuming the schema is already good)?
sql
add a comment |
I realise the answer should probably be 'as little time as possible' but I'm trying to learn how to optimise databases and I have no idea what an acceptable time is for my hardware.
For a start I'm using my local machine with a copy of sql server 2008 express. I have a dual-core processor, 2GB ram and a 64bit OS (if that makes a difference). I'm only using a simple table with about 6 varchar fields.
At first I queried the data without any indexing. This took a ridiculously long amount of time so I cancelled and added a clustered index (using the PK) to the table. This cut the time down to 1 minute 14 sec. I have no idea if this is the best I can get or whether I'm still able to cut this down even further?
Am I limited by my hardware or is there anything else I can do to my table/database/queries to get results faster?
FYI I'm only using a standard SELECT * FROM to retrieve my results.
Thanks!
EDIT: Just to clarify, I'm only doing this for testing purposes. I don't NEED to pull out all the data, I'm just using that as a consistent test to see if I can cut down the query times.
I suppose what I'm asking is: Is there anything I can do to speed up the performance of my queries other than a) upgrading hardware and b) adding indexes (assuming the schema is already good)?
sql
5
FYI you should probably test by inserting into a#TEMPtable instead of just selecting it. You will definitely be paying for display overhead for SSMS to show all 5m rows, and it will eat into your memory and IO
– JNK
Apr 3 '12 at 13:00
7
To speed it up, try to avoid using '*' and just ask for the data you want
– Matt Gibson
Apr 3 '12 at 13:06
+1 for Matt's suggestion of just ask for the data you want... for example, if all you want is the id and name of a row, you can do "select id, name from ..."
– joelparkerhenderson
Apr 3 '12 at 13:26
add a comment |
I realise the answer should probably be 'as little time as possible' but I'm trying to learn how to optimise databases and I have no idea what an acceptable time is for my hardware.
For a start I'm using my local machine with a copy of sql server 2008 express. I have a dual-core processor, 2GB ram and a 64bit OS (if that makes a difference). I'm only using a simple table with about 6 varchar fields.
At first I queried the data without any indexing. This took a ridiculously long amount of time so I cancelled and added a clustered index (using the PK) to the table. This cut the time down to 1 minute 14 sec. I have no idea if this is the best I can get or whether I'm still able to cut this down even further?
Am I limited by my hardware or is there anything else I can do to my table/database/queries to get results faster?
FYI I'm only using a standard SELECT * FROM to retrieve my results.
Thanks!
EDIT: Just to clarify, I'm only doing this for testing purposes. I don't NEED to pull out all the data, I'm just using that as a consistent test to see if I can cut down the query times.
I suppose what I'm asking is: Is there anything I can do to speed up the performance of my queries other than a) upgrading hardware and b) adding indexes (assuming the schema is already good)?
sql
I realise the answer should probably be 'as little time as possible' but I'm trying to learn how to optimise databases and I have no idea what an acceptable time is for my hardware.
For a start I'm using my local machine with a copy of sql server 2008 express. I have a dual-core processor, 2GB ram and a 64bit OS (if that makes a difference). I'm only using a simple table with about 6 varchar fields.
At first I queried the data without any indexing. This took a ridiculously long amount of time so I cancelled and added a clustered index (using the PK) to the table. This cut the time down to 1 minute 14 sec. I have no idea if this is the best I can get or whether I'm still able to cut this down even further?
Am I limited by my hardware or is there anything else I can do to my table/database/queries to get results faster?
FYI I'm only using a standard SELECT * FROM to retrieve my results.
Thanks!
EDIT: Just to clarify, I'm only doing this for testing purposes. I don't NEED to pull out all the data, I'm just using that as a consistent test to see if I can cut down the query times.
I suppose what I'm asking is: Is there anything I can do to speed up the performance of my queries other than a) upgrading hardware and b) adding indexes (assuming the schema is already good)?
sql
sql
edited Apr 3 '12 at 16:45
VMAtm
23.7k166178
23.7k166178
asked Apr 3 '12 at 12:58
alimac83alimac83
1,22952444
1,22952444
5
FYI you should probably test by inserting into a#TEMPtable instead of just selecting it. You will definitely be paying for display overhead for SSMS to show all 5m rows, and it will eat into your memory and IO
– JNK
Apr 3 '12 at 13:00
7
To speed it up, try to avoid using '*' and just ask for the data you want
– Matt Gibson
Apr 3 '12 at 13:06
+1 for Matt's suggestion of just ask for the data you want... for example, if all you want is the id and name of a row, you can do "select id, name from ..."
– joelparkerhenderson
Apr 3 '12 at 13:26
add a comment |
5
FYI you should probably test by inserting into a#TEMPtable instead of just selecting it. You will definitely be paying for display overhead for SSMS to show all 5m rows, and it will eat into your memory and IO
– JNK
Apr 3 '12 at 13:00
7
To speed it up, try to avoid using '*' and just ask for the data you want
– Matt Gibson
Apr 3 '12 at 13:06
+1 for Matt's suggestion of just ask for the data you want... for example, if all you want is the id and name of a row, you can do "select id, name from ..."
– joelparkerhenderson
Apr 3 '12 at 13:26
5
5
FYI you should probably test by inserting into a
#TEMP table instead of just selecting it. You will definitely be paying for display overhead for SSMS to show all 5m rows, and it will eat into your memory and IO– JNK
Apr 3 '12 at 13:00
FYI you should probably test by inserting into a
#TEMP table instead of just selecting it. You will definitely be paying for display overhead for SSMS to show all 5m rows, and it will eat into your memory and IO– JNK
Apr 3 '12 at 13:00
7
7
To speed it up, try to avoid using '*' and just ask for the data you want
– Matt Gibson
Apr 3 '12 at 13:06
To speed it up, try to avoid using '*' and just ask for the data you want
– Matt Gibson
Apr 3 '12 at 13:06
+1 for Matt's suggestion of just ask for the data you want... for example, if all you want is the id and name of a row, you can do "select id, name from ..."
– joelparkerhenderson
Apr 3 '12 at 13:26
+1 for Matt's suggestion of just ask for the data you want... for example, if all you want is the id and name of a row, you can do "select id, name from ..."
– joelparkerhenderson
Apr 3 '12 at 13:26
add a comment |
5 Answers
5
active
oldest
votes
I think you are asking the wrong question.
First of all - why do you need so many articles at one time at the local machine? What do you want to do with them?
Why I'm asking? I think this amount of data will be transfered to somewhere. And only at this time you should measure the time of transfering the data.
And even in this situation I want to make advice to you:
Your applications should not select 5 million records at the time. Try to split your query, and get data partially.
UPDATE:
As you say are doing this for testing, I suggest you to:
- Remove
*from your query - SQL server spends some time to resolve this. - Try to put your data to some temp data storage. Try to use
VIEWor temp table for this. - Try to use some cache plan on your server
But I still don't understand - why do you need such tests if your application would not ever use such query? Testing only for testing is bad time spending.
And I would add that optimizing for issues found in those tests, which I assume is the reason for running them, would be an even bigger waste of time.
– Samuel Lindblom
Sep 30 '13 at 9:03
add a comment |
Look at the query execution plan. If your query is doing a table scan, it will obviously take a long time. The query execution plan can help you decide what kind of indexing you would need on the table. Also, creating table partitions can help sometimes in cases where the data is partitioned by a condition (usually date and time).
add a comment |
The best optimized way depends on the indexing strategy you choose. As many of the above answers, i too would say partitioning the table would help sometimes. And its not the best practise to query all the billion record in a single time frame. Will give you much bettered resuld if you could try to query partially with the iterations. you may check this link to clear the doubts on the minimum requirements for the Sql server 2008 Minimum H/W and S/W Requirements for Sql server 2008
add a comment |
When fecthing 5 million rows you are almost 100% going spool to tempdb. you should try to optimize your temp Db by adding additional files. if you have multiple drives on seperate disks you should split the table data into different ndf files located on seperate disks. parititioning wont help when querying all the data on the disk
U can also use a query hint to force parrallelism MAXDOP this will increase the CPU utilization. Ensure that the columns contain few nulls as possible and rebuild ur indexes and stats
add a comment |
I did 5.5 million in 20 seconds. That's taking over 100k schedules with different frequencies and forecasting them for the next 25 years. Just max scenario testing, but proves the speed you can achieve in a scheduling system as an example.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f9993761%2fhow-long-should-a-query-that-returns-5-million-records-take%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
I think you are asking the wrong question.
First of all - why do you need so many articles at one time at the local machine? What do you want to do with them?
Why I'm asking? I think this amount of data will be transfered to somewhere. And only at this time you should measure the time of transfering the data.
And even in this situation I want to make advice to you:
Your applications should not select 5 million records at the time. Try to split your query, and get data partially.
UPDATE:
As you say are doing this for testing, I suggest you to:
- Remove
*from your query - SQL server spends some time to resolve this. - Try to put your data to some temp data storage. Try to use
VIEWor temp table for this. - Try to use some cache plan on your server
But I still don't understand - why do you need such tests if your application would not ever use such query? Testing only for testing is bad time spending.
And I would add that optimizing for issues found in those tests, which I assume is the reason for running them, would be an even bigger waste of time.
– Samuel Lindblom
Sep 30 '13 at 9:03
add a comment |
I think you are asking the wrong question.
First of all - why do you need so many articles at one time at the local machine? What do you want to do with them?
Why I'm asking? I think this amount of data will be transfered to somewhere. And only at this time you should measure the time of transfering the data.
And even in this situation I want to make advice to you:
Your applications should not select 5 million records at the time. Try to split your query, and get data partially.
UPDATE:
As you say are doing this for testing, I suggest you to:
- Remove
*from your query - SQL server spends some time to resolve this. - Try to put your data to some temp data storage. Try to use
VIEWor temp table for this. - Try to use some cache plan on your server
But I still don't understand - why do you need such tests if your application would not ever use such query? Testing only for testing is bad time spending.
And I would add that optimizing for issues found in those tests, which I assume is the reason for running them, would be an even bigger waste of time.
– Samuel Lindblom
Sep 30 '13 at 9:03
add a comment |
I think you are asking the wrong question.
First of all - why do you need so many articles at one time at the local machine? What do you want to do with them?
Why I'm asking? I think this amount of data will be transfered to somewhere. And only at this time you should measure the time of transfering the data.
And even in this situation I want to make advice to you:
Your applications should not select 5 million records at the time. Try to split your query, and get data partially.
UPDATE:
As you say are doing this for testing, I suggest you to:
- Remove
*from your query - SQL server spends some time to resolve this. - Try to put your data to some temp data storage. Try to use
VIEWor temp table for this. - Try to use some cache plan on your server
But I still don't understand - why do you need such tests if your application would not ever use such query? Testing only for testing is bad time spending.
I think you are asking the wrong question.
First of all - why do you need so many articles at one time at the local machine? What do you want to do with them?
Why I'm asking? I think this amount of data will be transfered to somewhere. And only at this time you should measure the time of transfering the data.
And even in this situation I want to make advice to you:
Your applications should not select 5 million records at the time. Try to split your query, and get data partially.
UPDATE:
As you say are doing this for testing, I suggest you to:
- Remove
*from your query - SQL server spends some time to resolve this. - Try to put your data to some temp data storage. Try to use
VIEWor temp table for this. - Try to use some cache plan on your server
But I still don't understand - why do you need such tests if your application would not ever use such query? Testing only for testing is bad time spending.
edited Apr 13 '12 at 7:19
answered Apr 3 '12 at 13:02
VMAtmVMAtm
23.7k166178
23.7k166178
And I would add that optimizing for issues found in those tests, which I assume is the reason for running them, would be an even bigger waste of time.
– Samuel Lindblom
Sep 30 '13 at 9:03
add a comment |
And I would add that optimizing for issues found in those tests, which I assume is the reason for running them, would be an even bigger waste of time.
– Samuel Lindblom
Sep 30 '13 at 9:03
And I would add that optimizing for issues found in those tests, which I assume is the reason for running them, would be an even bigger waste of time.
– Samuel Lindblom
Sep 30 '13 at 9:03
And I would add that optimizing for issues found in those tests, which I assume is the reason for running them, would be an even bigger waste of time.
– Samuel Lindblom
Sep 30 '13 at 9:03
add a comment |
Look at the query execution plan. If your query is doing a table scan, it will obviously take a long time. The query execution plan can help you decide what kind of indexing you would need on the table. Also, creating table partitions can help sometimes in cases where the data is partitioned by a condition (usually date and time).
add a comment |
Look at the query execution plan. If your query is doing a table scan, it will obviously take a long time. The query execution plan can help you decide what kind of indexing you would need on the table. Also, creating table partitions can help sometimes in cases where the data is partitioned by a condition (usually date and time).
add a comment |
Look at the query execution plan. If your query is doing a table scan, it will obviously take a long time. The query execution plan can help you decide what kind of indexing you would need on the table. Also, creating table partitions can help sometimes in cases where the data is partitioned by a condition (usually date and time).
Look at the query execution plan. If your query is doing a table scan, it will obviously take a long time. The query execution plan can help you decide what kind of indexing you would need on the table. Also, creating table partitions can help sometimes in cases where the data is partitioned by a condition (usually date and time).
answered Apr 3 '12 at 13:39
rvphxrvphx
1,61862757
1,61862757
add a comment |
add a comment |
The best optimized way depends on the indexing strategy you choose. As many of the above answers, i too would say partitioning the table would help sometimes. And its not the best practise to query all the billion record in a single time frame. Will give you much bettered resuld if you could try to query partially with the iterations. you may check this link to clear the doubts on the minimum requirements for the Sql server 2008 Minimum H/W and S/W Requirements for Sql server 2008
add a comment |
The best optimized way depends on the indexing strategy you choose. As many of the above answers, i too would say partitioning the table would help sometimes. And its not the best practise to query all the billion record in a single time frame. Will give you much bettered resuld if you could try to query partially with the iterations. you may check this link to clear the doubts on the minimum requirements for the Sql server 2008 Minimum H/W and S/W Requirements for Sql server 2008
add a comment |
The best optimized way depends on the indexing strategy you choose. As many of the above answers, i too would say partitioning the table would help sometimes. And its not the best practise to query all the billion record in a single time frame. Will give you much bettered resuld if you could try to query partially with the iterations. you may check this link to clear the doubts on the minimum requirements for the Sql server 2008 Minimum H/W and S/W Requirements for Sql server 2008
The best optimized way depends on the indexing strategy you choose. As many of the above answers, i too would say partitioning the table would help sometimes. And its not the best practise to query all the billion record in a single time frame. Will give you much bettered resuld if you could try to query partially with the iterations. you may check this link to clear the doubts on the minimum requirements for the Sql server 2008 Minimum H/W and S/W Requirements for Sql server 2008
answered Apr 12 '12 at 7:19
user824910user824910
47251332
47251332
add a comment |
add a comment |
When fecthing 5 million rows you are almost 100% going spool to tempdb. you should try to optimize your temp Db by adding additional files. if you have multiple drives on seperate disks you should split the table data into different ndf files located on seperate disks. parititioning wont help when querying all the data on the disk
U can also use a query hint to force parrallelism MAXDOP this will increase the CPU utilization. Ensure that the columns contain few nulls as possible and rebuild ur indexes and stats
add a comment |
When fecthing 5 million rows you are almost 100% going spool to tempdb. you should try to optimize your temp Db by adding additional files. if you have multiple drives on seperate disks you should split the table data into different ndf files located on seperate disks. parititioning wont help when querying all the data on the disk
U can also use a query hint to force parrallelism MAXDOP this will increase the CPU utilization. Ensure that the columns contain few nulls as possible and rebuild ur indexes and stats
add a comment |
When fecthing 5 million rows you are almost 100% going spool to tempdb. you should try to optimize your temp Db by adding additional files. if you have multiple drives on seperate disks you should split the table data into different ndf files located on seperate disks. parititioning wont help when querying all the data on the disk
U can also use a query hint to force parrallelism MAXDOP this will increase the CPU utilization. Ensure that the columns contain few nulls as possible and rebuild ur indexes and stats
When fecthing 5 million rows you are almost 100% going spool to tempdb. you should try to optimize your temp Db by adding additional files. if you have multiple drives on seperate disks you should split the table data into different ndf files located on seperate disks. parititioning wont help when querying all the data on the disk
U can also use a query hint to force parrallelism MAXDOP this will increase the CPU utilization. Ensure that the columns contain few nulls as possible and rebuild ur indexes and stats
answered Feb 7 '14 at 13:58
Jayanth KurupJayanth Kurup
1
1
add a comment |
add a comment |
I did 5.5 million in 20 seconds. That's taking over 100k schedules with different frequencies and forecasting them for the next 25 years. Just max scenario testing, but proves the speed you can achieve in a scheduling system as an example.
add a comment |
I did 5.5 million in 20 seconds. That's taking over 100k schedules with different frequencies and forecasting them for the next 25 years. Just max scenario testing, but proves the speed you can achieve in a scheduling system as an example.
add a comment |
I did 5.5 million in 20 seconds. That's taking over 100k schedules with different frequencies and forecasting them for the next 25 years. Just max scenario testing, but proves the speed you can achieve in a scheduling system as an example.
I did 5.5 million in 20 seconds. That's taking over 100k schedules with different frequencies and forecasting them for the next 25 years. Just max scenario testing, but proves the speed you can achieve in a scheduling system as an example.
answered Nov 25 '18 at 19:16
AndyAndy
1
1
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f9993761%2fhow-long-should-a-query-that-returns-5-million-records-take%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
5
FYI you should probably test by inserting into a
#TEMPtable instead of just selecting it. You will definitely be paying for display overhead for SSMS to show all 5m rows, and it will eat into your memory and IO– JNK
Apr 3 '12 at 13:00
7
To speed it up, try to avoid using '*' and just ask for the data you want
– Matt Gibson
Apr 3 '12 at 13:06
+1 for Matt's suggestion of just ask for the data you want... for example, if all you want is the id and name of a row, you can do "select id, name from ..."
– joelparkerhenderson
Apr 3 '12 at 13:26