What is the best way to limit a MySQL query on a large table to optimize execution speed?
-
Here is the scenario: I've got a MySQL table "posts" with 1.5mm rows. To select a set of 50 and order by creation timestamp can take up to 3 seconds because MySQL has to analyze some 300k rows first. I've found a good solution for the main page which loads the last 50 posts ordered by timestamp desc. Basically, every time a new post is submitted, a query is run to look up the last post_id on the page (the 50th post) and its ID is written to a flat text file. Then when the page is loaded, the query is limited WHERE post_id >= $lastPostIdOnPage fetched from that file. This works for the main page (0.002 sec), but for any other page it takes nearly 3 seconds. How would you optimize this for all pages? Using the above methodology I would need to first know a range of post_ids before executing the query to fetch results. Any thoughts would be appreciated!
-
Answer:
First: try to normalize the database. Database normalization is very important when you reach certain amount of rows. Second: allways use numbers. If you have to make SELECTs and you have to order by a field o make a condition with a field, try to code that field as a number. Third: indexes. That's your word: indexes. Every field you use in a WHERE or in an ORDER BY, must be indexed. Fourth: sometimes, is better to make something by programming and don't make everything with the database core. I'ver worked with a table with 50 millions rows (yes, 50 millions). And I made a join with that table and two other tables with 8 millions rows each. The time was less than a second. Sorry for my spelling mistakes.
Angel Rojas at Quora Visit the source
Other answers
Try optimizing the table and see what that gets you.
Robert Ross
You are finding that indexes donât help when youâre using LIMITâ¦OFFSET for a specific pageâs worth of rows. The reason is that indexes index by value, not by position.But if you can make the value and the position be the same, then you can use an index for paging.Add a column rownum in your table with an index on it. This will be like an auto-increment number, but it wonât have any gaps. That means you may have to renumber the table if you delete a row or something. SET @r := 0; UPDATE posts SET rownum = (@r:=@r+1) ORDER BY PostId; Then you can query for any page of 50 rows in the collection: SELECT * FROM posts WHERE rownum BETWEEN ? AND ? This will use the index on rownum, and then it will examine only the rows that are part of the page you want to return.
Bill Karwin
The suggestions in the other answers make sense but without knowing the full query you're running, we can really only offer generic advice. As a first step, run an EXPLAIN query (https://dev.mysql.com/doc/refman/5.0/en/using-explain.html) to figure out what MySQL is actually trying to do. Assuming MySQL isn't using indexes, as others have pointed out you'll want to add indexes to try and improve the performance. Apart from that, you'll want to look at the specific conditions in the WHERE clause and also take a look at any JOINs you're doing. MySQL has some well known "worst case" performance patterns when you do things like run dependent subqueries, JOINs across unindexed columns, or SELECT on a table without a primary key. Just a heads up, it is possible to "over index" a table and actually cause the performance to drop.
Ashish Datta
Try creating an index on the creation timestamp if that's what you are querying with. From your description, it looks like MySQL is doing a table scan, resulting in poor performance. To begin with, run EXPLAIN on the exact SELECT command you are using now to understand MySQL's query plan. The output of EXPLAIN will tell you the rows scanned to return the output you've requested. You can then add an index or use other optimization strategies to speed up queries. FWIW, MySQL can handle 1.5mm rows very easily.
Raghavendra Kidiyoor
The key is to create a composite index on "all" fields queried with createdTime at the end. You may also want to limit the index scan by specifying a date range in the query for createdTime, like last few days. Don't do count(*) or SQL_CALC_FOUND_ROWS as they can kill your servers, you will have to live with not showing total no of pages.
Ravi Periasamy
Try to make an horizontal partitionning. The difficulty is to find the better way with your schema to get the good partitionning. It'll be transparent for you but MySQL will split a big table in x partition. http://dev.mysql.com/tech-resources/articles/performance-partitioning.html
Antoine Guiral
Check out this link where its nicely explained on how to optimize the queries http://www.ezeestudy.com/2010/06/22/mysql-query-optimization-3/, you shall check that out
Deivanaathan A Krishnan
Related Q & A:
- What is the best way to calculate a date difference?Best solution by Stack Overflow
- What is the best way to sell a timeshare?Best solution by Yahoo! Answers
- What's the best way to get a job in a restaurant?Best solution by Yahoo! Answers
- What's the best way to make a good impression at a job interview?Best solution by Yahoo! Answers
- What is the best way to negotiate a salary for a new position?Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.