SQL: Is there a performance gain using one join vs. several?
-
When designing a database schema, is there a performance gain using a pseudo-key (an auto-incremented key) vs. a natural key that constitutes two or more fields? In other words, would one join be faster and by how much by percentage over using two joins? Table 1: id. (auto-number), description, addedByUserId, dateTimeAdded Table 2: id., (auto-number), price, dateOfSale vs. Table 1: description, addedByUserId, dateTimeAdded Table 2: price, dateOfSale, addedByUserId, dateTimeAdded In this example I could use addedByUserId and dateTimeAdded as my primary key ... or ... I could use an auto-number. What is the performance gain in using the auto-number approach?
-
Answer:
Don't try to prematurely optimise for performance. In my opinion you should design your schema for correctness and integrity first.
Toby Thain at Quora Visit the source
Other answers
As said don't try to prematurely optimize your database performance. Lets take the example you have used. Lets start with dateTimeAdded as a primary key. Its good if you have only user active at a time but may disturb your other relations. And if the number of users active are more than one then it is going to be huge problem, there will be several misses. To be a little mathematical if 2 people are active then chances are 50% miss. And it will increase with number of users. And why? Since only a user can update at a particular dateTime. Now if we take addedByUserId as a primary key. Then there will be problem if the user wants to add two items. So, it will not work. Then the case you specified what if we use both addedByUserId and dateTimeAdded. Wohooo good work. This can do and eliminate our problem since a user can add only one thing at once, so everything will work fine. Now there is performance issues with your solution too. First of all, joins are costly. Not much costly but compare to single table, Yes. Lesser the joins are better will be the performance. Just for little background, the results of joins are dependent on elimination rather than selection. And second redundancy, which you obviously know. Try to be as less redundant as possible. So, to improve on your schema I would add an autonumber primary key for every activity and then use that particular autonumber for joins. So why to make copy of data if we can work with an additional single column and plus it is a better identifier than this combination. Note: My explanation does not hold always. But it work in maximum number of cases, in general. In some of the other cases your way might work well. Some reading material, I would like to suggest. http://stackoverflow.com/questions/2623852/why-are-joins-bad-when-considering-scalability http://stackoverflow.com/questions/2451428/what-is-so-bad-about-using-sql-inner-join
Neeraj Khandelwal
This is not premature optimization. Considering your tables primary key and natural vs. pseudo is *not* premature optimization in a database. It is the very core of "correctness" and "integrity" of the DB. This concept "What identifies my data?" *is* the absolute crux of databases and should be given proper due. The answer depends on the DB and in the case of mySQL storage engine. It is important to remember that a natural key is different than a key which "happens to uniquely identify the data". In the case of your tables above I would use a way of thinking similar to thinking of objects. A table should be an instance of something and while user_id and time added happen to uniquely identify your data it isn't really a natural key. There are, of course, exceptions, but in general you should always prefer a natural key over an increment, but you should not shoe horn the wrong data into being a key. In mySQL in the absence of a natural key use an auto-increment. If you focus very carefully on the the primary identifiers for data and reducing repetition of data then better performance will be a result and importantly later performance optimization will not be stymied.
Stephen Johnston
Related Q & A:
- Would a performance chip do anything beneficial for my 2002 Cavalier?Best solution by Yahoo! Answers
- How to search for a particular string in a text file using java?Best solution by Stack Overflow
- When purchasing a new tv, does 60 HZ vs 120 Hz REALLY matter?Best solution by Yahoo! Answers
- How to connect a psp go to a PS3 controller using a computer?Best solution by eHow old
- Is the National Society of High School Scholars a worth-while society to join?Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.