How do I generate unique, non-serial order numbers (similar to Amazon)?

Amazon generates a 17 digit order number of the form 3-7-7, where the last 14 digits are seemingly pseudo-random. The "birthday problem" means that I cannot generate random numbers and get away without collisions. I am using a database and I know that I can check the database index before inserting to check for collisions, but the rapidly increasing rate of collisions means that my order number generation may arbitrarily slow down. What is a good way to generate these pseudo-random numbers for creating an order ? The Amazon order# has other nice properties like the 14 digits never begins with 0, etc. I was thinking about using a seed permutation and using something to get lexicographic permutations.
Answer:

You donâ€™t need to worry. Yes, the birthday problem tells you that you will probably see one collision in a 14-digit order number space after some ten million ordersâ€”but if you can check for collisions and deal with them by generating another number, an occasional collision is no big deal. Even if you get a trillion orders, youâ€™ll still only fill up 1% of the space, so an order number will only need to be regenerated with probability 1%. But if you really want to avoid that check anyway, you could generate internal IDs sequentially (1000000-0000000, 1000000-0000001, 1000000-0000002, â€¦), then apply http://en.wikipedia.org/wiki/Format-preserving_encryption to make them look random before showing them to the customer. See also

Anders Kaseorg at Quora Visit the source

Was this solution helpful to you?

Other answers

If your strings get long enough, the chances of collisions becomes almost infinitesimally small. Like suggested, check out GUIDs or UUIDs. The page on UUIDs has a section on the possibility of duplicates. http://en.wikipedia.org/wiki/UUID

Jon Moter

Can you use a timestamp as a component of the ID? If you combine a millisecond or nanosecond precision timestamp with a good, long pseudo-random number, the chances of collision would be vanishingly small.

Tim Moore

The lowest realtime-overhead method of doing this is pre-populating a database table with your desired ID's. Write a script that starts at the lowest number you want and then increment up - discard numbers that don't meet your criteria. Write the others to a table. Generate 1000 or 1,000,000 at a time. Run the script when you get close to running out of numbers - starting at the highest number previously generated. Only run the script when you have extra cycles to spare.

Michael Stajer

Take a text file [say paste.txt],the longer words it has the batter for you. Use a one liner like for i in `cat paste.txt`; do echo "$RANDOM $i"; done | sort | sed -r 's/^[0-9]+ //' > randorder.txt, to get shuffled words from the file,use word by word output as word by word input for a hash function that fits.Even minimal functions like https://github.com/carnotweat/lopt/blob/master/pwjhash.c or https://github.com/carnotweat/lopt/blob/master/elfhash.c give decent outputs. e.g. ./pwjhash sameer 0x00000073 0x00000791 0x0000797d 0x00079835 0x007983b5 0x07983bc2 0x00000309 & ./elfhash sameer 00000111100110000011101111000010 sameer=127417282 This can be followed by using similar *nix untilities/regex to cleanup the strings that don't match your criteria. Processing any decent text file as above will generate the sough string. Like everyone else said,collisions are unlikely for a large numbers of strings generated.

Sameer Gupta

This is a very interesting question since generating unique IDs can be a technically challenging problem when the system has scaled to certain level. However, if your ID has 17 digit numbers like Amazon, the chance of collision is extremely low. With that in mind, whenever a new random ID is generated, you can just check if there's a collision. If by any chance you get a collision, just generate again. Since the chance is very low, this will add almost zero cost. I think your concern is mostly for cases like generating user IDs. Usually, the system will keep incrementing the user ID based on the time of registration and when there are a large number of requests every second, it's easy to get duplicate user IDs. Remember that the biggest difference here is that user IDs are usually incremental, while Amazon's order numbers are not. Solving the user ID generation problem is much harder as you'd better distributed the generators across multiple machines. Take a look at the article http://blog.gainlo.co/index.php/2016/06/07/random-id-generator/?utm_source=quora&utm_medium=How+do+I+generate+unique%2C+non-serial+order+numbers+(similar+to+Amazon)%3F&utm_campaign=quora that has a in-depth discussion of this problem.

Mark Ali

Consider that the number is to be generated in the interval [1,1000] . Generate a random number ,say 300 . Now the valid interval for the next number becomes [1,299] & [301,1000] . After the second number is generated we would get 3 disjoint intervals for the next valid serial number. At some point say we have 'k' valid intervals. We can have 2 pseudo random numbers generated... one is rand(0,k-1) which randomly selects a valid interval ,and next is rand(interval beginning , interval ending) which selects a number within that chosen interval . Once the serial number is selected increment k and change the valid intervals accordingly. Hence we can just generate two random numbers always to get a valid unique number without any collisions.

Kumaran Gunasekaran

If you don't want to go with an existing solution (like UUIDs) here's what you do. Take a representation of a bunch of data about the order (User ID of the person that placed the order, IDs of the ordered items, the current timestamp, the IP of the user, ...) and concatenate them (or put them in a hashable data-structure (like an array). Add some random number at the end Generate a hash-value of that. Because a user doesn't place the same order at the same time from the same IP address twice your original value (before hashing) will be unique. If you want to make really sure that there are no duplicates even after hashing most database systems can check the uniqueness of values. If you're using some flavour of SQL-database you can make the column for order ID unique (it is automatically set to unique if it's your primary key). If you then try to insert an order with a duplicate order ID you will get an error message and can regenerate a new order ID and try again. You can also pre-generate the IDs in the background and feed them in a queue. Then you have a lot of time to check the uniqueness and when your order manager needs an order ID you just take one of the queue.

Tobias Kommerell

Related Q & A:

How do I generate documentation for a project?Best solution by Stack Overflow
How do I generate barcode using c#?Best solution by Stack Overflow
How do i find people by cell phone numbers for free?Best solution by Yahoo! Answers
How can I open a non-profit organization in the San Francisco Bay Area?Best solution by Yahoo! Answers
How do I register a non profit organization in India?Best solution by Quora

Just Added Q & A:

How many active mobile subscribers are there in China?Best solution by Quora
How to find the right vacation?Best solution by bookit.com
How To Make Your Own Primer?Best solution by thekrazycouponlady.com
How do you get the domain & range?Best solution by ChaCha
How do you open pop up blockers?Best solution by Yahoo! Answers

For every problem there is a solution! Proved by Solucija.

Got an issue and looking for advice?
Ask Solucija to search every corner of the Web for help.
Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.