What is the ideal database server architecture for a Facebook App that is anticipated to scale up very quickly?
-
We intend to use MySQL, and users will be uploading media (mostly photos) through the app. We anticipate a few hundred thousand users in a relatively short period of time, and want to architect the app and the server side to be prepared to handle the load. Any recommendations on preferred services, alternative platforms, and best practices would be greatly appreciated. Thank you in advance.
-
Answer:
Ensure Application Layer Scalability From The Start On the backend, I would go ahead and utilize VMs to create several "small" servers instead of one big one. Ensure your application code can run on two separate application VMs. This will essentially prove that the application layer is scalable and will allow you to scale the app and database layer independently. Tune The Platform I would focus on making things fast across the board by tuning your infrastructure first. Read tuning docs for your platform / language of choice. Even simple stress tests along with simple systems-level measurement tools like 'vmstat' can reveal performance bottlenecks. Figure out the best practices for high performance on whatever application platform you're working with, and get that out of the way. Use Proven Technology NoSQL and other hot, more "experimental," tech designed for exotic scaling is very tempting, but not proven. Use well-documented and well-understood technology. You'll be able to get out of a sticky situation much quicker and find help when you need it. In addition, if you can, leverage existing infrastructure that's already been proven to scale, like some of the services provided by Amazon Web Services or Google App Engine. Always Measure Don't assume things are faster because some guy on Quora says it is ;), have ways of measuring it. Even if they are simple and crude, any measurement is several orders of magnitude better than shooting in the dark. Caching is Tempting, but Hard. Use Sparingly I would stay away from caching unless the implementation is extremely simple. Cache invalidation can be hard and complex, and often you won't realize exactly what you need to cache until you've got hard performance data from the real-world. If you cache the wrong thing or do not invalidate things properly, things can really get messed up. You open yourself up for things like "cache stampedes" that can suddenly overload your database. If you've got a dynamic page that you know will get a ton of traffic and looks the same for everyone, like a home page, go ahead and cache it. Use something simple like a background job that runs, say, every five seconds, to update the cached copy of the page. Of couse, as soon as you personalize any page, caching becomes much harder. Uploading Sucks Most web platforms are terrible at uploading. Most have crude implementations that will tie up an entire worker (think 50-250MB RAM, depending on platform) while the upload is taking place. Considering this is going to be a core feature of your application, I'd make sure you can handle thousands of concurrent uploads without crashing and burning. Nginx can buffer uploads to disk while they are taking place, and then flush the entire upload to your application once it's complete. This means your application code is only actually running while Nginx is transferring the upload to it (happens at hundreds or thousands of megabytes per second), versus while the client is streaming it (happens at kilobytes per second). Files Don't Belong In The Database As far as the database, definitely do not store files in MySQL. As others have said, use a CDN like S3. It will cut out a huge amount of issues you may have by removing an entire class of scaling problems, allowing you to focus on other things. The Database Will Likely Be Your Bottleneck I prefer PostgreSQL, but if you must, use at least 5.4 and always use the InnoDB storage engine. If you expect to have lots of writes (hundreds of INSERTs per second peak) and can afford to lose a few transactions if the database crashes (by few I mean less than a seconds worth of transactions worst case), make sure you set innodb_flush_log_at_trx_commit to 0 and sync_binlog to 10. This allows the database to perform disk writes in a more efficient manner, by batching them and performing them once per second. It also allows a synchronous client (like PHP, Rails, etc) to continue operating without waiting for the physical disk write to complete. In most free/consumer applications, losing a few transactions isn't going to impact you much. Try to avoid master/slave replication for scaling as long as possible, as it introduces relaxed consistency guarantees that are often difficult to predict and can cause coding problems. Use as few indexes as you can get away with. Don't put sessions in your database. Instead, store sessions using an encrypted string that the client stores in a cookie or use an in-memory store like memcached.
Rick Branson at Quora Visit the source
Other answers
whatever you do -- make sure that you do not store the photos in the database.
Tim Lossen
Avoid focusing too much on future load while developing the application initially. Most approaches to scalability issues will only add unnecessary complexity and probably end up affecting performance negatively in case they are implemented too early. I suggest you start out simple with one server and database. However, it would probably be a good idea to prepare your code for multiple databases in case you end up needing to shard your users and their data. But as Tim Lossen said you should avoid storing photos in the database and instead write them directly to disk. In order to prevent a bunch of database reads, cache whatever you can using memcached. Further down the road if you still experience heavy database reads you can implement replication in MySQL in form Master-Slave topology where you direct all your writes to the master and most of your reads to the slave. It does duplicate the amount of writes though. One thing you should think of is high bandwith usage as a consequence of serving plenty of photos. On upload you should run the image through an appropriate lossless optimization tool. Investigate whether it would be financially viable to store photos via a CDN (Content Delivery Network) in order to spare your servers of these requests. It will also greatly improve the user experience in form of faster response time.
Birk Nilson
I don't think your bottleneck will be on the DB layer, but on the photo management(upload,processing,storage), which probably CPU and File IO will used some orders of magnitude more than the DB layer. You should focus on splitting your application and let the all the photo upload-processing-storage be handled in a computer cloud service and storage like Amazon S2-S3. It means that your application and DB wont store nor process (even temporary uploads) any of the photos. Once you have that in place, and it means you can properly scale the photo processes then you can focus on the database scalability. It's also important to know the average size of the expected images (avatars or a full quality professional photo?),
Carlos Fdz
I'd recommend hiring someone clueful in this space (someone who can write answers like the folks above here). Either as an employee or a consultant. You can waste a vast amount of time (during which your service is very visible, sucking in public view) fixing newbie scaling mistakes. Look for someone who has been through the scale-up of at least one production site, in the trenches.
David Boreham
Have a good plan for load testing, and be sure to watch your system stats as you slowly turn on the traffic tap. For the most part, the fact that this is an FB app doesn't change any of the recommendations you might find on general web app architecture. You'll still be dealing with the same data flows and storage requirements. And 100% agreed with the previous commenters -- don't store the files in MySQL. Store them on disk/CDN and store URLs in your DB. You will kill yourself trying to manage a MySQL DB with files like that. Also, build quick and test it out. Without any context on your product or business, I assume you're in a startup, and you should remember that many traffic estimates turn out to be wrong. If you make sure that you're horizontally scalable and you have a good team on hand, you'll be fine starting small, and it'll conserve your resources when you inevitably have to make changes.
Tim Rosenblatt
OK, after building architectures on all the above formats, I would suggest to go to http://xeround.com/. DAAS (database as a service) is the next generation of DB hosting. I do not work for xeround but I appreciate what they are doing.
Pinaki Saha
I would strongly recommend a call into Joyent. They will actually help you with the ideal FB DB architecture, whether it's a MySQL setup or something like Riak which is a distributed NoSQL system. They run big games/apps on FB already, including LinkedIn's Bumpersticker (RAILS), Kingdoms of Camelot from Kabaam (formerly watercooler), Family Feud from Backstage Games, THQ, CountryLife and more. These guys know how to scale apps very quickly, growing from zero to 1 million users in 46 days, averaging 8 million MAU and over 2 billion page views.
Nima Badiey
Did this scaled up as fast as expected? I'm just curious.
Alexandru Rada
Related Q & A:
- What type of email is yahoo? Is it POP, IMAP, or Exchange? What is the incoming mail server? Its for an App?Best solution by Yahoo! Answers
- What does the grey half moon next to a name on facebook chat mean?Best solution by Yahoo! Answers
- What is a facebook 'official' page?Best solution by Yahoo! Answers
- What are the key elements in a facebook page?Best solution by facebook.com
- What is the ideal conformation for a jumper?Best solution by equestrianism.wordpress.com
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.