What is the best data model and database systems to store social graph?

Is fair to benchmark Titan database with Neo4j in a single database instance in isolation?

As I know Neo4j is not a distributed database and runs really fast in single servers. "Neo4j is a robust (fully ACID) transactional http://www.neo4j.org/learn/graphdatabase database. Due to its graph data model, Neo4j is highly agile and blazing fast. For connected data operations, http://www.neo4j.org/learn/neo4j runs a thousand times faster than relational databases." Titan is a graph database based on cassandra (highly scalable and distributed). "Titan is a scalable http://en.wikipedia.org/wiki/Graph_database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster. Titan is a transactional database that can support http://thinkaurelius.com/2012/08/06/titan-provides-real-time-big-graph-data/ executing http://thinkaurelius.com/2013/05/13/educating-the-planet-with-pearson/. (http://thinkaurelius.github.io/titan/)" So, before any benchmark, I would like to know if it's a fair comparation between a distributed database (Titan) that maybe only gain some advantage in "shard architectures"?
Answer:

If you need a distributed graph, based on the amount of data you expect to store, then I would think it would make it more fair to build your test around that amount of data rather then a minimal subset. (Use the distributed capability) Neo4j is considerably faster. I think the Titan team even publish a simple graph showing this. The index free adjacency, speeding up traversals, is slowed down considerably if the graph is scattered over multiple servers. In short; If you need a distributed graph then compare solutions that support that. If you don't, then compare solutions not trying to solve that use-case.

Was this solution helpful to you?

Other answers

It's very hard when it comes to benchmarks. You can take a look at a few comparisons here: 1) http://db-engines.com/en/system/Neo4j%3BTitan and 2) https://docs.google.com/spreadsheet/ccc?key=0AlHPKx74VyC5dERyMHlLQ2lMY3dFQS1JRExYQUNhdVE#gid=0 --> Neo4j is part open-source/ part paid(When you want HA cluster) --> Neo4j has it's proprietary database(file structure). whereas, --> Titan is Apache licensed(completely open source). --> Titan relies on Cassandra, HBase, etc for it's storage.

Vishvesh Deshmukh

Neo4J is faster in a single machine, Titan will scale better for very large graphs (Neo4j, as far as I know, has writing in the master as its main bottleneck). As someone said, Cassandra/HBase makes it possible to have a fully distributed file system. One very important thing to keep in mind are the read/write requirements and the implications for I/O demands. If you have a lot of reading, you want to avoid I/O by writing sequentially (and probably slower). If you have a lot of writing, I/O becomes less important. Read http://www.violin-memory.com/blog/understanding-io-random-vs-sequential/ on the topic. Interesting explanations on Titan and OrientDB here:

Flavio Graf

Related Q & A:

How to pass multiple parameters in a single Ajax function?Best solution by stackoverflow.com
How to merge multiple CSV files into a single CSV file?Best solution by solveyourtech.com
How to forward e-mail to a group from a single e-mail address?Best solution by Webmasters
Is it possible to be a travel nurse when your a single mom of 1 child?Best solution by Yahoo! Answers
Can a person have more than two alleles for a single gene?Best solution by answers.yahoo.com

Just Added Q & A:

How many active mobile subscribers are there in China?Best solution by Quora
How to find the right vacation?Best solution by bookit.com
How To Make Your Own Primer?Best solution by thekrazycouponlady.com
How do you get the domain & range?Best solution by ChaCha
How do you open pop up blockers?Best solution by Yahoo! Answers

For every problem there is a solution! Proved by Solucija.

Got an issue and looking for advice?
Ask Solucija to search every corner of the Web for help.
Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.