Is fair to benchmark Titan database with Neo4j in a single database instance in isolation?
-
As I know Neo4j is not a distributed database and runs really fast in single servers. "Neo4j is a robust (fully ACID) transactional http://www.neo4j.org/learn/graphdatabase database. Due to its graph data model, Neo4j is highly agile and blazing fast. For connected data operations, http://www.neo4j.org/learn/neo4j runs a thousand times faster than relational databases." Titan is a graph database based on cassandra (highly scalable and distributed). "Titan is a scalable http://en.wikipedia.org/wiki/Graph_database optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-machine cluster. Titan is a transactional database that can support http://thinkaurelius.com/2012/08/06/titan-provides-real-time-big-graph-data/ executing http://thinkaurelius.com/2013/05/13/educating-the-planet-with-pearson/. (http://thinkaurelius.github.io/titan/)" So, before any benchmark, I would like to know if it's a fair comparation between a distributed database (Titan) that maybe only gain some advantage in "shard architectures"?
-
Answer:
If you need a distributed graph, based on the amount of data you expect to store, then I would think it would make it more fair to build your test around that amount of data rather then a minimal subset. (Use the distributed capability) Neo4j is considerably faster. I think the Titan team even publish a simple graph showing this. The index free adjacency, speeding up traversals, is slowed down considerably if the graph is scattered over multiple servers. In short; If you need a distributed graph then compare solutions that support that. If you don't, then compare solutions not trying to solve that use-case.
Stefan Baxter at Quora Visit the source
Other answers
It's very hard when it comes to benchmarks. You can take a look at a few comparisons here: 1) http://db-engines.com/en/system/Neo4j%3BTitan and 2) https://docs.google.com/spreadsheet/ccc?key=0AlHPKx74VyC5dERyMHlLQ2lMY3dFQS1JRExYQUNhdVE#gid=0 --> Neo4j is part open-source/ part paid(When you want HA cluster) --> Neo4j has it's proprietary database(file structure). whereas, --> Titan is Apache licensed(completely open source). --> Titan relies on Cassandra, HBase, etc for it's storage.
Vishvesh Deshmukh
Neo4J is faster in a single machine, Titan will scale better for very large graphs (Neo4j, as far as I know, has writing in the master as its main bottleneck). As someone said, Cassandra/HBase makes it possible to have a fully distributed file system. One very important thing to keep in mind are the read/write requirements and the implications for I/O demands. If you have a lot of reading, you want to avoid I/O by writing sequentially (and probably slower). If you have a lot of writing, I/O becomes less important. Read http://www.violin-memory.com/blog/understanding-io-random-vs-sequential/ on the topic. Interesting explanations on Titan and OrientDB here:
Flavio Graf
Related Q & A:
- How to pass multiple parameters in a single Ajax function?Best solution by stackoverflow.com
- How to merge multiple CSV files into a single CSV file?Best solution by solveyourtech.com
- How to forward e-mail to a group from a single e-mail address?Best solution by Webmasters
- Is it possible to be a travel nurse when your a single mom of 1 child?Best solution by Yahoo! Answers
- Can a person have more than two alleles for a single gene?Best solution by answers.yahoo.com
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.