After our last post about benchmarking graph databases, Neo Technology representatives contacted us and said that they repeated the shortest path queries on Neo4j with the Facebook dataset and they observed 2-4 orders of magnitude better performance. This was an intriguing comment and we were very interested in finding out why this could be the case.
The numbers reported in the earlier post were all (1) with cold cache, i.e., resetting cache after every run, (2) using two edges to represent every connection in the undirected graph, and (3) based on weighted Dijkstra’s shortest path. We had the same setup for all systems (e.g., for Vertica we also measured weighted Dijkstra’s with a cold cache and two unidirectional edges for each undirected edge.) In order to understand the differences between our numbers and those reported to us by Neo Technology representatives, we made the following (step-by-step) changes in our experimental setup:
- Exclude the connection and other instantiation times. For Neo4j, this means to exclude the following two lines from time measurement:
g = new Neo4jGraph(DATA_DIR);
dijkstra = GraphAlgoFactory.dijkstra(Traversal.expanderForAllTypes(), CommonEvaluators.doubleCostEvaluator(“weight”));
- These commands set up the database connection and load the graph algorithm. Including or excluding these could depend on whether we run a one-off analytical query or a workload of several queries.
- Enabling warm cache (cache_type=strong), along with running a warm-up query before actual time measurements. Whether warm or cold caches are more representative of the real world is unclear: an exploratory analysis is likely to encounter warm cache while a daily log analysis is likely to see cold cache.
- Using a simpler shortest path implementation in Neo4j than Dijkstra’s.
By default, Neo4j traverses relationships in both directions, although a user could also specify the traversal direction. Therefore, modeling an undirected edge as a single relationship instead of two relationships in Neo4j is recommended. We fixed this data modeling problem. We re-ran our shortest path benchmark on Neo4j using the new input file and applied the above four changes one by one. As in the previous post, we ran the experiment on a single machine having CentOS 6.4, 2.4 GHz Intel Xeon processor, and 48GB of memory. We repeated each measurement 100 times. Table 1 shows the overall results and Figure 1 shows the mean run times.
We can see from Figure 1 that the measured run times vary from several seconds to a few milliseconds, depending on how and what is measured. This explains why the numbers observed by Neo Technology representatives were 2-4 orders of magnitude different from what we reported previously.
Our conclusions from this are that, like any of the complex systems we tested, properly tuning Neo4j can be tricky and getting optimal performance may require some experimentation with parameters. Whether a user of Neo4j can expect to see runtimes on graphs like this measured in milliseconds or seconds depends on workload characteristics (warm / cold cache) and whether setup steps can be amortized across many queries or not.
These new results are useful extensions to our ongoing effort to benchmark and understand different systems for large-scale graph analytics. We highly appreciate the feedback from Neo Technology representatives!