Mind Your Matches
I recently had to track down a performance issue with one of our Cypher queries that was taking an obscenely long amount of time to run (15 seconds!) given the simplistic nature of the query.
Can you spot the problem?
1 2 3 4 5 |
MATCH (dist:Distribution), (user:User) MATCH (docType:DocumentType) WHERE dist.Uid = '...' AND user.Uid = '...' AND docType.Uid = '...' |
It was not easy to track down as this is an entirely valid query that gives the exact result desired.
However, that first MATCH will load every (!) Distribution and User in the graph into memory.
1 2 3 4 |
MATCH (dist:Distribution), (user:User), (docType:DocumentType) WHERE dist.Uid = '...' AND user.Uid = '...' AND docType.Uid = '...' |
It was an easy fix, as you can see, but also an easy mistake to make!
Probably easier to spot if you write the queries that just do a lookup on a property as:
MATCH (dist:Distribution {Uid:{distUid}}), (user:User {Uid:{userUid}}), (docType:DocumentType {Uid:{docTypeUid}})
….
Might be the case. However, we are currently using the Neo4jClient library (.NET) and using the .Where() seems to make it generally easier to read for our devs.