Fix APOC jaroWinklerDistance threshold (distance ≠ similarity)
Access to neo4j-kg (bolt://localhost:7688). At least one Concept node in the graph with a canonicalName that shares the first two characters with your test string.
Entity resolution test shows pass=jaro_winkler for near-identical strings (similarity >= 0.92). The JW pass fires before embedding or LLM passes.
1. Confirm the symptom: entity resolution pass=1 (JW) never fires even for near-identical strings. Run the manual test:
MATCH (c:Concept) WHERE c.canonicalName = "OpenAI"
RETURN apoc.text.jaroWinklerDistance("OpenAI", "OpenAI Inc") AS dist
Expected result: ~0.079 (a distance value, not 0.921).
2. Locate the Cypher query in lib/entity-resolver.js that calls apoc.text.jaroWinklerDistance.
3. The query must use:
WHERE dist < $distThreshold
where distThreshold = 1 - desiredSimilarityThreshold (e.g. 1 - 0.92 = 0.08)
NOT: WHERE score > 0.92
4. The ORDER BY clause must be "ORDER BY dist ASC" (lowest distance = closest match).
5. The returned confidence must be computed as "(1 - dist)" not dist.
6. Rebuild and redeploy zil-graph-worker, then rerun the entity resolution test:
docker exec -e NEO4J_KG_PASSWORD="<password>" zil-graph-worker node test/test-entity-resolution.js