Calculating Date Differentials In Neo4j
(Or calculating date differentials in any environment where you don’t have a date object)
For all the awesomeness of graph databases and Neo4j, the lack of a native date object seems a glaring omission and makes it difficult to calculate date differentials if you have two date strings and did not have the foresight to store it as a numeric value.
To get around this, you will need to either update all of your relevant nodes to calculate a numeric value which can be used for date operations. Alternatively, you can use the magic of Julian Dates! Why Julian? Because it’s a continuous numeric value starting from the beginning of the Julian Period and can be “easily” calculated given a date string.
Consider a case where we have a task and a state and we want the average number of days between the created date of the task and the created date of the state.
In the sample Cypher below, the dates are stored in yyyy-MM-dd format:
toInt(substring(task.CreatedUtc, 0, 4)) AS task_year,
toInt(substring(task.CreatedUtc, 5, 2)) AS task_month,
toInt(substring(task.CreatedUtc, 8, 2)) AS task_day,
toInt(substring(state.CreatedUtc, 0, 4)) AS state_year,
toInt(substring(state.CreatedUtc, 5, 2)) AS state_month,
toInt(substring(state.CreatedUtc, 8, 2)) AS state_day,
substring(task.CreatedUtc, 0, 10) AS task_created,
substring(state.CreatedUtc, 0, 10) AS state_created
task_year, task_month, task_day, floor((14-task_month)/12) AS task_a,
state_year, state_month, state_day, floor((14-state_month)/12) AS state_a,
task_year, task_month, task_day, task_a, task_year + 4800 - task_a AS task_y, task_month + (12 * task_a) - 3 AS task_m,
state_year, state_month, state_day, state_a, state_year + 4800 - state_a AS state_y, state_month + (12 * state_a) - 3 AS state_m,
task_day + floor(((153 * task_m) + 2)/5) + (365 * task_y) + floor(task_y/4) - floor(task_y/100) + floor(task_y/400) - 32045 AS task_jd,
state_day + floor(((153 * state_m) + 2)/5) + (365 * state_y) + floor(state_y/4) - floor(state_y/100) + floor(state_y/400) - 32045 AS state_jd,
RETURN avg(state_jd - task_jd) AS averageTimeToCompletionForScope
The actual algorithm is included below from the Wiki article:
Have you seen the apoc library with it’s apoc.date.parse and apoc.date.format functions(3.1) / procedures (3.0) ?
WITH apoc.date.parse(task.CreatedUtc,’s’,’yyyy-MM-dd’) as task_s,
apoc.date.parse(state.CreatedUtc,’s’,’yyyy-MM-dd’) as state_s
RETURN apoc.date.format(state_s – task_s, ‘s’, ‘D’) as deltaDays
Customer is in 2.3.x
what about this though
“Apoc was the technician and driver on board of the Nebuchadnezzar in the Matrix movie. He was killed by Cypher.” so rad
3.x onwards only