Until this week, all of the recursive CTE posts I had read
used an employee hierarchy example. Then my colleague pointed me to the best
technical blog post I have ever read where Brad Schulz uses a sales running
total example in his blog post entitled, “This Article On Recursion Is Entitled “This Article On Recursion Is Entitled “This Article… ∞” ” Even if you are not interested
in SQL you should check out this article for the great story, writing style,
examples or recursion, and links to Wikipedia pages about things like fractals
and mobius strips.
My blog post is about using a recursive CTE in SQL Server
for part of the logic in the Center for Disease Control’s (CDC) Ventilator Associated Event (VAE) Protocol.
Imagine you have a temp table called #vent_days where each row
represents a patient and a day that they were on a mechanical ventilator. You
have already identified which vent days meet the criteria for a VAE day of
event except you haven’t yet taken into account the rule: when a VAE day of
event occurs, another VAE day of event cannot occur for 14 days. You have
filtered the temp table down to just those patient visits that have a VAE. Your temp table
has the columns represented by the following SELECT statement.
SELECT
patient_visit_id
, vent_day_date /*date of mechanical ventilation*/
, vent_day /*the number of days a patient is on a mechanical ventilator as of the vent_day_date*/
, pos_VAE_doe_flg /*possible VAE day of event flag. 1 indicates that this day meets the
criteria for a VAE day of event except the rule that a VAE day of event may not occur until 14 days after a previous VAE day of event*/
, pos_VAE_doe_order /*only populated where pos_VAE_doe_flg is 1. The ordered number of possible VAE days of event per patient visit.*/
FROM
#vent_days
Below is how you can use a recursive CTE to complete the
logic by fulfilling the criteria that a VAE day of event must not occur until
14 days have passed since a previous VAE day of event.
;
WITH VAE_doe (patient_visit_id, vent_day_date, vent_day, pos_VAE_doe_order, prev_doe, doe_flg)
AS (
/* the anchor query is the first possible VAE day of event for each patient visit. doe_flag is set to 1 because it will always be an actual day of event.*/
SELECT
v.patient_visit_id
, v.vent_day_date
, v.vent_day
, v.pos_VAE_doe_order
, 0 AS prev_doe
, 1 AS doe_flag
FROM
#vent_days AS v
WHERE
v.pos_VAE_doe_order = 1
AND v.pos_VAE_doe_flg = 1
UNION ALL
/* second query recurses across the anchor query*/
SELECT
v.patient_visit_id
, v.vent_day_date
, v.vent_day
, v.pos_VAE_doe_order
/* when the previous VAE date of event occurs less than 14 days early
then do not change prev_doe, otherwise set prev_doe equal to vent_day
and assign a 1 to doe_flag. */
, CASE WHEN v.vent_day - VAE_doe.doe_vent_day >= 14
THEN VAE_doe.vent_day ELSE VAE_doe.prev_doe END AS prev_doe
, CASE WHEN v.vent_day - VAE_doe.doe_vent_day >= 14
THEN 1 ELSE 0 END AS doe_flag
FROM
#vent_days AS v
INNER JOIN
VAE_doe
ON v.patient_visit_id = VAE_doe.patient_visit_id
AND v.pos_VAE_doe_order = VAE_doe.pos_VAE_doe_order + 1 /*for each visit recurse by order of possible VAE days of envet*/
WHERE
v.pos_VAE_doe_flg = 1
)
SELECT
v.patient_visit_id
, v.vent_day_date
, v.vent_day
, COALESCE(VAE_doe.doe_flg,0) AS doe_flg
FROM
#vent_days AS v
LEFT OUTER JOIN
VAE_doe
ON v.patient_visit_id = VAE_doe.patient_visit_id
AND v.vent_day = VAE_doe.vent_day
ORDER BY
v.patient_visit_id
, v.vent_day
OPTION(MAXRECURSION 25)
;
If I had read Brad Schulz’s post before coming up with this I
would have used a WHILE loop. Recursive CTEs do not use set-based processing but process everything row-by-row using a Stack Spool. I don’t know if I will ever use a recursive CTE
again.
Besides a possible argument for convenience, does a
legitimate reason for using a recursive CTE in SQL Server exist?