SQL Query for Multi-Values In An Encoded String

Consider a table with an textual column that encodes multi-values like so:

How can we query for all rows using an OR criteria?

For a single value, it’s quite easy by searching for the string (in this case, an ISO 2 code).  But what if we need to search for the occurrence of one of n strings?

The following query achieves this using the Microsoft SQL XML data type, the nodes() function, and CROSS APPLY:

This should yield the values Charles, Steven, Arturo, and John by first converting the delimited values into XML by simply using string replacement.  Next, the XML is “shredded” using nodes().  For each base row, the shredding generates one row per node (for example, for Charles, we would have one row for US, one row for CA, and one row for MX).

Here is the result of the inner sub-select:

And here is the resultset after shredding:

You can see the intermediate resultset using this query:

Finally, the DISTINCT clause collapses the resultset once again.

From my own testing, better performance can be achieved by creating a table variable with the target values and using a JOIN instead of an IN (about 1/5 of the time). For 100k records, using IN takes about 13.430s. Using a JOIN to a table variable takes about 2.293s.

You may also like...