SharePoint, Large Lists, Content Iterator, and an Alternative

When retrieving large datasets from large SharePoint lists, SharePoint 2010 provides a new class called ContentIterator (CI) that is supposed to help make large dataset retrievals from large lists possible.

There’s a bunch of great documentation on the web regarding this class, but one interesting observation that I’ve made is that it seems that it limits your query to one field only.  This means that in your query’s where clause, you can only include one field, even when used with the order clause generated by ContentIterator.ItemEnumerationOrderByNVPField.

I tested with a list containing over 22,000 items with the default thresholds:

list-settings

And randomly generated data like so:

random-data

It turns out that if I use more than one field in the query, even with an index on each field in the query and using SPQueryThrottleOption.Override, the CI will fail the query with a threshold error.

What’s one to do if you need to get all of the items in a list?

It seems that you should be able to just simply write a loop that executes the query and retrieves data, page-by-page, until you reach the end of the set.  So I rigged up the code myself:

This can be invoked like so:

It turns out that this works just fine provided that at least one of the columns in your query has an index. I tested with indices on all columns, on two columns, on one column, and on no columns.  With no indices, this query will fail as well (must have at least one index on one of the columns in your query in my testing, but my guess is that you will need more as the number of items increases).  With one to three indices, it made no difference in performance.  In fact, it got a little slower with three indices.

The batch size also had an impact.  Larger batch sizes were more efficient, which makes sense given that another database roundtrip is made for each batch.  For 32,000 items (I added more to test), a batch of 1000 (seems to be a sweet spot in my testing) completed in 2487ms for 7902 items.  A batch of 500 completed in 2550ms.  A batch of 100 completed in 3116ms.

It doesn’t have the fancy bits of the CI, but it will work with multiple columns in your where clause.

To test for yourself, you can download this handy-dandy, multi-threaded list filler

You may also like...

2 Responses

  1. October 1, 2013

    […] SharePoint, Large Lists, Content Iterator, and an Alternative […]

  2. October 1, 2013

    […] SharePoint, Large Lists, Content Iterator, and an Alternative […]