<CharlieDigital/> Programming, Politics, and uhh…pineapples

12Aug/140

Invoking Custom WCF Services in SharePoint with Claims

Posted by Charles Chen

In SharePoint, if you host a custom WCF service in a claims-enabled application, the authentication via NTLM is actually quite tricky if you are attempting to invoke it from a console application, for example.

There are various articles and Stackoverflow entries on using System.ServiceModel.Description.ClientCredentials on either the ChannelFactory or the client instance, but all of these did not work in the sense that on the server side, SPContext.Current.Web.CurrentUser was null and ServiceSecurityContext.Current.IsAnonymous returned true.

It seems like it should be possible to invoke the service authenticating through NTLM as if the user were accessing it through the web site.

In fact, it is possible, but it involves some manual HTTP requests to get this to work without doing some Windows Identity Foundation programming and consequently setting up tons of infrastructure to get what seems like a relatively simple and straightforward scenario to work.

The first step is to actually manually retrieve the FedAuth token:

/// <summary>
///     Gets a claims based authentication token by logging in through the NTLM endpoint.
/// </summary>
/// <returns>The FedAuth token required to connect and authenticate the session.</returns>
private string GetAuthToken()
{
    string authToken = string.Empty;

    CredentialCache credentialCache = new CredentialCache();
    credentialCache.Add(new Uri(_portalBaseUrl), "NTLM", new NetworkCredential(_username, _password, _domain));

    HttpWebRequest request = WebRequest.Create(string.Format("{0}/_windows/default.aspx?ReturnUrl=%2f_layouts%2fAuthenticate.aspx%3fSource%3d%252F&Source=%2F ", _portalBaseUrl)) as HttpWebRequest;
    request.Credentials = credentialCache;
    request.AllowAutoRedirect = false;
    request.PreAuthenticate = true;

    // SharePoint doesn't like it if you don't include these (403 Forbidden)?
    request.UserAgent = "Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko";
    request.Accept = "text/html, application/xhtml+xml, */*";

    HttpWebResponse response = request.GetResponse() as HttpWebResponse;

    authToken = response.Headers["Set-Cookie"];

    return authToken;
}

There are three keys here:

  1. The first is that AllowAutoRedirect must be false or you will get an error that you are getting too many redirects.  What seems to happen is that the cookies are not set correctly when using auto redirect so the chain will continue until an exception is thrown.  In Fiddler, you will see this as a long cycle of requests and redirects.
  2. The second is that the URL must be the NTLM authentication endpoint (/_windows...") as any other URL will return a 302 and for that, you will need to set AllowAutoRedirect to true.
  3. The third is that it seems as if SharePoint really doesn't like it when the user agent and accept headers are not included.  I tried it later without these and it seemed to work, but I could not get it to work without them (403 unauthorized) initially.

Once you have the FedAuth token, you are able to basically impersonate the user.  To do so, you will need to include a cookie in your HTTP header request:

// Get the FedAuth cookie
var authToken = GetAuthToken();

// Create the connection artifacts.            
EndpointAddress endpointAddress = new EndpointAddress(endpointUrl);
BasicHttpBinding binding = new BasicHttpBinding();            

ChannelFactory<ISomeService> channelFactory = 
    new ChannelFactory<ISomeService>(binding, endpointAddress);

// Initiate the client proxy using the connection and binding information.
ISomeService client = channelFactory.CreateChannel();

using (new OperationContextScope((IContextChannel) client))
{
    // Set the authentication cookie on the outgoing WCF request.
    WebOperationContext.Current.OutgoingRequest.Headers.Add("Cookie", authToken);

    // YOUR API CALLS HERE    
}

The key is to add the header on the outgoing request before making your service API calls.

With this, you should see that you are able to invoke SharePoint hosted custom WCF service calls in claims-based web applications with NTLM authentication.

Filed under: .Net, SharePoint, WCF No Comments
2Mar/14Off

Programmatically Add SharePoint Lists with Schemas

Posted by Charles Chen

So you want to add a custom list with a schema, eh?

In SharePoint, this is "easily" (lol) done by adding a custom list with a Schema.xml file and a list template XML file.

But what if you don't want to add a custom list template and you want to do it programmatically?  You'd want to do this, of course, if you wanted to define custom views on the list.  I've seen this done programmatically (as in verbose, custom code to build lists, add content types, build views, etc.), but SharePoint already offers you a mechanism for defining custom views using the list schema XML file.  Why duplicate what SharePoint already gives you for free?

In looking through the API, it seems that there is an API call that would support it, but it's quite cryptic in how it's actually invoked.

After a bit of testing, I found that it's actually quite easy.

Here is the API definition from Microsoft:

public virtual Guid Add(
   string title,
   string description,
   string url,
   string featureId,
   int templateType,
   string docTemplateType,
   string customSchemaXml,
   SPFeatureDefinition listInstanceFeatureDefintion,
   SPListTemplate.QuickLaunchOptions quickLaunchOptions
)

Here is the invocation in PowerShell:

$listId = $web.Lists.Add("Test List", "Test", "TestList2", 
    "00BFEA71-DE22-43B2-A848-C05709900100", 100, "100", $xml, $feature, 0)

A little explanation is in order here.  The first three parameters are very straight forward.  The fourth one is where it starts to get "funny".  Here, you will want to search in your 14\TEMPLATE\FEATURES\ directory for the feature that contains the template that you want to use.  In this case, I am creating a list based on the generic custom list type so the feature is located in 14\TEMPLATE\FEATURES\CustomList.  You need the GUID of the feature in the Feature.xml file here and not your own custom feature GUID.

The fifth and sixth parameters are straight forward.

We'll skip the seventh parameter for now.

The eight parameter here is the feature definition which contains the template that your list will be based on.  Because we are using an out-of-the-box list template, we simply need to load the feature definition for the GUID in parameter 4:

$features = [Microsoft.SharePoint.Administration.SPFarm]::Local

$id = [Guid]("00BFEA71-DE22-43B2-A848-C05709900100") 

$feature = $features[$id]

Again, because we are using the out-of-the-box template, we need to use the out-of-the-box feature definition that contains the template.

The ninth parameter is, again, straight forward.

Now back to that seventh parameter.  This parameter is simply the XML that would be generated by adding a new list in Visual Studio.  I've added a simple example here:

<?xml version='1.0' encoding='utf-8'?>
<List xmlns:ows='Microsoft SharePoint' Title='List1' FolderCreation='FALSE' Direction='$Resources:Direction;' Url='Lists/List1' BaseType='0' EnableContentTypes='True' xmlns='http://schemas.microsoft.com/sharepoint/'>
    <MetaData>
        <ContentTypes>
            <ContentTypeRef ID='0x01'>
                <Folder TargetName='Item' />
            </ContentTypeRef>
            <ContentTypeRef ID='0x0120' />
            <ContentTypeRef ID='0x010800B66F73F12643464793530152868EEE87'/>
        </ContentTypes>
        <Fields>
            <Field ID='{fa564e0f-0c70-4ab9-b863-0177e6ddd247}' Type='Text' Name='Title' DisplayName='$Resources:core,Title;' Required='TRUE' SourceID='http://schemas.microsoft.com/sharepoint/v3' StaticName='Title' MaxLength='255' />
        </Fields>
        <Views>
            <View BaseViewID='0' Type='HTML' MobileView='TRUE' TabularView='FALSE'>
                <Toolbar Type='Standard' />
                <XslLink Default='TRUE'>main.xsl</XslLink>
                <RowLimit Paged='TRUE'>30</RowLimit>
                <ViewFields>
                    <FieldRef Name='LinkTitleNoMenu'></FieldRef>
                </ViewFields>
                <Query>
                    <OrderBy>
                        <FieldRef Name='Modified' Ascending='FALSE'></FieldRef>
                    </OrderBy>
                </Query>
                <ParameterBindings>
                    <ParameterBinding Name='AddNewAnnouncement' Location='Resource(wss,addnewitem)' />
                    <ParameterBinding Name='NoAnnouncements' Location='Resource(wss,noXinviewofY_LIST)' />
                    <ParameterBinding Name='NoAnnouncementsHowTo' Location='Resource(wss,noXinviewofY_ONET_HOME)' />
                </ParameterBindings>
            </View>
            <View BaseViewID='1' Type='HTML' WebPartZoneID='Main' DisplayName='Hello, World' DefaultView='TRUE' MobileView='TRUE' MobileDefaultView='TRUE' SetupPath='pages\viewpage.aspx' ImageUrl='/_layouts/images/generic.png' Url='AllItems.aspx'>
                <Toolbar Type='Standard' />
                <XslLink Default='TRUE'>main.xsl</XslLink>
                <RowLimit Paged='TRUE'>30</RowLimit>
                <ViewFields>
                    <FieldRef Name='Attachments'></FieldRef>
                    <FieldRef Name='LinkTitle'></FieldRef>
                    <FieldRef Name='IntegrationID'></FieldRef>
                </ViewFields>
                <Query>
                    <OrderBy>
                        <FieldRef Name='ID'></FieldRef>
                    </OrderBy>
                </Query>
                <ParameterBindings>
                    <ParameterBinding Name='NoAnnouncements' Location='Resource(wss,noXinviewofY_LIST)' />
                    <ParameterBinding Name='NoAnnouncementsHowTo' Location='Resource(wss,noXinviewofY_DEFAULT)' />
                </ParameterBindings>
            </View>
        </Views>
        <Forms>
            <Form Type='DisplayForm' Url='DispForm.aspx' SetupPath='pages\form.aspx' WebPartZoneID='Main' />
            <Form Type='EditForm' Url='EditForm.aspx' SetupPath='pages\form.aspx' WebPartZoneID='Main' />
            <Form Type='NewForm' Url='NewForm.aspx' SetupPath='pages\form.aspx' WebPartZoneID='Main' />
        </Forms>
    </MetaData>
</List>

It is easily customized with additional custom views, specification of the fields on those views, and even specification of content types to associate to the list!

So why would you want to do this?  If you want a custom list with content types and custom views and all of that jazz, you can get it without writing a lot of custom code to build lists and without the hassle of custom templates (a pain in the butt); you can just write the schema XML (or maybe better yet, configure and export the list) and let SharePoint do its magic!

Filed under: Dev, SharePoint No Comments
5Dec/13Off

A Simple Way to Improve CAML Query Performance

Posted by Charles Chen

There are many ways to improve the performance of your CAML queries, but I've recently found that in some cases, it's as easy as switching the order of your filter operations.

In this case, I was searching across a list of 1,000,000 items for a set of 41.

The list consists of tasks with, among other fields, a Status and Assigned To field.

Both of these fields were indexed, but the following query was still running in the 10 second range:

<Where>
    <And>
        <And>
            <Neq>
                <FieldRef Name='Status' />
                <Value Type='Choice'>Completed</Value>
            </Neq>
            <Neq>
                <FieldRef Name='Status' />
                <Value Type='Choice'>Terminated</Value>
            </Neq>
        </And>                                             
        <Eq>
            <FieldRef Name='AssignedTo' LookupId='TRUE' />
            <Value Type='Int'>159</Value>
        </Eq>
    </And>
</Where>

One small tweak and the same query ran in 1.5s:

<Where>
    <And>
        <Eq>
            <FieldRef Name='AssignedTo' LookupId='TRUE' />
            <Value Type='Int'>159</Value>
        </Eq>	
        <And>
            <Neq>
                <FieldRef Name='Status' />
                <Value Type='Choice'>Completed</Value>
            </Neq>
            <Neq>
                <FieldRef Name='Status' />
                <Value Type='Choice'>Terminated</Value>
            </Neq>
        </And>                                             
    </And>
</Where>

All that was done was to shift the order of the query conditions.

The first query reads as "All tasks that are not Completed and not Terminated and Assigned To user 159".

The second query reads as "All tasks that are Assigned To user 159 that are not Completed and not Terminated".

I didn't trace the generated SQL, but it's not hard to imagine that the SQL now performs an initial filter on the data set against the user ID and returns a much smaller data set for subsequent operations (filter by Status).

So the lesson learned is that for large lists, you need to follow Microsoft's guidance on large lists, but also ensure that your queries are written to take advantage of the indexes and reduce the data set as early as possible (preferably against an indexed field.).

16Oct/13Off

SharePoint’s Image Problem

Posted by Charles Chen

SharePoint has an image problem with many folks.

I've discussed it before, but I think I have a new perspective on the issue.

You see, I recently interviewed a candidate that was Microsoft Certified IT Professional: SharePoint Administrator 2010 and Microsoft Certified Professional Developer: SharePoint Developer 2010.  His resume looked impressive and promising, with many SharePoint projects under his belt.

He failed miserably on my moderate-advanced web developer assessment, skipping on nearly every single item in the assessment -- things that I think any mid-level web/.NET developer should know.

You see, the problem is not that SharePoint is inherently a weak platform or inherently un-scalable or inherently poorly performing, but that they've made the platform so approachable, so well documented, and built such a strong community of developers and developer evangelists around it, that anyone can get into it.

You don't need to know how garbage collection works or understand the difference between a Func and an Action (or even know what they are!) or what IL is.

You don't need to know how to write a generic class or how to refactor code to reduce complexity.

You don't need to know many of the higher level programming concepts of .NET or sound object oriented programming practices to build solutions on SharePoint because most of the time, you can just google it and find an example (Of course, most examples -- even the Microsoft MSDN ones -- are not meant for production code; they are merely designed to convey the concept and usage of a particular member in the library but not necessarily how to use it in an architecturally sound manner).

So therein lies the root of SharePoint's image problem: it's so easy to build basic solutions, a developer who is fundamentally weak on the .NET platform and fundamental web technologies like Javascript and CSS can be productive in many companies and even get Microsoft certified!  And then these developers are let loose on projects that ultimately under-deliver in one way or another.  Be that performance or usability or maintainability.

SharePoint is like the mystery basket on Chopped.  There is nothing inherently good or bad about those mystery ingredients but for the skill of the individual chefs who combine them through the application of experience, creativity, and technique to create a dish.  With different chefs, you will get drastically different results each round and the winners are usually the folks that have a fundamental understanding of the art and science of flavor and cooking as well as a deep understanding of the ingredients.

It is the same with SharePoint; it is nothing but a basket of ingredients (capabilities or features) to start from (a very generic one at that).  At the end, whether your dish is successful or not is a function of whether you have good chefs that understand the ingredients and understand the art and science of melding the ingredients into one harmonious amalgamation.  Likewise, it is important that when staffing for SharePoint projects, you focus not only on SharePoint, but also on the fundamentals of .NET, computer science, and good programming in general.

You can also think of it like a Porsche.  Give it to a typical housewife to drive around a track and you'll get very different results versus having Danica Patrick drive it.  Should it reflect poorly on the Porsche that the housewife couldn't push it anywhere near the limits?  Or is it a function of the driver?  Danica Patrick in a Corolla could probably lap my wife in a Porsche around a track.

The bottom line is that SharePoint, in my view, is just a platform and a good starting point for many enterprise applications.  When some SharePoint projects fail or fall short of expectations, the blame is often assigned to the platform (ingredients) and not to the folks doing the cooking (or driving).  SharePoint is indeed a tricky basket of ingredients and it still takes a skilled team of architects, developers, testers, and business analysts to put it together in a way that is palatable.

16Oct/13Off

Watch Out For SPListItemCollection.Count and Judicious Use of RowLimit

Posted by Charles Chen

This seemingly innocuous call can be quite dangerous when used incorrectly.

The reason is that this property invocation actually executes the query.

This is OK if you plan on iterating the results because the results are cached, but costly if you don't plan on iterating the results.  The following code sample can be used to test this effect for yourself:

static void Main(string[] args)
{
    using(SPSite site = new SPSite("http://internal.dev.com/sites/oncology"))
    using (SPWeb web = site.OpenWeb())
    {
        SPList list = web.Lists.TryGetList("General Tasks");

        SPQuery query = new SPQuery();
        query.RowLimit = 1;
        query.Query = @"
<Where>
<Contains>
<FieldRef Name='Title'/>
<Value Type='Text'>500KB_1x100_Type_I_R1</Value>
</Contains>
</Where>";
        query.QueryThrottleMode = SPQueryThrottleOption.Override;

        SPListItemCollection items = list.GetItems(query);

        Stopwatch timer = new Stopwatch();

        timer.Start();

        Console.Out.WriteLine("{0} items match the criteria.", items.Count);

        var timeForCount = timer.ElapsedMilliseconds;

        Console.Out.WriteLine("{0} milliseconds elapsed for count.", timer.ElapsedMilliseconds);

        foreach (var i in items)
        {
            Console.Out.WriteLine("{0} milliseconds elapsed for start of iteration.", timer.ElapsedMilliseconds - timeForCount);

            break;
        }
    }
}

(And of course, you can check the implementation of Count in Reflector or dotPeek)

You will see that the start of iteration will be very fast once you've invoked Count once.

Now here is where it gets interesting:

  1. The total time it takes to execute the query is longer for invoking Count versus just iterating (~3000ms vs ~3200ms, about 5-10% in my tests).
  2. When I set the RowLimit to 1, I can reduce the time by roughly 40-50% (~1600ms vs ~3200ms for a resultset of 230 out of a list of 150,000 items).

Try it yourself by commenting and uncommenting the RowLimit line and commenting and uncommenting the line that invokes Count.

What does this mean for you?  Well, if you don't need the count, then don't use it.  It's slower than just iterating the results. Where you plan on iterating the results anyways, don't invoke Count. If you need the count, you are better off doing a counter yourself in the iteration.

And in a use case where you don't plan on iterating the result set (for example, checking to see if there is at least one occurrence of a type of object), be sure to set the RowLimit in your query!

6Sep/13Off

SharePoint, Large Lists, Content Iterator, and an Alternative

Posted by Charles Chen

When retrieving large datasets from large SharePoint lists, SharePoint 2010 provides a new class called ContentIterator (CI) that is supposed to help make large dataset retrievals from large lists possible.

There's a bunch of great documentation on the web regarding this class, but one interesting observation that I've made is that it seems that it limits your query to one field only.  This means that in your query's where clause, you can only include one field, even when used with the order clause generated by ContentIterator.ItemEnumerationOrderByNVPField.

I tested with a list containing over 22,000 items with the default thresholds:

list-settings

And randomly generated data like so:

random-data

It turns out that if I use more than one field in the query, even with an index on each field in the query and using SPQueryThrottleOption.Override, the CI will fail the query with a threshold error.

What's one to do if you need to get all of the items in a list?

It seems that you should be able to just simply write a loop that executes the query and retrieves data, page-by-page, until you reach the end of the set.  So I rigged up the code myself:

/// <summary>
///     Executes a query and returns the result in batches.
/// </summary>
public class BatchQueryExector
{
    private SPQuery _query;

    private BatchQueryExector() {}

    /// <summary>
    ///     Creates an instance of the executor against the specified query.
    /// </summary>
    /// <param name="query">The query to execute.</param>
    /// <returns>The instance of the executor.</returns>
    public static BatchQueryExector WithQuery(SPQuery query)
    {
        BatchQueryExector executor = new BatchQueryExector();

        executor._query = query;

        return executor;
    }

    /// <summary>
    ///     Specifies the list the query will be executed over.
    /// </summary>
    /// <param name="list">The SharePoint list that contains the data.</param>
    /// <returns>An instance of <c>ExecutionContext</c>.</returns>
    public ExecutionContext OverList(SPList list)
    {
        return new ExecutionContext(_query, list);
    }

    /// <summary>
    ///     Inner class used to encapsulate the execution logic.
    /// </summary>
    public class ExecutionContext
    {
        private readonly SPList _list;
        private readonly SPQuery _query;

        /// <summary>
        ///     Creates a new instance of the context.
        /// </summary>
        /// <param name="query">The query to execute.</param>
        /// <param name="list">The SharePoint list that contains the data.</param>
        public ExecutionContext(SPQuery query, SPList list)
        {
            _query = query;
            _list = list;
        }

        /// <summary>
        ///     Retrieves the items in the list in batches based on the <c>RowLimit</c> and 
        ///     invokes the handler for each item.
        /// </summary>
        /// <param name="handler">A method which is invoked for each item.</param>
        public void GetItems(Action<SPListItem> handler)
        {
            string pagingToken = string.Empty;

            while (true)
            {
                _query.ListItemCollectionPosition = new SPListItemCollectionPosition(pagingToken);

                SPListItemCollection results = _list.GetItems(_query);

                foreach (SPListItem item in results)
                {
                    handler(item);
                }

                if (results.ListItemCollectionPosition == null)
                {
                    break; // EXIT; no more pages.
                }

                pagingToken = results.ListItemCollectionPosition.PagingInfo;
            }
        }
    }
}

This can be invoked like so:

internal class Program
{
    private static void Main(string[] args)
    {
        Program program = new Program();

        program.Run();
    }

    private void Run()
    {
        using (SPSite site = new SPSite("http://internal.dev.com"))
        using (SPWeb web = site.OpenWeb())
        {
            SPList list = web.Lists.TryGetList("TestPaging");

            SPQuery query = new SPQuery();
            query.Query = @"
<Where>        
        <And>
            <Eq>
                <FieldRef Name=""TstProgram"" />
                <Value Type=""Text"">Program 1</Value>
            </Eq>
            <Eq>
                <FieldRef Name=""TstDocumentType"" />
                <Value Type=""Text"">15 Day SUSAR</Value>
            </Eq>
        </And>      
</Where>";
            query.RowLimit = 100; // Effective batch size.
            query.QueryThrottleMode = SPQueryThrottleOption.Override;

            query.Query += ContentIterator.ItemEnumerationOrderByNVPField;

            Stopwatch stopwatch = new Stopwatch();

            stopwatch.Start();

            int count = 0;

            BatchQueryExector.WithQuery(query).OverList(list).GetItems(i => { count++; });

            stopwatch.Stop();

            Console.Out.WriteLine("{0}ms, {1} items", stopwatch.ElapsedMilliseconds, count);
        }
    }
}

It turns out that this works just fine provided that at least one of the columns in your query has an index. I tested with indices on all columns, on two columns, on one column, and on no columns.  With no indices, this query will fail as well (must have at least one index on one of the columns in your query in my testing, but my guess is that you will need more as the number of items increases).  With one to three indices, it made no difference in performance.  In fact, it got a little slower with three indices.

The batch size also had an impact.  Larger batch sizes were more efficient, which makes sense given that another database roundtrip is made for each batch.  For 32,000 items (I added more to test), a batch of 1000 (seems to be a sweet spot in my testing) completed in 2487ms for 7902 items.  A batch of 500 completed in 2550ms.  A batch of 100 completed in 3116ms.

It doesn't have the fancy bits of the CI, but it will work with multiple columns in your where clause.

To test for yourself, you can download this handy-dandy, multi-threaded list filler

Filed under: SharePoint No Comments
17Jul/13Off

Has SharePoint Jumped the Shark?

Posted by Charles Chen

I've worked with SharePoint now for some 7 years starting from SharePoint 2007 beta 2 and I've worked with portals and CMS  systems in some form or another since 2003.

Since 2006, most of my domain experience has been in life sciences, but I think that what I've observed applies to most other industries and verticals as well.

So the question is whether SharePoint has jumped the shark?  Is it the second coming of Lotus Notes?  Have businesses soured on it as an enterprise application platform?  Is it a failed vision?  Is it time for your organization to put it out to pasture?

One of my observations in the last two years or so is that increasingly, I have been called upon to defend the choice of building a solution of top of SharePoint (I've even heard audible snickers).  You can see the eyes rolling from certain folks when we mention that our solution at InnovoCommerce is built on top of SharePoint.  The skepticism and questions about scalability soon follow, but it's not unfounded: it's most often based on firsthand experience with the pains of owning and managing SharePoint.  As the saying goes: fool me once, shame on you; fool my twice, shame on me.

Indeed, I think that the popularity and rise of SharePoint has been its own worst enemy in a sense.  As adoption of SharePoint increased with 2007 and 2010 (in part, thanks to the incredible marketing of Microsoft that often oversold the ease and value), organizations rushed to roll it out and make it available to their users.  In many cases, proper governance and information architecture were either an afterthought or poorly thought out from the onset, leading to deployments which were a nightmare to manage and maintain as rogue site collections and sub-sites cropped up like toadstools after an early summer shower.

Now, after one or two upgrade cycles, and maybe half a decade of ownership, organizations are starting to sour on shelling out for another (multi-) million dollar project to upgrade to 2013.

Organizations eventually -- more or less -- run into the same key problems:

Scalability and Performance.

One of the most troublesome byproducts of poor information architecture and planning is a considerable degradation of performance.  SharePoint provides endless ways with which to shoot yourself in the foot with regards to building scalable, performant applications; it's a field of landmines that even experienced SharePoint architects have to navigate carefully on each project as understanding the scale and intended usage patterns of data is key to coming up with a plan to build a system that will scale.  The indiscriminate use of and reliance on item level permissions is a good example of this.  Failure to break up data into manageable scopes is another.  In the case of custom code, bad or lazy coding practices can also have severe consequences in terms of performance.

Fortunately, most performance issues can be managed -- if use cases are well designed and well understood -- through some thoughtful design up front to build an information architecture that will align with how SharePoint is built to scale.  On the custom code side, following guidelines and best practices along with code reviews can help prevent or alleviate many of the most common traps that developers fall into.  For this, experienced, battle-tested SharePoint architects and developers are worth their weight in gold (wink, wink)!

Of course, this feeds well into the next point...

Cost.

Getting a large SharePoint architecture and deployment right is expensive (getting it wrong, even more so!).  From the cost of the hardware to the cost of the licenses of SharePoint (don't forget Windows Server, SQL Server, etc.) to the cost of the human resources required to plan the infrastructure, gather the requirements, design the architecture, manage the deployment, and offer on-going support.

It is a huge endeavor made all the more challenging when you factor in custom solutions that get built and installed on top of SharePoint and all that entails -- including risk.  Of course, many companies I've worked with have tried hard to minimize these costs through governance policies that severely limit the amount of customizations available -- often requiring several rounds of evaluation, validation, and approvals to get custom solutions installed.

While this does tend to help keep costs down from an ongoing maintenance perspective, I contend that it also severely limits the ROI and utility of SharePoint by crippling it.  Strip away the ability to build sophisticated business solutions on top of SharePoint, and you are left with a very expensive portal; it's like going to Chipotle and getting a rice bowl with only rice in it

In a way you could say that overzealous organizations addressed the cost of risk by squashing creativity and innovation in terms of mapping business processes into powerful solutions in SharePoint; IT simply didn't allow teams to come up with novel, creative, and innovative ways to extract ROI from SharePoint because of fear (I don't blame them!).  Oddly, by limiting SharePoint to basic ECM functionality, I believe that IT organizations simultaneously decreased their own value in terms of delivering real solutions to the business users and creating a justification for the investment in SharePoint -- everyone loses if you just treat SharePoint like a giant file share or a merely a portal.

Planned Obsolescence.

While I'm a huge advocate of SharePoint as an enterprise application platform (it's what I've been doing for years now) and not just a portal, working in the solution and product development side means that I also have an appreciation of the pain and costs associated with planned obsolescence.  You see, every three years or so, Microsoft releases a new version of SharePoint that will eventually "force" you to upgrade.  Perhaps it's a new feature of Office that can only be used with a certain version of SharePoint.  Or it's a critical fix or feature that addresses a key issue of your current deployment.  Or the product simply reaches end-of-life.  Or, from a product development perspective, you are simply forced to upgrade your codebase to capture or keep your customer base that is moving onto a new version.  It's expensive (I could imagine circumstances where migrations could cost as much or more than the original platform itself) and you are kind of forced to do it every few years.

It creates so much impedance and hand-wringing that some companies simply skip the migration and leave little zombie SharePoint deployments sitting around (our own document management portal at InnovoCommerce is still on SharePoint 2007, and we're a SharePoint shop with deep SharePoint experience).

From the product development side, it is a particularly challenging balancing act to decide how to allocate resources and time to upgrading your codebase while still developing capabilities and maintaining support for existing customers who may not move off of their current version of SharePoint for another two or three years (because of the aforementioned issue of cost and time).  When the fundamental architecture of the platform changes so drastically -- as it has with the shift from 2010 to 2013 -- this makes it all the more challenging (and expensive).

Analytics (Or Lack Thereof).

While SharePoint does offer many great features for building dashboards and KPIs, integration with SQL Server Reporting Services, and other reporting capabilities, I find it severely lacking from an analytics perspective.  A good example is that, for a platform that bills document management as one of the key features, you would think it would be simple to see a document access report that shows all documents, the number of times they were accessed, who accessed them, what actions each user took, etc.  You'd like to see it with some basic charts and visualizations that give you a good overview of the data (in an increasingly data driven and yet data saturated world).

Nope.  Instead, to get this type of basic document metrics, you have to build quite a complex solution (especially so given the barely sufficient audit log query APIs).

Another challenge from an analytics perspective is basically complexity.  For any significant deployment, it is a huge challenge to extract data from SharePoint to build sophisticated reports, especially if you follow best practices and segment data into separate site collections and databases not to mention the challenges and pitfalls of working directly against the raw SharePoint databases (practically a must if you plan on doing any serious analytics across your data).

What's the Future?

For many organizations, I can see SharePoint 2010 being their last dip into the SharePoint pool, at least as an IT owned and managed solution.  I believe that the experience that organizations had with owning 2007 and then migrating to 2010 has left a bad taste in the mouths of many IT organizations (not to mention the business sponsors).  Microsoft is making a strong push with Office 365, but many organizations are understandably reluctant to put their data into Microsoft's cloud (or any cloud, for that matter).  Other organizations might opt to migrate onto less costly, open source solutions like Alfresco, Liferay, or perhaps any number of cloud-based platforms including Google (one of our very large pharma customers has already moved their mail systems from Exchange and also use Google Docs).

It will be interesting to see how SharePoint 2013 and the future of SharePoint unfurls given much of the challenges organizations have had with 2007 and 2010.  In life sciences -- and I suspect with many other industries as well -- there has also been a large movement to downsize IT budgets and IT ownership of platforms, making ownership of SharePoint and delivering useful business solutions on top of it an increasingly challenging task.

While SharePoint 2013 is still plenty expensive when you factor in all of the licenses and expertise required to deploy and manage it, I think it addresses many of the key problems with regards to the deployment of custom solutions -- where I think the most business value is realized -- into SharePoint and is a huge leap forward from 2010 (a much bigger gap, in my opinion, than from 2007 to 2010).

At InnovoCommerce, we have yet to see any of our customers move in this direction (but then again, life sciences companies tend to move slowly anyways -- some of our customers are just going live with 2010) so it will be interesting to see how the landscape of enterprise portals and application platforms evolves over the next year or so.

We are increasingly mulling moving more and more of our codebase out of SharePoint as a mechanism for insulating against the cost of upgrade cycles as well as ensuring that our platforms can be deployed, managed, and integrated more readily and preparing ourselves for a future where our customers may be looking the other way when it comes to their enterprise portal and business application platform of choice.

11Jul/13Off

Use ContentIterator to Read Items from Large SharePoint Lists

Posted by Charles Chen

New to me: ContentIterator for reading items from large lists.

28Mar/13Off

Why Your CAML Query Might Be Returning No Results

Posted by Charles Chen

A curious thing.

We were working on a custom view for one of our lists and found that we were not able to retrieve a list item using a CAML query that should have worked.

After spending quite a bit of time studying the query itself, I decided to look at the actual SQL that was being generated.

This list has a large number of columns and SharePoint was wrapping the rows according to the boundary conditions.  For example, when you have 16 integer or user fields, the 17th field of this type will force the row to wrap into a second row.

When this occurs, in the AllUserData table, you will see multiple rows for the same item, each with a different tp_RowOrdinal. Looking in the table, it was clear that the value was in the table, but it was not on the first row (tp_RowOrdinal="0").

The problem with this is that for some odd reason, the SQL query that gets generated when you query on the field that has wrapped onto the second row inexplicably only queries the first row.

In the generated SQL, I found the following:

...AND (UserData.tp_RowOrdinal=0) AND ((UserData.[int10] = N''34'') ...

In this case, "34" represents the ID of a user and "int10" represents the column that the data is in.  But the key is that even though this data occurred in the row with ordinal 2, the SQL will filter only on the row with ordinal 0.

This is kind of inexplicably bad but not all is lost.

The dirty workaround is that you can do one of two things:

  1. If you are adding the columns manually or in a single content type, you need to ensure that the field(s) that you want to query against appear earlier in your content type definition or they get added earlier.
  2. If you are adding the columns via content types in a Schema.xml file, you need to ensure (1) above and that you add that content type to the list earlier in the XML.

There is an MSDN discussion on this, but Microsoft seems to claim that this is not a "bug" (it most certainly is).

Filed under: SharePoint No Comments
10Oct/12Off

SharePoint DirectoryNotFoundException (0×80070003)

Posted by Charles Chen

I've been dealing with an interesting SharePoint error for the better part of a day-and-a-half now and I thought it was worth sharing.

The error surfaced was a DirectoryNotFoundException (0x80070003) when attempting to call BreakRoleInheritance in an asynchronous event receiver.

The purpose of the receiver was to read a set of rules which specified how to configure permissions for objects based on metadata and content type. However, this would fail with the aforementioned error, but only for folders.

Of course, this was a weird error because I could certainly see the folder in the list.

It turns out that the root cause is how we were setting the titles/names on our folders.  One issue with folders is that depending on how you add the list item, you may have to go back and rename it.  Otherwise, it gets a title based on its ID.

Our original logic looked like this (note the use of the Name property) and it was raised after the item was created in the list:

if (entity.ContentTypeId.StartsWith("0x0120")) // Only for folders
{
    item = list.GetItemById(item.ID);
    item["Name"] = entity.Title;
    item.SystemUpdate(false);
}

(Assume that entity is simply a container that describes the list item to create.)

The exception occurs in the case where the rename executes before the asynchronous event receiver finishes executing, thus the directory -- as originally named -- no longer exists since it's been renamed by the code above.  This caused random errors on our systems based on the order of execution and of course, when the debugger is attached, it works perfectly fine (I think because it changes the threading model).

We've changed our code now to something like this instead:

private SPListItem AddItem<T>(T entity, SPList list, string parentFileRef) where T : CtmoModelBase, new()
{
    if (string.IsNullOrEmpty(parentFileRef)) //Add to sub-folders
    {
        parentFileRef = list.RootFolder.Url;
    }

    _log.DebugFormat("Saving object to container URL: {0}", parentFileRef);

    try
    {
        Web.AllowUnsafeUpdates = true;

        if (entity.ContentTypeId.StartsWith("0x0120"))
        {
            // Add a folder (same in all cases)
            return list.AddItem(parentFileRef, SPFileSystemObjectType.Folder, entity.Title);
        }

        if (list.BaseType == SPBaseType.DocumentLibrary)
        {
            // Add a file to a document library.
            SPFolder parentFolder = Web.GetFolder(parentFileRef);

            return parentFolder.Files.Add(entity.Title, entity.BinaryContents, true).Item;
        }
        else
        {
            // Add a list item to a custom list.
            return list.AddItem(parentFileRef, SPFileSystemObjectType.File, entity.Title);
        }
    }
    finally
    {
        Web.AllowUnsafeUpdates = false;
    }
}

Which has solved the issue as the name of folder and items created from folder based content types no longer need to be updated to set the display name.

Filed under: SharePoint No Comments