Inserting HTML into Word Documents

For a recent prototype, I was tasked with building a solution to allow authors to create “standard text” entries in SharePoint 2010 which could then be re-used by authors.

One of the challenges was trying to figure out the best way to store this type of content.  Word documents is one obvious solution, but I approached it differently: using SharePoint’s HTML field type.

List wth standard text items.

You can see in the screen on the left, a list with a column called Default Text.  This is simply a multi-line text field set to either Rich text or Enhanced rich text.

SharePoint’s rich text editor is fairly competent for most text formatting scenarios.

This design allows authors to create basic text bodies with formatting entirely in SharePoint

The next question is how to take advantage of this.

Add-in panel in Word 2010

In the authoring environment, Word 2010, an add-in was written to allow users to connect to SharePoint and browse the list contents with fully formatted text.

You can see the screenshot on the right.  The UI is a mish-mash of WPF and classic Windows Forms controls (specifically, the WebBrowser control) to render the HTML contents for preview.

The add-in uses the SharePoint web services to connect to a configured list to retrieve the items using the trusty old Lists web service (the tool was written to support both 2007 and 2010).

The next step is to figure out how to add the text to the body of the Word document and retain the formatting.  In all honesty, I figured that this would be the most complicated part.  In fact, it was the easiest part and accomplished with only a few meager lines of code.

The trick is to leverage the InsertFile function of the Range.  This allows us to insert arbitrary HTML strings by first saving them as files to a temporary location on disk.

The only trick is that <html/> must be the root element of the document.

Once the temporary file is saved to the disk, we can insert it into the Word document:

HTML content inserted into the Word document

The screenshot to the left is an example of HTML content inserted into a Word document.

As you can see, it retained inherited the local formatting (fonts, font size) where no explicit font specification is made.  But otherwise, it faithfully rendered the HTML content entered in the list.

That’s it.  I like it when things are simple yet powerful.  For server side (OpenXML manipulation), the story is a little bit different – we’ll explore that some other time.

You may also like...