Inserting HTML into Word Documents
For a recent prototype, I was tasked with building a solution to allow authors to create “standard text” entries in SharePoint 2010 which could then be re-used by authors.
One of the challenges was trying to figure out the best way to store this type of content. Word documents is one obvious solution, but I approached it differently: using SharePoint’s HTML field type.
You can see in the screen on the left, a list with a column called Default Text. This is simply a multi-line text field set to either Rich text or Enhanced rich text.
SharePoint’s rich text editor is fairly competent for most text formatting scenarios.
This design allows authors to create basic text bodies with formatting entirely in SharePoint
The next question is how to take advantage of this.
In the authoring environment, Word 2010, an add-in was written to allow users to connect to SharePoint and browse the list contents with fully formatted text.
You can see the screenshot on the right. The UI is a mish-mash of WPF and classic Windows Forms controls (specifically, the WebBrowser control) to render the HTML contents for preview.
The add-in uses the SharePoint web services to connect to a configured list to retrieve the items using the trusty old Lists web service (the tool was written to support both 2007 and 2010).
The next step is to figure out how to add the text to the body of the Word document and retain the formatting. In all honesty, I figured that this would be the most complicated part. In fact, it was the easiest part and accomplished with only a few meager lines of code.
The trick is to leverage the InsertFile function of the Range. This allows us to insert arbitrary HTML strings by first saving them as files to a temporary location on disk.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
/// <summary> /// Saves the default text to a temporary .html file which can then be inserted. /// </summary> /// <returns>The path to the temporary .html file.</returns> public string SaveToTemporaryFile() { string htmlTempFile = Path.Combine(Path.GetTempPath(), string.Format("{0}.html", Path.GetRandomFileName())); using (StreamWriter writer = File.CreateText(htmlTempFile)) { string html = string.Format("<html>{0}</html>", Default); writer.WriteLine(html); } return htmlTempFile; } |
The only trick is that <html/> must be the root element of the document.
Once the temporary file is saved to the disk, we can insert it into the Word document:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
private object _opt = Missing.Value; /// <summary> /// Inserts the standard text into the active selection range by creating a /// temporary file and using the <c>InsertFile</c> method to inject HTML into a /// Word document. /// </summary> /// <param name="standardText">The standard text.</param> /// <param name="contentControl">The content control.</param> public void InsertStandardText(StandardText standardText, ContentControl contentControl) { object range = Application.Selection.Range; if (contentControl == null) { contentControl = _localDocument.ContentControls.Add( WdContentControlType.wdContentControlRichText, ref range); } contentControl.Title = standardText.Title; string path = standardText.SaveToTemporaryFile(); contentControl.Range.InsertFile(path, ref _opt, ref _opt, ref _opt, ref _opt); // Snipped for brevity... } |
The screenshot to the left is an example of HTML content inserted into a Word document.
As you can see, it retained inherited the local formatting (fonts, font size) where no explicit font specification is made. But otherwise, it faithfully rendered the HTML content entered in the list.
That’s it. I like it when things are simple yet powerful. For server side (OpenXML manipulation), the story is a little bit different – we’ll explore that some other time.