Programmatic generation of word documents

I had the opportunity to play around with generating word documents last week. At first I looked at what I had used before (the Office PIA).

I quickly remembered that it was not very easy to work with because most of it is based on COM interop anyways. Further-more, I stumbled across a little MSDN article that stated that the Office PIA should not be used to generate documents from a server; it suggested that it only be used in desktop application environments where a user is controlling the application itself. I imagine that is because the Office PIA really only invokes the Word application and tells the Word application what to do in real-time (or something like that)... and that makes sense to me. Can you imagine a web server that gets even 100 users spawning each of their own processes of word.exe to generate a doc?! Production... Nightmare...

Trying to think of some alternatives, I stumbled across Microsoft's Open XML API. What a wonderful thing. Sure, it's not generating documents in the well-established .doc format from 97, but then again, that format was established in 97. Isn't it about time we move on? Newer versions of Office even default to .docx formats. I used to reset the defaults back to .doc, but eventually gave up and started paying attention to who the audience of my documents was before I hit the save button; something a good writer should do anyways.

In any event: After deciding that it was OK to generate in .docx format and then convert to .doc (only if needed), I played around with the API a bit and was astounded with how much easier it was to create a word document with the Open XML API rather than the Office PIA. Take a look at these little examples:

Creating a document package:

using (MemoryStream docStream = new MemoryStream()) { WordprocessingDocument document = WordprocessingDocument.Create( this.docStream, DocumentFormat.OpenXml.WordprocessingDocumentType.Document); document.AddMainDocumentPart(); // A method I use to setup the CSS (sorta) styles for the document SetupStyles(this.document.MainDocumentPart.AddNewPart()); document.MainDocumentPart.Document = new Document( new Body()); // DO STUFF TO THE DOC HERE document.Close(); // Read the memory stream to get the binary contents of the document }

Add a header line:

// Add a Header1 Paragraph p1 = new Paragraph( new ParagraphProperties( new ParagraphStyleId() { Val = "Heading1" }), new Run( new Text("This is a test header"))); document.MainDocumentPart.Document.Body.AppendChild(p1);

Create a table:

// Add a table Table t = new Table(); TableRow r1 = new TableRow( new TableCell( new Paragraph( new Run( new Text("Row1-Cell1")))), new TableCell( new Paragraph( new Run( new Text("Row1-Cell2"))))); TableRow r2 = new TableRow( new TableCell( new Paragraph( new Run( new Text("Row2-Cell1")))), new TableCell( new Paragraph( new Run( new Text("Row2-Cell2"))))); t.AppendChild(r1); t.AppendChild(r2); document.MainDocumentPart.Document.Body.AppendChild(t);

Those are only three of snippets I was able to come up with in a matter of a couple hours. I have other examples of adding images, creating internal hyper-links, creating external hyperlinks (all of which are about equally as easy).

The only down-side I can see so far to the MS Open XML API is that it doesn't seem very well documented. I found better examples contributed by joe-nobody's like myself than I did from Microsoft. Sure, they have class documentation, but that doesn't compare to examples when dealing with a complex API like this.

I hope I get to work with the API more soon, cause it seems fun!