PHPOpenDoc – Creating Word2007 Documents using PHP
I recently started a new Open Source project that allows me to create Microsoft Word (2007) documents from a PHP script. You can find this project on my page at GitHub. In a moment I’ll show you a quick “Hello World” example of how easy my new library can create a document with minimal effort on the programmers part.
For those of you that may not be aware, Microsoft Word 2007 “docx” files are actually “Office Open XML” (OOXML) files (not to be confused with “Open Office“). The OOXML format is an ECMA standard that many organizations have adopted. Note: MSWord 2003 “doc” files are a totally different format.
I know there are already a couple of PHP libraries out there that allow me to create MSword documents but none of them had an API that I liked, or they were buggy, or lacked features that I wanted. So I put my nose to the grindstone and started coding.
My number one goal was to create an API that was extremely easy and intuitive to use and with that in mind I decided to make PHP v5.3 the minimum requirement. This version has namespace and OOP related features that make PHP coding much more elegant.
Example 1: Hello World
Please note that at the time of this writing the library is less than a month old and has a long way to go before all features are completed or fully working. However, the example shown below does work.
require __DIR__ . '/autoload.php'; use PHPDOC\Document, PHPDOC\Document\Writer; // start a new document $doc = new Document(); // start a new section (basically translates to a "page") $sec = $doc->addSection(); // Add a single paragraph to a section in the document $sec = "Hello World"; // Save document Writer\Word2007::saveDocument($doc, 'helloworld.docx');
Creating a docx file can’t get much simpler than that. Now lets dissect that code a little bit.
First, we create a new
Document object. This object is the core object that allows you to build up a document using different elements (eg: Sections, Paragraphs, Images, Tables, etc…).
Next, the first thing you will usually do is start a new
Section. In MSword a “section” usually translates to a physical page (but realize that this is not always true). The
Section is where you will add all of your document elements that make up your content.
Next, we added some paragraph text to the
Section by assigning a string to the
$sec array. This is your first glimpse of how I’ve tried to make the library easy and flexible to use. In this case we just added a plain string to make a paragraph but in more complex cases we could have added a new
Text object with formatting properties (eg: bold, colors…). Each time you add an element to the Section array you’re actually adding a new block level element (like a paragraph or table).
Finally, we save the Document to a file and that’s it! If you open up that file in MSword you’ll get a single page document with the words “Hello World” in it. And here’s a neat trick: change the extension of a “docx” file to “zip” and you can open it up as a normal ZIP file to see its various components.
Example 2: Slightly more complex
Here’s a slightly more complex example that highlights a couple of the ways the API actually works (afterall, most documents aren’t going to be built solely from plain strings).
require __DIR__ . '/autoload.php'; use PHPDOC\Document, PHPDOC\Document\Writer, PHPDOC\Element\Text, PHPDOC\Element\Paragraph, PHPDOC\Element\Table; // start a new document $doc = new Document(); // start a new section (basically translates to a "page") $sec = $doc->addSection(); $sec = new Paragraph(array( "The ", new Text("quick", array('i' => true)), new Text("brown", array('color' => 'A52A2A')), "fox ", new Text("jumped", array('b' => true)), "over the ", new Text("lazy", array('u' => true)), "dog." )); Writer\Word2007::saveDocument($doc, 'helloworld2.docx');
In this example we created a paragraph with the phrase:
The quick brown fox jumped over the lazy dog.
It may seem like a lot of code to make a single sentence but this just shows you the flexibility that the library has. It allows you to do things at a very low or high level depending on your requirements. In the future I plan to have a feature that will be able to read HTML tags and apply the styles to the text (however, as of right now that has not been worked on yet).
Example 3: Tables
require __DIR__ . '/autoload.php'; use PHPDOC\Document, PHPDOC\Document\Writer, PHPDOC\Element\Image, PHPDOC\Element\Table; // start a new document $doc = new Document(); // start a new section (basically translates to a "page") $sec = $doc->addSection(); $sec = Table::create() ->row() ->cell("R1C1") ->cell("R1C2") ->cell(new Image("http://php.net/images/php.gif")) ->row() ->cell("R2C1") ->cell("R2C2") ->table() // start nested table ->row() ->cell("N1R1C1") ->row() ->cell("N1R2C1") ->end() ->end(); Writer\Word2007::saveDocument($doc, 'helloworld2.docx');
In this example we created a table that had an image and even another nested table within it. This Table class makes it very easy to create complex table structures with minimal coding. Its important to note that each “cell” allows for any type of “block” level content like paragraphs or other tables.
The Image in this example points to an external resource and when the document is saved it automatically pulls down the image and stores it in the document so you never have to worry about external dependencies. The image could also have been a local file or even a memory buffer.
As the library matures I’ll write up more articles and tutorials that will explain its various features. If you’re interested in contributing to the library please go to my GitHub page and contribute!