Back in the blogsphere
July 2nd, 2009After a hiatus of some 19 months, I am glad to be back in the blogsphere. Check out what I\’ve been doing at Go MD Web – feedback welcome.
After a hiatus of some 19 months, I am glad to be back in the blogsphere. Check out what I\’ve been doing at Go MD Web – feedback welcome.
The file formats in Microsoft Office 2007 for Word, Excel, and PowerPoint (.docx, .xlsx, and .pptx, respectively) are based on ZIP technology. Just change the file extension to .zip, open in WinZip or a similar program and you expose the internals of the files. And presto, you have XML files (broken into a set of files and folders) that can be edited in Notebpad, or by a batch script, or any other means.
So, what does this mean for you, who need to efficiently manage your content in Word or other Office apps? You can edit, delete, add or verify information automatically for all files in a folder, for example, without opening individual documents.
Here is an example (from the user guide for Author Max™):
<dc:title>Author Max™ Toolbar Pro</dc:title>
<dc:subject>User guide</dc:subject>
So if I wanted to change the metadata field subject (doc property subject in Wordspeak) for many documents, I could write a script that looked in the core.xml file for all Word documents and presto make the change. (Of course, it might be simpler just to use Author Max™ to enforce rules for document properties and styles in the first place.)
Here’s another example, also taken from the Author Max documentation.
< ...w:val="center"/>© 2007 Method M Ltd. (All rights reserved)…
So if I wanted to change the footer for many documents to be left aligned, and the copyright year to be 2008, all that would be needed is to write a script that looked in the footer3.xml file for all Word documents and presto make the change. (Of course, it might be simpler just to use Author Max™ to enforce rules for headers and footers in the first place.) Need help implementing this or other Word functionality? Hey, that’s what our e-mail is for (info at methodm dot com).
Best wishes for clear, efficient and great writing!
Katriel
Tech writing forums regularly get hit with questions in the vein of “Word has corrupted my styles”. And the answers that come in are useful for some cases. Such as:
However, what to do when my files already have helter skelter formatting? For this very need we have included “fix styles” in our Author Max toolkit. Details here.
Be a victim of Word’s automatic formatting “features” no longer. Equip yourself with Author Max and fight back.
Katriel
A useful list of DITA sites (thanks to Bob Doyle):
Authors working in FrameMaker or Word have hard-coded links within topics to other topics using cross-references. When moving to DITA, authors often tend towards hard-coding links in topics, inserting cross references or using the element.
What’s wrong with hard-coded links?
They decrease reusability, they tend to break, they tend to get out of date, and they are high maintenance.
- Decreased reusability: hard-coded links may not make much sense when a topic is reused but if you hard-coded the links you’re stuck with them.
- They tend to break: if the target topic is renamed or moved, the link will break.
- They tend to get out of date: if a related topic is added, the author would have to look in many topics, find the appropriate locations and insert many times the appropriate link.
- High maintenance: see reasons 2 and 3 above.
Hard-coded links are not in a good idea in FrameMaker or Word, but when working in unstructured DTP tools you didn’t have much choice. In DITA you do — and you should use it. “Relationship tables” in DITA allow you to control linking from one place, for many topics, rather than hard code links within many topics.
It’s not often that this blog for power authors is able to offer relationship advice, but today we are. Use relationship tables and start improving your documents!
Best wishes,
Abby, … oops, I mean Katriel
If you will be using any fancy Word 2007 features (see the previous post for a listing), and sharing your Word files with users of Word 2003 or earlier versions, you should consider working in compatibility mode,
Compatibility mode ensures that content created in 2007 can be opened/editing in 2003. For example, when you choose “Insert SmartArt” in compatibility mode, the 2003 diagramming tool appears. (When not in compatibility mode, the content created by the 2007 SmartArt diagramming tool will not be fully editable in Word 2003).
Congratulations, you have Microsoft Word 2007. Excellent choice. But, you have to share files with less evolved colleagues still using Word 2003 (or, gasp, an even earlier version). The bad news is that while Microsoft has a free download that enables Office 2003 users to open Office 2007 files, you may experience some disruptions. Microsoft’s compatibility checker sometimes refers to these issues with a message that includes the phrase ”you may experience some minor loss of fidelity”. Well, minor is subjective - so here is a listing of issues (from Microsoft TechNet) that you should be aware of.
The next post will describe how to use Compatibility Mode when writing/editing in Word 2007 to proactively avoid these issues.
| Name |
Description |
Compatibility Mode Behavior |
| Math |
Equation building is new to Office Word 2007. |
Equations are represented as non-editable images. These images are refreshed when the document is converted. The Equations UI is disabled in compatibility mode. |
| Themes |
Themes are new to Office Word 2007. |
Themes are permanently converted to styles. The Themes UI is disabled in compatibility mode. |
| Colors (Theme Chunk) |
Themes are new to Office Word 2007. |
Themes are permanently converted to styles. The Themes UI is disabled in compatibility mode. |
| Font (Theme Chunk) |
Themes are new to Office Word 2007. |
Themes are permanently converted to styles. The Themes UI is disabled in compatibility mode. |
| Effects (Theme Chunk) |
Themes are new to Office Word 2007. |
Themes are permanently converted to styles. The Themes UI is disabled in compatibility mode. |
| Content Controls |
Content controls are new to Office Word 2007. |
Content controls are permanently converted to static text. The Content Controls UI is disabled in compatibility mode. |
| Tracked Moves |
Tracked moves are new to Office Word 2007. |
Tracked moves are permanently converted to “Insert” and “Delete.” |
| Major/Minor Fonts |
Major/minor fonts are new to Office Word 2007. |
Major and minor fonts are permanently converted to static formatting. |
| Relative Text Boxes |
The ability to set the position of a text box relative to some part of a document. Relative text boxes are new to Office Word 2007. |
Relative positioning of text boxes is permanently converted to absolute positioning. |
| Margin Tabs |
Margin tabs are new to Office Word 2007. |
Margin tabs are permanently converted to absolutely defined tabs. |
| Bibliography |
New to Office Word 2007. |
Bibiliographies are permanently converted to static text. |
| Citations |
New to Office Word 2007. |
Citations are permanently converted to static text. |
| Placeholder text |
New to Office Word 2007.. |
Placeholder text is permanently converted to static text. |
| Office Art 2007 |
Office Art engine is improved upon in the 2007 Office release. |
All Office Art 2007 objects are converted to Office 97–2003 objects. These objects are refreshed when the document is converted. When a user selects SmartArt in Office Word 2007, the Diagram Gallery from Word 2003 appears. |
| SmartArt Diagram |
Some diagrams are new to the 2007 Office release. |
Diagrams in the 2007 Office release are converted to non-editable images. When the document is converted, these images are refreshed to 2007 Office release again. When a user selects SmartArt in Office Word 2007, the Diagram Gallery from Word 2003 appears. |
| Custom XML Data store |
New to the Open XML Formats, custom-defined XML information can be stored as a separate component within the Open XML Formats, to help organizations include content from their own data sources, using their own languages. |
The XML data store is removed during conversion, and XML data and content within XML bindings are permanently converted to text. |
| Vertical Text Box Alignment |
Vertical text box alignment of center or bottom are new to Office Word 2007. |
Vertical text box alignment of center or bottom is permanently converted to top vertical text box alignment. |
| Office Charts |
Charts can now exist as native objects in Office Word 2007. |
Office charts are converted to Excel OLE objects. These objects are refreshed when the document is converted back to 2007 full functionality mode. When a user selects Charts in Office Word 2007, the Diagram Gallery from Word 2003 appears. |
| ActiveX |
Active X controls can be added to Word documents to deliver enhanced functionality. |
Disabled ActiveX controls are converted to their image representation when saved to a downlevel file type or opened in a downlevel application verison through the converter. |
Katriel
Advertising Age reports yesterday that 20 MINUTES is “the average amount of time a consumer spends trying to set up a device before giving up”. Figure out how many returns or support calls that generates for your company, and figure out how much those returns or support calls cost your company, and you have your business case for better documentation!
Katriel
IMHO, the Word 2007 ribbon interface is a big improvement. However, it takes time to get your bearings. (I’ve been using Word 2007 full-time for about 9 months, since the beta period, and I still find myself scratching my head trying to remember where to find a particular function that I could find in my sleep in earlier versions.)
So — you may want to check out the Get Started tab (shown below). Download from the Microsoft site.

You can also download a workbook from the Microsoft site that lists the locations of Word 2003 commands in Word 2007. Recommended!
Katriel
A.S., a loyal reader, writes, “I’m a bit hazy on the difference between XML and XSD”. Well, hopefully this post will clarify the issue for you.A schema (XSD) describes what must be in the XML document. For example, it might say that every item must have one catalog number, and one name, but may have one or more sizes (e.g. 500 gram and 750 gram).
The schema (XSD) describes what must be in the XML document. For example, it might say that every item must have one catalog number, and one name, but may have one or more sizes (e.g. 500 gram and 750 gram).
The XML document would list what’s in the catalog. For example:
100
Corn Flakes
500
700
200
Bran Flakes
500
750
1000
1250
In the above case, the schema (XSD) would declare the XML file invalid if it had no catalog number – or if it had 2 or more catalog numbers.
Katriel
BTW, DITA processors generally use DTDs rather than XSDs, but that’s another post.
Word 2007 creates, by default, ”.docx” files rather than “.doc files”. If you need to share .docx files with users of earlier versions of Word, you can save as .doc files. If you do not have Word 2007 but have received .docx files — no need to worry. Just download the compatibility pack from Microsoft — allows older versions of Word to open .docx files.
When saving as .doc files Word will warn you about any features that are likely to be problematic. In my experience to date, Word has been conservative — warning about relatively minor problems.
Katriel
Avi, a loyal and critical reader, asks “I already have a suitable tool (RHX5), what could I possibly benefit from… ”
Well, Robo Help is certainly a reputable tool. And, if it works for you, then remember the first rule from Engineering 101: “If it works don’t fix it”.
But if you need to deliver content in multiple channels (PDF), if you need to tailor content for specific audiences, if you want to reuse content for different needs (implementation, training, user guide, troubleshooting, support, etc.), if you need to cut down on translation costs… then IMHO you should be thinking seriously about DITA.
Katriel
P.S. We have posted a new white paper: Find out why DITA matters and what’s in it for you.