To those who know where to look, Microsoft Word documents can harbor all sorts of compromising and embarrassing information, quite aside from poor grammar and stilted language.
Unwanted content can hide in numerous places and go along for the ride when you exchange files. Microsoft was so concerned about it that it gave us a simple way of cleansing files of that stuff, although you have to do it manually. I’ll get to that process in a bit.
The cargo of information can include mildly undesirable things like the document version, your name, your computer’s name, your hard drive’s name and where you stored the document, but it can also include snide comments, internal discussions and reviews, and graphics you’re not allowed to share.
Note that here I’m talking about Microsoft’s Big Three: Word, PowerPoint and Excel. Less-used Microsoft Office applications like Access and Publisher rarely cause information leakage.
One particularly big hiding place is in comments and revisions. Many documents are circulated for review and comment. When the final edits are done, the comments appear to be gone, but often they are only hidden from view. This gives document recipients the ability to see what your staff had to say before the document was released. Pre-release commentary is often direct, brutal and potentially embarrassing, so it should be eliminated before shipment.
Content also can hide in headers and footers in Word and Excel documents. This is particularly problematic in Excel, because headers and footers don’t show up on-screen, but they do when printed. If you reuse home-grown templates for oft-used documentation, it’s easy to forget to change the headers and footers. Just to be safe, it’s best to enter that information into the fields.
Excel has some other traps. Big spreadsheets beg to have certain columns hidden when you don’t need them, and those columns could contain information you’d rather the recipient not see. Indeed, Excel can hide entire worksheets. It’s easy to forget that there is concealed information when you ship the document, and it’s only later that you realize the cat has scampered out of the bag.
In Word, text can actually be buried in plain sight and not give any outward sign that it still exists. This often happens when “hidden” is applied to text as a font format. This is sometimes done in Word when a document has multiple purposes and some text isn’t used in both versions. If you forget and leave some of the text hidden, it can be revealed after reception of the document, which is why authorities recommend not using hidden text, as a security measure.
PowerPoint has its own special “gotchas,” where unwanted things can persist. Text and other content can disappear under other elements on a PowerPoint slide as you’re moving them around. And the area just off to the side of slides can accumulate content you meant to use, but didn’t.
As any PowerPoint user knows, finding small things underneath bigger things can be a chore. The document’s recipient has little better chance of finding forgotten graphs or text boxes than you do, but there’s always that chance, and if you reuse slide decks from job to job, it’s possible to leave something catastrophic underneath today’s items.
One final place where content can fly under the radar is as custom XML. Since Office 2000, Microsoft has been moving its documents from the old binary format that only hackers could penetrate to an open XML-based format that anybody can open and change. Most people never bother, but it’s entirely possible to hide lots of content in there that doesn’t show up in the Office interfaces.
Fortunately, Microsoft has taken pity on us and provided a way to automatically detect most of these leaks. It’s called “Document Inspector,” and it has shipped with the product since Office 2007 was released. It applies to all of the Big Three. It’s accessed through the “Prepare” selection in the main menu.
To use it, simply click “Inspect Document.” The Document Inspector opens and scans the document, giving you a dialogue box with its findings. It doesn’t tell you everything that’s in the sections it finds, only that it found content there.
For example, if it finds comments, it will merely tell you that there are comments, not what those comments are. If Document Inspector finds anything, it will give you a button next to that section in the dialogue box that will delete all the material.
Use the “Remove All” button carefully, because Document Inspector can’t distinguish between content you want to keep and content you don’t. And unfortunately, it can’t find small things under bigger things in PowerPoint. There, you’re on your own.•
Altom is an independent local technology consultant. His column appears every other week. He can be reached at firstname.lastname@example.org.