Sanitize Office: How to remove personal metadata

Do you ever sanitize Microsoft Office products? Have you ever stopped to consider the potential privacy risk from personal metadata that is embedded in Microsoft Office products? Here’s how to remove it.

Yesterday Microsoft announced Office 365 for Government. Since "security and privacy play a big role in any decision to move to the cloud," the feds' version will reside on separate servers than from the standard Office 365 users. Like Google has done in the past with Data Liberation, Microsoft Office 365 Trust Center offers customers the option to download "a copy of all of your data at any time and for any reason." On the same "it's your data" portability page, it states "To download a copy of end-user metadata (such as email address, first and last name, etc.), you can use Powershell cmdlets, including the Get-MsolUser Windows Powershell cmdlet. If you use Exchange Online, you can also utilize the Get-MailUser and Get-User Exchange Powershell commands."

RELATED: Rumored Office for iPad will challenge Windows 8 tablets

MORE: Microsoft launches Office 365 edition for US government customers

While Get-User Identity is a handy optional command, there are other times when users do not stop to think about the potential privacy risk from personal metadata that is embedded in Microsoft Office products.

The NSA tackled the subject years ago with a document called Redacting with Confidence: How to Safely Publish Sanitized Reports Converted from Word to PDF [PDF]. Metadata has the potential for "exposing information unintentionally." The US Cyber Labs blog states, "According to Microsoft's knowledge base article on the Metadata, the best way to remove all personal metadata from a document is to go to Tools | Options | Security Tab | 'Remove personal information from this file on save'. Be warned that this does NOT remove hidden text and comment text that may have been added, but those tasks are also covered in that article. Microsoft also provides the Remove Hidden Data Tool that apparently accomplishes those same functions but from outside of Microsoft Office." If however you have a more recent version than Microsoft Office 2003 or XP, there's a different way to do it.

There are Group Policy Administrative Templates for Office 2010 where admins can tweak and customize Office files as well as the "privacy options because you have a special security environment," but there is no default privacy for regular folks.

Many people may have no reason to remove the identifying data, but you might if you want to publish something anonymously and perhaps if you are sharing a document. Microsoft shows how to view the options and settings for Office 2007 and Office 2010, but here's how-to protect your privacy and strip out that info that Microsoft automatically embeds in Office.

First the long "hard" way so you can see that "privacy by default" does not seem to be part of Microsoft's plan.

If you have the 2010 version of Office, then under the File tab on Excel, PowerPoint or Word there is "Info." It shows Permissions, Prepare for Sharing, and Versions. On the right is Properties or Show More Properties. On the left there is "Options" listed directly beneath "Help." Clicking on Options opens another dialogue box to set specific options for whichever Office product you are using. Go to Trust Center. Then click on Trust Center Settings.

Next find "Privacy Options" and you will notice the checkbox for "Remove personal information from file properties on save" is grayed out under "Document-specific settings."

To activate it, you must run the "Document Inspector."

After running Document Inspector, you will be given the option to "Remove all" document properties and personal information.

Below is in Excel after stripping metadata; it's listed under File>Info and then Prepare for Sharing. It alerts users that properties and personal information was removed and offers the option to "allow this information to be saved in your file."

The reason for showing the long way was because I could not find a "privacy by default" setting to always strip the metadata from Office products like Word, Excel or PowerPoint unless you choose otherwise and want to save it. So luckily there is a very easy way to do this under Info and "Prepare Document for Sharing" and then Inspect Document.

If you want the data removed then, sadly, you must do so for each document, spreadsheet or presentation. If there is an option to assign privacy by default other than admin templates for group policy, then I didn't see it and Microsoft certainly didn't bother to reply to my emailed questions. After the post that started as an idea for a how-to and ended as "This is why people pirate Windows," I'm not sure why I even bothered to ask Microsoft PR. (By the way, that error "This copy of Windows is not genuine" has come back twice since that article.)

One last little note about "properties" that you may find handy to sort fact from fiction. If, for example, you are waiting on another person who keeps saying he or she is working on it, yet time is rolling on, when you finally get it sent back, then you can see how long that person actually worked on the project.

That's in theory, at least, since some people may have numerous documents open without working on it even though it adds to "total editing time." That's a property that is not removed even after stripping out metadata.

There are all kinds of reasons, security and privacy, to strip out personal metadata embedded in photos. I don't know if you would ever want to strip out the metadata in Office products, but even the NSA pointed out that people often forget the "sanitizing" step before publishing.

Like this? Here's more posts:

Follow me on Twitter @PrivacyFanatic

SUBSCRIBE! Get the best of CSO delivered to your email inbox.