A study by Workshare, a company focused on secure file sharing applications, says that 68 percent of the 800 professionals surveyed failed to remove metadata before sharing documents.
Due to this oversight, potentially sensitive information was unknowingly leaked to the outside.
Moreover, 65 percent of those in the survey said it was their responsibility to ensure that sensitive company information is protected, yet only 32 percent of them confirmed that they actually take steps to removed metadata from files before publishing them, sharing them outside of the organization, or hosting them externally.
In context, metadata is the embedded information included with PDF files, Microsoft Office documents, images, etc. When it comes to this embedded information, most organizations don't even know it exists, let alone know to protect it.
For example, in 2011, a 1.2 GB Torrent file published by someone representing Anonymous led many to believe the U.S. Chamber of Commerce, and the American Legislative Exchange Council (ALEC) had suffered a data breach. They didn't, but the wealth of information published at the time certainly made it look like a data breach.
As it turns out, the information mistaken for a post-breach data flood was document-based metadata. Within the U.S. Chamber of Commerce file set, including 194 Word documents, 724 PDF files, 59 PowerPoint files, and 12 Excel files, there were nearly 300 names, most of them representing network IDs.
Moreover, there were email addresses exposed, network naming conventions (including network shares), system paths, a sizable list of software titles and version numbers, IP addresses, and operating system data.
All of this information was discovered by a tool named FOCA (Fingerprinting Organizations with Collected Archives). The reason it's such a big deal, is because the data that was leaked in those documents could help an attacker map a plan of attack; either via Phishing or malware delivery.
Workshare has a commercial offering that will address the metadata problem. However, fixing this problem is something that can be done in-house, and the only cost associated with it is time.
While FOCA can be used by the bad guys, but it can also be used internally to see what information is being inadvertently leaked to the outside. The best place to start is all of the publicly available documents that are related to your organization.
These can be located on the company website, as well as the websites maintained by business partners. Once those are checked, collecting the documents that exist on network shares and other places within the network could be a solid second step – as they could be shared with the outside depending on their purpose.
After the checks, if metadata is in fact a problem, correcting it is a matter of process and habit.