A memo dated July 7, issued by the FBI and the National Counterterrorism Center, warns law enforcement and private security agencies about the practice of Google Dorking (or Google Hacking if you prefer) and what can be done about it.
The memo is rather plain, but the content comes off as something that makes one wonder why such a warning is needed in the first place. That is, until you start searching for documents in the government domain space. Then, it all makes perfect sense.
The FBI's warning tells agencies that actors will use Google Hacking to "locate information that organizations may not have intended to be discoverable by the public or to find website vulnerabilities for use in subsequent cyber attacks."
It goes on to reference various flags that can be used on Google to find information, including the file type, site, URL, and in text operators. Google makes a complete list of valid operators available online.
As an example, the memo highlights an incident in 2011, where attackers used Google Hacking to discover Social Security Numbers on a Yale University FTP server. Another incident singled out in the memo focused on the 35,000 websites that were compromised after attackers used Google to locate vulnerable vBulletin installations.
A quick search on Google shows that the memo makes a valid point, as many of the websites indexed in the government space offer a variety of documents available for public consumption.
However, from an attacker's perspective, the internal forms and documents - as well as the contact details on some of them - offer a way to fake legitimacy during a targeted attack. Many of the documents have an internal context, something that can be leveraged by attackers in order to get someone to open an attachment, follow a link, or share information.
Moreover, the documents themselves contain metadata.
Putting those documents into a tool like FOCA (Fingerprinting Organizations with Collected Archives), reveals additional details such as author names, email addresses, network naming conventions (including network shares), system paths (useful for mapping a network or system), software titles and version numbers, IP addresses, and operating system data.
The search below offers an example of what the memo is talking about, but it will need to be tuned in order to discover some of the sensitive documents. In addition, readily available lists of search terms for website vulnerabilities can be found all over the Web, such as the list found here.
filetype:"xls | xlsx | doc | docx | ppt | pptx | pdf" site:gov "FOUO" | "NOFORN" | "Confidential"
The memo recommends that website operators use robots.txt to prevent directories with sensitive information from being indexed, and encouraged the use of Google Hacking to discover files already in the public domain. From there, they can be removed from Google by following the search giant's guidelines.
A full copy of the memo is available from Public Intelligence.