Facebook will reach likely reach one billion users this year or next.
The privacy and security implications of this astonishing amassing of personal information are mind-boggling.
Imagine having access to the political views, sexual preferences, relationships, tastes, foibles, emotional states, workplace attitudes, etc. of a billion people.
An effort to collect such data on behalf of a government, or a corporation, or a geopolitical alliance, or an industrial sector, or even a seemingly benign world organization, would meet with fierce opposition. It would be difficult if not impossible; it would require lawyers, money and yes maybe even guns.
But in the era of social media, a extraordinary and rapidly growing number of us have been willingly posting such sensitive information (or at least the keys to unlocking it) online and accessible either directly or indirectly to marketers, stalkers, reporters, law enforcement, private investigators, human resource personnel, and rivals in love, business or politics, whether by subterfuge or inference or subpoena, whether legally or illegally, whether ethically or unethically.
It is all out there now, not just spread all across cyberspace in fragmented segments; no, happily, willfully offered up in an organized way.
Consider for example the Facebook profile photo.
No matter how tightly you zip up your Facebook account, people who you have not "friended" are going to come across your profile photo. And isn't that the point for most of us, not just to share status updates, photos, videos "likes" and comments with our current circle of friends and colleagues, but to expand that circle?
Indeed. But what if a stranger on the street could snap a smartphone photo of you, and then run it against profile photos in FB, and then learn not only your name, and your date of birth, your circle of friends and other such data, but was then able to take some of that data and "guess" your Social Security number from it, and then, of course with that Social Security number that stranger would have unrestricted access to the most sensitive details of your financial and medical information.
Well, it is possible, as The Economist (which broke this story) recounts:
"By mining public sources, including Facebook profiles and government databases, the researchers could identify at least one personal interest of each student and, in a few cases, the first five digits of a social security number. All this helps to explain concerns over the use of face-recognition software by the likes of Google and Facebook, which have been acquiring firms that specialise in that technology, or licensing software from them. (Google recently snapped up Pittsburgh Pattern Recognition, the firm which owns the programme the researchers used for their tests.) Privacy officials in Europe have said they will scrutinise Facebook's use of face-recognition software to help people 'tag', or identify, friends in photos they upload. And privacy campaigners in America have made a formal complaint to regulators. (Facebook notes that people can opt out of the photo-tagging service by altering their privacy settings.)" The Economist, 7-28-11
Yes, Alessandro Acquisti and Ralph Goss, the two Carnegie Mellon University researchers who rocked the world a couple of years ago with their blockbuster study proving that Social Security numbers could be guessed, have done it again.
The experiment that yielded the information on study subject's Social Security numbers was the third of three experiments.
The first experiment was about online-to-online re-identification. The researchers showed that they could take unidentified profiles from a popular dating site (where people use pseudonyms to protect privacy), compare them, using face recognition, to identified profiles coming from Facebook (without even logging onto the network itself; they simply used what of a Facebook profile you can see via a search engine), and end up re-identifying a significant proportion of users of the dating site.
The second experiment was about offline-to-online re-identification. It was conceptually similar to the first experiment, but the researchers tried to re-identify students on the Carnegie Mellon University campus after taking three shots of them with a cheap webcam. It took on average three seconds to identify more than 32%.
Acquisti, one of the co-authors of the study, will present the results of the study at Black Hat Briefings (where else?) on August, 4, 2011. Acquisti is a colleague of mine at Carnegie Mellon University CyLab, an advanced academic research program exploring 21st Century cyber security and privacy. He is also an associate professor of information technology and public policy at Carnegie Mellon's Heinz College.
We recently sat down to conduct this interview for the readers of CSO:
Tell us about the germination of this study. How did it come about? What are the intersecting trends that drew you and your co-researchers to look into this issue?Alessandro Acquisti: We actually started thinking about this study six years ago. Ralph Gross and I had written what turned out to be the first peer-reviewed published article about Facebook ("Information Revelation and Privacy in Online Social Networks," Proceedings of the 2005 Workshop on Privacy in the Electronic Society). Facebook was very young at the time, but its members were already revealing lots of personal information, and in particular identified primary profile photos. We thought that this could lead to visual re-identification, but we only started serious work on this idea after completing another one of our studies, the one about predicting Social Security numbers from public data, including, in fact, Web 2.0 profiles. In your presentation, you mention that FB is perhaps evolving into a default "Real ID." Explain.
Acquisti: Facebook users tend to create profiles under their real first and last names. This is due to a combination of reasons. First, when Facebook started, it was a campus-based social network, where members felt they shared something in common. Therefore members felt more comfortable using their real identities, as compared to behavior on MySpace or Friendster at the time. Of course, that Facebook "community" is in reality very much open - we call it an "imagined" community in one of our papers. Second, as Facebook expanded outside college networks, it realized that forcing a "verified identity" policy was good business - it meant better data on members and consumers. As a result, according to some of our estimates, about 90% of Facebook users use their real identities on the network. If you combine this fact with another, i.e., that the vast majority also use frontal face photos of themselves as their primary profile photos (which, by the way, Facebook makes visible to all by default), you end up with the concept of a de facto Real ID.
Tell us what you mean by "Augmented Reality" and what this research shows us about its consequences and implications?
Acquisti: We use the term, "augmented reality," in an expanded sense, to refer to the merging of online and offline data that new technologies make possible. When I can recognize your face in the street, using a face recognizer, and also find your Facebook profile that way, I can not only identify you, but also infer additional sensitive information about you (such as, in our third experiment, your Social Security number). Effectively, we start from an anonymous face in the street, and we end up with very sensitive information about that person. This is the kind of future we are walking into whether we like it or not, and the future consequences and implications of this seamless blending of online and offline data are anybody's guess.
You also mentioned scalability issues in your presentation. In what ways may scalability impact this trend?
Acquisti: As of today, automated face recognition is still pretty bad, but it keeps improving. If you look at the technological trends in cloud computing, the accuracy of face recognizers, and online self-disclosures, it is hard not to conclude that what we present today as a proof-of-concept in our study; will tomorrow become as common as everyday text-based searches on a search engine.
What are the immediate or near term implications of this study for users of Facebook and social media both personally and professionally? And likewise, what are the immediate or near term implications for organizations in regard to their workforce? What do governments and advocacy groups need to get their minds around in regard to these technological capabilities?
Acquisti: There is no obvious answer or solution to the privacy concerns raised by widely available face recognition. Google's Eric Schmidt observed that, in the future, young individuals may be inclined to change their names to disown youthful improprieties. It is much harder, however, to change someone's face. Other than adapting to a world where every stranger in the street could quite accurately predict your credit score and sexual orientation, we need to think about policy solutions that can balance the benefits and risks of peer-based face recognition. Self-regulation is not going to work.
Richard Power is a Distinguished Fellow at Carnegie Mellon University CyLab. He writes and speaks on security, risk and sustainability issues. Power is the author of seven books, and has conducted executive briefings and led security training in forty countries. He also writes a frequent column for CSO Magazine, and serves on its Technical Advisory Board.