Where did that data come from? And do you want fries with it?

Factors like participant compensation may taint results, says Ben Rothke

He uses statistics as a drunken man uses lamp-posts – for support rather than illumination — Andrew Lang (1844-1912)

One is hard pressed to go a day in the IT world without reading an article, white paper or seeing a PowerPoint without a metric, poll, statistic or some sort of number attempting to make a point.

[Analyze this, and that: CSOs latch onto better data tools]

But have you ever considered, in the spirit of Juvenal's "quis custodiet ipsos custodies" – who guards the guards – where that data really came from?

Data from Gartner, Forrester, J.D. Power are often accepted as the final word, used to set corporate security strategies and direction. For example, when Forrester claims their results are based on a survey of 500 CISOs, how did they ensure the respondents are legitimate CISOs? Just how did they get that data? These are just a few of the types of validation questions that should be asked before taking action on the data.

The reason to deal with their results with some hesitation is that Forrester, like many other firms, uses paid surveys. While it is uncertain what percentage of their respondents are paid survey based, the fact that they are used, may taint their results.

A paid survey is a type of statistical survey where the participant is rewarded through an incentive program, generally entry into a sweepstakes program or a small cash reward, for completing the survey. Traditional surveys are usually unpaid, as with Gallup or Quinnipiac University polls.

Paid survey companies such as Opinion Miles Club and e-Rewards are two of many firms that have large databases of eager participants. What they offer are rewards for those who complete surveys; be in the form or airline miles, gifts cards, and the like. Research, marking and analyst firms use them as they are a huge pool of those willing to answer surveys.

It seems like a legitimate barter; you give them the information they want, they give you a reward. While there are problems with that model; this is not the place to detail them. But two of the most significant issues are that when there are incentives, it may compromise the data. More importantly, the incentive is substantially low for the data required.

[Big goals for Big Data]

What this means is data for significant technology questions are being answered at a minimum wage rate, when the qualifier to answer the question requires being a senior technologist.

Let me give you an example from a recent survey from e-Rewards.

Every survey has a qualification section. Before one can participate in the survey, the polling firm needs to ensure that the respondent is qualified. In this one, the first qualifier was the respondent's job title. The selection criteria had 12 selections, ranging from senior-most business leader (owner, president, C-level executive), VP in IT, to student, retired, or not currently employed.

[Big Data without good analytics can lead to bad decisions]

For this, the respondent stated that they were at the CIO level. At every level, the survey can end, with a screen stating the respondent is not qualified, or move on to the next if they meet the criteria.

The next screen asked for the size of the firm, with 12 selections; from a few employees, to over 25,000. The respondent clicked the over 25,000 button.

The next screen inquired what sector that the respondant works in. The respondent selected financial services.

Next asked was what type of cloud providers the repondent has in their organization. The selections were IaaS, PaaS, SaaS, none, and don't know. The respondent selected the 3 cloud types.

Based on that answers, the respondent is a CIO of a financial services firm with over 25,000 employees, that uses IaaS, PaaS and SaaS technologies. While they didn't ask for the person's salary, it is safe to estimate such a person would make at least $200,000- annually.

Based on those qualifiers, the person is given the option to take the survey.

The next screen states that if the person wants to continue, the survey will take about 10 minutes, and the respondent will get $4.00 in e-Rewards currency (note that this is not real money) if they complete the survey, or 25 cents in e-Rewards currency if they don't. With that, they are being paid the equivalent of $24 per hour, in theory. Note that they clearly state that, "e-Rewards Currency has no cash value."

Given the name of this survey firm has rewards it its name, it is clear that it is all about the incentive. And once someone has answered enough surveys, they can convert their e-Rewards currency into rewards.

Using gift cards as an example, the currency itself is worth only 1/3 of the value of a gift card. Obtaining a $50 Starbucks cards requires $145 in earnings.

[4 ways metrics can improve security awareness programs]

Using this survey as an example, one would need roughly 6.25 hours of survey time to obtain a $50 gift card. That is $8.00 per hour, just more than the US minimum wage of $7.25.

So in essence, information for your cloud strategy is being gotten by an executive who would answer a survey for the equivalent of $8.00 per hour.

When it comes to airline miles, the reward is even less. In one survey on the Opinion Miles Club, the respondent gets 60 miles for 20 minutes, or 180 frequent flyer miles for an hour. Most airlines require 25,000 miles for a domestic ticket, that would require 138 hours of surveys.

According to the U.S. Department of Transportation Bureau of Transportation Statistics, the U.S domestic average itinerary fare in Q1 2013 was $379.00 A $379- ticket at 138 hours of surveys is $2.74 per hour.

Suggestions and Conclusion

Does it make sense that a CIO of a large financial services firm would spend hours completing surveys in order to get Starbucks or iTunes gift card? Such a CIO would likely have such items thrown at them by vendors trying to get on their calendar.

Metadata is data about data. When it comes to surveys, find out the data about the data. How exactly are they obtaining the data? How are they validating that the respondents are qualified to answer the questions?

This is not to say all data is bad or every survey data should be discarded. But you can't take it at face value.

[Security metrics: Critical issues]

So how do you get strategic answers you can rely and take action on? One solution is to join a group where you know the people you are dealing with rather than simply relying on blind data. Practitioner-based IT research services such as Wisegate (of which I am a member) and IANS offer the ability to directly interact with your peers. Networking at industry conferences is another method.

Of course, my suggestion of the above to groups is what is known as selection bias. And as Pete Lindstrom of Spire Security observed, "in my experience, every concern with numerical surveys has a similar problem in the qualitative environment. The cool thing about numbers is that it is very easy to see the biases and problems. I consider that a plus, not a minus. In qualitative world, it all gets masked, even though there is a very long list of these psychological scenarios."

Getting good answers to hard technology questions is not easy. If you find a better way, let me know.

Ben Rothke CISSP, CISA (@benrothke) is an information security manager and the author of Computer Security: 20 Things Every Employee Should Know (McGraw-Hill).

To comment on this article and other CSO content, visit our Facebook page or our Twitter stream.
Insider: Hacking the elections: myths and realities
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.