• United States



Contributing writer

How to protect chatbot data and user privacy

Sep 26, 20177 mins
Data and Information SecurityPrivacySecurity

Employees and customers often enter sensitive information during chatbot sessions, but you can minimize chatbot security and privacy risks.

Robot Artificial Intelligence chat bot
Credit: Thinkstock

Are chatbots your next big data vulnerability? Yes, chatbots, those little add-ons to Slack and other messaging apps that answer basic HR questions, conduct company-wide polls, or get information from customers before connecting them to a person, pose a security risk. 

Because of the way we buy bots, Rob May, CEO of chatbot vendor Talla, says the IT industry is heading toward a data security crisis. “In the early days of SaaS [software as a service],” he explains, software “was sold as, ‘Hey, marketing department, guess what? IT doesn’t have to sign off, you just need a web browser,’ and IT thought that was fine until one day your whole company was SaaS.” Suddenly, critical operations were managed by platforms bought without any user or data management best practices in place. To head off similar data vulnerability from chatbots, May recommends streamlining bot purchasing and implementation now.

Unfortunately, employees might already be using chatbots to share salary information, health insurance details, and similar data. So what steps can IT can take now to keep that data safe? How do you stop this vulnerability before it starts? What other questions should you be asking?

Understand how chatbots will be used

Start by triaging the current situation, says Priya Dodwad, a developer at computer and network security provider Rapid7. Then, before you build or buy anything else, interview users. This helps in two ways: First, user responses show whether the chatbots you’re considering will be used as planned. This improves user adoption and productivity. Interviewing also helps assess threat level: You can better prepare for chatbot privacy concerns when you know the type of data you’re protecting.

When Rapid7 considers a new chatbot, Dodwad says, “We start off thinking, ‘Okay, what is the information that’s going to be with it? Is it going to be PII [personally identifiable information] data or data that’s confidential or revenue related?’ Those bots concern us the most.” Rapid7 runs bots that do something non-critical—like paste gifs into Slack chat—through a less stringent process.

The problem with this, though, is that sometimes people can chat about serious stuff while using a frivolous tool. Jim O’Neill, former CIO at Hubspot, says, “Learn that your humans will volunteer data.” Using a gif bot, for example, one employee might send another a funny get-well message. Next thing you know, they’re discussing the latter’s cancer diagnosis. “If you think about conversational interactions with bots, we’re naturally going to be giving up more information than we intend to,” he continues.

To do their jobs, chatbots need to ask questions. The data they get helps them assess the situation and to train. O’Neill says, “As the bots ask more—because they’re trying to be helpful and learn more—sensitive data will just naturally get in there.” For example, think about a bot that routes health insurance customers to the appropriate department for help. First, it asks for the customer’s claim number, but then the user types, “It’s 4562 and I need to know if STD tests are covered because I gotta do something about this rash.”

Who else sees chatbot information?

Not only does IT need to prepare for unexpected data to be entered in the system, but CSOs should ask who’ll see this information as well. When considering a new vendor, May recommends asking where the data will inevitably go. Is it stored locally or in the cloud? To whom is it routed? How does the bot get trained?

[Related: KSecurity chatbot empowers junior analysts, helps fill cybersecurity gap]

As with most machine learning, real people often check an enterprise chatbot’s work to improve the engine. If human review is part of your vendor’s process, May says to ask, “Who sees the data? Does it go out on [Amazon’s] Mechanical Turk? Does it go out on a crowd file? Do you care?”

“There’s a tradeoff,” he continues. “Sometimes [the chatbot] might be the only way to get done what you need done and so you have to deal with that. You have to decide: Can your data go out there? Where does it go and how do these things train?”

One solution, May adds, would be to implement a service level agreement (SLA) addressing chatbot risks. In addition to including uptime requirements, quality expectations, and other matters you’d typically find in an SLA, make sure your agreement addresses chatbot encryption and similar security expectations: What external providers—like Turk—does the vendor work with? Will they maintain SSAE-16/SSAE-18 certification or SOC 2 compliance for the length of the contract? What happens if they don’t?

Start with a chatbot proof of concept

To mitigate risk, Dodwad says most external chatbots at Rapid7 start off as a proof of concept (POC). Only after a successful POC are they more broadly deployed. She says the POC is also a chance to reassess need: “It’s important to see what’s the coverage of that bot: Is it going to reach all the employees or is it just for a particular department? Things like that influence how we plan the deployment and the training around it.” Despite being a technology company, Dodwad says many Rapid7 employees “are not very technical, so we need to make sure [the bot] is very intuitive.”

The more intuitive, the better—not just so the chatbot can provide the solution it was bought for, but also so users won’t enter private, unnecessary data. Going back to our health insurance example, if users are providing too much data, make the chatbot easier to use. If the bot asks, “Please tell me your claim number and your claim number only,” fewer users will talk about their rash.

For employee-facing chatbots, user training teaches staff what level of information is and is not appropriate to share. Employee training also lowers the risk of rogue implementation—like the kind companies saw in SaaS’s early days. If employees understand why chatbot privacy concerns are important, they’ll be more likely to run new bots by IT before installation.

Beware of small chatbot data leaks

May recommends IT determine who has permission to install sooner rather than later. “It’s one thing to install a bot that does a lunch poll,” he says, but “a lot of these bots are going to use credentials to connect to systems. How do you monitor that?”

[Related: Legal incentives for spying on employees]

To further limit ad hoc implementation, remind employees that a series of small data leaks can cause just as much damage as a major breach. May shares the story of an attempted hack at Backupify, a cloud-to-cloud backup provider he owned before founding Talla: Someone used his CFO’s email address to unsuccessfully try to transfer funds from the company’s bank account. “If a bot knows things about your company,” he says, “when people ask a series of questions and retrieve pieces of information and you add it all together, they find out something they shouldn’t find out.”

Don’t scare your colleagues into submission. Of the many business operations that bots currently impact, HR is most affected. Enterprise chatbots are used to make recruiting more efficient, to onboard new hires, and to predict how likely an employee will be to leave. Talla itself sells to the HR space, answering common questions like “How much vacation do I have left?” so HR reps don’t have to. For any chatbot—and especially HR chatbots—to work, employees must be comfortable talking to them. O’Neill says, “PII, PHI, and all this information is going to be there. Don’t try to skirt it; embrace it. If you get comfortable that this data’s going to be there and you trust the companies that you’re doing your bot integrations with and chat interactions with, then you can trust that you’re going to have better results. You’re going to get better data, make better informed decisions.”