• United States




Is language-theoretic security the answer to Internet insecurity?

Feb 17, 20164 mins
Application SecurityData and Information SecuritySecurity

How some industry innovators are putting theory to practice with language-theoretic security (LANGSEC).

Historically traditional security technologies have not been very effective. For the better part of the last two decades, IT and security teams have been focused on defending the networks. Attackers have figured out how to work around network controls.

Every five to 10 years a new technology comes along that needs to be tested, and according to Kunal Anand, co-founder and CTO at Prevoty, language-theoretic security is the next generation of application security controls that are broadening the solution space.

According to Upstanding Hackers, “Language-theoretic security, or LangSec, is the emerging field of digital security that treats code patterns and data formats as languages and their grammars for the purpose of preventing the introduction of malicious code into software.”

LangSec enables companies to secure their prized data and applications without lag-time or delay. “It is security across initiatives,” Anand said.

Anand has been a bit of a fish swimming upstream as he has grown in his understanding of security technologies. One key question that drives his ongoing development in application security, he said, is asking, “Is there an ability for us to add security so that we can make sure those apps don’t get compromised?”

Across sectors and industries, whether it’s an enterprise in financial services or retail, they all have a preponderance of software they’ve developed. For most of these well-established companies, the issue with application security is rooted in the reality that they may not have the same team that developed that software, said Anand. Those vulnerabilities are exposed by attackers.  

Understanding what controls can predict actions before they happen can strengthen security and mitigate threats, but how is it possible to predict an event before it happens? Anand said controls like run-time application security monitoring (RASP) and LangSec offer a deeper level of security than pattern matching.

“Pattern matching can block content in list of patterns, but attacker can get more specific and put spaces in between or insert upper/lower case letters,” said Anand. “With new types of attack, payloads are generated really quickly. Criminals are using scripts like JJencode, so how do you guard against that using patterns?” 

When Anand worked at MySpace, he said, “We had to run thousands of patterns. The problem is that false positives and false negatives are high in pattern matching.” When security is focused on anomaly detection, you first have to determine what is normal.

“Applications are always changing, some change up to 40 times a day. The thing you are trying to detect is changing because the underlying application is always changing,” said Anand.  

LangSec is the idea of understanding what something is going to do before it executes, so it looks at the intent within the context. Anand gave the example of network controls in SQL injection. 

“Network controls are looking for 0s and 1s, but they aren’t living inside the application. If you can see every database query going between the application and database, you can look at it the same way that the database is going to look at it. You don’t have to guess. You can look at the query and understand exactly what it is going to do,” Anand said.

Coupling visibility inside the application and nowhere else with the idea of language analysis to look at HTML for things like command injection or SQL injection will allow security professionals to understand the way that the database would execute.

“We don’t care about patterns because patterns change all the time. Applications are always changing. Data flow analysis may change all the time, but the benefit of being in the application is you don’t have to guess,” Anand said. The result is a much reduced rate of false positives at a much faster rate.  

For LangSec to be accurate, though, you need to look at the payload in the right context. “Building a LangSec approach is really difficult on its own,” Anand said. “An enterprise can have lots of databases that all diverge in unique ways. Oracle may have different functionalities from SQL. You have to understand how each target system is going to execute. You have to build formal tokenizers and language analysis tools for each one of those pieces of the tool chain.” 

While some have overlooked LangSec as too theoretical or nerdy, more organizations are waking up to it. “Gaming companies are talking about applying LangSec.”  

The greatest challenge right now is moving security professionals forward, beyond the habits of defending the network. Anand asked, “How do you explain to people who are entrenched in patterns and pattern matching?”  

The proof, as they say, will be in the pudding.


Kacy Zurkus is a freelance writer for CSO and has contributed to several other publications including The Parallax, and K12 Tech Decisions. She covers a variety of security and risk topics as well as technology in education, privacy and dating. She has also self-published a memoir, Finding My Way Home: A Memoir about Life, Love, and Family under the pseudonym "C.K. O'Neil."

Zurkus has nearly 20 years experience as a high school teacher on English and holds an MFA in Creative Writing from Lesley University (2011). She earned a Master's in Education from University of Massachusetts (1999) and a BA in English from Regis College (1996). Recently, The University of Southern California invited Zurkus to give a guest lecture on social engineering.

The opinions expressed in this blog are those of Kacy Zurkus and do not necessarily represent those of IDG Communications, Inc., its parent, subsidiary or affiliated companies.

More from this author