According to Steve McConnell, author of “Code Complete”, software development projects that reach 512,000 lines of code or more can see four to 100 coding errors per thousand lines of code. Coding errors create the software vulnerabilities that criminal hackers attack in order to enter and pillage the enterprise. Anything that can help to prevent those holes should be of interest to CISOs and their teams.
One example is the NSA-funded, research-based Wyvern programming language from Carnegie-Mellon University. Wyvern seeks to limit coding errors by various means including enabling the use of five different programming languages inside the host language in a secure manner, according to Jonathan Aldrich, associate professor, the Institute for Software Research in the School of Computer Science, Carnegie-Mellon University. Aldrich is the research group leader for the group behind the Wyvern project.
But not everyone is convinced that Wyvern is on the right track. In fact, Wyvern could be vulnerable to attack, according to Robert Coleridge, CTO, Secure Channels. Coleridge is a 40-year veteran software engineer. Secure Channels just patented a new encryption technology that 155 black hat hackers at Black Hat USA 2014 failed to break, according to a Secure Channels media release.
CSO explores the potential and the risk of Wyvern.
Wyvern, Programming Languages, and Security
The Wyvern Programming Language is a host language that enables developers to import any five programming languages for use on a software project, says Aldrich. Programmers can import existing languages or languages that they create that are uniquely equipped with the proper expressiveness for their industry domain. They or anyone in their field can import that language into Wyvern and use it with other languages, Aldrich explains. So extensibility is one of Wyvern’s strengths.
Programmers need to use multiple languages to complete projects. “One language is not enough because different problems and solutions are expressed in different ways,” says Aldrich. For example, SQL is optimized to describe queries to databases and has advantages over the query language for C#. This is one example of why coders prefer to go outside a single language to incorporate other languages.
Programmers can already include multiple programming languages in projects but the way they do it today introduces coding errors, according to Aldrich. Today, they use strings to code database queries into projects by first importing a database library, then passing strings to it, which interpret the libraries as a domain-specific database. These strings are not as safe as writing actual SQL queries in SQL, according to Aldrich.
In another example, HTML is great for expressing the structure of web pages. Developers can use strings to include HTML but, again, strings are not as secure as using HTML itself, Aldrich says. “HTML strings can lead to cross site scripting errors, for example,” says Aldrich. Wyvern will offer each language the developer cares to add as a native extension inside the host language.
Wyvern further seeks to address software vulnerabilities by fixing a core technical issue, i.e., the lack of modular extensibility. “Today, we can already import a library and use that library to write SQL into our code. But if that library approach is not modular, the library can conflict with other programming libraries,” says Aldrich. So, for example, if a developer has two languages that use less than and greater than delimiters, there could be conflicts. “The conflict occurs when the compiler sees ambiguous source code that could be part of more than one language. The compiler can’t do its job properly when this happens and the program either won’t run or won’t run properly. We fixed that in that even if the languages overlap in syntax they are guaranteed not to conflict,” says Aldrich.
The Wyvern language uses what Aldrich calls types to avoid these conflicts. Because a developer selects a type, Wyvern will tell the programmer what language she can use with that type, and this will ensure proper interpretation of the code as belonging to the one language and not the other. “We have associated a syntax for a new language for each type optionally, using a domain-specific notation,” says Aldrich; “adding the domain-specific notation helps the programmer express the program more efficiently and without introducing security vulnerabilities”. Associating the domain-specific notation with the type ensures that the compiler can always tell what the intended language is.
That’s not all CMU plans for Wyvern’s security enhancements. Wyvern project developers would like to add architectural control as a feature of the language. According to Aldrich, this would give development project leads additional tools and a compiler to ensure that all developer team members follow the required security practices to maintain secure coding and avoid coding errors. “Perhaps the lead developer insists that because they are writing distributed systems, the developers must use TLS. So, the lead developer could use the architectural control mechanism to ensure that no developer can introduce a code patch allowing non-encrypted communication with a server,” Aldrich explains.
Wyvern Programming Logic
While Aldrich holds that strings can be responsible for unsecure coding, Coleridge contends that these strings are just a bunch of ones and zeros and are inherently not vulnerable or dangerous. “It is how the developer compiles the code and whether it is coded securely that can make software unsecure. But software cannot be unsecure solely based on whether you choose to use strings,” says Coleridge.
And though Aldrich communicated that Wyvern could use architectural control to maintain secure coding practices, Coleridge says that architectural control and maintaining secure coding practices are two ideas that don’t really have anything to do with each other. And for those concerned about either one, the solutions exist on the market today. “There are already good tools out there for both architectural control and to make sure developers follow secure coding practices,” says Coleridge.
According to Aldrich, the goal of Wyvern is to improve secure coding.
But, Wyvern is a meta-language, not a true programming language, which intends to allow people to use different languages, explains Coleridge. “With anything that flexible, it could be easy to slip malware and viruses into it,” says Coleridge. If that is true, Wyvern could be a new source of vulnerabilities rather than the cure.
Selection criteria for NSA funding
In 2013, the NSA distributed a broad announcement to approximately 200 university departments about available funding for small research labs or lablets for unclassified research into fundamental scientific components of cybersecurity in order to create a scientific foundation for security. “This effort is aimed at providing the nation with assurance in its information systems,” says Stuart Krohn, Technical Director for the Science of Security, the NSA. The CMU research proposal was one of only four that the NSA eventually funded under the Science of Security initiative.
The NSA selected the CMU research because CMU’s research proposal addressed finding scientific bases for creation and scalability of security components so the nation can trust the resulting systems. Success means Wyvern could help developers to construct secure systems with known security properties without having to re-analyze the components themselves. These are hard problems.
Whether Wyvern sinks or swims, the larger goal of the project is right on target. While enterprises have focused on network security, people are realizing that it is equally important to have developers write applications in a secure manner. “Attackers can find a way inside the network to attack those applications. For end-to-end security, every application should be secure,” says Aldrich. The CMU developers intend Wyvern to address that.