As if tracking down bugs in a complex application isn’t difficult enough, programmers now must worry about a newly emerging and potentially dangerous trap, one in which a program compiler simply eliminates chunks of code it doesn’t understand, often without alerting the programmer of the missing functionality.
The code that can lead to this behavior is called optimization-unstable code, or “unstable code,” though it is more of a problem with how compilers optimize code, rather than the code itself, said Xi Wang, a researcher at the Massachusetts Institute of Technology. Wang discussed his team’s work at the USENIX annual technical conference, being held this week in Philadelphia.
With unstable code, programs can lose functionality or even critical safety checks without the programmer’s knowledge.
That this problem is only now coming to the attention of researchers may mean that many programs considered as secure, especially those written in C or other low-level system languages, may have undiscovered vulnerabilities.
The researchers have developed a new technique for finding unstable code in C and C++ programs, called Stack, that they hope compiler makers will use when updating their products.
Using Stack, the research team has found over 160 bugs in various programs due to unstable code.
They found 11 bugs in the open source Kerberos network authentication protocol, all of which were subsequently fixed by the Kerberos developers.
Stack also found 68 potential bugs in the PostgreSQL database management software. Only after they had fashioned some sample code using bugs that crashed PostgreSQL did the database’s core developers remedy the issues with 29 new patches.
Unstable code may be hard to pinpoint because, to the developer, it may look, and behave, like functional code. It may also compile into a working program with no problems. Only when the compiler tries to optimize the code for better performance do the issues arise.
A compiler translates the source code of a program into machine code, using the specifications of the programming language itself. Compilers can also optimize code, or examine the code logic to look for ways it can execute more efficiently, which would improve the performance of the running program.
A compiler could, for example, drop a subroutine that is never called. But compilers could also drop code that falls outside the typical programming behavior, even if the programmer may have specific reasons for crafting the program in such a way.
For instance, a routine that guards against buffer overflows may check such a large boundary of memory beyond what is allocated for the program that the compiler may assume it is a mistake and eliminate that safety check altogether, Wang noted. The programmer would never know that the resulting program has no defense against buffer overflow attacks.
The research looked at 16 open source and commercial C/C++ compilers — from companies such as Intel, IBM and Microsoft — and had found they all dropped unstable code.
A compiler can issue warnings when it drops code, though compilers typically issue so many warnings, especially for large programs, that a notice of code being eliminated may be lost in the deluge of other largely inconsequential messages.
“I think compiler developers have known about this for years,” Wang said.
Not all the blame should be placed on the compiler makers, noted Peng Wu, a researcher at Huawei America Labs who was at the presentation.
In many cases, the specification of the language itself, which the compilers are based on, does not offer any guidance on how to handle certain conditions, she noted. So each compiler handles the cases of unstable code differently.
Also, the programmer should understand the trade-offs of using optimization, Wu said. For instance, if the entire code absolutely must stay fully intact, it shouldn’t be optimized, even if optimization does speed the time it takes to build the program and helps the resulting program perform better.
Wu noted that optimization was a chief priority for compiler makers in previous decades, when developers tried to get the best performance from the hardware as possible. Over the past decade however, has more attention been placed on finding bugs, due to the growing impact of security vulnerabilities, and so the problem of unstable code is now surfacing.