Malware: Windows is only part of the problem

On coding secure and resilient applications

We’ve all been hearing a lot about secure applications recently, or more accurately about insecure applications; specifically those that are exploited in identity theft raids or that we can be “tricked” into running on our PCs.

Insecure applications are such a problem that Microsoft has spent the last five years and many millions of dollars re-engineering its operating system and much of its other software in order to improve the situation [and can one ever really overcome the temptation to bolt-on security to a fundamentally insecure design, in pursuit of “backwards compatibility”, in such circumstances – Ed].

Other software providers are doing the same thing and there has been an explosion of anti-virus and spyware removal vendors in the industry. It’s not that software has suddenly become insecure, rather with the internet there is now a viable means for criminals to exploit these insecurities to create ill-gotten gains.

It’s estimated that there are 15 million instances of identity theft each year. Many of these identities are stolen in direct attacks against e-commerce websites, universities, government systems but others are a result of malware installed on our home computers recording our keystrokes.

According to the Microsoft security intelligence report (available here) we saw the emergence of over 40,000 new trojans and 30,000 password loggers specifically designed to steal identity information for online banking in the first half of 2006. In the same period, the Microsoft malicious software removal tool disinfected almost 10,000,000 computers.

With these figures, it’s fairly safe to say it’s a significant problem; however, there are a number of misunderstandings about the cause. Frequently, the tendency is to blame the operating system and in Microsoft’s own words (here [2Mb Powerpoint]) “Windows 98 clients cannot be effectively secured”.

However, it really isn’t just a question of operating system shortcomings, it’s often the applications and the services running on the operating system that provide the open backdoors to malware - and the operating system simply can’t stop them.

The worst example of this is the now defunct Kazaa, which was software explicitly designed to mislead the user about its true function - while pretending to provide p2p functions it secretly installed spyware and adware all over our PCs.

Clearly, nobody should trust software coming from an unknown source with unknown motives; this was the lesson one should have learnt from Kazaa and it’s an extreme one. Nowadays, however, malware finds its way into our systems through security holes even in application software that was designed and implemented honestly.

The classic example of such a backdoor is a buffer overflow attack arising from malicious input. In the early days, a URL with over 256 characters could cause Internet Explorer to execute arbitrary code. The arbitrary code is chosen by the attacker and almost certainly, the payload will be either malware, or something that installs malware.

Another example of a real vulnerability is a jpg decoder that allows arbitrary code to be executed when decoding an incorrectly formatted image. All the attacker had to do was to place such an image on a website or send it in an email, your browser would try to load it and his/her code would be executed.

However, the point here isn’t these specific issues (which shouldn’t be a problem any longer, as long as you've patched your operating system and applications up-to-date) but that the applications rather than the operating system are now the entry point for malware and that the vulnerabilities can arise in subtle scenarios. Who would have thought that simply looking at a picture could leave your computer wide open to an attacker?

Buffer overflow attacks have been around since the early days and are possible because of how programs execute in computers such as the PC running Windows, which have stack-based architectures. There’s nothing particularly clever about them but new code continues to be vulnerable; it’s something hackers are good at finding and, unfortunately, us good guys haven’t the best record at preventing them. In this series of articles, we want to look at how this and other attacks work and to highlight what can be done to make our code less vulnerable.

So how does the buffer overflow work? On a windows PC, programs execute by starting in the main function, and calling other functions to perform calculations. As each function is called it pushes its stack frame onto the stack. This stack frame consists of local variables, and housekeeping information for the operating system; including the stack base pointer of the calling function and the return address. The last part is particularly important for the buffer overflow attack; after the function returns the next instruction to be executed is that pointed to by the return address.

In C and C++ programs, coders allocate arrays to fixed sizes appropriate for the expected, valid input. In a buffer overflow attack, an attacker provides input that is deliberately malformed so that more data is assigned to the array than it can handle and as a result, the return address mentioned above is overwritten.

When the function returns the system continues execution at the address specified by the attacker, and we can assume that it will point to code that we don’t want executing on our system; and, of course, since malware coders are no better at testing their code than anyone else, and it’s quite hard to test this stuff in practice anyway, at best your system may crash nastily.

The following example shows how this works with well-formatted parameters. You’ll see that bar is never called; however a buffer overflow in foo causes the code of bar to be executed nonetheless.

Buffer Overflow example

namespace {
  void bar() { 
    printf("hello world");

  void foo(char *psPtr) {
    char vsBuf[8] = "ABCDEFG";
    strcpy(vsBuf, psPtr);     // [1] 

In this example, the code that calls foo is located just before memory address ’00 40 11 46’ and the code of bar is located at ’00 40 10 30’. These addresses appear on the stack in reverse byte order because our test system is little endian. (00 40 11 46 is seen below as 46 11 40 00).

We stop execution at [1] and examine the stack. The local variable containing “ABCDEFG” is clearly visible followed by a pointer (not important; but it’s the stack base of the calling function). After this we see the value ’46 11 40 00’ – this is the return address of the function if you read it in reverse.

0012FF08  CC CC CC CC 41 42  ÌÌÌÌAB
0012FF0E  43 44 45 46 47 00  CDEFG.
0012FF14  80 FF 12 00 46 11  .ÿ..F.
0012FF1A  40 00 6C FF 12 00  @.lÿ..

We now execute the string copy.

int main() {
  char myString[20] = {
    'Z',  'Z',  'Z',  'Z',  'Z',  'Z',  'Z', 'Z', 
    0x80, 0xFF, 0x12, 0x10, 0x30, 0x10, 0x40, 0x00,

  return 0; 

After executing the string copy, we see that the function return address has been overwritten. When the function returns, execution will now continue at the location of bar

0012FF0E  5A 5A 5A 5A 5A 5A  ZZZZZZ
0012FF14  80 FF 12 10 30 10  .ÿ..0.
0012FF1A  40 00 6C FF 12 00  @.lÿ.. 

Finding exploits

But how do hackers find code like this that they can exploit? Well, the applications that we run are freely available. All a hacker has to do is take the application and start feeding it random data or corrupted files as input. In some cases, this data will cause crashes and a percentage of these crashes are exploitable. The specific corrupted data that caused the crash becomes part of the hacker's analysis, in the process of delivering a new exploit. In other cases, where the source code is available, the attacker can analyze the code and find security holes. In any event, it’s not safe to assume that insecure code will never be exploited because it will never be found; it does get found, using methods like those described here and others — the statistics from the first paragraph prove it.

We know that organized crime is moving into technical areas and that there’s a black market for exploits that can be used to install malware on your PC (see here for example). And “zero day exploits” (that is, exploits for which no patch yet exists) for Windows Vista are reported here to be on sale for upwards of €10000.

Why would crime pay this kind of money? Well one working exploit works against every machine running the vulnerable software, at least until a patch is developed and installed. A subverted machine can bring revenue in many different ways; participating in botnets, sending spam, collecting identity information on its users, the list goes on…

So faced with this kind of well-resourced and determined attacker, what can developers do? As always, awareness helps: developers who are aware of the problem and its causes are less likely to make the mistake.

Additionally, code analysis tools such as those from Fortify and Programming Research exist that compile source code and produce analyses that enable security problems to be identified and some C and C++ compilers contain features that provide a level of protection from these vulnerabilities.

In VC7 this called Buffer Security Protection. Unfortunately, the implementation does not cover all scenarios, and for the scenarios it does cover questions have been raised to its effectiveness here. However, the idea of increasingly using tools to ensure secure software is a good one; and, hopefully, future evolutions of this will be more complete.

Microsoft has produced a website describing how software development process can be improved to produce more secure code and in upcoming articles, we’ll look at some of these, including threat modelling and abuse cases. We’ll also look at some of the other security holes that can creep into software that can be exploited through attacks such as injected SQL or cross site scripting. In the meantime, give some thought to disabling some of the services and applications you don’t use! ®

Biting the hand that feeds IT © 1998–2021