Introduction

In this chapter we’ll learn about common computer virus infection techniques that target various file forms and system areas.

Boot viruses

The first known successful computer viruses were boot sector viruses; they are rarely used nowadays but are interesting since they can infect any computer by taking advantage of the boot process of personal computers.

Because most computers do not contain an operating system in their ROM, they need to load the system from somewhere else, such as from a disk or from the network. In early systems, the boot order could not be defined, and thus the machine would boot from the diskette, allowing great opportunity for computer viruses to load before the OS.

On newer systems, each partition is furthere divided into additional partitions. The disk is always divided into heads, tracks and sectors. The master boot record (MBR) is located at head 0, track 0 and sector 1, which is the first sector on the hard disk. The MBR contains generic, processor-specific code to locate the active boot partition from the partition table (PT) records. The PT is stored in the data area of the MBR. At the beginning of the MBR, there’s a bootstrap loader that contains the active partition and loads its first logical sector as the boot sector. The MBR code can be easily replaced with virus code that loads the original MBR after itself and stays in memory, depending on the installed operating system.

File infection techniques

Some viruses simply locate another file on the disk and overwrite it with their own copy. This a primitive technique but also the easiest of all. Overwritten viruses cannot be disinfected from a system but you need to delete from the disk the infected files.

Another overwriting virus infection method is used by the so-called tiny viruses. During the 90s, many virus writers attempted to write the shorted possible binary virus. The shortest viruses are often unable to infect more than a single host program in the same directory in which the virus was executed. This is because finding the next host file would be “as expensive” since you need to write more code.

Often the virus code is optimized to take advantage of the content of the registers during program execution as they are pressed in by the operating system. Thus the virus code itself does not need to initialize registers that have known content set by the system loader. By using this condition, virus writeers can make their creation even shorter.

Other techniques consist in prepending the virus at the front of host programs, appending the virus code in front of the host program or by embedding the host program inside of the virus body.

It’s also good to note that a virus can inject a decryptor into the executable’s code. The entry point of the host program is modified to point to the decryptor code which its location is randomly selected and split into many parts.

When the infected application starts, the decryptor is executed. It decrypts the encrypted virus bodu and gives it control.

Entry point obscurity viruses

EPO viruses do not change the entry point of the application to infect it; neither do they change the code at the entry point. Instead, they change the program code somewhere is such a way that the virus gets control randomly.

On Win32 systems, EPO techniques became highly advanced. The PE file format can be attacked in different ways. One of the most common EPO techniques is based on the hooks of an instruction pattern in the program’s code section. A typical Win32 application makes a lot of calls to APIs, taking advantage of API CALL by changing these pointers to their own start code.

Another common technique of EPO viruses is to locate a function call reliably in the application’s code section to a subroutine of the program. Because the patter of a CALL instruction could be part of another instruction’s data, the virus would not be able to identify the instruction boundaries properly by looking for CALL instruction alone.

To solve this problem, viruses often check to see whether the CALL instruction points to a pattern that appears to be the start of a typical subroutine call.

Newer Win32 viruses infect Win32 executables in such a way that they do not need to modify the original code of the program to take control. To get control, the virus simply changes the important address table entries of the PE host in such a way that each API call of the application via the import address directory will run the virus code instead.

An in-depth look at Win32 viruses

The world of computer antivirus research has changed drastically since Windows 95 appeared on the market. One reason this happened was that a certain number of DOS viruses became incompatible with older versions of Windows. In particular, the tricky viruses that used stealth techniques and undocumented DOS features failed to replicate under the new system. Since not many had enough knowledge about Windows internals, malware writers found a first shortcut: macro viruses, which are generally not dependent on the operating system or on hardware differences.

Infection techniques of 32-bit Windows

Because the most common file format is the PE format, most of the infection methods are related to that. The PE format makes it possible for viruses to jump easily from one 32-bit Windows platform to another.

Introduction to the Portable Executable File format

The most important thing to know about PE files is that the executable code on disk is very similar to what the module looks like after Windows has loaded it for execution. This makes the system loader’s job much simpler. In 16-bit Windows for example, the loader must spend a long time preparing the code for executing and this is because all the functions that call out to a DLL must be relocated. PE applications don’t need relocation for library calls anymore. Instead, a special area of the PE file, the import address table (IAT), is used for that functionality by the system loader.

For Win32, all the memory used by the module for code, data, resources, import tables and export tables is in one continuos range of linear address space. The only thing that an application knows is the address where the loader mapped the executable file into memory. When the base address is known, the various pieces of the module can easily be found by following pointers stored as part of the image.

Another thing that we should get familiar with is the relative virtual address (RVA), which is an offset to an item to where the file is mapped.

Last but not least, the section field holds either code or data.

Header infection

A virus inserts itself between the end of the PE header and the beginning of the first section. It modifies the AddressOfEntryPoint field in the PE headet to point to the entry of the virus instead.

Prepending viruses

The easiest way to infect PE files is to overwrite their beginning. Some applications won’t work correctly after the infection, triggering a red flag of the antivirus and busting the virus.

Appending viruses with no section header

This is a more advanced method used by some viruses that do not add a new section header at the end of the section table. Rather, it patches the last section’s section header to fit inot that section. In this way, the virus can infect all PE EXE files easily.

Appending viruses with no Entry Point modification

There are some viruses that do not modify the AddresOfEntryPoint field of the infected program, calculating where the original AddresOfEntryPoint points to and places a JMP instruction there that points to the virus body.

Kernel32.dll infection

Kernel32.dll infectors don’t attack the entry point of a program. Instead, this type of virus must gain control differently. PE files have many other entry points that are useful for viruses, especially DLLs, which are export APIs by nature. Therefore, the easiest way to attack Kernel32.DLLs is to patch the export RVA of one of the APIs (for instance, GetFileAttributesA) to point to the virus code at the end of the DLL image. After that, every program that has Kernel32.DLL imports will be attached to this infected DLL or whenever the application has a call to the API in which the virus code has been attached, the virus code gets control.

DLL load insertion technique

This particular infection technique is based on manipulation of PE files in such a way that when the host application is loaded, it will load an extra DLL, which is the virus code.

References

  1. The Art of Computer Virus Research and Defense, Chapter 4
  2. Master Boot Record
  3. Partition Table
  4. PE file format
  5. DLL