An introduction to binary differentiation for ethical hackers

Binary Differencing is a reverse engineering technique that involves comparing two versions of the same software to reveal recent code changes – much like the find-the-difference puzzles in Reader’s Digest.

In ethical hacking, the goal of binary differentiation is to report new security patches in order to locate and identify corresponding vulnerabilities. Penetration testers and red teams can then use this information to launch N-day exploits in unpatched systems, for example.

Although simple in theory, binary differentiation is complex in practice. The following excerpt from Chapter 18, “Exploiting Next-Gen Patches,” of Gray Hat Hacking: The Ethical Hacker’s Handbook, Sixth Edition by authors Allen Harper, Ryan Linn, Stephen Sims, Michael Baucom, Daniel Fernandez, Huáscar Tejeda and Moses Frost, published by McGraw Hill, explains how to get started and presents four tools for binary differentiation. Download a PDF of the whole chapter here.

And check out this Q&A, in which lead author Harper explains why it’s so important for hackers to ethically disclose vulnerabilities they discover and the devastating impact of unethical disclosures.

In response to the lucrative growth of vulnerability research, the level of interest in binary differentiation of patched vulnerabilities continues to increase. Privately disclosed and internally discovered vulnerabilities usually offer limited technical details to the public. The more details released, the easier it is for others to locate the vulnerability. Without these details, patch differentiation allows a researcher to quickly identify code changes related to mitigating a vulnerability, which can sometimes lead to successful weaponization. The inability to patch quickly in many organizations presents a lucrative opportunity for offensive security practitioners.

Introduction to binary differentiation

When changes are made to compiled code, such as libraries, applications, and drivers, the delta between patched and unpatched versions can provide an opportunity to discover vulnerabilities. At its most basic level, binary differencing is the process of identifying differences between two versions of the same file, such as 1.2 and 1.3. Arguably the most common target of binary mismatches are Microsoft patches; however, this can be applied to many different types of compiled code. Various tools are available to simplify the binary differencing process, thus quickly allowing a reviewer to identify code changes between versions of a disassembled file.

Click here to learn more about


Gray Hat Hack: The
Handbook of the ethical hacker,
Sixth Edition
by Allen Harper,

Ryan LinnStephen Sims

Michel Baucom, Daniel

Fernandez, Huascar Tejeda

and Moses Frost.

Application Difference

New versions of apps are usually released on an ongoing basis. The reasoning behind the release may include introducing new features, code changes to support new platforms or kernel versions, exploiting new compile-time security checks such as canaries or Control Flow Guard (CFG), and fixing vulnerabilities. Often the new version may include a combination of the above reasoning. The more changes there are to the application’s code, the more difficult it can be to identify those related to a patched vulnerability. Much of the success of identifying code changes related to vulnerability patches depends on limited disclosures. Many organizations choose to disclose minimal information about the nature of a security patch. The more clues we can get from this information, the more likely we are to discover the vulnerability. If a disclosure announcement indicates that there is a vulnerability in the handling and processing of JPEG files, and we identify a modified function named RenderJpegHeaderType, we can deduce that it is related to the patch. These types of clues will be presented in real-life scenarios later in the chapter.

A simple example of a C code snippet that includes a vulnerability is shown here:

/*Uncorrected code that includes the insecure gets() function. */
int get_Name(){
character name[20];
printf(“nPlease enter your name: “);
gets (name);
printf(“nYour name is %s.nn”, name);
returns 0;
}

And here is the patched code:

/*Patched code that includes safer fgets() function. */ entire
get_Name(){
character name[20];
printf(“nPlease enter your name: “);
fgets(name, sizeof(name), stdin);
printf(“nYour name is %s.nn”, name);
returns 0;
}

The problem with the first snippet is the use of the gets() function, which does not offer any boundary checking, leading to an opportunity for buffer overflow. In the patched code, the function fgets() is used, which requires a size argument, thus helping to avoid a buffer overflow. the fgets() the function is considered deprecated and probably not the best choice due to its inability to properly handle null bytes, such as in binary data; however, it is a better choice than gets() if used correctly. We’ll look at this simple example later using a binary differencing tool.

patch difference

Security patches, such as those from Microsoft and Oracle, are among the most lucrative targets for binary differences. Microsoft has always had a well-planned patch management process that follows a monthly schedule, where patches are released on the second Tuesday of each month. Files patched are most commonly dynamic link libraries (DLLs) and driver files, although many other file types also receive updates, such as .exe files. Many organizations do not patch their systems quickly, leaving the opportunity for attackers and penetration testers to compromise these systems with publicly disclosed exploits or privately developed using patches. Starting with Windows 10, Microsoft is much more aggressive with patch requirements, which makes it difficult to defer updates. Depending on the complexity of the vulnerability patched and the difficulty in locating the relevant code, a working exploit can sometimes be developed quickly within days or weeks of the release of the patch. Exploits developed after reverse engineering security patches are commonly referred to as One day or n-daytime Exploits. This is different from 0-day exploits, where a fix is ​​not available the moment it is discovered in the wild.

As we progress through this chapter, you will quickly see the benefits of different code changes for drivers, libraries, and applications. Although not a new discipline, binary differentiation has steadily attracted the attention of security researchers, hackers, and vendors as a viable technique for discovering and deriving vulnerabilities. profit. The price of a one-day exploit is usually not as high as a one-day exploit; however, it is not uncommon to see attractive payouts for highly sought-after achievements. Since most vulnerabilities are privately disclosed without publicly available exploits, exploit framework vendors want to have more exploits related to these privately disclosed vulnerabilities than their competitors.

Binary Differencing Tools

Manually analyzing the compiled code of large binary files using a disassembler such as Interactive Disassembler (IDA) Pro or Ghidra can be a daunting task for even the most skilled researcher. Through the use of freely available and commercially available binary differentiation tools, the process of debugging code of interest related to a patched vulnerability can be simplified. Such tools can save hundreds of hours spent reversing code that may have no relation to a sought-after vulnerability. Here are some of the most well-known binary differentiation tools:

Each of these tools works as a plug-in for IDA (or Ghidra if indicated), using various techniques and heuristics to determine code changes between two versions of the same file. You may experience different results when using each tool with the same input files. Each of the tools requires the ability to access IDA database (.idb) files, hence the requirement for a licensed version of IDA or the free version with turbodiff. For the examples in this chapter we will be using the commercial tool BinDiff as well as turbodiff as it works with the free version of IDA 5.0 which can still be found online at various sites like at https://www.scummvm.org/news/20180331/. This allows those who do not have a commercial version of IDA to be able to perform the exercises. The only tools on the list that are actively maintained are Diaphora and BinDiff. The authors of each should be highly commended for providing such great tools that save us countless hours trying to find code changes.

Comments are closed.