Discovering potentially abusable binaries with streamlined PE Import Table searching

Introduction

I decided to put this blog post together only to share a simple idea which could potentially be useful or inspirational to some folks who are into – or are thinking about getting into – binary analysis.

Those familiar with malware analysis know that examining PE Import Tables is one of the first steps performed during static analysis, as entries contained within this data structure represent the list of external (present in shared libraries – DLLs) functions the executable is using (especially the common WinAPI functions), thus reflecting most of the basic functionalities given application implements.

For example, let’s have a look at some of the functions imported by putty.exe (displayed with Process Hacker’s PEview):

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image1

Based solely on these few entries we can assume that the application can create files and run external programs. In addition to functions responsible for networking (imports from WSOCK32.dll I expected to see in putty.exe as well), these make the usual features present in typical backdoors, even though the executable is a legitimate tool (also keep in mind that legitimate tools like this can and are often used as backdoors; when it comes to classifying something as legit or malicious, context of usage matters a lot).

It is worth mentioning that reviewing Import Tables as a basic method of analysis is superficial and cannot be relied upon as a sufficient way of determining whether given application is malicious; there are numerous techniques allowing to use or implement required features without revealing their existence in Import Table entries (e.g., reflective DLL loading or static linking). An Import Table entry is just an indicator. And again, the fact that an application is simply using features that are usually found in backdoors does not automatically mean that the application is not legitimate. These are features, after all.

PE Import Tables can also be abused (modified), for example to change the original behavior of the executable (e.g. make it load a different shared library).

Now, having said about the basics, the rest of this post is simply about using information extracted from different Import Tables only to initially identify executables that indicate the use of specific functionalities we could be interested in for research purposes. The idea is to simply extract the Import Table from all the executables in scope, save them in a database (SQLite did the trick for me) and then search it to pick ones for further analysis. The goal is up to us – we could be looking for:

  • backdoors,
  • abusable functionalities (like persistence, proxy execution, file downloading, or other features covered by LOLBAS and GTFO binaries projects),
  • vulnerabilities.

Demonstration

For my own convenience, for the purpose of parsing PE import tables, I decided to use radare2 under Windows Subsystem For Linux (command line is powerful and easily scriptable).

Extracting the Import Table entries from an executable is as simple as running rabin2 -i path_to_executable:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image2

Executed on my version of notepad.exe, the full output produced 378 lines in total:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image3

Now, all we want from this output are the fifth and the sixth column; the shared library name and the function name (lib + name, e.g. KERNEL32.dll + GetProcAddress):

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image4To avoid irrelevant output lines, we also want to skip those where the fourth column contains a value different than FUNC.

Then it would be nice to store these records in a database, so we can collect and then analyze this data for multiple executables. As I mentioned earlier, I chose SQLite, with the following table layout:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image5

To automate the process, I put the rabin2 output parsing and database storing logic into a simple python script: https://gist.github.com/ewilded/36b3ed598045c316820a6ac5ac1834b9

Then finally, to build the database for *exe files, I ran it like this:
$ find /mnt/c/Windows/ -iname ‘*exe’ -exec python3 PE_parse_and_catalog.py -i -P {} \;

The entire process took quite a bit of time and generated a 62-megabyte import_tables.db output file:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image6

Now, from this point, browsing is as simple as regular SQLite usage.

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image7

Now, let’s take the CreateProcess WinAPI function and search for its use instances. Knowing that WinAPI functions taking strings as arguments have different variants, depending on the character encoding in place – with suffix either equal A for ASCII and W for Wide Char – in this case we distinguish CreateProcessA and CreateProcessW.

Let’s start from obtaining the full list of all unique CreateProcess-like functions present in Import Tables of the executables from C:\Windows and its subdirectories, by simply issuing a SELECT DISTINCT query for any names containing the CreateProcess string:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image8

Getting such lists by itself can also reveal potentially useful (e.g. detection evasion) and yet unknown to us functions.

Anyway, for starters, let’s just stick to CreateProcessA and CreateProcessW from KERNEL32.dll, by issuing the following query:

select image_path from PE_import WHERE library_name=’KERNEL32.dll’ and (function_name = ‘CreateProcessA’ or function_name = ‘CreateProcessW’) (with my database, the query returned 275 executables):

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image9

Obviously, the same sort of Import Table gathering and searching can be done for DLL files. Similar approach could be as well employed for DLL Export Tables to find DLLs implementing functions with interestingly sounding names.

Some analysis examples

Even though the main goal of this post was to just share the concept, let’s complement it with some examples of how executables revealed this way could be analyzed for abusable behaviors.

 

C:\Windows\System32\Sysprep\sysprep.exe

Let’s pick one of the results of the CreateProcess-like query select image_path from PE_import WHERE function_name LIKE ‘%CreateProcess%‘) ; sysprep.exe from C:\Windows\System32\Sysprep\:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image10

Below screenshot demonstrates its Import Table entries displayed with Process Hacker’s PEview, manually confirming the presence of KERNEL32.dll::CreateProcessW:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image11

Now, to find out whether the way the application is using this function could be in any way abused, we need to identify all of its use instances (in other words, every place in the program when this function is called), and whether there is any way for the user to control arguments in that call (so arbitrary executables or arguments could be provided, e.g. to achieve proxy execution or when looking for command/argument injection). There are many tools to perform static analysis, I personally love NSA’s Ghidra.

Before we create a new project and import the executable file into it, it is good to download debugging symbols first (if they are available), so we can enjoy descriptive and thus intuitive function names once we start our analysis.

For most Microsoft binaries, this can be achieved by running the symchk command, e.g. for sysprep.exe:

C:\WINDOWS\system32>symchk /v /if C:\Windows\System32\Sysprep\sysprep.exe

If successful, we should see output like this:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image12

Now, we can run Ghidra, create a new project and import the executable into it:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image13

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image14

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image15

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image16

If the relevant symbol database PDB file was found and successfully loaded, we should see something like this in the Import Results Summary that pops up once the import is finished:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image17

Now, to run the CodeBrowser tool, we select the image in the project tree and double-click it:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image18

Then Ghidra displays the following pop up to confirm whether we want it to automatically analyze the executable (yes, we do):

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image19

We go with the default settings and click Analyze:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image20

Now, once analysis is finished, let’s get right into the Symbol Tree and select Imports:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image21

Then expand it, pick KERNEL32.dll, expand it as well to finally locate and pick the CreateProcessW entry:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image22

It is all that is required for Ghidra to generate relevant lists of incoming and outgoing references in the Function Call Trees window:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image23

Incoming calls mean call trees leading to all instances of use of the CreateProcessW function (every place in the program when this function is called). Thus, let’s expand the trees under Incoming References:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image24

In this case we only have one place in the program, where CreateProcessW is called. It’s somewhere in the WdsExecuteApplication function (even the name suggests that it is a wrapper function around CreateProcessW), although there are two different code branches that lead to it.

Both branches start from the executable’s entry point (the function named entry), then go through the standard __mainCRTStartup and WinMain functions (these, just like entry, are also standard functions automatically created by the compiler, thus significant number of function call trees start this way). Then there are two different instances of calling the custom HandleFatalError function, both of which end up calling OrchestrateHandleError in the same way. From there, finally, the custom wrapper function named WdsExecuteApplication is run.

So let’s select the OrchestrateHandlerError function in the call tree:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image25

And then call up Decompiler by clicking the CodeBrowser -> Window -> Decompiler menu option:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image26

We can see that the OrchestrateHandleError function does not take any arguments (void), and that it runs the WdsExecuteApplication function using a fixed string %windir%\\Setup\\Scripts\\ErrorHandler.cmd as the second argument:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image27

So, it turns out there are no arguments we could control here (e.g. any values derived from the command line arguments – WinMain third argument; LPSTR lpCmdLine) and the script is always run the same way. However, it seems we could trick it into running something else than C:\Windows\Setup\Scripts\ErrorHandle.cmd script if we (temporarily) tamper with the %windir% environmental variable before running sysrep.exe (by setting it to point to a directory we control and then creating Setup\Scripts subdirectories there and place our own executable under ErrorHandler.cmd name, e.g. putting the script to C:\Users\Public\Setup\Scripts\ErrorHandle.cmd and temporarily setting %windir% to C:\Users\Public).

 

C:\Windows\System32\runexehelper.exe

Now let’s see our second example:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image28

After downloading symbols, creating a new Ghidra project and loading the binary (the same way we did for sysprep.exe), we get just one and simple Call Function Tree:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image29

We can then navigate it with Decompiler, starting from the top (entry):

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image30

From here we can double-click the selected __scrt_common_main_seh call and make Ghidra decompiler step into the function body (this is how we can navigate down the tree):

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image31

Eventually on line 54 we find arguments from the command line and environment variables being collected and passed as arguments to the wmain function:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image32

Once we enter wmain, on lines 97 and 106 we see references to, respectively, a file named runexewithargs_output.txt and an environmental variable named %diagtrack_action_output%. Both turned out quite relevant a bit later:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image33

Then finally, we get to the bottom of this:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image34

From that point, a quick take at dynamic testing of runexehelper.exe with x64dbg, using arbitrary command lines, revealed that in fact the executable can be used for proxy execution, but only if the two following conditions are met:

  • an environmental variable named diagtrack_action_output must exist and point to an existing writable directory,
  • a file named runexewithargs_output.txt must not exist in the directory pointed by the %diagtrack_action_output% variable (this output file is created upon every execution).

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image35

That first execution created an empty file under C:\Users\ewilded\runexewithargs_output.txt. Thus, the second attempt – this time to call mspaint.exe – failed, and succeeded only after the output file was removed first:

Atos cybersecurity Blog Security Dive Julian Horoszkiewicz image36

As results of various SELECT queries made on the output SQLite database suggest, there’s plenty more examples to follow up. Some of them suggest potential for proxy execution or even privilege escalation, other appear to point at new persistence mechanisms, and so on. The point is, searching binaries this way will often take us to interesting places.

 

Conclusions

Streamlined Import Table parsing and searching provides an easy way to identify potentially interesting executables. On top of it, in addition to manual analysis, some sort of automation could be built, e.g. scheduled decompilation combined with taint checking of arguments as they pass the call trees between application entry points and particular WinAPI functions, as new binaries and new versions get deployed to our systems with installations and upgrades.

Share this article

Follow us on