Vaughn Of The Dead Pt III: Some small-fry

December 17th, 2007 Greg

Since we last spoke, Vaughn has seen very little action. The problem is not the week or so of down-time he’s experienced, but the fact that his virtual environment lives inside my computer. While the virtual PC itself is about as safe as a chainsaw-piñata, the internet connection to which it is bridged is protected by the firewalls and antivirus of my computer and router. This is such an elementary design-flaw that I was tempted to keep it quiet, but let’s move on and never speak of it again. The problem has gone unfixed since I discovered it (a fortnight ago) and, as this is the lowest-priority project on my agenda, it will probably remain so for a little while. At least until I work out why my router’s demilitarised-zone setting seems to do absolutely nothing.

Anyway, a piece of malware found its way into my possession via a different medium, and so I kept my side of the bargain and gave it to Vaughn. But don’t get too excited - this adware trojan is so uninteresting that none of the antivirus companies bothered to give it an identity any more unique than ‘Generic Delphi Downloader’. (yes, that’s right, Delphi). I still don’t know where it came from (after China).

The original infection was on my rarely-used laptop. First came the mysterious popup advertisements, then the rogue ‘Add Favourite’ dialog boxes. Both coincided with the installation of Internet Explorer 7 and so - being a long-time Firefox convert - I didn’t take too much notice. But when the activity persisted even after IE7 had been removed from the picture, I took a peek at my process list. Windows’s Task Manager showed nothing suspicious - mainly because it is borderline-useless - but OllyDbg’s attach menu showed up seven instances of svchost.exe. If you know anything about Windows Services, you’ll know that these are nothing more than user-mode process that house DLL-based shared-service modules. At least that’s true of the six instances resident at “%systemroot%/system32″, but the one at “%ProgramFiles%/Internet Explorer” was rather more spurious.

A brief analysis showed that this file (with the ‘hidden’ and ’system’ file attributes set) was compiled by Borland Delphi 7.0 and that despite its obviously trojan nature, the process - true to its name - actually did host a Windows Service. Only, a bad one. After familiarising myself with the verbose assembly produced by Delphi (like any other very-high-level language) and the fastcall-esque nature of the internals, I produced a flow-chart of the trojan’s life-cycle.

  • Initialise the Delphi run-time library.
  • Get the executable path and set the file attributes to ‘SYSTEM | HIDDEN’.
  • Look for a service named ‘windownetpker’. If none such is registered with the system, create it.

    If you look for this service in the administrative tools, you’ll see it just points straight back at the same executable file. It hides under the name ‘Window Image Worker’, which is presumably supposed to resemble the legitimate ‘Windows Image Acquisition’. Naturally, it is set to auto-start.

  • If the service isn’t running, start it.
  • Check the user-name of the process environment and determine how the process was launched.

    When a program is launched manually, this user-name is that of whoever is logged in. But when launched as a service, GetUserName returns ‘SYSTEM’, or in certain cases ‘LOCAL SERVICE’. By comparing against these two the trojan works out whether it is supposed to act as a service or just a plain and filthy malware executable.

  • In the normal-user case, clean up and quit. Otherwise, enter the secondary phase, idling in a service event-handling loop.

    It is made quite clear in the MSDN documentation that, when launched, a service should not do anything before calling StartServiceCtrlDispatcher. Not only does our trojan violate this, but it also goes through all the unnecessary work of installing and starting the service, even when it probably is the service. Now, I’m quite happy for people to infect my computer without consent but they breaking the rules is just plain rude.

The call to CreateServiceA:

The call to CreateServiceA

The service payload isn’t much more interesting. It opens up a few UDP ports (starting from 1025), establishes a TCP connection with cpk4.easy78.cn (HTTP) and waits for the spam to come rolling in. When such an item does come along, the service displays a popup (usually offering ‘great savings’ on something or other) or attempts to add a page to your IE Favourites. The mundaneness was all too much for me and so I didn’t probe much further, but the trojan doesn’t seem to use any suspicious APIs and it even gives you the offer to ‘Cancel’ the popup ads. A little courtesy goes a long way.

That’s all I have to say about ‘Generic Delphi Downloader’ and any other ‘generic’ downloaders that I may run into in future. Here’s hoping that something truly wicked finds its way to me before too long. And for the record, I don’t have any problems with Delphi. It just struck me as an odd choice for the task.

Armadillo, Nanomites and vectored exception-handling

December 11th, 2007 Greg

Let me tell you about a problem I ran into a couple of years ago, and the solution I ended up with. If you’ve ever heard of ArmInline, then this is the story behind its Nanomites tool.

The Background

If you’re not already aware, Armadillo is a commercial anti-cracking software scheme for Windows: you buy a license, throw your exe (or DLL) at it, and you end up with a new, protected, file. This new program does just what the old one did, but it’s far harder to reverse-engineer. As the attacker, our goal is to remove the protection so that we can have our wicked way with the program inside.

Among other things, Armadillo employs a system known as Debug Blocker. Briefly put, this causes the program to create two instances whenever it is run - we call them the ‘parent’ and ‘child’ processes. The parent acts as a user-mode debugger, nannying the child (which does all the real work) to make sure that no bad guys can get too close. This system was fairly easy to defeat - all you needed to do was detach the parent process’s debugger at an appropriate moment and attach your own.

So to prevent this happening, the developers of Armadillo invented what they call Nanomites. When the protector is installed on the program, user-marked parts of the code section are scanned for jump instructions (JZ, JNZ, JBE and so on), and a database is created containing the address, type and offset of each. These jump instructions are patched over with ‘INT 3’s (user-mode breakpoint interrupt) and the database is put in the hands of the debugger. The idea is that the child process will raise a debug-break exception whenever one of these instructions fires, whence the parent steps in, grabs the thread context, looks up the appropriate jump in the database and sets the child process on its merry way.

This works very well. If the Nanomite-enabled code regions are chosen carefully then performance is virtually unaffected, and any attempts to sever the child-parent bond results in an immediate and unrecoverable crash. Even worse for the would-be cracker, the information needed to recover the code to a working state is locked up in this database, which is encrypted several times over and accessed only by heavily obfuscated, anti-debug-ridden routines. Reverse-engineering this would be a royal pain.

Getting the table

Many successful efforts had been made to reverse this encryption process and produce a working Nanomite table, but with each offence from the crackers came a counter-offence from the developers and pretty soon there were several variants of the Nanomite system floating around. It was time for a unified approach. Being lazy as I am, I insisted on making the computer do as much of the work as possible. So the plan was this:

Write a program to debug the parent process. That is, debug the debugger. With this level of control, it would be reasonably easy to fool the parent into processing Nanomites at our will. Three function hooks need to be created in the parent process:

  1. WaitForDebugEvent - This is the primary source of information for any debugger. With a hook in here, we could forge any conceivable exception and let the parent attempt to handle it.
  2. GetThreadContext - When alerted of an INT 3 exception, the parent calls this to find out where the Nanomite was struck. Another hook and we can feign a Nanomite hit at an arbitrary address.
  3. SetThreadContext - After ploughing through that obfuscated code, the parent will have decided where execution should continue from, and enforces its will by setting the thread context. This last inside-element will help us determine the details of any given Nanomite.

From here the algorithm writes itself. We find all instances of the byte 0xCC (INT 3) in the code section, spoof an INT 3 exception at each of these points and watch how the parent responds. By setting the EFlags register to take different values for the same Nanomite address, we can determine under which circumstances the jump occurs and hence exactly which conditional jump is being emulated. A few switch-statements later and we have a complete Nanomite table, without having to step through a single instruction of Armadillo’s code.

The Real Problem

After all that work, it we can just assemble all the jumps from the database into place and dump the process. That’ll be sure to remove all the Nanomites, right? Well, yes, but it turns out that something far nastier happens in the process. See, when Armadillo creates the table in the first place, it doesn’t just store the addresses of the jumps but also creates some false entries at addresses that happen to legitimately contain a 0xCC byte. This means that a completely unrelated ‘CALL DWORD PTR:[0043CC7A]’, for instance, will produce a false entry in the table. This entry will never be needed, as the 0xCC is in the middle of an instruction and can’t trigger an exception under normal circumstances, but those clever developers have put us in a real dilly of a pickle.

There is simply no sure-fire way to weed out the ‘false Nanomites’ from the real ones. Without defeating the object of our endeavour and writing a purpose-built debugger to do exactly what we didn’t want the parent process doing, how can we fix this?

The Solution

It took a little bit of brainstorming, but this is where vectored exception-handling comes to the rescue. This little-used feature of the Win32 API allows for installation of a process-wide exception-handler that doesn’t depend on stack-frames. They are of limited use in the real world, but just perfect for our needs for the sole reason that the VEH chain is triggered before the SEH chain.

Suppose that we’ve managed to dump and patch the program (and fixed the imports, encrypted pages, code-splicing) so that it runs without the parent. Suppose further that the original program didn’t use any VEH. Then everything works great until a Nanomite triggers: a debug-break fires, promptly falls through all the structured exception-handlers and the process crashes and burns. But if we had a VEH installed, we’d be given a chance to deal with it.

So by adding a new section to the exe containing the Nanomite table along with some code, we can save the day:

Redirect the entry-point to our code, which installs the VEH and jumps straight to the original entry-point.
Have the VEH handle only INT 3 exceptions, searching the database and patching in the appropriate jump instruction when necessary.

That nearly takes care of everything. The only remaining problem is for programs that use VEHs of their own. It’s unlikely that anybody would implement their own exception handler to deal with breakpoints, but conceivable for a catch-all scenario to ruin our best-laid plans. So the last piece of the puzzle is to hook RtlAddVectoredExceptionHandler, telling it to remove our handler before installing the client’s, then replace it afterwards. In this way, the Nanomite-handler is guaranteed to be the first exception-handler on the scene (be it structured or vectored), and existing functionality is unaffected.

How I cracked the iTunes 7 DRM, Pt V

December 9th, 2007 Greg

The story so far: Part 1, Part 2, Part 3, Part 4.
The remainder of this project consisted of developing the interface and injection DLL in parallel. This all went fairly smoothly, so I’ll present a summary of the workings.

  • Two programs are involved:
    • DLLBugger.dll - a C++ toolkit DLL designed for injection into iTunes. It sniffs out DRM keys as they are passed to the MP4-playing subroutine, exposes a variety of methods for inter-process communication, and invokes iTunes’s decrypter function when ordered to do so.
      The peculiar name is something of a relic from the DLL’s twin program, who unfortunately didn’t make it this far.
    • DisaRM.exe - a C# GUI responsible for locating the iTunes process (launching it, if necessary), injecting DLLBugger, parsing the database, asking the user which tracks to unlock, and overseeing the decryption process as performed by DLLBugger within the iTunes address-space.
  • When launched, DisaRM immediately loads DLLBugger into its own address-space. Next, it launches iTunes.exe and acquires a handle to the process. From here DLLBugger is injected into iTunes. Having the DLL present in both processes makes for an ‘easy’ way to communicate data back-and-forth (using a shared PE segment). As it turned out, there was no need to use the Win32 debugging API and so DRMBugger.exe outlived its usefulness.
  • Because I had never written anything involving inter-process communication before, I was quite unprepared for the volume of work required to make this shared-segment approach successful. So it wasn’t before familiarising myself with semaphores and planning out how everything could be made to work with exchange limited to POD, that a rudimentary communication state-machine was implemented.
  • DLLBugger exports twelve functions:
    bool CreateHooks(void* decrypt_call, void* decrypt_func, void* cfw_call);
     
    DWORD InjectMain(void *lpParam);
     
    void* GetRemoteProcAddress(LPCSTR lpModuleName, LPCSTR lpProcName);
     
    bool KillRemoteThread(HANDLE hRemoteThread);
     
    bool RemoteDecrypt(wchar_t* in_name, wchar_t* out_name, RijndaelKey key);
     
    RijndaelKey GetLastKey();
     
    void SetStoredKey(RijndaelKey key);
     
    void RemoveHooks();
     
    long GetLastTrackFirstLength();
     
    char* GetLastTrackFirstData(long* buffer_size);
     
    WCHAR* GetLastAudioFileName();
     
    bool PollNewFile();
  • Passing hard-coded addresses (I know, yuck), DisaRM invokes CreateHooks in the iTunes process. This installs hooks in Kernel32!CreateFileW, iTunes!Decrypt and iTunes!_PlayMP4+_CallToDecrypter (the point at which the previous function is called). Now any attempts to load a track or decrypt a chunk of AAC will be intercepted.
  • After DisaRM has loaded and displayed the protected subset of the iTunes library, the user chooses which tracks to unlock and hits the ‘Get Keys’ button. This triggers DisaRM to launch the first track into iTunes, causing a call to CreateFileW to be intercepted by the DLL. The arguments are stored and execution is allowed to continue. With this, DLLBugger has a good idea which track will be playing at any given time. Once iTunes has loaded the protected MP4 file, determined its decryption key and done whatever else it does, it necessarily makes a call to the Decrypt function. Naturally, this too is intercepted by our DLL and we begin to generate a mapping of file-names to DRM keys. Sanity checks exist in the form of GetLastTrackFirstLength, GetLastTrackFirstData, GetLastKey, and GetLastAudioFileName. Once the confidence level is high enough (as all this business is done asynchronously by iTunes and it isn’t safe to assume too much about the order of events) DLLBugger reports back to DisaRM, and the next track is launched. In this way, DisaRM learns the keys for each file it needs to decrypt.
  • Provided everything went smoothly, the DisaRM displays the keys alongside the track name, artist, album and such (this was initially useful for debugging purposes, but I left it in because it looks kinda cool). The user gets a chance to reconsider before hitting the ‘Remove DRM’ button. Because what DRM-removal tool would be complete without one?
  • The decryption process itself is relatively straightforward. A single call to RemoteDecrypt from DisaRM creates a new thread in iTunes, which opens up the MP4 file and parses the data to find the stbl atom. This part of the file lists the offsets and sizes of each chunk of AAC data (’cause they come in chunks, you know) among other things. For each chunk, the thread calls the Decrypt function, passing the appropriate offset, size and key. With the stream decrypted, some offensive atoms are removed and the file is made to look like it never had any DRM in the first place. DLLBugger saves the result to disk and that’s that.

I’ll take this opportunity to apologise to any Mac users who were hoping to learn something about the iTunes DRM from this series. Clearly, I didn’t reverse-engineer the protocol to any substantial degree and so none of the methods described port very far away from Windows XP. Maybe another time.

A few things were learned over the four weeks I spent. Here are just a few:

  • Writing an inter-process communication framework is not a task to be taken lightly, no matter how little of it you think you need.
  • C# is excellent for GUI development and awful at low-level hackery. But when you have a shiny new hammer, everything starts to look like a nail.
  • Over-engineering a solution is as bad as under-engineering it. I’m sure I could have saved myself a fortnight if I hadn’t bothered writing that debugger I didn’t need.
  • A profiler can be an excellent RCE tool. If I’d only thought to profile a few seconds of each of m4p and m4a playback, I could have isolated the decrypter function in minutes, rather than days.
  • QuickTime is horrible.

So that marks the end of this series of posts. I can assure you, though, that I haven’t nearly reached the end of the story.

Bypassing IsDebuggerPresent

December 5th, 2007 Greg

The Win32 API function IsDebuggerPresent is commonly used in rudimentary anti-hack techniques. It’s generally safe to conclude, if somebody is debugging your program, that there’s some foul play going on. Now, once you’ve convinced yourself that this really doesn’t matter, allow me to explain the guts of this Kernel32 function. Here’s a disassembly:

7C813093    MOV EAX, DWORD PTR FS:[18]
7C813099    MOV EAX, DWORD PTR DS:[EAX+30]
7C81309C    MOVZX EAX, BYTE PTR DS:[EAX+2]
7C8130A0    RETN

That’s really all there is to it. The first line gets a pointer to the thread environment block (often abbreviated to TEB). This is a lump of system-maintained memory that keep track of per-thread data. At offset 0×30 into the TEB is a pointer to the process environment block, or PEB. The second line of the disassembly loads this address into the EAX register. Last of all, it reads and returns the third byte of the PEB (the ‘BeingDebugged’ member) as a boolean value.

This code is very simple to implement manually, and doing just that is a quick and easy way to thwart the attempts of those out-of-the-loop crackers who attempt to patch the IsDebuggerPresent function itself. But equivalently, the disassembly betrays a way to render IsDebuggerPresent truly useless:

void HideIsDebuggerPresent(bool hide) {
    unsigned char being_debugged = (hide ? 0 : 1);
    __asm {
        MOV EAX, FS:[0x18]
        MOV EAX, [EAX + 0x30]
        MOV CL, being_debugged
        MOV BYTE PTR [EAX + 2], CL
    }
}

So without the need for any messy code patches, we can hide the presence of a debugger from IsDebuggerPresent - or anything that reads from PEB->BeingDebugged - by executing this function in the process’s address-space. Now, that won’t fool any programs clever enough to read in the value of PEB _EPROCESS->DebugPort (which can’t be overwritten from ring3) or that use CheckRemoteDebuggerPresent (which requires XP SP1 or later), but it’s nice to know.

How I cracked the iTunes 7 DRM, Pt IV

December 1st, 2007 Greg

Success was close enough to smell, but not to taste. Succeeding in a debugger with all your (razor-sharp) wits about you, and teaching a computer how to do the same are two very different things. DRMBugger and DLLBugger were still in a state of throwaway code and the project had almost nothing in the way of an interface.

This is where Visual C# came into play. While I’m no expert (and I certainly wasn’t at the time), anybody with conversational C++ can quite quickly pick it up and produce a convincing GUI in no time. But with any new toy comes the compulsion to wear it out, and I soon found myself wasting a week getting the DisaRM GUI perfect. I really have nobody to blame but myself, but the friend who suggested I mimic the iTunes GUI (mentioning no names, Dave) helped to send the project rolling in entirely the wrong direction. Unfortunately, I wasn’t quite prepared for the OOP-mania that is C# and so the controls I created are a little too interdependent for me to release their source code, and that’s a shame, because my iTunesListBox, iTunesScrollBar and iTunesProgressBar classes are true works of art.

With that distraction out of the way, I got to porting the DLL injection code from C++ to C#. If you’ve ever used the Win32 API from C#, you’ll know how much of a pain it is to translate all those function prototypes (somewhat reminiscent of VB 6) and you’ll have some sympathy for me having to do thirty of the bastards. If I had thought things through beforehand then I’d have left this close-to-the-metal business in a C++ DLL where it belongs, but we live and learn.

Getting the iTunes library track-listing and extracting the DRMed tracks was a lesson in elementary XML-parsing (take a look in %MyMusic%/iTunes/iTunes Music Library.xml if you don’t believe me). The next step is to extract the DRM keys.

If I’d taken more time to debug I’m sure a cleaner way to do this would have presented itself, but I settled for launching each of the encrypted files into the Windows shell (so that iTunes begins to play it) and extracting each key via a hook installed in the iTunes process’s decrypter function. The keys are piped back to DisaRM and everybody’s happy (with the possible exception of the user, who has just heard the first second of each protected track in their library while seeing the iTunes window frantically pop in and out of focus). It won’t mean too much without context, but here’s the rather confusing C++ source for the hook, from DLLBugger. The unnecessarily complex conditional statement at the start is checks the stream contents to make sure that it is indeed drawing the key from the right track.

long __cdecl HookDecrypt(char* buffer, long length, RijndaelKey** key) {
    if (new_file) {
        // Track Changed
        long copy_length = min(length, LENGTH_AAC_DATA_TO_SAMPLE);
        std::fill(port_last_track_first_data, port_last_track_first_data + LENGTH_AAC_DATA_TO_SAMPLE, 0);
        std::copy(buffer, buffer + copy_length, port_last_track_first_data);
        port_last_track_first_decrypt_length = length;
        new_file = false;
        port_new_file = true;
    }
    port_last_key = **key;
    return (hook_decrypt_func(buffer, length, key));
}

With this done, all the ingredients are present. DisaRM knows the locations of the protected files and the keys needed to decrypt them. Next is to manipulate iTunes into doing the dirty on its own DRM.

Run-time determination of VC++ 2005 virtual member function addresses

November 29th, 2007 Greg

I was recently somewhat surprised to find that there is really no C++ way to resolve a virtual function to its address at run-time. Admittedly, there is no good reason why anybody would morally need to do this, but when you’ve already lowered yourself to patching another process’s own code without consent, it seems like a very small crime.

Pioneers of such hackery have already established concrete methods for calling virtual functions from inline assembly, but these methods don’t quite stretch to getting the address in pointer form. So, if for no reason other than to convince you that it’s a lot of hassle, I present a miserable bit-chop hack to do just this.

Read More…

Drawing on another Direct3D program’s viewport

November 27th, 2007 Greg

Update: See the post for the new version.

The theme of the moment is DLL hooking, and so I thought I’d present an applied example. I already explained how Fraps works, and since I’ve recently been roped into writing a similar tool for a stranger, I thought I’d share the wealth. There isn’t much new material here, but people like examples with source code, so you can download the DLL source (C++) from the project page.

Bioshock Hook Screenshot

If you don’t know how to inject this DLL into a foreign process, then you’ll need to read my previous post or wait for the injection framework I’m working on. But once it’s injected call the Initialise method, via CreateRemoteThread or otherwise, to install the hooks. It works with any program that uses IDirect3DDevice9::Present (or IDirect3DSwapChain::Present) to render, which is probably all of the DirectX 9 games. Similarly, invoke Release to remove the hooks. The source is fairly self-explanatory, with a few exceptions.

  • It’s not safe for 64-bit consumption, though this should be obvious.
  • While there’s no reason it can’t be made to work with Unicode, I’ve written everything in ASCII, for simplicity.
  • By default, the DLL will increase its own reference count to prevent it being unloaded prior to termination of the host process. This is because there is a small risk of the DLL being unloaded by one thread, while a hooked function in another returns to the now dead memory. I figured that it’s best to waste a little bit of everybody’s memory than to crash unnecessarily.
  • The d3d9.dll function addresses (and prologues) are hard-coded, or at least their offsets are. While this may look very unprofessional and rather risky, I can assure you that it’s quite safe. The alternative would be to hack up some virtual-function tables and that’s a whole other story for a whole other post.
  • You may notice that the compiled DLL is dependent upon D3DX. This isn’t necessary for the hook itself, but I used ID3DXFont in my example for demonstrative purposes. The only reason I mention this is that there is no way to guarantee the existence of any D3DX DLLs on a DirectX 9 machine, and distributing them yourself is in violation of the DirectX Runtime EULA. So if you happen to need to distribute this code, you’ll either need to carry the huge runtime installer around, or avoid using D3DX altogether.

Update:

  • The soft-hooks used here will cause problems with PunkBuster if applied to any of its monitored functions. If you need to do this then you’ll have to be a bit cleverer.
  • The source assumes that the graphics device will never become invalid. If you suspect that this isn’t the case (which will be true for any full-screen game at a minimum) then you’ll need to add the appropriate sanity checks (see IDirect3DDevice9::TestCooperativeLevel) before attempting to render anything, lest you want to crash and burn.

RCE essentials: PEiD

November 24th, 2007 Greg

When I mention my reverse-engineering feats or failures to technically-minded friends, I tend to get one of a few responses. Not uncommon is ‘I wouldn’t know where to start.’ Well, I know it’s just a figure of speech, but I always start in the same place: PEiD.

PEiD

Many programs are built with third-party post-applied protection schemes, or are compressed with a packer to reduce the file size. The basic workings are the same - you run what you think is the program, but unknowingly execute the unpacker’s code, which decompresses or decrypts the original exe in memory and executes that once it’s done. The fact that most people are completely unaware of this process goes to show that these protectors and packers do at least half of their job well. While some protection schemes are better than others, any such packer will have the effect of turning a trivial hack, crack or patching job into a relative pain in the neck.

Rather distinctly, the odd occasion comes up where you’d like to know which compiler and/or linker was used to produce a binary, as the different options have their own quirks and particulars. Differentiating your Borland C++ Builder 5 from your Microsoft Visual C++ 6 can save you a little time and effort, if you need to fiddle with the ins and outs of stack-frame prologues or function indirection tables, for example.

Any tool that modifies a PE (exe or DLL) has to conform to strict standards, so as to keep the program functional, but will also have the effect of leaving behind a mark. These tell-tale marks are aptly known as PE fingerprints, and PEiD is designed to sniff out these fingerprints and give you the lowdown. So if I decide that I want to tweak the interface of my PostScript viewer, or to investigate how my anti-spyware tool enumerates processes, I only need to drag-drop the respective exe files into PEiD and I immediately know that GhostScript 4.7’s gsview32.exe was built in Microsoft Visual C++ 7.0 and that AdAware SE Personal 6.20 is compressed using ASPack 2.12. This tells me that the former will be very easy to analyse, whereas the latter will put up something of a fight, and that I’d perhaps be better off spending my time on Google.

So PEiD is something of an unsung hero, in that it only ever runs for five seconds at a time, perhaps once a week (at least on my computer), but yet when used properly it can have a profound effect on the development of any RCE project. And it is for this reason that I hereby sing its heroism for all to hear.

Case study: Fraps

November 22nd, 2007 Greg

One of the topics that I often find myself bluffing through on GameDev is Direct3D hooking. In particular, how to display an overlay of your own on the window of another Direct3D program, often a commercial game. It’s pretty clear that the simplest method would involve somehow hooking the call to IDirect3DDevice8/9/10::Present, but the details are a little sketchy, particularly when you throw anti-hack systems into the mix. To be quite honest, I wasn’t sure I’d have been able to write a scalable hook that wouldn’t cause any incompatibilities - at least not without doing some very immoral things. So when I found out that Fraps has been doing exactly this for years, and that it somehow manages to avoid angering PunkBuster and other such systems, I decided to investigate.

What is Fraps?

Simply put, it’s a profiling and video-capture tool for PC games. As well as providing the ability to capture a video stream from any Direct3D 8/9/10, DirectDraw or OpenGL program, it can display real-time performance statistics (frame-rate and such) by means of an overlay on the game’s window (or full-screen display).

Remarkably, Fraps seems to handle the hacky side of all this automatically, with no fuss. It will dig its claws into any compatible game, whether it was started before or after Fraps. Yet if you investigate the state of DirectX and OpenGL’s DLLs on disk (in system32), they remain untouched at all times.

So how does it work?

Well it could be a whole lot worse, but I wasn’t thrilled to find out that every process running on my system had an instance of Fraps.dll loaded. It’s not so much the 106kB footprint that bothers me, but the performance and stability concerns. Anyway, my suspicions that this DLL was ‘infecting’ all processes via a system-wide hook were soon confirmed when OllyDbg caught Fraps.exe making a call to SetWindowsHookEx.

SetWindowsHookEx(WH_CBT, &FrapsProcCBT, hModuleFrapsDLL, 0);

So in one fell swoop, Fraps guarantees that Windows will load a copy of Fraps.dll into every process currently running, as well as those created in future. Moreover, the function FrapsProcCBT will be called by that process each time it attempts to create, destroy, show, hide, move or resize a window. Now this may seem like the perfect solution to a difficult problem, but it’s rather wasteful considering that most processes run window-driven interfaces, and only very few involve DirectX or OpenGL.

How should it work?

Now I haven’t tested this, but a much cleaner way to achieve the same goal would be for Fraps.exe to periodically poll EnumProcesses and EnumProcessModules, so as to determine the processes that actually need hooking. Installing the hooks into these specific processes would require no more work but would save the OS some effort and limit the worst-case-scenario disaster-zone to DirectX and OpenGL applications, which is a considerably smaller domain than almost everything. In Fraps’s defence, the code makes extensive use of IsBadReadPtr and suchlike, and I’ve never heard of it causing any trouble, but nevertheless, the best way to prevent your DLL from crashing someone else’s program is to make sure it never gets loaded.

What does the hook do?

All events but window activation and focus-acquisition (HCBT_ACTIVATE, HCBT_SETFOCUS) fall through the hook chain (CallNextHookEx). But in either of these cases, Fraps.dll goes on to look for a supported graphical interface.

Rather achronologically, the first thing it does at this point is a bunch of string-processing to capitalise and isolate the executable image’s file name. Presumably, the writers would have used GetProcessImageFileName, but bringing psapi.dll along to the party for this reason alone would be borderline-criminal.

Next, GetModuleHandleA is called on opengl32.dll, d3d8.dll, d3d9.dll, dxgi.dll and ddraw.dll. If all return NULL, then there is no work to do and the function returns. But if any of these modules are found, Fraps.dll gets straight to installing its function hooks. The hooks are simply JMP operations assembled ad-hoc at the beginning of IDirect3DDevice9::Release and Present (and presumably the equivalent functions belonging to the other APIs). Now, I was rather surprised to find that PunkBuster has no problem with such crude, unsubtle behaviour, but it’s possible that there is some agreement between the two developers.

That’s almost everything, but one problem remains. Nothing will be drawn to the screen unless d3d9!Present is called, and installation of the patch renders the original functions useless. It is for this reason IAT hooking is preferable to the patch method being used here, but from what I gather PunkBuster periodically ‘fixes’ the IAT of its client process, so that’s no-go. Fraps gets around this little inconvenience in the messy, but reliable way that you’d expect: each time the proxy Present function fires, it removes the patch, calls the original function, and restores the patch.

Here’s some untested C++ concept-code I threw together for the IDirect3DDevice9::Present case. The unbraced snippet installs a patch at address_d3d9_Present (use GetProcAddress) to redirect it to PresentHook. I’ve omitted the patch-removal code, along with a whole load of sanity-checking that really shouldn’t be left out in such a risky situation. Don’t use my laziness as an excuse.

// Calculate offset
DWORD from_int = reinterpret_cast <DWORD> (address_d3d9_Present);
DWORD to_int = reinterpret_cast <DWORD> (&PresentHook);
 
// This version of the JMP instruction takes an address relative
// to the current address, and it is 5 bytes long
// So the relative offset is 'to - from - 5'
// Don't worry about the unsigned DWORD underflowing
DWORD offset = to_int - from_int - 5;
 
// Assemble the patch at the beginning of Present
const unsigned char jmp = 0xE9; // The opcode for a 32-bit rel JMP
 
unsigned char* ip = reinterpret_cast <unsigned char*> (address_d3d9_Present);
*ip = jmp;
*(reinterpret_cast <DWORD*> (ip + 1)) = offset;
 
HRESULT __cdecl PresentHook(const RECT* pSourceRect, const RECT* pDestRect, HWND hDestWindowOverride, const RGNDATA* pDirtyRegion) {
    IDirect3dDevice9* device;
    __asm MOV device, ECX;
 
    // Do anything that needs to be done before Present gets called
 
    // Remove the Present hook
    HRESULT return_value = Present(pSourceRect, pDestRect, hDestWindowOverride, pDirtyRegion);
    // Reinstall the Present hook
 
    // Do anything that needs to be done after Present gets called
    return return_value;
}

How I cracked the iTunes 7 DRM, Pt III

November 20th, 2007 Greg

After last time’s failure, things started to become personal. I started exploring all kinds of new avenues and employing many techniques that aren’t so commonly used. In parallel, I drew up a map of the inner-workings of iTunes 7.0.2.16 and began coding up a framework from which to launch a full-scale attack once I knew how.

The program map was an uninteresting flow-chart full of hex strings (mostly addresses of key points) and crossings-out, but the framework was quite a sight to behold. I had no idea that it was such heavy overkill at the time, but so as to keep all bases covered I ended up with two distinct but inseparably linked programs, codenamed DRMBugger and DLLBugger.

DLLBugger is a tool-kit DLL, designed for injection into iTunes so I could execute arbitrary code from within its address-space and hook functions to my heart’s content.

DRMBugger is a purpose-built debugger. When you fire it up, it locates and attaches to any running instance of iTunes as a user-mode debugger, keeping track of all internal goings-on. It wouldn’t take much effort to convert this into a full-blown ring3 debugger, as it has support for module and memory-page enumeration, hardware breakpoints and run-tracing, along with a rather unreliable disassembler.

With the combined power of these two, iTunes was truly at the mercy of my twisted will. If only I knew what I needed to make it do.

My First Stream

I can’t emphasise enough just how much code is executed in the seemingly idle state of iTunes’s AAC playback, but for days on end I would see OllyDbg’s familiar disassembly window whenever I closed my eyes. After tracing backwards, forwards, using stochastic pausing to find the bottlenecks (like a primitive manual code-profiler), comparing run-traces and hit-traces and analysing dead-listings of programs known to use QuickTime’s DRM code; I finally struck gold.

At 0×0062914D, iTunes.exe (7.0.2.16 for Windows XP) calls a function at 0×005C1B20, passing a pointer to a structure containing the address of an encrypted chunk of AAC, along with a pointer to the pseudo-1024-bit decryption key. It was pretty clear that the AAC data was an entire chunk as defined in the stbl atom directory. It was also evident that the key is retrieved from ‘Documents and Settings\All Users\Application Data\Apple Computer\iTunes\SC Info\SC Info.sidb’ using some of the udta atom’s values as an index. I could have investigated this further, but decided that the details are unimportant if DLLBugger can just use this function black-box-style.

So I got straight to writing an inline patch. The idea was simple: Each time this ‘decrypt’ function is called, dump the decrypted buffer to a previously VirtualAlloced array. And sure enough, after waiting for my protected track to play beginning-to-end, I ended up with an encryption-free copy of the AAC stream. I was confident that this had worked, as the data chunks looked the part, according to my limited knowledge of AAC format, but it wasn’t so straightforward to verify. Decrypting the stream is the hard part, sure, but this stream is useless without the remainder of the MP4 file to house it. I spent a little while in a hex-editor piecing things back together and wasn’t surprised when iTunes refused to play the new file. As far as iTunes was concerned, the MP4 file (even with its new m4a extension) still had all the descriptors of an encrypted file, and as you’d expect, decrypting the stream twice didn’t produce anything meaningful.

Fortunately, this frustrating period didn’t last too long. I had to remove all evidence of DRM-related atoms before things got up-and-running, but that means only overwriting a contiguous block of the file with zeros. As it happened, I had already written a fairly complete atom parsing engine into DRMBugger, so rather than let this go to waste, I took things a step further and removed & reshuffled some of these atoms so that the resulting file was indistinguishable from a true ‘m4a’. Reassuringly, the file played just fine in WMP, WinAmp and VLC, as well as iTunes, so things were really starting to look up. The only problem is that it had taken me two and a half hours to remove the DRM from this file (a little trivia: it was ‘The Rejection’ by Dangerous Muse). The next thing on the agenda, after a night of celebration, was to automate the process.