Ring3 Circus

Journal of a programmer, diary of a hacker

NAS Troubleshooting via the Back Door

I recently bought a NAS enclosure with 1TB of storage for my humble home LAN. While I eventually managed to identify the manufacturer as a firm called ‘Sumvision’, the packaging didn’t make this obvious and upon opening it up I got that distinct and familiar feeling of ‘you’re on your own from here’. Of course, none of this came as a surprise (given the price I paid) and I was pleasantly surprised to find a fresh set of firmware available on the website (version 2.4.4-Jul 8 2008). What did upset me, though, was the awful performance I experienced when using it over my wireless connection. A short Google later it became apparent that this is a common problem with NAS devices, caused by the typically higher packet-loss experienced on wireless connections. Briefly put, SAMBA (and NFS) transmission units often default to 8192 bytes or more, whereas IPv4 networks without the relatively new jumbo frame support will fragment anything above the 1500-or-so-byte MTU. The suggested fix, for SAMBA devices such as mine, is to tweak this value in the server’s smb.conf:

socket options = TCP_NODELAY IPTOS_LOWDELAY SO_KEEPALIVE SO_RCVBUF=1024 SO_SNDBUF=1024

Alas, I found no obvious way to modify these values in the enclosure’s web interface. Furthermore, extracting the hard-drive and mounting it locally proved that it is used for content storage only - the OS sits on a ROM, away from the user’s grubby fingers. And so I was out of luck and was at the point of giving up, until curiosity got the better of me. Taking a look at the ‘shares’ page of the admin interface it was pretty clear that this thing was running a Unix-based OS (disclosing local paths such as ‘/mnt/C/Media/Music’), and if their implementation was as flaky as their design there may be another way in.

Web interface 'shares' page

So I tried what any hacker would have done faced with this input field, and tried to traverse the device-local path using some ‘../’s.

Path-traversal attack

Now I won’t pretend that it was big, clever or original, but to my surprise it worked. Browsing to ‘\nas\Root’ from my computer landed me right in the device’s root, with admin read/write privileges. For those wondering, the OS is BusyBox.

The resulting SAMBA root share

And the rest is history. A couple of words of warning to anybody planning on doing the same:

  • In order to access this root share, the device obviously needs to boot and mount successfully. As such I wouldn’t be the slightest bit surprised if any undue alterations to the OS or its configuration result in the device being rendered permanently unusable. I never did brick mine and so haven’t confirmed this, but be very careful what you touch.

  • /config/smb.conf is presumably kept open at all times, and so you’ll likely have trouble saving to it. I found that saving as a different file and then renaming them around works fine. Be aware that there is very little excess space to play with (I suspect it’s mounted as a ramdrive).

  • The smb.conf, at least, seems quite volatile, in that any changes via the web interface result in the [global] parameters being reset to their defaults. I have observed the alterations to persist across reboots but couldn’t offer a comprehensive list of actions that result in it being wiped clean, so don’t be surprised if you find that your fixes disappear when you’re not looking.

Strong-Name Signing, AdmiralDebilitate v0.1

Update: It has been pointed out to me that strong naming serves a somewhat more noble purpose than to act as a simple anti-patch mechanism (you can read about it here.aspx)) and so the tone of this post is perhaps a little inappropriate. However, considering that it is so widely used for exactly this purpose, the tool’s usefulness stands.

It has taken a little while, but the application of asymmetric cryptography to binary signing is becoming popular. This is reflected in the intrinsic DSA signing mechanism offered by the .NET framework. If you aren’t familiar with strong name signing, the idea is fairly simple: The creator of a .NET assembly has their own RSA key pair, which they use to encrypt a combination of hashes of the file. The resulting strong-name signature is included in the binary along with the public key and the whole lot is distributed as normal. Now, whenever a user tries to load this assembly, the public key and signature are used to verify the integrity of the file, thus thwarting the attempts of any bad guys to patch the assembly. And of course, because the private key is never published, it is virtually impossible for the file to be re-signed with the same key after its release.

The method of digital signing is remarkably effective at preventing patching and without the use of a hypercomputer, the chances of anyone cracking the key is as good as zero. Also of note is that the code responsible for verifying the signature is part of the system’s .NET assembly loader, and therein lies both a strength and the weakness of the technique. While this makes it easy for developers to sign their assemblies (very easy indeed using Visual Studio), it also unifies the implementation allowing for automated removal of such signatures. The weak point in the protection is obviously the DSA signature-checking algorithm, but it lies in system code and so for many reverse-engineers, bypassing it is undesirable or impossible. So we target the next best thing - the metadata in the assembly itself. If the assembly looks like it isn’t signed then that is exactly how it will be treated.

The exact structure of the relevant metadata is described in the .NET specification, and the steps necessary to remove the signing have been concisely detailed here. Even more generously, Andrea provides us in that article us with a tool to remove the signatures from a single file. Similar programs can be found elsewhere on the web but nowhere did I see a tool that’s capable of handling complex projects with nested binary dependencies. The problem here is that the public key is stored both in a binary itself and each assembly that references it, so if you’re patching a DLL deep in a dependency structure, it is nontrivial to isolate and execute all the necessary patches. That’s why I created AdmiralDebilitate.

AdmiralDebilitate Screenshot

If you load up any number of PE files, AdmiralDebilitate will enumerate all .NET dependencies and provide you with a tree, much like the Dependency Walker. From here, you can ‘mark’ any and all assemblies that you intend to patch, and all the work of identifying dependants and patching out the headers will be done for you. I provide no further instructions but it should be self-explanatory. As usual, this is a one-man project and the code is correspondingly immature so please report any bugs.

Get the source and binary here.

An Introduction to .NET Reversing

The first time I saw a .NET application, I was scared. I was scared of the unknown and this fear was only heightened after looking closer with OllyDbg, IDA and LordPE. I imagine that every seasoned reverser out there felt the same way. Well if that’s you, and you’re anything like me, then you’ll have avoided the .NET paradigm-shift for as long as possible. But after seeing new .NET articles and tools popping up left, right and centre, I decided it was time to face my demons and get comfortable with the inevitable future. This post is aimed at those familiar with native RCE but not with .NET, and forms a distillation of the essential facts you’ll need to get started.

You probably already know that .NET is a semi-compiled language - that is, a .NET binary (called an ‘assembly’) exists as bytecode that is compiled into native code just before execution. The bytecode is called Common Intermediate Language, or CIL, (formerly known as Microsoft Intermediate Language - MSIL) and it is the product of any C#, VB.NET or J# compilation. These exe and DLL assemblies are wrapped up inside regular PE files having only one import - mscoree.dll - which acts as the assembly’s just-in-time (JIT) compiler and its gateway to the .NET Framework and its platform API, the Common Language Runtime (CLR). The PE header’s data directory contains a .NET directory entry, which points to a new header in the file containing everything the operating system needs to run it.

There are a lot of changes from the familiar native situation, some of which work for us and some against. From a hacker’s perspective, there are some pros:

  • CIL is reflective, meaning that the structure of the code can be deterministically inferred from its assembly.

  • For JIT compilation, linking and introspection to work, a considerable amount of type information must remain in the assembly.

  • .NET assemblies are a few levels of abstraction away from the user-mode platform, so there is far less scope for obfuscation, and anti-debug tricks.

  • The CLR provides standardised interfaces for a tremendous amount of functionality for standard operations, and this is typically embraced by .NET programmers. The result is short, concise code that can reveal even very complex algorithms at a glance.

and, of course, some cons:

  • The assembly effectively runs in a virtual machine, meaning the operations-to-clock-cycles ratio is a few orders of magnitude higher. This is of immediate consequence if you are tracing through the program natively (i.e. debugging the JIT compiler).

  • The abstracted nature of the program’s entities means that old friends like Olly’s ‘registers’ and ‘stack’ windows are of very little use. In fact, the applicability of a low-level debugger is altogether questionable.

  • While the relatively small community is making tremendous efforts, the science of .NET RCE is still quite young and so the tools available are correspondingly immature and thin on the ground.

So put OllyDbg, LordPE and your hex-editor aside (you may want to keep IDA within reach) and meet your new arsenal:

  • Lutz Roeder’s .NET Reflector is perhaps the best-known and most advanced .NET reversing tool. It’s also an invaluable asset for every .NET job, giving you in-depth at-a-glance information about any assembly.

  • ILDAsm - This is Microsoft’s CIL disassembler and is far more useful than any native equivalent. You can download it from MSDN if necessary, but it comes as a part of Visual Studio (all versions) and can be found either at ($ProgramFiles)\Microsoft SDKs\Windows[Version]\bin or ($VisualStudioDir)\SDK\2.0… depending on your version. Be sure to check your non-x86 Program Files directory if you’re running Windows x64.

  • ILAsm - As you guessed, this is the CIL assembler. The great thing about this is that in many circumstances, you can pipe ILDAsm’s output into ILAsm and end up with a fully working assembly.

  • Explorer Suite - A Swiss Army Knife of .NET-aware tools. This will be your new PE editor for all things .NET.

(I recommend you take a look around Dan Pistelli’s site once you’re comfortable with the terrain as he’s produced a lot of marvellous tools that can cut your workload down considerably.)

If you were wondering where the debugger is, then my answer is that you’re probably better off without one, at least for the moment. It would of course be a very handy addition, but I haven’t found any .NET debuggers that work too well. DILE is promising, but it’s very unstable on my 64-bit Windows, and WinDbg puts in a good effort with the SoS extension running, but you won’t find anything that’s a joy to use.

So with all the framework in place it’s remarkably easy to get going. The first step is to take a look at your target assembly in Reflector and get a feel for the program. If it isn’t obfuscated then you’ll probably be surprised how easy this is (in the many cases it’s just like having the source code). If it is obfuscated then you may need to work a little harder, or perhaps skip straight ahead to the next step:

ildasm Target.exe /out=Target.il

The resulting file contains a complete CIL listing of the assembly, which you can edit at your leisure. You may want to Google ‘MSIL’ to get an idea of what all those alien-looking opcodes do, but it’s really quite straightforward to do the basics:

  • ‘Push’ instructions start with ld (load), ‘Pop’s with st (store).

  • There is no MOV instruction. In order to copy from one place to another, push then pop.

  • Most manipulation involves arguments and locals, which are referenced by index using a dot, e.g. ldarg.1 // Push the second argument stloc.0 // Pop it into the first local

  • Constants are pushed using ldc. The size and type of constant must also be specified e.g. ldc.i4 12 // Push a 4-byte integer (of value 12) ldc.r4 33.33 // Push a 32-bit float

  • The values are returned by pushing them onto the stack prior to a return (ret)

Far more complete references exist on the web if you’re willing to search, but this is enough to short-circuit a function or two. Once you’re done playing with the CIL, compile it back up, being sure to specify any resource files that were created by the disassembler:

ilasm [/dll] Target.il [/res=Target.res] /out=TargetNew.dll

And you’re all ready to go. This is sufficient for the most basic cases, but it will only be a matter of time before you encounter an obfuscated assembly (which will cause you problems in Reflector) or one that’s signed using strong-names (which will refuse to run after compilation). I’ll discuss both of these situations and any relevant workarounds another day.

D3DLookingGlass v0.1

Update: Like all the other Direct3D hooks on the site, this doesn’t actually work anymore. Feel free to use it for reference or kindling or whatever, but don’t expect it to have any practical use. As soon as I get ’round to bring it up to scratch I’ll put out another post. Sorry about that.

The topic of debugging full-screen Direct3D applications came up a little while ago. If you’ve ever tried it on a single-monitor setup (or even multi-monitor if the app wasn’t designed to handle it) then you’ll know how much of a pain it is. Windows just can’t handle focus being stolen from a suspended exclusive-mode program. The solution’s exactly what you’d expect - to intercept the relevant window- and device-creation calls and coax the debuggee into running in a window. This works, but fiddling with the calls manually each time you restart the process quickly gets boring. So here’s my first attempt at a generic solution.

D3DLookingGlass is a DLL which, if injected into a Direct3D process early enough, will make sure that all video devices are created in windowed mode, allowing the hosting process to coexist with a debugger without any bother. If you can inject this DLL into the target process before the first call to CreateWindow, then everything should go smoothly. I think. Any later than this and your mileage may vary.

I’ve also written a ‘loader’ program that installs the DLL as a system-wide CBT hook, so that you don’t need to inject it manually. This kind of worked for my limited set of test-cases, but there seems to be no Windows-hooks method of injecting a DLL globally and beating the call to CreateWindow. Windows installs the DLL containing the hook at the latest possible moment for its function, and I can find no type of hook that needs to be around before a window is created. I’d love for somebody to prove me wrong (or suggest another way to install the DLL system-wide), but by the looks of things, my loader is of limited use.

In particular, I recall a situation where the game (Call Of Duty 4 Demo, I think) creates a non-overlapped window, which works fine for full-screen mode, but causes problems when you try force the device to bind as windowed. This will still be a problem unless the call to CreateWindow can be intercepted (and a well-formed window induced), which means that D3DLookingGlassLoader will struggle. Confirmation would be nice.

Here’s the source: D3DLookingGlass_Source.zip

Here’s the binary: D3DLookingGlass_Binary.zip

Here’s the small-print:

  • The DLL hooks CreateWindowExW and ShowWindow in its DLLMain. I think this is kosher in terms of loader-lock, but it’s obviously not too cool with regard to system stability. Especially if it’s being installed in every running process. If d3d9.dll isn’t found in the address-space then the hooks fall straight through, so that shouldn’t be too much of a problem. But if it is found then all attempts to create or show (or hide) a window will be overridden - possibly to the demise of the process if it’s doing anything but the basic behaviour. So in all cases, watch out, and make sure you aren’t running anything important in the background (in particular, I’ve noticed that it doesn’t play nice with Firefox).

  • The loader uses a system-wide hook, and you hate system-wide hooks. I trust that anybody who needs this tool has some degree of technical expertise and is aware of the stability concerns inherent in installing somebody else’s barely-tested system-wide hook.

  • This was harder to put together than I anticipated, and that’s probably evident from the slightly shabby code. Again, I intend for this only to be used for debugging purposes, so you’ll have to forgive me for the sub-production-quality code.

  • Despite the focus on Direct3D of this blog, I’m not really a gamer and I don’t actually have any commercial games installed on this machine. So I only got a chance to test this against my own programs. Obviously, there are several ways to skin the metaphorical Direct3D-initialisation cat, so please leave a comment when you find a game that this chokes on.

Run-time Determination of VC++ Virtual Member Function Addresses: Take II

I wrote about this tricky little problem a while ago and wasn’t too happy with the desperate methods that seemed necessary. Since then, I’ve been shown a much cleaner way to do the same thing, by manipulating the vTable manually. It seems that Microsoft haven’t changed their vTable implementation since Visual Studio 6 (at least) and so with a little modification, the following piece of inline-assembly will do the trick: no muss, no fuss.

__declspec(naked) void* ResolveVirtualFunction(IDirect3DDevice9* pDevice, ...) {
    __asm {
        mov eax, dword ptr ss:[esp+0x08]
        add eax, 0x8
        cmp byte ptr ds:[eax-1], 0xA0
        mov eax, dword ptr ds:[eax]
        je normal_index
        and eax, 0xFF
normal_index:
        mov ecx, eax
        mov eax, dword ptr ss:[esp+0x4]
        mov eax, dword ptr ds:[eax]
        mov eax, dword ptr ds:[eax+ecx]
        retn
    }
}

// ...

// The function should be invoked like this:
void* address_device_present = ResolveVirtualFunction(device, &IDirect3DDevice9;::Present);

Thanks go to Vuurvlieg for this function. The beauty (or horror), here, is the use of a variadic parameter-list to overcome C++’s strong-typing that would otherwise make this operation very difficult. Obviously, this implementation will only work for objects of type IDirect3DDevice9, but the method extends to any other class by simply replacing the class name in the function declaration. Don’t be tempted to generalise this function to IUnknown or some other common base-class, as you’ll quickly run into problems with object-slicing. A final warning to those still using Visual C++ 6 (not that you deserve any help for such a crime): you’ll need to drop the ampersand from the second argument in the function call, as VC++6 handles function pointers slightly differently.

Direct3D 9 Hook v1.1

Update: Since new DLLs were pushed out a while back, this doesn’t work any longer. The function offsets are wrong, and the hook injection method is a little too flaky to be relied upon. Feel free to use this code as a basis, but I’d recommend the use of Microsoft Detours for the hook injection. One day I’ll write a new version along these lines but until then you’re on your own.

By popular demand, I’ve updated the Direct3D 9 Hooking Sample to accommodate Windows Vista. The same binary should work on both Vista and XP. I’ve only tested it on Vista 64-bit, so it’d be nice to know if it works with Vista 32 or not. Other than this, most of the same caveats apply as last time.

Screenshot

The Collaborative RCE Tool Library

I had decided to unofficially shut up shop for what remains of the year, but I just can’t keep quiet about this. For those of you who don’t already know, dELTA over at Woodmann’s RCE forums has created what I’ll describe as the most important RCE development since IDA 4.9. It’s not a tool, but an interactive database of potentially every one you will ever need.

The Collaborative RCE Tool Library

It’s already rather complete, but if you feel anything is missing then it needn’t be for very long. Take a look, download what you need, learn what you can and maybe give something back.

See you in the new year.

A Framework to Take the Tedium Out of Code-injection in C++

Update: I’ve left this up for posterity, but unless you have a good reason not to, you should be using Microsoft Detours for this stuff. It’s just as easy to use and far more mature.

Calculator Hook

I know I’ve been banging on about injection a lot recently, but I figured a good way to pinch off would be to present some code. After searching and failing, I took it upon myself to write a reusable C++ class to do most of the leg-work for Windows XP/2000/Vista32 DLL injection and hooking. The source is available on the project page.

The process of remote function hooking via a DLL is notoriously messy, so I’ve tried to encapsulate as much of the mess as possible into a C++ class. Here’s an example of some client code that injects a DLL into Windows Calculator, then installs two hooks (one by name and another by address):

// Create the injection object
DLLInjection injection("E:/Temp/HookDLL.dll");

// Find Calc.exe by its window
DWORD process_id = injection.GetProcessIDFromWindow("SciCalc", "Calculator");

// Inject the DLL
HMODULE remote_module = injection.InjectDLL(process_id);

// Hook a DLL function (User32!SetWindowTextW)
HDLLHOOK swtw_hook = injection.InstallDLLHook("C:/Windows/System32/User32.dll", "SetWindowTextW", "SetWindowTextHookW");

// Hook a function manually (Calc!0100F3CF)
HDLLHOOK manual_hook = injection.InstallCodeHook(reinterpret_cast <void*> (0x0100F3CF), "SomeOtherHook");

// Remove the hooks
injection.RemoveHook(swtw_hook);
injection.RemoveHook(manual_hook);

Testing has been limited so don’t be surprised to find bugs. If you do find any, please report them via email or comment.

Armadillo, Nanomites and Vectored Exception-handling

Let me tell you about a problem I ran into a couple of years ago, and the solution I ended up with. If you’ve ever heard of ArmInline, then this is the story behind its Nanomites tool.

The Background

If you’re not already aware, Armadillo is a commercial anti-cracking software scheme for Windows: you buy a license, throw your exe (or DLL) at it, and you end up with a new, protected, file. This new program does just what the old one did, but it’s far harder to reverse-engineer. As the attacker, our goal is to remove the protection so that we can have our wicked way with the program inside.

Among other things, Armadillo employs a system known as Debug Blocker. Briefly put, this causes the program to create two instances whenever it is run - we call them the ‘parent’ and ‘child’ processes. The parent acts as a user-mode debugger, nannying the child (which does all the real work) to make sure that no bad guys can get too close. This system was fairly easy to defeat - all you needed to do was detach the parent process’s debugger at an appropriate moment and attach your own.

So to prevent this happening, the developers of Armadillo invented what they call Nanomites. When the protector is installed on the program, user-marked parts of the code section are scanned for jump instructions (JZ, JNZ, JBE and so on), and a database is created containing the address, type and offset of each. These jump instructions are patched over with ‘INT 3’s (user-mode breakpoint interrupt) and the database is put in the hands of the debugger. The idea is that the child process will raise a debug-break exception whenever one of these instructions fires, whence the parent steps in, grabs the thread context, looks up the appropriate jump in the database and sets the child process on its merry way.

This works very well. If the Nanomite-enabled code regions are chosen carefully then performance is virtually unaffected, and any attempts to sever the child-parent bond results in an immediate and unrecoverable crash. Even worse for the would-be cracker, the information needed to recover the code to a working state is locked up in this database, which is encrypted several times over and accessed only by heavily obfuscated, anti-debug-ridden routines. Reverse-engineering this would be a royal pain.

Getting the table

Many successful efforts had been made to reverse this encryption process and produce a working Nanomite table, but with each offence from the crackers came a counter-offence from the developers and pretty soon there were several variants of the Nanomite system floating around. It was time for a unified approach. Being lazy as I am, I insisted on making the computer do as much of the work as possible. So the plan was this:

Write a program to debug the parent process. That is, debug the debugger. With this level of control, it would be reasonably easy to fool the parent into processing Nanomites at our will. Three function hooks need to be created in the parent process:

  1. WaitForDebugEvent - This is the primary source of information for any debugger. With a hook in here, we could forge any conceivable exception and let the parent attempt to handle it.

  2. GetThreadContext - When alerted of an INT 3 exception, the parent calls this to find out where the Nanomite was struck. Another hook and we can feign a Nanomite hit at an arbitrary address.

  3. SetThreadContext - After ploughing through that obfuscated code, the parent will have decided where execution should continue from, and enforces its will by setting the thread context. This last inside-element will help us determine the details of any given Nanomite.

From here the algorithm writes itself. We find all instances of the byte 0xCC (INT 3) in the code section, spoof an INT 3 exception at each of these points and watch how the parent responds. By setting the EFlags register to take different values for the same Nanomite address, we can determine under which circumstances the jump occurs and hence exactly which conditional jump is being emulated. A few switch-statements later and we have a complete Nanomite table, without having to step through a single instruction of Armadillo’s code.

The Real Problem

After all that work, it we can just assemble all the jumps from the database into place and dump the process. That’ll be sure to remove all the Nanomites, right? Well, yes, but it turns out that something far nastier happens in the process. See, when Armadillo creates the table in the first place, it doesn’t just store the addresses of the jumps but also creates some false entries at addresses that happen to legitimately contain a 0xCC byte. This means that a completely unrelated ‘CALL DWORD PTR:[0043CC7A]’, for instance, will produce a false entry in the table. This entry will never be needed, as the 0xCC is in the middle of an instruction and can’t trigger an exception under normal circumstances, but those clever developers have put us in a real dilly of a pickle.

There is simply no sure-fire way to weed out the ‘false Nanomites’ from the real ones. Without defeating the object of our endeavour and writing a purpose-built debugger to do exactly what we didn’t want the parent process doing, how can we fix this?

The Solution

It took a little bit of brainstorming, but this is where vectored exception-handling comes to the rescue. This little-used feature of the Win32 API allows for installation of a process-wide exception-handler that doesn’t depend on stack-frames. They are of limited use in the real world, but just perfect for our needs for the sole reason that the VEH chain is triggered before the SEH chain.

Suppose that we’ve managed to dump and patch the program (and fixed the imports, encrypted pages, code-splicing) so that it runs without the parent. Suppose further that the original program didn’t use any VEH. Then everything works great until a Nanomite triggers: a debug-break fires, promptly falls through all the structured exception-handlers and the process crashes and burns. But if we had a VEH installed, we’d be given a chance to deal with it.

So by adding a new section to the exe containing the Nanomite table along with some code, we can save the day:

Redirect the entry-point to our code, which installs the VEH and jumps straight to the original entry-point. Have the VEH handle only INT 3 exceptions, searching the database and patching in the appropriate jump instruction when necessary.

That nearly takes care of everything. The only remaining problem is for programs that use VEHs of their own. It’s unlikely that anybody would implement their own exception handler to deal with breakpoints, but conceivable for a catch-all scenario to ruin our best-laid plans. So the last piece of the puzzle is to hook RtlAddVectoredExceptionHandler, telling it to remove our handler before installing the client’s, then replace it afterwards. In this way, the Nanomite-handler is guaranteed to be the first exception-handler on the scene (be it structured or vectored), and existing functionality is unaffected.

How I Cracked the iTunes 7 DRM, Pt V

The story so far: Part 1, Part 2, Part 3, Part 4. The remainder of this project consisted of developing the interface and injection DLL in parallel. This all went fairly smoothly, so I’ll present a summary of the workings.

  • Two programs are involved:

    • DLLBugger.dll - a C++ toolkit DLL designed for injection into iTunes. It sniffs out DRM keys as they are passed to the MP4-playing subroutine, exposes a variety of methods for inter-process communication, and invokes iTunes’s decrypter function when ordered to do so. The peculiar name is something of a relic from the DLL’s twin program, who unfortunately didn’t make it this far.

      • DisaRM.exe - a C# GUI responsible for locating the iTunes process (launching it, if necessary), injecting DLLBugger, parsing the database, asking the user which tracks to unlock, and overseeing the decryption process as performed by DLLBugger within the iTunes address-space.
    • When launched, DisaRM immediately loads DLLBugger into its own address-space. Next, it launches iTunes.exe and acquires a handle to the process. From here DLLBugger is injected into iTunes. Having the DLL present in both processes makes for an ‘easy’ way to communicate data back-and-forth (using a shared PE segment). As it turned out, there was no need to use the Win32 debugging API and so DRMBugger.exe outlived its usefulness.

    • Because I had never written anything involving inter-process communication before, I was quite unprepared for the volume of work required to make this shared-segment approach successful. So it wasn’t before familiarising myself with semaphores and planning out how everything could be made to work with exchange limited to POD, that a rudimentary communication state-machine was implemented.

    • DLLBugger exports twelve functions:

      bool CreateHooks(void decrypt_call, void decrypt_func, void* cfw_call);

      DWORD InjectMain(void *lpParam);

      void* GetRemoteProcAddress(LPCSTR lpModuleName, LPCSTR lpProcName);

      bool KillRemoteThread(HANDLE hRemoteThread);

      bool RemoteDecrypt(wchar_t in_name, wchar_t out_name, RijndaelKey key);

      RijndaelKey GetLastKey();

      void SetStoredKey(RijndaelKey key);

      void RemoveHooks();

      long GetLastTrackFirstLength();

      char GetLastTrackFirstData(long buffer_size);

      WCHAR* GetLastAudioFileName();

      bool PollNewFile();

    • Passing hard-coded addresses (I know, yuck), DisaRM invokes CreateHooks in the iTunes process. This installs hooks in Kernel32!CreateFileW, iTunes!Decrypt and iTunes!PlayMP4+CallToDecrypter (the point at which the previous function is called). Now any attempts to load a track or decrypt a chunk of AAC will be intercepted.

    • After DisaRM has loaded and displayed the protected subset of the iTunes library, the user chooses which tracks to unlock and hits the ‘Get Keys’ button. This triggers DisaRM to launch the first track into iTunes, causing a call to CreateFileW to be intercepted by the DLL. The arguments are stored and execution is allowed to continue. With this, DLLBugger has a good idea which track will be playing at any given time. Once iTunes has loaded the protected MP4 file, determined its decryption key and done whatever else it does, it necessarily makes a call to the Decrypt function. Naturally, this too is intercepted by our DLL and we begin to generate a mapping of file-names to DRM keys. Sanity checks exist in the form of GetLastTrackFirstLength, GetLastTrackFirstData, GetLastKey, and GetLastAudioFileName. Once the confidence level is high enough (as all this business is done asynchronously by iTunes and it isn’t safe to assume too much about the order of events) DLLBugger reports back to DisaRM, and the next track is launched. In this way, DisaRM learns the keys for each file it needs to decrypt.

    • Provided everything went smoothly, the DisaRM displays the keys alongside the track name, artist, album and such (this was initially useful for debugging purposes, but I left it in because it looks kinda cool). The user gets a chance to reconsider before hitting the ‘Remove DRM’ button. Because what DRM-removal tool would be complete without one?

    • The decryption process itself is relatively straightforward. A single call to RemoteDecrypt from DisaRM creates a new thread in iTunes, which opens up the MP4 file and parses the data to find the stbl atom. This part of the file lists the offsets and sizes of each chunk of AAC data (‘cause they come in chunks, you know) among other things. For each chunk, the thread calls the Decrypt function, passing the appropriate offset, size and key. With the stream decrypted, some offensive atoms are removed and the file is made to look like it never had any DRM in the first place. DLLBugger saves the result to disk and that’s that.

I’ll take this opportunity to apologise to any Mac users who were hoping to learn something about the iTunes DRM from this series. Clearly, I didn’t reverse-engineer the protocol to any substantial degree and so none of the methods described port very far away from Windows XP. Maybe another time.

A few things were learned over the four weeks I spent. Here are just a few:

  • Writing an inter-process communication framework is not a task to be taken lightly, no matter how little of it you think you need.

  • C# is excellent for GUI development and awful at low-level hackery. But when you have a shiny new hammer, everything starts to look like a nail.

  • Over-engineering a solution is as bad as under-engineering it. I’m sure I could have saved myself a fortnight if I hadn’t bothered writing that debugger I didn’t need.

  • A profiler can be an excellent RCE tool. If I’d only thought to profile a few seconds of each of m4p and m4a playback, I could have isolated the decrypter function in minutes, rather than days.

  • QuickTime is horrible.

So that marks the end of this series of posts. I can assure you, though, that I haven’t nearly reached the end of the story.