This article is more than 1 year old
Anatomy of a killer bug: How just 5 characters can murder iPhone, Mac apps
What evil lurks in the Unicode of Death ... oh, a buffer overrun
Analysis There has been much sniggering into sleeves after wags found they could upset iOS 6 iPhones and iPads, and Macs running OS X 10.8, by sending a simple rogue text message or email.
A bug is triggered when the CoreText component in vulnerable Apple operating systems tries to render on screen a particular sequence of Unicode characters: the kernel reacts by killing the running program, be it your web browser, message client, Twitter app or whatever tried to use CoreText to display the naughty string.
Much hilarity ensued as people tweeted the special characters, posted them in web article comments or texted them, and rejoiced in the howls of fanbois' frustration. (Facebook had to block the string from being submitted as a status update.)
But how did that bug work? After some examination, it appears to be a rather cute programming cock-up that's fairly easy to explain. The vulnerable code has probably been in the wild for yonks; some people noticed it six months ago and it appeared on some slides [PDF] in April for a Hack In The Box conference presentation. Barely anyone took any notice back then - but it started to spread around the web over the weekend after a trigger string appeared on a Russian website.
Too long, didn't read: A summary
Apple's CoreText rendering system uses signed integers to pass around array indexes and string lengths. A negative length, -1, is passed unchecked to a library function which uses it as an unsigned long integer to set the bounds of an array. This causes the library to attempt to read beyond the end of an array and into unallocated memory, triggering a fatal exception.
If you're au fait with disassembling software to debug it, what follows will be obvious to you. If you're interested in what goes on under your Mac or iThing's hood, then read on.
First of all, let's look at the crash. All these steps occurred on a 64-bit Mac running OS X 10.8.4. By making the Terminal app display a particular string of five 16-bit Unicode characters, the program is quickly killed by the kernel with this OS-generated fault report:
Grokking these crash logs can be a bit of a headache, but the first thing to spot is that the processor was running inside a library called libvDSP.dylib, specifically the instruction 117462 bytes into that library, when the fault occurred. This is right at the top of the stack backtrace, at line 30 in the report, which tries to describe the sequence of functions the code called before hitting the bug.
If we open libvDSP (located deep within the /System/Library/ filesystem hierarchy of your computer) in the rather handy reverse-engineering tool Hopper, we can look at the compiled machine code that blew up. See the screenshot below: the faulting instruction 117462 bytes in, or 1cad6 in hex, is highlighted (click to enlarge).
That instruction, addsd xmm1, qword [ds:rdi+rsi], tries to load a 64-bit value into the xmm1 register from the memory address calculated by adding the rdi and rsi registers together. The crash log states we were killed by an EXC_BAD_ACCESS fault at address 0x00007fa95cc00008; in other words, that instruction tried to read data from that memory address, but it was marked as inaccessible. We're not supposed to be touching that part of the memory map, so our program is killed by the kernel before any damage can be done.
In fact, the log handily tells us that, just before the crash, there was 2048KB of memory allocated for Terminal up to the address 0x00007fa95cc00000. It seems likely the program stepped beyond that limit and triggered the fatal exception. If we scroll through the crash log to the thread state, we can see the values that were in the CPU registers at the time of the crash:
Thread 0 crashed with X86 Thread State (64-bit): rax: 0x0000000000000030 rbx: 0x00007fa95bcc5010 rcx: 0xfffffffffffc3e0a rdx: 0x00007fff5c862d60 rdi: 0x00007fa95cbffff8 rsi: 0x0000000000000010 rbp: 0x00007fff5c862d70 rsp: 0x00007fff5c862d58 r8: 0xffffffffffffffff r9: 0x00007fa95c83e1e8 [...] r15: 0x0000000000000002
Hopper highlights that there's a loop running between 0x1cad2 and 0x1cae7, reading in data and adding it to three running totals in xmm0, xmm1 and xmm2, and subtracting the value in register rcx each time until it crosses zero at which point we jump out of the loop - effectively using rcx as a loop countdown. In each iteration, rdi increases and rsi remains constant. As mentioned above, these two are added together to calculate the address we read from; you can see that adding rdi to rsi produces the address 0x7fa95cc00008, which (as seen above) triggers the fault.
So, rdi is growing too large, forcing us to read memory not allocated to us. The address of the data we attempt to fetch is only constrained by rcx, so this suggests rcx is too large - it doesn't hit zero before rdi pushes us into invalid memory - and yes, that's clearly the case: at 0xfffffffffffc3e0a by the time of the crash, rcx has been counting down from an unfeasibly large value, far greater than the memory allocated to the entire program. It should be fairly small. What's gone wrong?
By analysing the debugging information in the libvDSP binary, Hopper tells us we're in the vDSP_sveD() function of the library at the time of the fault. Apple has documented that function here:
void vDSP_sveD (double *__vDSP_A, vDSP_Stride __vDSP_I, double *__vDSP_C, vDSP_Length __vDSP_N);
It's used to sum an array of double-precision floating-point numbers. Let's map those input parameters for the function to the registers used in this compiled code, following the System V AMD64 ABI that Mac OS X uses:
rdi = double *__vDSP_A Pointer to the array of input values rsi = vDSP_Stride __vDSP_I The stride, don't worry about this rdx = double *__vDSP_C Pointer to where we store the result rcx = vDSP_Length __vDSP_N The number of array elements to sum
The variable type vDSP_Length is defined as follows:
typedef unsigned long vDSP_Length;
So rcx is supposed to be given a positive-only integer: the number of array elements to process, hence why it counts down to zero in the loop.
Something inside the CoreText component is therefore calling vDSP_sveD() with a stupidly large __vDSP_N value and the library function does nothing to check the sanity of this number because it assumes the caller knows what it's doing. This large loop counter value causes rdi to exceed the bounds of the input array __vDSP_A and blow up our app.
So, what's calling this library function? Looking next down the stack trace, we find the processor was in the CoreText component's TRun::TRun function, specifically 850 bytes in. There, we find an instruction that calls another CoreText function labelled TStorageRange::SetStorageSubRange(), which is disassembled below (click to enlarge):
This SetStorageSubRange() function is internal to CoreText and isn't documented publicly. But we can see it calls vDSP_sveD() at 0x274f9, leading to our crash as described above.
Just before that doomed function call, rcx (which holds the dodgy array length value passed to libvDSP) takes its value from the r8 register at 0x274f6. The function vDSP_sveD() doesn't modify r8, which is very handy for us snooping around. From our aforementioned crash dump, we can see the contents of r8, which is used to initialise the loop countdown register rcx just before vDSP_sveD() is called.
And yes, its value is ridiculously big. In fact it's the largest possible value for an unsigned 64-bit register, so no wonder we exceed the array's bounds and crash:
r8: 0xffffffffffffffff
That unsigned integer, when expressed as a signed decimal integer, is -1, as per the rules of two's complement.
So vDSP_sveD() is called by CoreText's SetStorageSubRange() with a negative array length, which isn't what the library function expects: it's defined as taking a positive-only value. SetStorageSubRange() isn't calling libvDSP's summing function properly.
Where does this negative number come from? Back inside SetStorageSubRange(), we see that the r8 register (used to initialised rcx for vDSP_sveD()) is given the value of rdx near the start of the function at 0x2744e. By following all the possible code paths in SetStorageSubRange(), we can see that the value of r8 doesn't change from that initial value if it's either negative or an internal flag bit is clear. Therefore the -1 passed to vDSP_sveD() comes from rdx.
That rdx register is an input parameter of the SetStorageSubRange(). Judging by the code, the first parameter to the function, stored in rdi, is a pointer to a 64-bit variable that vDSP_sveD()'s summing calculation should be stored in. That leaves the two other entry parameters in rdx and rsi.
According to the debugging information, SetStorageSubRange() takes a CFRange structure as one of its inputs. This struct is defined publicly in here:
struct CFRange { CFIndex location; CFIndex length; };
This structure describes "a range of sequential items in a container, such as characters in a buffer". It contains two CFIndex variables, one labelled location, which defines the starting point in an array of things, and the other labelled length, which is the number of things in that array that we're interested in.
And CFIndex is defined as:
typedef signed long CFIndex;
So it appears the second parameter to the SetStorageSubRange() function is a CFRange structure, placing the location in rsi and the length value in rdx, which is a negative integer. Whatever called SetStorageSubRange() thus passed in -1 as a string length, which trickled down to the crash described above.
(Given that length is a signed long, -1 may be a valid number. Apple's Core rendering system uses signed CFIndex values "as an array index and for count, size, and length parameters and return values" throughout its software.)
Stepping back once more, we reach the non-trivial function Trun::Trun() in CoreText, which called SetStorageSubRange() at 0x25d57. A disassembly is below (click to enlarge):
Things get interesting at the 0x25d25 mark: the register rbx is incremented and its value copied into rdx. Then the value of the r15 register is subtracted from rdx. The resulting value in rdx is then passed to SetStorageSubRange(), which we've seen is -1. And we know the value of r15, because miraculously it has been preserved from that moment all the way through to the crash. So, in our register dump:
r15: 0x0000000000000002
Working backwards, rbx has to be zero just before it's incremented in order to land us with an rdx of -1 that knocks down the house of cards later on. Following the spaghetti assembly code up through TRun::TRun(), it seems to be that rbx is a previously calculated length, quite likely the number of characters or Unicode glyphs to process.
Filippo Bigarella, who is attempting to patch the bug for jailbroken devices, found that the special Unicode of Death sequence made the publicly documented CTRunGetGlyphCount() function in CoreText return a value of -1. That function is supposed to return the number of Unicode glyphs in a "glyph run, which is a set of consecutive glyphs sharing the same attributes and direction".
A negative value doesn't seem right at this point. It suggests that the killer Unicode sting - a short nonsensical mix of Cyrillic and Arabic - is a sequence of characters that causes the operating system to determine that the string has a zero or negative length. For the uninitiated, Unicode is a way of storing and processing letters beyond the traditional ASCII set you'll no doubt be familiar with; it's capable of representing characters from Arabic and Asian languages to mathematical symbols.
Unicode can also set the direction from left to right and right to left, and combine multiple glyphs into a custom one. Call it a hunch, or a considered guess, but given your humble hack's experience of font renderers, it's not inconceivable that CoreText could get confused by quirky Unicode, and ends up computing a negative length. If anyone has any bright ideas, get in touch.
One bug-triggering character sequence involves a simple space (ASCII code 0x20), which may be related to whitespace-handling code seen in the stack trace.
Final thoughts
The code for CTRunGetGlyphCount() does not calculate the number of glyphs in a run: instead it pulls up a value from a data structure that must have been initialised earlier. Bigarella tells me the glyph count miscalculation may occur when CoreText creates a CTRunRef object that represents a line of text to render.
Here the trail goes cold; the quarry disappears into the long grass of TRun::Trun() and beyond. The next step would be to fire up a debugger and slowly step through the execution until it becomes clear how the invalid glyph count is calculated. Perhaps revealing a recipe for creating killer Unicode strings for an unpatched bug wouldn't be the best idea, anyway.
In the meantime, the flaw as it stands doesn't appear to be exploitable beyond crashing a user's program: it's mighty hard to leverage an end-of-array read fault into something more serious.
It can also be triggered on 32-bit ARM-powered iPhones, iPods and iPads running the latest publicly available version of iOS. This means the bug isn't specific to a particular architecture: the buffer overrun will work in much the same way except within 32 bits rather than 64 as seen above.
The app-slaying coding error is absent from iOS 7 and Mac OS X 10.9 (codenamed Mavericks), both due for a public release soon. El Reg contacted Apple to find out whether or not its older software will be fixed, but no one was available to comment.
Just as articles have typos, software has bugs. And Apple support forums have complaints about Unicode-triggered crashes. ®