7,159 views
HITB2014AMS – Day 1 – State of the ART: Exploring the New Android KitKat Runtime
Good afternoon and welcome back to Hack In the Box. I can’t think of anything better than a talk on ART, the new Android KitKat Runtime, to digest lunch :)
Intro
ART was introduced in Android 4.4 back in October 2013 and although it is still in an experimental stage, it’s poised to replace Dalvik in the near future. ART features AOT (Ahead Of Time) compilation, which means it will run faster compared to Dalvik (which has JIT compilation) (no benchmark data is available at this point). As a side-effect, the battery life will be improved too. On the downside, more storage space will be needed (about 10 times larger) and ART has a longer installation time. To enable ART (in KitKat), you can switch from Dalvik to Art by editing the corresponding developer setting.
To check which runtime is enabled, run getprop persist.sys.dalvik.vm.lib.1 If “libart.so” is returned, ART is enabled.
Before continuing, Paul emphasizes that ART is still experimental, which means some of the contents of his talk are subject to change in the final version of ART.
Ahead of time compilation
The OAT happens upon reboot after ART is enabled. It creates a boot.oat and boot image. All installed apps will be compiled… and this may take a while, Paul says. When a new applications is installed, it also gets compiled. The dex2oat utility is used to compile an app to OAT, and the resulting oat file will be stored on the device. Paul explains that the boot.oat file contains absolute pointers to methods in the boot image. The boot.oat and boot image are loaded by zygote.
ART has 3 compilers back-ends:
- Quick (default): Medium Level IR (DEX bytecode). It uses low level IR, gets converted to native code, and some optimization is done at each stage of the compilation process.
- Optimized: Basically an optimized version of “Quick”
- Portable: Uses LLVM bitcode at its LIR. It uses optimisations using the LLVM optimizer and code generation is done by the LLVM backends. Paul mentions that he has not been able to use the portable backend yet, for unknown reasons.
By default, ART compiles all methods (except for some class initialization methods).
When you run an app, profiling data is generated (unless you disable it) and stored under /data/dalvik-cache. ART uses this profiling data to determine if dex2oat must be used (and thus the applicatio must be compiled). If the number of methods comprising 90% of the called methods has changed by 10%, it will be compiled. (in other words, it compares with previous runs and compiles as soon as it reaches a threshold). The methods to compile = the methods comprising 90% of the called methods.
OAT file format
The OAT file is an ELF dynamic object file, and has an .oat file extension, and uses a container for the oat data. The file start with the string “.oat” (“magic” bytes) and has dynamic symbol tables pointing to oat data and code: oatdata (.rodata), oatexec (.text) and oatlastword.
The oat data table points to headers, DEX files. Oatexec points to the compiled code and oatlastword is just an end marker (marks the last 4 bytes of oatdata). OAT supports ARM, ARM64, Thumb2, x86, x86_64 and MIPS, and the target architecture is stored in the instruction_set header field.
The DEX File header is placed right after the OAT header. It contains information about the dex size (length of the original input path), data (original path of input file), checksum (of the path), a pointer to the embedded input DEX (apk) file, and a list of offsets to OATClassHeaders. Each of these headers has a few fields: status, type, bitmap_size, bitmap_pointer and methods_pointer. kOatClassAllCompiled, kOatClassSomeCompiled, kOatClassNoneCompiled are examples of the Oat Classes. Bitmaps are used to represent which methods are compiled. Each bit represents every method in the class, starting with direct methods.
Next, we find an OatMethodOffset (which corresponds with a method) and OatMethodHeader header, which appears right before the method code.
Security Implications
New technology means new code, Paul says. New code means potential mistakes. Paul decided to fuzz the compiler (using dumb fuzzing methods), generating DEX files with mutated method code and ran them against dex2oat. He found several crashes but didn’t pursue the crashes because he realized that – since ART is still evolving and under heavy development – it may get fixed in a next version of ART. It does prove that new code == flaws.
Exploiting ART would allow attackers to install user mode root kits. The fact that the boot image has the addresses of methods, it could be parsed and the pointers used in local attacks.
Also, the base address of the boot image is fixed at 0x700000, which means it could be used to bypass ASLR. It’s a rich source of ROP gadgets. Also, boot.oat code section has 27mb of code :)
Reverse Engineering
From a static analysis perspective, Paul says, it’s probably easier to read Dalvik bytecode disassembly. If you feel up to it, you can dump the native code disassembly using oatdump (which should be on your ART enabled device). He explains that the absolute addresses of methods are put in the application just like that. In other words, it may be difficult to understand what is going on unless you create cross reference names. Also, oatdump dumps the entire OAT file, which may be painful. Paul mentions that you can use gdb to debug native code. Simply get the address of a method using oatdump, set a breakpoint and trace.
For dynamic instrumention, you could use Cydia Substrate for Android (by saurik) or Xposed framework (by rovo89). Unfortunately ART is not supported yet (unless it has stabilized). So, for now, static instrumentation is the way to go.. You’ll have to unpack, unassembled, etc. which can be painful.
Paul finished his talk by explaining that ART is definitely ripe for more security research and explains that more work on RE tools is necessary to make it easier to perform research.
About the speaker
Paul Sabanal is a security researcher on IBM Security Systems’s X-Force Advanced Research Team. He has more than a decade of experience in the information security industry, mainly focusing on reverse engineering and vulnerability research. He has previously presented in several conferences on the topics of C++ reversing and various sandboxing technologies. His main research interests these days are in protection technologies, mobile security, and automated binary analysis tools. When not in front of a computer, he enjoys Disney movie nights with his daughter, playing weird instruments in a band, and pajama wrestling. He is currently based in Manila, Philippines.
© 2014, Peter Van Eeckhoutte (corelanc0d3r). All rights reserved.