Beruflich Dokumente
Kultur Dokumente
HITBSecConf2008 - Kuala Lumpur Ero Carrera - ero.carrera@gmail.com Reverse Engineer at zynamics GmbH Chief Research Ofcer at VirusTotal
Introduction
An historical perspective
Originally meant to save space by reducing the redundancy in executable le formats Simply compressed parts or the whole of the executable Created a new "envelope" around it that restored the original executable and the passed control to it The decompressing envelope did not much more than just restoring the executable
Compression provided a trivial degree of obfuscation, but obfuscation nonetheless Was easy to add additional measures in the decompressing envelope
Anti-debug
Aimed at making tracing hard Using SEHs triggered by hard to handle exceptions Confuse debuggers throwing INTs they use Calling hard-to-hook low level APIs/syscalls Checking for hooks
Anti-environment
VM detection. VMWare, VirtualPC, etc Techniques aimed against specic tools OllyDBG, IDA, Softice, etc
Breaking tools
Tricks detecting, confusing or aimed at crashing some of the most common tools IDA OllyDBG Procdump Softice
Anti-analysis
Code obfuscation Adding junk code, using opaque predicates Code transformation Virtual machines Flow obfuscation (SEH, Nanomites)
Tools
Bochs Provides with a high-level view No need to worry about most of the anti-* techniques Windbg Can do kernel-mode debugging, hook syscalls, look deeper that user-mode tools Inspection of physical/virtual memory Memoryze, for the real hardcore
Function
address address address ... instruction instruction instruction (operand, ...) (operand, ...) (operand, ...)
Memory Page
address address address ... instruction instruction instruction (operand, ...) (operand, ...) (operand, ...)
Memory Page
(operand, ...) (operand, ...) (operand, ...) address address address ... instruction instruction instruction (operand, ...) (operand, ...) (operand, ...)
Executable Image
Function
address address address ... instruction instruction instruction (operand, ...) (operand, ...) (operand, ...)
Memory Page
Function Chunk Function Chunk
address address address ... instruction instruction instruction (operand, ...) (operand, ...) (operand, ...)
Memory Page
Function Chunk
Memory Page
Function Chunk Function Chunk
(operand, ...) (operand, ...) (operand, ...) address address address ... instruction instruction instruction (operand, ...) (operand, ...) (operand, ...)
Function A
( ( (
Function B
( ( (
( ( (
( ( (
( ( (
( ( (
( ( (
Shared Blocks
address address address ... ( ( ( , ...) , ...) , ...)
( ( (
Junk Code
pusha popa Non-Standard Branching Junk JMP insertion
Virtual Machines
Visual Basic, Java, Python, Ruby, Perl, .NET Starforce, VMProtect, x86 Virtualizer, Themida/ CodeVirtualizer At a high-level its a: fetch, decode, handle algorithm
Virtualized Code
Runs
Runs
Runs
Real CPU
Real CPU
Registers
-General Purpose -Instruction Pointer -Stack pointer
4 Execute handler 3
Real CPU opcodes
Update registers
2
handler for opB
Decode
Decoder
Decoder
...
Advanced Packers
Some of the hardest current packers are VMProtect, Themida, Armadillo They incorporate some complex, custom techniques Usually commercial products protectors
Armadillo
Armadillo
Double process debugging, debug blocker Nanomites Strategic Code Splicing Armadillo's invalid instructions LOCK prex Invalid MOV
Parent process
Debug
Child process
Look up address
Find target
Resume child
Debug
Child process
Parent catches it
Transfer control
INT 3
Themida
Standard Imports
Executable Imported DLL
Reconstruction
The algorithm has limitations References to other functions within the DLL are kept Same for true branches of conditional branches Those two points can allow us to do API discovery by studying their connectivity
Themida's obfuscation
Adds lots of branching and junk Keeps few "real" instructions per obfuscated block IDA can easily deal with the branching Although bogus calls break IDA analysis and lead to broken obfuscated functions Some scripting can make this look better
Current state
Packing vs unpacking Packing is not always a symmetric proces, sometimes it can't be undone perfectly You wont get the original process back Can it be done generically? Some cases the answer is "mostly" yes You will mostly always be able to obtain code close to its original form
Recent techniques
skape documented an elegant trick on uninformed 10 a few weeks ago Attacks a basic heuristic used by most generic unpackers Tracking execution transfer to dirty-memory
WRITE
WRITE
TIME
Physical Memory
MMU
Virtual Address Range A
WRITE
Physical Memory
MMU
Virtual Address Range A
EXECUTE
Countermeasures
Windbg can see the mappings from virtual to physical Not to hard to spot doubled mapped regions Bochs and other low level emulators can easily do it as well Requires kernel-mode access or higher
References
Reversing. Secrets of Reverse Engineering. Eldad Eilam Dprotection semi-automatique de binaire, Yoann Guillot & Alexandre Gazet
http://metasm.cr0.org/SSTIC08-article-Guillot_GazetDeprotection_Semi_Automatique_Binaire.pdf
References II
A Quick Survey on Automatic Unpacking Techniques, Daniel Reynaud
http://indenitestudies.wordpress.com/2008/09/25/automatic-unpacking/
References III
Rolf Rolles blog in OpenRCE
https://www.openrce.org/blog/browse/RolfRolles
Oreans Themida/CodeVirtualizer
http://www.oreans.com
References IV
VMProtect
http://www.vmprotect.ru/
Memoryze, Mandiant
http://www.mandiant.com/software/memoryze.htm