The purpose of this post is to demonstrate how emulation can be used to quickly find solutions to simple keygenme-style programs.
It is not always necessary or efficient to rely on just a disassembler or debugger when emulation can be used to assist with the analysis -
by leveraging tools like angr and Cutter one can save a significant amount of time when solving challenges like these.
Rather than post a write-up seperately for each crackme, the solutions to 5 challenges are posted here together.
Crackme-style challenge programs often incorporate techniques designed to resist or slow down analysis; one such technique - quite familiar by now -
is corruption of the header of the binary, but many other techniques exist as well. For example, a program may be designed to deliberately perform overly
complex operations that are difficult for a human to follow during analysis, increasing the time required
to sufficiently comprehend program behavior. The keygenme binary that will be analyzed here is an example of this. It will be
demonstrated that in this case, a viable approach to overcoming the challenge posed for analysis by some of the program’s rather opaque internal
operations is to devise a method that will automatically generate inputs that solve the binary using the angr binary analysis toolkit.
In the previous post, the Unicorn emulation framework was used to examine 2 very small programs that were both less than 100 bytes in size.
Some of the fields of these binaries’ ELF headers contained executable code, which had the effect of corrupting the ELF header. Tools like radare2 and gdb
could not be used to analyze the runtime behaviour of these programs. Emulation via Unicorn was shown to be a useful alternative for this task.
Here, a slightly larger and more functional program with a malformed ELF header will be analyzed, this time with the new and very cool
Qiling emulation framework, which is built upon Unicorn. In addition, it will be shown how a simple control-flow graph can be built from disassembly,
as well as how to create a graph that maps the execution paths of a program when it is emulated.
A simple but often effective method for complicating or preventing analysis of an ELF binary by many common tools (gdb, readelf, pyelftools, etc)
is mangling, damaging or otherwise manipulating values in the ELF header such that the tool parsing the header does so incorrectly, perhaps
even causing the tool to fail or crash. Common techniques include overlapping the ELF header with the program header table and writing
non-standard values to ELF header fields that are not needed for composing the process image of the binary in memory. In addition to some programs designed for criminal
purposes (e.g. the “mumblehard” family of malware programs), a few code-golf- and proof-of-concept-type programs have been created that employ these techniques.
Examples of such programs include
Brian Raiter’s “teensy” files and @netspooky’s “golfclub” programs. In this post, it will be demonstrated how emulation can be used to trace the execution
of these types of binaries.
In the previous post, it was demonstrated how an internal function in a dynamically-linked
ELF executable can be hooked by redirecting execution to the PLT entry of a shared library function and
then overriding that shared library function via LD_PRELOAD. This technique
was used to completely replace the logic of an internal function of a toy program.
This time, rather than substituting the logic of a hooked internal function in its entirety
in order to override the function’s behavior, it will be demonstrated how debugging instrumentation can be
inserted into a hooked internal function to analyze and log its runtime behavior.
The internal function responsible for encoding a key via XOR operations in a crackme program
will be analyzed.
It is well known that LD_PRELOAD can be used to override shared library
functions loaded at runtime by the dynamic linker [1]. What is not so well known is
that internal functions - functions whose code lies within the .text section
of the binary - can also be be hooked indirectly using a simple trick that relies on LD_PRELOAD, even though
these functions obviously are not imported from dynamically-linked libraries
(shared objects).