Day 14 - Ghidra - Headless Analyzer

Writer: Niklas Saari - OUSPG / University of Oulu

The Advent calendar of CinCan continues, this time there is also rather interesting tool - Ghidra: NSA:s open-source software reverse engineering (SRE) suite.

Release of Ghidra

It is only about 8 months since the open-source release of the Ghidra. There are not too many tools around with similar capabilities. From commercial products, maybe IDA Pro and BinaryNinja are quite close, and from open-source products - there weren't single one before with really competitive properties.

There are some great open-source tools such as radare2, but they were missing for example the efficient decompilation into pseudo-C code. After release of the Ghidra, Ghidra's decompiler part was integrated into radare2 as well. This describes quite well the impact and open-source benefits of the Ghidra.

The overall response for Ghidra's release was quite positive and some discussion can be found from the Hacker News.

Ghidra had been in the development for around 20 years internally, according to article released in Dark Reading. When the project started, there wasn't anything similar around, and there was a need for a tool with teaming capabilities for analyzing the malicious code.

Basically Ghidra is software analysis tool which is capable for disassembling, assembling, decompiling and graphing code, while providing APIs for scripting, among others many features.

Officially Ghidra was publicly released, because it made possible to use tool across different groups and this improves the ability to share (as the tool was classified before). The creativity of open-source community can be leveraged to improve and maintain tool further as well.

See the project homepage over here!

Video for Black Hat presentation can be found from here.

Integrating Headless Analyzer into CinCan project

Ghidra is quite huge application and it is heavily oriented to be used from the graphical user interface. In the CinCan project there has been an attempt to make Docker container based packages from CLI based analysis tools, and for this GUI is not really suitable.

However, there is Headless Analyzer part in Ghidra which enables to use decompiler part of it, and many other features based on provided analysis scripts.

See more information about Headless Analyzer in here.

The way how analyzer works, is not directly very user friendly to be used from the container. Analysis is requiring the definition of Ghidra project among other arguments, which are "meaningless" as current use of containers is to make analysis, and destroy it's contents afterwards.

Another not-user friendly problem is, that when manually specifying processor or complier configuration IDs, availability of those must be checked from the source code or from the GUI of the Ghidra.

To overcome some of these issues, Proof of Concept (yes, might be bugs included) script has been created to provide simple set of utils to list supported processor architectures and available compliers, and to provide stripped required arguments for using the Headless Analyzer. It is meant to be used for single-run container, but it works outside of container as well. Implementation can be found from our GitLab repository.

For example, when running container with cincan tool, supported processor architectures can be listed as:

cincan run cincan/ghidra-decompiler list processors

Which should produce output like this:

ghidra_example1

By adding one parameter more, for example x86 to see processor variants and languageID's which are required type for -processor argument, we can run

cincan run cincan/ghidra-decompiler list processors x86

Providing:

ghidra_example2

From above image we can see, that all variants are using Little-Endian and there are different instruction versions of the processor.

Entries have been queried from the source code, so they are always up-to-date with current version - as long as the format for defining these remains same.

To see more options, use --help option.

Decompiling

In the simplest case, we can let Ghidra automatically to detect suitable configuration based on provided binary.

By default, container is using "DecompileHeadless.java" named script, which prints C pseudo code into STDOUT.

By running:

cincan run cincan/ghidra-decompiler decompile hello_world

We can decompile binary named as 'hello_world`, which source is showed later.

This dumps whole code into STDOUT which can be redirected into file. Exporting projects with custom script might be possible in future.

ghidra_example3

From the output we can look for main function:

ghidra_example4

Previous automatic detection can manually defined as:

cincan run cincan/ghidra-decompiler decompile -processor x86:LE:64:default -cspec gcc hello_world

From the output we can look again for main function of the program:

undefined8 main(void)

{
  puts("Hello, World!");
  puts("And Hello for Ghidra Headless Analyzer!");
  return 0;
}

Which can be compared to original whole source:

#include <stdio.h>
int main()
{
    printf("Hello, World!\n");
    printf("And Hello for Ghidra Headless Analyzer!\n");
    return 0;
}

For Dockerfile and more instructions and many other tools see our tools repository!