This page has been machine-translated from the original page.
Originally, this was a collection of Ghidra Script samples that I had introduced in the article Solving a CTF Challenge with Your First Ghidra Script, but as the number of examples grew and because I plan to start using Ghidra more seriously from here on, I decided to split them out into a separate article.
From the feel of using it, I think it might be better to write scripts in Java rather than Python (Jython), but for the time being I am creating scripts in Python.
For the runtime, I am using Ghidra’s default one rather than Ghidrathon.
Table of Contents
- Get the Decompiler Output of a Function
- Get the Function and Data for a Specific Address
- Enumerate Call Instructions Within a Function
- Enumerate the Names of Functions Called by Call Instructions Executed Within a Specified Address Range
- Get a Sequence of Hard-Coded Byte Arrays from Disassembly
- Identify Structure Information at a Specific Address and Retrieve Its Values
- Assign a Structure to a Specific Address and Retrieve Its Values
- Save Byte Data from an Arbitrary Address Range as a File
- Summary
Get the Decompiler Output of a Function
The following script outputs the decompiler result for the function currently selected in the Listing window.
from ghidra.app.decompiler import DecompInterface
# Decompile インターフェースを取得
decomp = DecompInterface()
decomp.openProgram(currentProgram)
# currentAddress には、Listing で選択している行のアドレスが自動的に参照される
# そのため、事前にターゲットになる関数のアドレスを選択しておく
func = fpapi.getFunctionContaining(currentAddress)
decomp_results = decomp.decompileFunction(func, 30, monitor)
if decomp_results.decompileCompleted():
pp = PrettyPrinter(fn, decomp_results.getCCodeMarkup())
code = pp.print(False).getC()
print(code)
else:
print("There was an error in decompilation!")Get the Function and Data for a Specific Address
The following script lets you specify an offset to retrieve the function name for an arbitrary address, or retrieve the data at a specified address.
# Listing 情報を取得(ghidra.program.database.ListingDB)
listing = currentProgram.getListing()
# オフセットを指定して GenericAddress オブジェクトを取得
fpapi = FlatProgramAPI(currentProgram)
addr = fpapi.toAddr(0x1024aa)
# 指定したアドレスを含む関数を取得
func = fpapi.getFunctionContaining(addr)
print(func.getName()) # 関数名の表示
# データを取得するアドレスを指定
addr = fpapi.toAddr(0x102cd1)
# 指定アドレスのデータを取得
data = listing.getDataAt(addr)
print(data.getValue()) # ghidra.program.model.scalar.Scalar
# 指定 AddressSet の範囲で DataIterator オブジェクトを取得
# 指定の範囲でbelow. AddressSet を取得する
from ghidra.program.model.address import Address, AddressSet
factory = currentProgram.getAddressFactory()
# 指定のオフセットから 0x100 分の範囲を指定
addr_set = AddressSet()
addr_set.add(addr, addr.add(0x100))
# 指定の範囲のデータを先頭から取得
data_iterator = listing.getData(addr_set, True)
for data in data_iterator:
print(data.getValue())
# 特定のアドレスのオペコードとオペランドを取得
addr = toAddr(0x1000000)
inst = getInstructionAt(addr)
# オペランドの取得
inst.getDefaultOperandRepresentation(0)
inst.getDefaultOperandRepesentation(1)
# 次の行の情報を取得できる
inst.getNext()
inst.getDefaultOperandRepresentation(0)
inst.getDefaultOperandRepesentation(1)Enumerate Call Instructions Within a Function
The following code enumerates Call instructions within a function.
You need to use different retrieval methods depending on whether you want to preserve the call order.
# currentAddress is ghidra.program.model.address.GenericAddress
# currentAddress には、Listing で選択している行のアドレスが自動的に参照される
func_mgr = currentProgram.getFunctionManager()
func = func_mgr.getFunctionContaining(currentAddress)
# Call order is not preserved here
calls = func.getCalledFunctions(monitor)
for c in calls:
print(c)
# Listing 情報を取得(ghidra.program.database.ListingDB)
listing = currentProgram.getListing()
# 関数内の Call 命令を順に列挙することで呼び出し順序を維持して出力する
for i in listing.getInstructions(func.body, True):
print(i)Enumerate the Names of Functions Called by Call Instructions Executed Within a Specified Address Range
The following script enumerates the symbol names of functions called by Call instructions executed within the specified address range.
I used it in the following challenge.
Reference: Cake CTF 2023 Writeup - nande
from ghidra.program.flatapi import FlatProgramAPI
from ghidra.program.model.address import AddressSet
listing = currentProgram.getListing()
fpapi = FlatProgramAPI(currentProgram)
start_addr = fpapi.toAddr(0x1043c9)
end_addr = fpapi.toAddr(0x104825)
addr_set = AddressSet()
addr_set.add(start_addr, end_addr)
for p in listing.getInstructions(addr_set, True):
code = p.toString()
if "CALL" in code:
func_addr = int(code.split(" ")[1],16)
fpapi.getFunctionContaining(fpapi.toAddr(func_addr)).getName()Get a Sequence of Hard-Coded Byte Arrays from Disassembly
The following code lets you extract operands sequentially from a series of assembly instructions.
# 特定アドレスから 0x26 バイト分のオペランドを取得する
addr = toAddr(0x109011)
inst = getInstructionAt(addr)
result = []
for i in range(0x26):
result.append(inst.getDefaultOperandRepresentation(1))
inst = inst.getNext()
print(result)Identify Structure Information at a Specific Address and Retrieve Its Values
The following code identifies structure information at a specific address and retrieves its values.
I used it in the following challenge.
Reference: AmateursCTF 2023 Writeup - CSCE221-Data Structures and Algorithms
from ghidra.app.script import GhidraScript
# list の取得
start_address = toAddr("0x404000")
data_section = currentProgram.getMemory().getBlock(start_address)
data_address = toAddr("0x404060")
data_object = getDataAt(data_address)
# 自作した list 構造体が解釈される
data_structure = data_object.dataType
data_component = data_structure.getComponent(0x0)
# Get the relative offset, length and data type of the component
offset = data_component.offset
length = data_component.length
data_type = data_component.dataType
# list 構造体から int 分の値を取得
byte_array = getBytes(data_address.add(offset), length)
print(hex(byte_array[0]))
# Get the address of the .data section
start_address = toAddr("0x404000")
data_section = currentProgram.getMemory().getBlock(start_address)
# list のアドレス
data_address = toAddr("0x404060")
data_object = getDataAt(data_address)Assign a Structure to a Specific Address and Retrieve Its Values
This script lets you define data in an arbitrary address range as a structure and inspect its values.
I used it in the following challenge.
Reference: AmateursCTF 2023 Writeup - CSCE221-Data Structures and Algorithms
# listnode の取
data_type_manager = currentProgram.getDataTypeManager()
my_structure = data_type_manager.getDataType("main.coredump/listnode")
start_address = toAddr("0x405000")
data_section = currentProgram.getMemory().getBlock(start_address)
flag = ""
listnode_addr = 0x4052a0
# listnode
data_address = toAddr(hex(listnode_addr))
data_object = createData(data_address, my_structure)
# 自作した listnode 構造体が解釈される
data_structure = data_object.dataType
data_component = data_structure.getComponent(0x0)
# Get the relative offset, length and data type of the component
offset = data_component.offset
length = data_component.length
data_type = data_component.dataType
# listnode 構造体から byte 分の値を取得
byte_array = getBytes(data_address.add(offset), length)
flag += chr(byte_array[0])Save Byte Data from an Arbitrary Address Range as a File
This script is useful when a section contains embedded encrypted data or similar and copying it manually would be cumbersome.
It simply retrieves the data from an arbitrary address range one byte at a time and writes it to a file.
from ghidra.program.model.address import Address
from ghidra.program.model.mem import MemoryAccessException
import struct
def save_bytes_to_file(start_address, end_address, filename):
currentProgram = getCurrentProgram()
memory = currentProgram.getMemory()
start_addr = toAddr(start_address)
end_addr = toAddr(end_address)
with open(filename, "wb") as file:
address = start_addr
while address <= end_addr:
try:
byte = memory.getByte(address)
file.write(struct.pack("B", byte))
address = address.next()
except MemoryAccessException as e:
print("Error reading memory at address:", address, e)
break
start_address = 0x403040
end_address = 0x403000 + 0x1ce00 - 1
filename = "C:\\Users\\Public\\output.bin"
save_bytes_to_file(start_address, end_address, filename)Summary
I feel that there will be even more things you can do once you get comfortable with Ghidra Script.
However, Ghidra’s default Jython is old, and overwhelmingly most published sample scripts are in Java, so if you want to use it seriously, it might be better to study Java.