This page has been machine-translated from the original page.
The other day, I wrote an article on creating and analyzing ClamAV bytecode signatures using the SECCON 2022 challenge Devil Hunter as the theme.
Reference: Learning ClamAV Signature Creation and Analysis Through a CTF
I did not cover it in that article, but there is also a brilliant analysis technique for bytecode signatures: patching libclamav so you can trace the bytecode being executed.
Reference: hxp | SECCON CTF 2022 Quals
In this article, I summarize a method for analyzing how bytecode signatures work by patching libclamav, based on the write-up above.
Note that when patching libclamav, you need to build ClamAV yourself from source.
I summarized how to build ClamAV in the following article.
Reference: Notes on Building ClamAV from Source and Setting Up OnAccessScan
Table of Contents
- Customizing libclamav to dump operands at conditional branches
- Enabling debug traces for bytecode signatures
- Summary
Customizing libclamav to dump operands at conditional branches
To use this technique, modify the line DEFINE_ICMPOP(OP_BC_ICMP_EQ, res = (op0 == op1)); in bytecode_vm.c in libclamav as follows.
// DEFINE_ICMPOP(OP_BC_ICMP_EQ, res = (op0 == op1));
DEFINE_ICMPOP(OP_BC_ICMP_EQ, printf("%d: %x == %x\n", bb_inst, op0, op1);res = (op0 == op1));This change lets you dump the two values being compared when OP_BC_ICMP_EQ is called.
Let’s actually run a scan with clamscan in an environment where this change has been applied.
The following is the result when scanning a fake Flag.
The following is the result when scanning the correct Flag.
With just a simple patch to libclamav, you can easily determine that this bytecode signature transforms the scanned text and compares the result against hard-coded integer values.
Enabling debug traces for bytecode signatures
libclamav provides TRACE_INST, which can trace the bytecode being executed by using cli_byteinst_describe(inst, &bbnum);, but this feature is disabled by default.
To enable it, set the CL_DEBUG flag and change #if 0 to #if CL_DEBUG in the section that contains TRACE_INST.
Reference: clamav/libclamav/bytecode_vm.c at patch-libclamav · kash1064/clamav
[+] #define CL_DEBUG 1
***
[-] #if 0 /* too verbose, use #ifdef CL_DEBUG if needed */
[+] #if CL_DEBUG /* too verbose, use #ifdef CL_DEBUG if needed */
#define CHECK_UNREACHABLE \
do { \
cli_dbgmsg("bytecode: unreachable executed!\n"); \
return CL_EBYTECODE; \
} while (0)
#define TRACE_PTR(ptr, s) cli_dbgmsg("bytecode trace: ptr %llx, +%x\n", ptr, s);
#define TRACE_R(x) cli_dbgmsg("bytecode trace: %u, read %llx\n", pc, (long long)x);
#define TRACE_W(x, w, p) cli_dbgmsg("bytecode trace: %u, write%d @%u %llx\n", pc, p, w, (long long)(x));
#define TRACE_EXEC(id, dest, ty, stack) cli_dbgmsg("bytecode trace: executing %d, -> %u (%u); %u\n", id, dest, ty, stack)
#define TRACE_INST(inst) \
do { \
unsigned bbnum = 0; \
printf(""); \
cli_byteinst_describe(inst, &bbnum); \
printf("\n"); \
} while (0)Once you scan using the rebuilt ClamAV, you can inspect the runtime bytecode trace as shown below.
However, as you can see from the following code, the trace output available through cli_byteinst_describe expresses operands as variables, just like the output disassembled with the clambc command, so you cannot see the actual values.
void cli_byteinst_describe(const struct cli_bc_inst *inst, unsigned *bbnum)
{
size_t j;
char inst_str[256];
const struct cli_apicall *api;
if (inst->opcode > OP_BC_INVALID) {
printf("opcode %u[%u] of type %u is not implemented yet!",
inst->opcode, inst->interp_op / 5, inst->interp_op % 5);
return;
}
snprintf(inst_str, sizeof(inst_str), "%-20s[%-3d/%3d/%3d]", bc_opstr[inst->opcode],
inst->opcode, inst->interp_op, inst->interp_op % inst->opcode);
printf("%-35s", inst_str);
switch (inst->opcode) {
// binary operations
case OP_BC_ADD:
printf("%d = %d + %d", inst->dest, inst->u.binop[0], inst->u.binop[1]);
break;
case OP_BC_SUB:
printf("%d = %d - %d", inst->dest, inst->u.binop[0], inst->u.binop[1]);
break;
case OP_BC_MUL:
printf("%d = %d * %d", inst->dest, inst->u.binop[0], inst->u.binop[1]);
break;
***Reference: clamav/libclamav/bytecode.c at main · Cisco-Talos/clamav
So, in addition to TRACE_INST, I modified the libclamav code so that the debug output from TRACE_PTR, TRACE_R, TRACE_W, TRACE_EXEC, and TRACE_API is written to standard output.
// #define TRACE_PTR(ptr, s) cli_dbgmsg("bytecode trace: ptr %llx, +%x\n", ptr, s);
// #define TRACE_R(x) cli_dbgmsg("bytecode trace: %u, read %llx\n", pc, (long long)x);
// #define TRACE_W(x, w, p) cli_dbgmsg("bytecode trace: %u, write%d @%u %llx\n", pc, p, w, (long long)(x));
// #define TRACE_EXEC(id, dest, ty, stack) cli_dbgmsg("bytecode trace: executing %d, -> %u (%u); %u\n", id, dest, ty, stack)
#define TRACE_PTR(ptr, s) printf("ptr %llx, +%x\n", ptr, s);
#define TRACE_R(x) printf("%u, read %llx\n", pc, (long long)x);
#define TRACE_W(x, w, p) printf("%u, write%d @%u %llx\n", pc, p, w, (long long)(x));
#define TRACE_EXEC(id, dest, ty, stack) printf("bytecode trace: executing %d, -> %u (%u); %u\n", id, dest, ty, stack)
#define TRACE_INST(inst) \
do { \
unsigned bbnum = 0; \
printf(""); \
cli_byteinst_describe(inst, &bbnum); \
printf("\n"); \
} while (0)
// #define TRACE_API(s, dest, ty, stack) cli_dbgmsg("bytecode trace: executing %s, -> %u (%u); %u\n", s, dest, ty, stack)
#define TRACE_API(s, dest, ty, stack) printf("bytecode trace: executing %s, -> %u (%u); %u\n", s, dest, ty, stack)Reference: clamav/libclamav/bytecode_vm.c at patch-libclamav · kash1064/clamav
With this change enabled, you can trace memory reads and writes as shown below.
As you can see from the output above, information about memory reads and writes is printed immediately after the traced instruction.
For example, in the following part, v640 and v1240 each read 0x6cbfdd9f, compare them with OP_BC_ICMP_EQ, and store the result (1) at @644.
OP_BC_ICMP_EQ [21 /108/ 3] 644 = (640 == 1240)
1444, read 6cbfdd9f
1444, read 6cbfdd9f
1444, write8 @644 1Also, in the following part, you can confirm that the 4-character integer value 0x33547962 (3Tyb) is read from the p.248 pointer, written to v256, and then passed in a function call as the argument to Func2.
OP_BC_LOAD [39 /198/ 3] load 256 <- p.248
530, read fffffffe00000019
ptr fffffffe00000019, +4
530, read 617f7db2ab69
530, write32 @256 33547962
OP_BC_CALL_DIRECT [32 /163/ 3] 260 = call F.2 (256)
bytecode trace: executing 2, -> 260 (32); 2In this way, by enabling debug tracing, you can easily determine—without having to struggle through disassembled code like in the previous article—that the Flag text is split into 4-character chunks and passed to Func2 as 32-bit integer values.
Summary
The debug tracing was extremely useful.
If another bytecode signature challenge comes up in the future, I feel like I’ll be able to solve it without too much trouble.