Skip to content

Panda supports 64-bit LAVA #1574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from
Open

Panda supports 64-bit LAVA #1574

wants to merge 3 commits into from

Conversation

AndrewQuijano
Copy link
Collaborator

@AndrewQuijano AndrewQuijano commented Feb 9, 2025

This pull request is needed for LAVA 3.0 to be ready, the main summary of the findings is the following by commit:

  1. This PR closes loaded plugin not supported for x86-64 #1547 as I need loaded 64-bit for LAVA. Also, loaded should now work with ARM 32 and 64-bits
    https://github.com/HighW4y2H3ll/panda/blob/x64dev/panda/plugins/loaded/loaded.cpp
    loaded.cpp.txt

  2. More debug and comments are added to better understand how dwarf2 and pri_taint work

  3. Finishing touches on updating pri_taint and dwarf2

  • pri_taint now uses the hypercaller library, and also it should work with ARM as well. Well, technically this would require updating the Dwarf2 Plugin, where you would require updating ELF for ARM. Ideally, pri_dwarf should fully deprecate and dwarf2 should be updated to dwarf5 as well.
  • Update hypercaller documentation and code, I worked with @jamcleod, and he approved these changes to come in this PR
  • In about every 100 basic blocks, Dwarf2 will attempt to load the debug symbols of the process to inject bugs in. This should fix that sometimes the main executable does not load dwarf2 symbols
  • Updated dwarf2 README as dwarfdump.py is now on pandare python package, and now adding indentation on the dwarf2 JSON dumps for easier reading/debugging

@AndrewQuijano AndrewQuijano force-pushed the lava branch 3 times, most recently from 1c240fc to bbbed20 Compare February 22, 2025 02:56
@AndrewQuijano AndrewQuijano force-pushed the lava branch 7 times, most recently from 3bb1e81 to ead3e01 Compare March 20, 2025 22:22
@AndrewQuijano AndrewQuijano force-pushed the lava branch 20 times, most recently from c429ded to dffad1f Compare March 30, 2025 00:14
@AndrewQuijano AndrewQuijano force-pushed the lava branch 12 times, most recently from 0ca00bb to 132b41e Compare May 17, 2025 03:29
@AndrewQuijano AndrewQuijano force-pushed the lava branch 4 times, most recently from 3157e34 to 4aa000e Compare May 25, 2025 03:17
@AndrewQuijano AndrewQuijano force-pushed the lava branch 2 times, most recently from fbfb92f to 4116b14 Compare May 25, 2025 13:28
@AndrewQuijano AndrewQuijano changed the title [WIP] Panda supports 64-bit LAVA Panda supports 64-bit LAVA May 25, 2025
@AndrewQuijano AndrewQuijano force-pushed the lava branch 2 times, most recently from 06be0fe to 006126b Compare June 9, 2025 19:10
Comment on lines +30 to +35
#define CPU_NB_REGS 16
#define REGS(env) ((env)->regs)
#elif defined(TARGET_ARM) && defined(TARGET_AARCH64)
#define NUM_REGS 31
#define CPU_NB_REGS 31
#define REGS(env) ((env)->xregs)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed this change so I can compile pri_taint with ARM. I want to have this so that maybe someday we can plant LAVA bugs into ARM architecture too

Comment on lines +131 to +138
#if defined(TARGET_I386) && !defined(TARGET_X86_64) || defined(TARGET_ARM) && !defined(TARGET_AARCH64)
// technically for 32-bit it is pgoff, not offset! But I want to avoid code duplication!
void linux_mmap_return(CPUState *cpu, target_ulong pc, uint32_t addr, uint32_t len, uint32_t prot, uint32_t flags, uint32_t fd, uint32_t offset)
#elif defined(TARGET_X86_64)
void linux_mmap_return(CPUState *cpu, target_ulong pc, uint64_t addr, uint64_t len, uint64_t prot, uint64_t flags, uint64_t fd, uint64_t offset)
#elif defined(TARGET_AARCH64)
void linux_mmap_return(CPUState* cpu, target_ulong pc, uint64_t addr, uint32_t len, int32_t prot, int32_t flags, int32_t fd, uint64_t offset)
#endif
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the real change, I updated the function to work both i386, x86_64 and ARM. ARM might be needed for future LAVA work

@@ -64,7 +64,7 @@ It's much easier to handle this from Python, but here's an example of how you mi
#include <panda/plugin.h>
#include <hypercaller/hypercaller.h>

hypercall_t* register_hypercall;
register_hypercall_t register_hypercall;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worked with @jamcleod, so it was decided to add this to the header, from there, I was able to to use hypercaller.h correctly in pri_taint.

The documentation also matches the new code in pri_taint.

Comment on lines +180 to +198
uint64_t num_tainted = 0;
for (uint64_t offset = 0; offset < len; offset++) {
hwaddr pa = panda_virt_to_phys(cpu, buf + offset);
if ((int) pa != (hwaddr)-1) {
ram_addr_t RamOffset = RAM_ADDR_INVALID;
if (PandaPhysicalAddressToRamOffset(&RamOffset, pa, false) != MEMTX_OK) {
dprintf("[pri_taint] can't query va=0x%" PRIx64 " pa=0x" TARGET_FMT_plx ": physical map is not RAM.\n", buf + offset, pa);
continue;
}
else {
Addr a;
if (loc_t == LocMem) {
a = make_maddr(RamOffset);
}
else {
a = make_greg(buf, offset);
}
if (taint2_query(a)) {
num_tainted++;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code was mostly taken from taint2_hypercalls.cpp, also, it aligned well with what I saw in file_taint.

Comment on lines +340 to +360
// Taken from taint2_hypercalls.cpp, not sure why but at the moment
// AttackPoints is never populated.
Panda__SrcInfo *pandalog_src_info_create(PandaHypercallStruct phs) {
Panda__SrcInfo *si = (Panda__SrcInfo *) malloc(sizeof(Panda__SrcInfo));
*si = PANDA__SRC_INFO__INIT;
si->filename = phs.src_filename;
si->astnodename = phs.src_ast_node_name;
si->linenum = phs.src_linenum;
si->has_insertionpoint = 0;
if (phs.insertion_point) {
si->has_insertionpoint = 1;
si->insertionpoint = phs.insertion_point;
}
si->has_ast_loc_id = 1;
si->ast_loc_id = phs.src_filename;
return si;
}
void on_fn_start(CPUState *cpu, target_ulong pc, const char *file_Name, const char *funct_name, unsigned long long lno){
struct args args = {cpu, file_Name, lno, 0};
dprintf("fn-start: %s() [%s], ln: %4lld, pc @ 0x" TARGET_FMT_lx "\n",funct_name,file_Name,lno,pc);
pri_funct_livevar_iter(cpu, pc, (liveVarCB) pfun, (void *)&args);

void lava_attack_point(PandaHypercallStruct phs) {
if (pandalog) {
Panda__AttackPoint *ap = (Panda__AttackPoint *)malloc(sizeof(Panda__AttackPoint));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was taken from taint2_hypercalls.cpp. Based on an old LAVA pandalog, it seems like this implementation of attack point is needed

Comment on lines +488 to +493
panda_require("hypercaller");
void * hypercaller = panda_get_plugin_by_name("hypercaller");
register_hypercall_t register_hypercall = (register_hypercall_t) dlsym(hypercaller, "register_hypercall");
register_hypercall(LAVA_MAGIC, lava_hypercall);

printf("[pri_taint] This plugin is activated!\n");
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worked with @jamcleod and this worked to get pri_taint to use the new hypercall system.

Comment on lines +1057 to +1059
if (strstr(file_callee, "hypercall.h")) {
return;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made sure that using hypercall.h also got ignored too

Comment on lines +2469 to 2476
// First attempt to load debug symbols for main process
bb_count++;
if ((bb_count % 100) == 0) {
if (correct_asid(cpu) && !main_exec_initialized) {
main_exec_initialized = ensure_main_exec_initialized(cpu);
}
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dwarf2's real big change, is that it didn't reliably load debug symbols. Got this code from loaded_libs, but essentially every 100 blocks attempt to load all debug symbols, yes it slows the recording, but now it works consistently.

All other changes are either formatting, comments, adding debug prints, and replacing auto with the actual type for easier code reading.

Comment on lines +237 to +254
hwaddr pa = panda_virt_to_phys(cpu, buf + offset);
if ((int) pa != (hwaddr) -1) {
ram_addr_t RamOffset = RAM_ADDR_INVALID;
if (PandaPhysicalAddressToRamOffset(&RamOffset, pa, false) != MEMTX_OK) {
dprintf("[pri_taint] can't query va=0x%" PRIx64 " pa=0x%lx: physical map is not RAM.\n", (uint64_t) buf + offset, pa);
continue;
}
else {
Addr a;
if (loc_t == LocMem) {
a = make_maddr(RamOffset);
}
else {
a = make_greg(buf, offset);
}
if (taint2_query(a)) {
if (loc_t == LocMem) {
dprintf("\"%s\" @ 0x" TARGET_FMT_lx " is tainted\n", astnodename, buf + offset);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again, copied from file_taint and taint2_hypercalls to get RamOffsets to know where to find the tainted data

@AndrewQuijano AndrewQuijano requested a review from jamcleod June 23, 2025 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

loaded plugin not supported for x86-64
2 participants