Skip to content

DataDog/go-libddwaf

Repository files navigation

go-libddwaf

This project's goal is to produce a higher level API for the go bindings to libddwaf: DataDog in-app WAF. It consists of 2 separate entities: the bindings for the calls to libddwaf, and the encoder which job is to convert any go value to its libddwaf object representation.

An example usage would be:

import waf "github.com/DataDog/go-libddwaf/v4"

//go:embed
var ruleset []byte

func main() {
    var parsedRuleset any

    if err := json.Unmarshal(ruleset, &parsedRuleset); err != nil {
        panic(err)
    }

    builder, err := waf.NewBuilder("", "")
    if err != nil {
        panic(err)
    }
    _, err := builder.AddOrUpdateConfig(parsedRuleset)
    if err != nil {
        panic(err)
    }

    wafHandle := builder.Build()
    if wafHandle == nil {
        panic("WAF handle is nil")
    }
    defer wafHandle.Close()

    wafCtx := wafHandle.NewContext(timer.WithUnlimitedBudget(), timer.WithComponent("waf", "rasp"))
    defer wafCtx.Close()

    matches, actions := wafCtx.Run(RunAddressData{
        Persistent: map[string]any{
            "server.request.path_params": "/rfiinc.txt",
        },
		TimerKey: "waf",
    })
}

The API documentation details can be found on pkg.go.dev.

Originally this project was only here to provide CGO Wrappers to the calls to libddwaf. But with the appearance of ddwaf_object tree like structure, but also with the intention to build CGO-less bindings, this project size has grown to be a fully integrated brick in the DataDog tracer structure. Which in turn made it necessary to document the project, to maintain it in an orderly fashion.

Supported platforms

This library currently support the following platform doublets:

OS Arch
Linux amd64
Linux aarch64
OSX amd64
OSX arm64

This means that when the platform is not supported, top-level functions will return a WafDisabledError error including the purpose of it.

Note that:

  • Linux support include for glibc and musl variants
  • OSX under 10.9 is not supported
  • A build tag named datadog.no_waf can be manually added to force the WAF to be disabled.

Design

The WAF bindings have multiple moving parts that are necessary to understand:

  • Builder: an object wrapper over the pointer to the C WAF Builder
  • Handle: an object wrapper over the pointer to the C WAF Handle
  • Context: an object wrapper over a pointer to the C WAF Context
  • Encoder: its goal is to construct a tree of Waf Objects to send to the WAF
  • Decoder: Transforms Waf Objects returned from the WAF to usual go objects (e.g. maps, arrays, ...)
  • Library: The low-level go bindings to the C library, providing improved typing
flowchart LR
    START:::hidden -->|NewBuilder| Builder -->|Build| Handle

    Handle -->|NewContext| Context

    Context -->|Encode Inputs| Encoder

    Handle -->|Encode Ruleset| Encoder
    Handle -->|Init WAF| Library
    Context -->|Decode Result| Decoder

    Handle -->|Decode Init Errors| Decoder

    Context -->|Run| Library
    Encoder -->|Allocate Waf Objects| runtime.Pinner

    Library -->|Call C code| libddwaf

    classDef hidden display: none;
Loading

runtime.Pinner

When passing Go values to the WAF, it is necessary to make sure that memory remains valid and does not move until the WAF no longer has any pointers to it. We do this by using a runtime.Pinner. Persistent address data is added to a Context-associated runtime.Pinner; while ephemeral address data is managed by a transient runtime.Pinner that only exists for the duration of the call.

Typical call to Run()

Here is an example of the flow of operations on a simple call to Run():

  • Encode input data into WAF Objects and store references in the temporary pool
  • Lock the context mutex until the end of the call
  • Store references from the temporary pool into the context level pool
  • Call ddwaf_run
  • Decode the matches and actions

CGO-less C Bindings

This library uses purego to implement C bindings without requiring use of CGO at compilation time. The high-level workflow is to embed the C shared library using go:embed, dump it into a file, open the library using dlopen, load the symbols using dlsym, and finally call them. On Linux systems, using memfd_create(2) enables the library to be loaded without writing to the filesystem.

Another requirement of libddwaf is to have a FHS filesystem on your machine and, for Linux, to provide libc.so.6, libpthread.so.0, and libdl.so.2 as dynamic libraries.

⚠️ Keep in mind that purego only works on linux/darwin for amd64/arm64 and so does go-libddwaf.

Contributing pitfalls

  • Cannot dlopen twice in the app lifetime on OSX. It messes with Thread Local Storage and usually finishes with a std::bad_alloc()
  • keepAlive() calls are here to prevent the GC from destroying objects too early
  • Since there is a stack switch between the Go code and the C code, usually the only C stacktrace you will ever get is from GDB
  • If a segfault happens during a call to the C code, the goroutine stacktrace which has done the call is the one annotated with [syscall]
  • GoLand does not support CGO_ENABLED=0 (as of June 2023)
  • Keep in mind that we fully escape the type system. If you send the wrong data it will segfault in the best cases but not always!
  • The structs in ctypes.go are here to reproduce the memory layout of the structs in include/ddwaf.h because pointers to these structs will be passed directly
  • Do not use uintptr as function arguments or results types, coming from unsafe.Pointer casts of Go values, because they escape the pointer analysis which can create wrongly optimized code and crash. Pointer arithmetic is of course necessary in such a library but must be kept in the same function scope.
  • GDB is available on arm64 but is not officially supported so it usually crashes pretty fast (as of June 2023)
  • No pointer to variables on the stack shall be sent to the C code because Go stacks can be moved during the C call. More on this here

Debugging

Debug-logging can be enabled for underlying C/C++ library by building (or testing) by setting the DD_APPSEC_WAF_LOG_LEVEL environment variable to one of: trace, debug, info, warn (or warning), error, off (which is the default behavior and logs nothing).

The DD_APPSEC_WAF_LOG_FILTER environment variable can be set to a valid (per the regexp package) regular expression to limit logging to only messages that match the regular expression.