This project's goal is to produce a higher level API for the go bindings to libddwaf: DataDog in-app WAF. It consists of 2 separate entities: the bindings for the calls to libddwaf, and the encoder which job is to convert any go value to its libddwaf object representation.
An example usage would be:
import waf "github.com/DataDog/go-libddwaf/v4"
//go:embed
var ruleset []byte
func main() {
var parsedRuleset any
if err := json.Unmarshal(ruleset, &parsedRuleset); err != nil {
panic(err)
}
builder, err := waf.NewBuilder("", "")
if err != nil {
panic(err)
}
_, err := builder.AddOrUpdateConfig(parsedRuleset)
if err != nil {
panic(err)
}
wafHandle := builder.Build()
if wafHandle == nil {
panic("WAF handle is nil")
}
defer wafHandle.Close()
wafCtx := wafHandle.NewContext(timer.WithUnlimitedBudget(), timer.WithComponent("waf", "rasp"))
defer wafCtx.Close()
matches, actions := wafCtx.Run(RunAddressData{
Persistent: map[string]any{
"server.request.path_params": "/rfiinc.txt",
},
TimerKey: "waf",
})
}
The API documentation details can be found on pkg.go.dev.
Originally this project was only here to provide CGO Wrappers to the calls to libddwaf.
But with the appearance of ddwaf_object
tree like structure,
but also with the intention to build CGO-less bindings, this project size has grown to be a fully integrated brick in the DataDog tracer structure.
Which in turn made it necessary to document the project, to maintain it in an orderly fashion.
This library currently support the following platform doublets:
OS | Arch |
---|---|
Linux | amd64 |
Linux | aarch64 |
OSX | amd64 |
OSX | arm64 |
This means that when the platform is not supported, top-level functions will return a WafDisabledError
error including the purpose of it.
Note that:
- Linux support include for glibc and musl variants
- OSX under 10.9 is not supported
- A build tag named
datadog.no_waf
can be manually added to force the WAF to be disabled.
The WAF bindings have multiple moving parts that are necessary to understand:
Builder
: an object wrapper over the pointer to the C WAF BuilderHandle
: an object wrapper over the pointer to the C WAF HandleContext
: an object wrapper over a pointer to the C WAF Context- Encoder: its goal is to construct a tree of Waf Objects to send to the WAF
- Decoder: Transforms Waf Objects returned from the WAF to usual go objects (e.g. maps, arrays, ...)
- Library: The low-level go bindings to the C library, providing improved typing
flowchart LR
START:::hidden -->|NewBuilder| Builder -->|Build| Handle
Handle -->|NewContext| Context
Context -->|Encode Inputs| Encoder
Handle -->|Encode Ruleset| Encoder
Handle -->|Init WAF| Library
Context -->|Decode Result| Decoder
Handle -->|Decode Init Errors| Decoder
Context -->|Run| Library
Encoder -->|Allocate Waf Objects| runtime.Pinner
Library -->|Call C code| libddwaf
classDef hidden display: none;
When passing Go values to the WAF, it is necessary to make sure that memory remains valid and does
not move until the WAF no longer has any pointers to it. We do this by using a runtime.Pinner
.
Persistent address data is added to a Context
-associated runtime.Pinner
; while ephemeral address
data is managed by a transient runtime.Pinner
that only exists for the duration of the call.
Here is an example of the flow of operations on a simple call to Run()
:
- Encode input data into WAF Objects and store references in the temporary pool
- Lock the context mutex until the end of the call
- Store references from the temporary pool into the context level pool
- Call
ddwaf_run
- Decode the matches and actions
This library uses purego to implement C bindings without requiring use of CGO at compilation time. The high-level workflow
is to embed the C shared library using go:embed
, dump it into a file, open the library using dlopen
, load the
symbols using dlsym
, and finally call them. On Linux systems, using memfd_create(2)
enables the library to be loaded without
writing to the filesystem.
Another requirement of libddwaf
is to have a FHS filesystem on your machine and, for Linux, to provide libc.so.6
,
libpthread.so.0
, and libdl.so.2
as dynamic libraries.
⚠️ Keep in mind that purego only works on linux/darwin for amd64/arm64 and so does go-libddwaf.
- Cannot dlopen twice in the app lifetime on OSX. It messes with Thread Local Storage and usually finishes with a
std::bad_alloc()
keepAlive()
calls are here to prevent the GC from destroying objects too early- Since there is a stack switch between the Go code and the C code, usually the only C stacktrace you will ever get is from GDB
- If a segfault happens during a call to the C code, the goroutine stacktrace which has done the call is the one annotated with
[syscall]
- GoLand does not support
CGO_ENABLED=0
(as of June 2023) - Keep in mind that we fully escape the type system. If you send the wrong data it will segfault in the best cases but not always!
- The structs in
ctypes.go
are here to reproduce the memory layout of the structs ininclude/ddwaf.h
because pointers to these structs will be passed directly - Do not use
uintptr
as function arguments or results types, coming fromunsafe.Pointer
casts of Go values, because they escape the pointer analysis which can create wrongly optimized code and crash. Pointer arithmetic is of course necessary in such a library but must be kept in the same function scope. - GDB is available on arm64 but is not officially supported so it usually crashes pretty fast (as of June 2023)
- No pointer to variables on the stack shall be sent to the C code because Go stacks can be moved during the C call. More on this here
Debug-logging can be enabled for underlying C/C++ library by building (or testing) by setting the
DD_APPSEC_WAF_LOG_LEVEL
environment variable to one of: trace
, debug
, info
, warn
(or
warning
), error
, off
(which is the default behavior and logs nothing).
The DD_APPSEC_WAF_LOG_FILTER
environment variable can be set to a valid (per the regexp
package)
regular expression to limit logging to only messages that match the regular expression.