Skip to content

[AGNTLOG-229] De-emphasize legacy auto multiline aggregation #29944

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 0 additions & 5 deletions config/_default/menus/main.en.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -439,11 +439,6 @@ menu:
url: agent/logs/log_transport
parent: agent_logs
weight: 503
- name: Multi-Line Detection (Legacy)
identifier: multi_line_detection_legacy
url: agent/logs/auto_multiline_detection_legacy
parent: agent_logs
weight: 504
- name: Multi-Line Detection
identifier: multi_line_detection
url: agent/logs/auto_multiline_detection
Expand Down
34 changes: 17 additions & 17 deletions content/en/agent/logs/auto_multiline_detection.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ algolia:

## Overview

Automatic multi-line detection allows the Agent to detect and aggregate common multi-line logs automatically.
Automatic multi-line detection allows the Agent to detect and aggregate common multi-line logs automatically.

## Getting started

To enable the Auto multi-line feature in your Agent configuration, set `auto_multi_line_detection` to `true` in your configuration file, or set the `DD_LOGS_CONFIG_AUTO_MULTI_LINE_DETECTION=true` environment variable:
To enable the Auto multi-line feature in your Agent configuration, set `auto_multi_line_detection` to `true` in your configuration file, or set the `DD_LOGS_CONFIG_AUTO_MULTI_LINE_DETECTION=true` environment variable:

{{< tabs >}}
{{% tab "Configuration file" %}}
Expand All @@ -39,8 +39,8 @@ DD_LOGS_CONFIG_AUTO_MULTI_LINE_DETECTION=true
### Default settings
By default, the following features are enabled:

- `enable_datetime_detection`: This configures automatic datetime aggregation. Logs beginning with a datetime format are used to aggregate logs.
- `enable_json_detection`: This configures JSON detection and rejection. JSON-structured logs are never aggregated.
- `enable_datetime_detection`: This configures automatic datetime aggregation. Logs beginning with a datetime format are used to aggregate logs.
- `enable_json_detection`: This configures JSON detection and rejection. JSON-structured logs are never aggregated.

You can disable these features by setting the following to `false` in your configuration file or in your environment variable:

Expand All @@ -63,7 +63,7 @@ DD_LOGS_CONFIG_AUTO_MULTI_LINE_ENABLE_JSON_DETECTION=false
```
{{% /tab %}}
{{< /tabs >}}


### Enable multi-line aggregation per integration

Expand Down Expand Up @@ -99,17 +99,17 @@ logs:

### Supported datetime formats

Auto multi-line detection uses an algorithm to detect *any* datetime format that occurs in the first 60 bytes of a log line. To prevent false positives, the algorithm requires enough context to consider a datetime format a match.
Auto multi-line detection uses an algorithm to detect *any* datetime format that occurs in the first 60 bytes of a log line. To prevent false positives, the algorithm requires enough context to consider a datetime format a match.

Your datetime format must include both a _date_ and _time_ component to be detected.
Your datetime format must include both a _date_ and _time_ component to be detected.

Examples of valid formats that include enough context to be detected:
- `2021-03-28 13:45:30`
- `2023-03-28T14:33:53.743350Z`
- `Jun 14 15:16:01`
- `2024/05/16 19:46:15`

Examples of formats that do not have enough context to be detected:
Examples of formats that do not have enough context to be detected:
- `12:30:2017`
- `12:30:20`
- `2024/05/16`
Expand All @@ -119,11 +119,11 @@ Examples of formats that do not have enough context to be detected:

If datetime aggregation is insufficient or your format is too short to be detected automatically, you can customize the feature in two ways:
- [Custom Samples](#custom-samples)
- [Regex Patterns](#regex-patterns)
- [Regex Patterns](#regex-patterns)

### Custom samples

A custom sample is a sample of a log on which you want to aggregate. For example, if you want to aggregate a stack trace, the first line of the stack trace would be a good sample to provide. Custom samples are an easier way to aggregate logs than regex patterns.
A custom sample is a sample of a log on which you want to aggregate. For example, if you want to aggregate a stack trace, the first line of the stack trace would be a good sample to provide. Custom samples are an easier way to aggregate logs than regex patterns.

To configure custom samples, you can use the `logs_config` in your `datadog.yaml` file or set an environment variable. In the following example, the multi-line detection is looking for the sample `"SEVERE Main main Exception occurred"`:

Expand Down Expand Up @@ -160,22 +160,22 @@ java.lang.Exception: Something bad happened!

#### How custom samples work

Custom samples tokenize the first 60 bytes of a log line and also tokenize the provided sample.
Tokens include
Custom samples tokenize the first 60 bytes of a log line and also tokenize the provided sample.
Tokens include
- words and their length
- whitespace
- numbers and their length
- special characters
- datetime components.
- datetime components.

Each log token is compared to each token in the sample. If 75% of the log’s tokens match the sample’s, the log is marked for aggregation.
Datadog recommends using sample-based matching if your logs have a stable format. If you need more flexible matching, you can use regex.
Datadog recommends using sample-based matching if your logs have a stable format. If you need more flexible matching, you can use regex.

### Regex patterns

Regex patterns work similarly to a `multi_line` rule. If the regex pattern matches the log, it is used for aggregation.
Regex patterns work similarly to a `multi_line` rule. If the regex pattern matches the log, it is used for aggregation.

To configure custom regex patterns, you can use the `logs_config` in your `datadog.yaml` file or set an environment variable.
To configure custom regex patterns, you can use the `logs_config` in your `datadog.yaml` file or set an environment variable.

{{< tabs >}}
{{% tab "Configuration file" %}}
Expand Down Expand Up @@ -243,7 +243,7 @@ Auto multi-line detection uses a labeled aggregation system to aggregate logs. T

### Label configuration

You can provide custom labels to each regex or sample to change the aggregation behavior based on the label rules. This is useful if you want to explicitly include or exclude certain log formats in multi-line aggregation.
You can provide custom labels to each regex or sample to change the aggregation behavior based on the label rules. This is useful if you want to explicitly include or exclude certain log formats in multi-line aggregation.

{{< tabs >}}
{{% tab "Configuration file" %}}
Expand Down
16 changes: 6 additions & 10 deletions content/en/agent/logs/auto_multiline_detection_legacy.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Automatic Multi-line Detection and Aggregation (Legacy)
description: Use the Datadog Agent to detect and aggregate multi-line logs automatically
title: (Legacy) Automatic Multi-line Detection and Aggregation
description: (Legacy) Use the Datadog Agent to detect and aggregate multi-line logs automatically
further_reading:
- link: "/logs/guide/getting-started-lwl/"
tag: "Documentation"
Expand All @@ -27,11 +27,7 @@ algolia:
tags: ['advanced log filter']
---

<div class="alert alert-warning">This document applies to Agent versions earlier than <strong>v7.65.0</strong>, or when the legacy auto multi-line detection is explicitly enabled.

For more recent Agent versions, we offer new auto-multiline implementation that improve multiline detection with support for arbitrary timestamps, JSON aggregation, and per-integration configuration. , see <a href="/agent/logs/auto_multiline_detection">Auto Multi-line Detection and Aggregation</a>.

If you are sending lots of multi-line logs and you are unsure of their format or don't have the means to configure all sources individually, you should use automatic multi-line detection. If you know the specific format of your logs, it's recommended to use manual multi-line rules for more precise control. See <a href="/agent/logs/advanced_log_collection/#manually-aggregate-multi-line-logs">Manually aggregate multi-line logs</a> for details.</div>
<div class="alert alert-warning">This document applies to Agent versions earlier than <strong>v7.65.0</strong>, or when the legacy auto multi-line detection is explicitly enabled. For newer Agent versions, please see <a href="/agent/logs/auto_multiline_detection">Auto Multi-line Detection and Aggregation</a>.</div>

## Global automatic multi-line aggregation
With Agent 7.37+, you can enable `auto_multi_line_detection` to automatically detect [common multi-line patterns][1] across **all** configured log integrations.
Expand Down Expand Up @@ -193,9 +189,9 @@ In a containerized Agent, add the environment variable `DD_LOGS_CONFIG_AUTO_MULT
### Custom threshold
The `auto_multi_line_default_match_threshold` parameter determines how closely logs have to match the patterns in order for the auto multi-line aggregation to work.

If your multi-line logs are not getting aggregated as expected, you can change the sensitivity of the matching by setting the `auto_multi_line_default_match_threshold` parameter.
If your multi-line logs are not getting aggregated as expected, you can change the sensitivity of the matching by setting the `auto_multi_line_default_match_threshold` parameter.

Add the `auto_multi_line_default_match_threshold` parameter to your configuration file with a value lower (to increase matches) or higher (to decrease matches) than the current threshold value.
Add the `auto_multi_line_default_match_threshold` parameter to your configuration file with a value lower (to increase matches) or higher (to decrease matches) than the current threshold value.

To find the current threshold value, run the [Agent `status` command][4].

Expand Down Expand Up @@ -263,7 +259,7 @@ datadog:
{{< /tabs >}}

## Detection process
Automatic multi-line detection detects logs that begin and comply with the following date/time formats:
Automatic multi-line detection detects logs that begin and comply with the following date/time formats:
- ANSIC
- RFC822
- RFC822Z
Expand Down
Loading