Skip to content

ticdc: add header line for CSV protocol #21417

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions ticdc/ticdc-csv.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ quote = '"'
null = '\N'
include-commit-ts = true
output-old-value = false
output-field-header = false
```

## Transactional constraints
Expand All @@ -51,6 +52,12 @@ In the CSV file, each column is defined as follows:
- Column 5: The `is-update` column only exists when the value of `output-old-value` is true, which is used to identify whether the row data change comes from the UPDATE event (the value of the column is true) or the INSERT/DELETE event (the value is false).
- Column 6 to the last column: One or more columns with data changes.

When `output-field-header = true`, the CSV file includes a header row. The column names in the header row are as follows:

| Column 1 | Column 2 | Column 3 | Column 4 (optional) | Column 5 (optional) | Column 6 | ... | Last column |
| --- | --- | --- | --- | --- | --- | --- | --- |
| `ticdc-meta$operation` | `ticdc-meta$table` | `ticdc-meta$schema` | `ticdc-meta$commit-ts` | `ticdc-meta$is-update` | The first column with data changes | ... | The last column with data changes |

Assume that table `hr.employee` is defined as follows:

```sql
Expand Down Expand Up @@ -85,6 +92,19 @@ When `include-commit-ts = true` and `output-old-value = true`, the DML events of
"I","employee","hr",433305438660591630,true,102,"Alex","Alice","2018-06-15","Beijing"
```

When `include-commit-ts = true`, `output-old-value = true`, and `output-field-header = true`, the DML events of this table are stored in the CSV format as follows:

```
ticdc-meta$operation,ticdc-meta$table,ticdc-meta$schema,ticdc-meta$commit-ts,ticdc-meta$is-update,Id,LastName,FirstName,HireDate,OfficeLocation
"I","employee","hr",433305438660591626,false,101,"Smith","Bob","2014-06-04","New York"
"D","employee","hr",433305438660591627,true,101,"Smith","Bob","2015-10-08","Shanghai"
"I","employee","hr",433305438660591627,true,101,"Smith","Bob","2015-10-08","Los Angeles"
"D","employee","hr",433305438660591629,false,101,"Smith","Bob","2017-03-13","Dallas"
"I","employee","hr",433305438660591630,false,102,"Alex","Alice","2017-03-14","Shanghai"
"D","employee","hr",433305438660591630,true,102,"Alex","Alice","2017-03-14","Beijing"
"I","employee","hr",433305438660591630,true,102,"Alex","Alice","2018-06-15","Beijing"
```

## Data type mapping

| MySQL type | CSV type | Example | Description |
Expand Down