-
Notifications
You must be signed in to change notification settings - Fork 324
docs/prefect integration #3037
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel
Are you sure you want to change the base?
docs/prefect integration #3037
Conversation
djudjuu
commented
Aug 26, 2025
- prefect integration docs
- decomposition helper
- rewrite
- decomposition image
❌ Deploy Preview for dlt-hub-docs failed. Why did it fail? →
|
|
|
d566473
to
2ed2b15
Compare
For stability reasons, this actually runs one resource alone and then all others in parallel. | ||
This is because otherwise, on the first run, all resources would try to create the same dlt-tables. | ||
::: | ||
:::warning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShreyasGS read this (under this comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very exciting!
General comments:
-
I think in most of the cases dlt should be written without backticks (https://www.notion.so/dlthub/Documentation-Writing-Guide-87f4fcc32655460c83bcaf7787d11e67?source=copy_link#2288bedf0d7043e4bc1b4a0af7897b1f)
-
Sentence case for any headers
-
Please, put it through the grammar checker, I'm really bad with detecting grammar errors
|
||
## Key features | ||
|
||
- **Prefect Collector:** a dedicated way to do real-time [progress monitoring] and summary reports after each pipeline stage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why [progress monitoring]
is in braces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx, i thought maybe this could be a link but removed it ultimately: https://dlthub.com/docs/general-usage/pipeline#monitor-the-loading-progress
|
||
### Schema Change Reports | ||
|
||
The `PrefectCollector` will also create artifacts when schema changes are detected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we elaborate on the artifacts? what's inside?
Prefect has built-in functionality to [include logs from other libraries](https://docs.prefect.io/v3/advanced/logging-customization#include-logs-from-other-libraries) and display them as part of their UI. | ||
|
||
You can tell prefect to include `dlt`'s logs by setting the corresponding prefect environment variable, for example by adding this to your `.env` file: | ||
```sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Could you add the secrets.toml
version as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting. its actually a prefect configuration not dlt, but i can stress that
|
||
## Runner integration | ||
|
||
The `PrefectCollector` integrates seamlessly with the [dlt+ runner](../production/pipeline-runner.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The `PrefectCollector` integrates seamlessly with the [dlt+ runner](../production/pipeline-runner.md). | |
The `PrefectCollector` integrates seamlessly with the [dlt+ Runner](../production/pipeline-runner.md). |
|
||
### Pipeline Retries | ||
|
||
Prefect retry-mechanism is not a perfect fit for dlt pipelines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest first to mention that dlt+ Runner fixes the problem and then explain it :)
The dlt+ Runner provides a retry configuration that ensures pipeline state and intermediate results are preserved across retry attempts.
This is important because Prefect’s default retry mechanism is not optimized for dlt pipelines. During execution, dlt generates intermediate files in the pipeline’s working directory. If a run fails and Prefect retries it, those files may not be available anymore. For example, if the retry happens on a different worker node or inside an ephemeral Docker container.
To do so, ...
Co-authored-by: Violetta Mishechkina <[email protected]>
b18cd50
to
413b953
Compare