-
Notifications
You must be signed in to change notification settings - Fork 45
refactor: remove tracer dependencies to support dsm sqs -> lambda #612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
datadog_lambda/dsm.py
Outdated
) | ||
except Exception as e: | ||
logger.error(format_err_with_traceback(e)) | ||
arn = record.get("eventSourceARN", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we remove the try / except here. Is there a reason for that (maybe there is and I don't see it).
But we want to make sure our instrumentation never prevents the lambda from being executed, even if there is an issue with the instrumentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, we should move this inside the try/except.
tests/test_dsm.py
Outdated
}, | ||
} | ||
|
||
result = _get_dsm_context_from_lambda(message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] newline here
Wait for tracer version 3.9.2 before we can merge |
datadog_lambda/dsm.py
Outdated
context_json = None | ||
message_attributes = message.get("messageAttributes") | ||
if not message_attributes: | ||
logger.debug("DataStreams skipped lambda message: %r", message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we're logging debug messages multiple times for the same record.
|
||
datadog_attr = message_attributes["_datadog"] | ||
|
||
if "stringValue" in datadog_attr: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should do a type check here to ensure this is a dict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to clarify but we are checking that context_json is a dict right? I think context_json is the only one we need to make sure is a dict, the test you asked me to write also signaled that to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should also make sure datadog_attr
is a dict.
datadog_lambda/dsm.py
Outdated
datadog_attr = message_attributes["_datadog"] | ||
|
||
if "stringValue" in datadog_attr: | ||
# SQS -> lambda |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use the event_type
to avoid doing unnecessary work. We should already mostly know the shape of the event. Without doing so, this method is gonna get insanely large.
I would recommend creating a separate _get_dsm_context
for each event type.
}, | ||
{ | ||
"eventSourceARN": "arn:aws:sqs:us-east-1:123456789012:queue3", | ||
"body": "Message 3", | ||
"messageAttributes": { | ||
"_datadog": { | ||
"stringValue": json.dumps( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add another test where the stringValue
isn't a dict.
datadog_lambda/dsm.py
Outdated
context_json = _get_dsm_context_from_lambda(record) | ||
if not context_json: | ||
logger.debug("DataStreams skipped lambda message: %r", record) | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is untested.
datadog_lambda/dsm.py
Outdated
carrier_get = _create_carrier_get(context_json) | ||
set_consume_checkpoint(type, arn, carrier_get, manual_checkpoint=False) | ||
except Exception as e: | ||
logger.error(f"Unable to set dsm context: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is untested. We should test to make sure that if there's an exception in setting the checkpoint that this method properly captures the error.
datadog_lambda/dsm.py
Outdated
payload_size = calculate_sqs_payload_size(record) | ||
context_json = _get_dsm_context_from_sqs_lambda(record) | ||
if not context_json: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should you continue instead of return?
datadog_lambda/dsm.py
Outdated
logger.debug("DataStreams did not handle lambda message: %r", message) | ||
return None | ||
else: | ||
logger.debug("DataStreams did not handle lambda message: %r", message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend making each of these log lines slightly different. That way when one is encountered, it is easy to find the exact line of code where it was produced. Otherwise, we don't know what the actual issue was.
datadog_lambda/dsm.py
Outdated
return None | ||
else: | ||
logger.debug( | ||
"DataStreams did not handle lambda message: %r, no dsm context", message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: put the %r
at the end, the message itself could be quite long.
|
||
|
||
def _set_dsm_context_for_record(context_json, type, arn): | ||
from ddtrace.data_streams import set_consume_checkpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to be sure to set a minimum version for the ddtrace dependency. To do that, you'll want to find the first version of ddtrace that includes this set_consume_checkpoint
. Then update pyproject.toml with this version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! Thanks for letting me know how this is done
What does this PR do?
Removes dependencies on internal tracer code. The get_dsm_context() is moved inside the lambda layer, and the public facing api for setting checkpoints is used instead of internal dsm code in the tracer.
Motivation
Helps decouple the lambda layer code from the tracer code, keeps with bests practice of not using internal implementation code.
Testing Guidelines
Function was properly unit tested
Additional Notes
IMPORTANT: This PR cannot get merged until the tracer releases a version that includes this PR DataDog/dd-trace-py#13646 where the manual_checkpoint parameter is added to the set_consume_checkpoint() code
Types of Changes
Check all that apply