-
Notifications
You must be signed in to change notification settings - Fork 26
chore: introduce metrics change #307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
metrics. Customers can then ask for updates to the implementations | ||
CT provides or customers can go an implement their own interfaces that are fine-tuned | ||
to their use cases. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a middle ground? For example, logging information locally without uploading it anywhere.
will feel like; getting no choice on the matter and opting to not upgrade. | ||
Going from never emitting metrics to always emitting them says to customers | ||
that their application no matter its use case will always benefit from metrics. | ||
Without letting customers make that choice, CT looses hard earned customer trust. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without letting customers make that choice, CT looses hard earned customer trust. | |
Without letting customers make that choice, CT loses hard earned customer trust. |
### Issue 1: What will be the default behavior? | ||
|
||
As a client-side encryption library CT should be as cautious as possible. | ||
Customers of CT libraries should be on the driver seat and determine for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Customers of CT libraries should be on the driver seat and determine for | |
Customers of CT libraries should be in the driver's seat and determine for |
english is silly
become unsupported. | ||
|
||
Additionally, requiring customers to start emitting metrics | ||
almost certainly guarantees a breaking change across supported libraries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How so? If we require that all customer facing APIs now require a metric agent, then you can't pick up this change in a minor version because you are required to make a code change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, not that I am advocating for this, but we could just provide a metric agent.
I am against it, but there is probably a way to do this in a non-breaking way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh sure, I apparently forgot this was in the context of specific API changes, I retract my objection
at customer facing API call sites. | ||
|
||
Currently, the ESDK client APIs models are defined [here](https://github.com/aws/aws-encryption-sdk/blob/mainline/AwsEncryptionSDK/dafny/AwsEncryptionSdk/Model/esdk.smithy#L60-L126). | ||
This change would see that the client APIs be changed as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't be the only change. I like the detail, but we should explicitly add the CMM/Keyring inputs to this list.
non-breaking way as the Metrics Agent will be an optional parameter | ||
at customer facing API call sites. | ||
|
||
Currently, the ESDK client APIs models are defined [here](https://github.com/aws/aws-encryption-sdk/blob/mainline/AwsEncryptionSDK/dafny/AwsEncryptionSdk/Model/esdk.smithy#L60-L126). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that the DB-ESDK item encryptor would be similar.
What about the DDB client interface?
Also, are there plans for S3EC?
### label | ||
|
||
A label is a string that is used | ||
as a an attribute name to aggregate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a an attribute name to aggregate | |
as an attribute name to aggregate |
to either a local application or to an observability service like AWS CloudWatch. | ||
|
||
As client side encryption libraries emitting metrics must be done carefully as | ||
to avoid accidentally [leaking](https://github.com/aws/aws-encryption-sdk-python/pull/105/files) any information related to the plaintext that could lead to a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we want to link to this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am planning on removing it before merging but wanted it here for folks to get context.
## Summary | ||
|
||
Existing users of Crypto Tools (CT) libraries do no have any insights as to | ||
how the librar(y/ies) behave(s) in their application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how the librar(y/ies) behave(s) in their application. | |
how the libraries behave in their application. |
simplify, we know what you mean
@required | ||
label: String, | ||
@required | ||
duration: Long, // Duration in milliseconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like Ryan's Duration model in Lockbox that would allow users to specify the unit: https://github.com/aws/private-aws-encryption-sdk-dafny-staging/blob/seebees/lockbox-smithy/AwsLockbox/dafny/AwsLockbox/Model/structures.smithy#L17-L22
@josecorella I have a meta/macro question. I appreciate that the Background doc highlights issues and alternatives, but I feel like we a missing a "User Stories" document, that can be used to measure success criteria and what are the table stakes of this work. It is also possible I just missed such a proposal doc; but without it, it is difficult to work backwards. |
Additionally, requiring customers to start emitting metrics | ||
almost certainly guarantees a breaking change across supported libraries. | ||
|
||
### Issue 2: Should Data Plane APIs fail if metrics fail to publish? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about latency?
This is somewhat relevant to the discussion around availability, particularly in technical terms.
What I'm getting at is, are we planning to make blocking calls to CT or wherever, or have a separate thread/pool that does so periodically? Do we get this for free from the CT metrics agent (or whatever) or do we have to write this ourselves?
In any case we should specify what the bar is here for performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 . We should delegate the whole metric stuff to some known frame work. Maintaining this would be difficult and in any case we shouldn't be reinventing the wheel.
operation AddDate { | ||
input: AddDateInput, | ||
output: AddOutput, | ||
errors: [MetricsPutError] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: You say that the interface should not error, but you have errors here.
// Common output structure | ||
structure AddOutput {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get why you might optimize this. But is this really the best choice? why not have a output per operation?
## Issues and Alternatives | ||
|
||
Crypto Tools (CT) publishes software libraries. The latest | ||
versions of these libraries have no logging or metrics publishing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit/Issue: We have not defined metrics.
While the definition might seem obvious to some, metrics can either be a description of the library's performance in the customer's application OR they could be telementry on the usage of Crypto Tools products.
Having read the docs, it is clear that is NOT telementry, and maybe it is only Tony Knapp who gets the two confused, but we could clarify this.
|
||
This change will allow Crypto Tools to introduce a Metrics Agent in a | ||
non-breaking way as the Metrics Agent will be an optional parameter | ||
at customer facing API call sites. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we support providing a metrics agent to the client constructor? Such that the same instance would then be used for each API call.
I don't necessarily think we should but it might be worth discussing here.
.MaterialProvidersConfig(MaterialProvidersConfig.builder().build()) | ||
.build(); | ||
|
||
final IKeyring rawAesKeyring = matProv.CreateRawAesKeyring(keyringInput); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we implementing our own metric agent? I'm pretty sure there exists some framework which we can leverage. Customers can optionally provide the agent, or we do no-op (like logging frameworks such as slf4j etc).
algorithmSuiteId: aws.cryptography.materialProviders#ESDKAlgorithmSuiteId, | ||
|
||
frameLength: FrameLength, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this probably isn't possible in Smithy/Smithy-Dafny but we might want to consider having a more open-ended "request override" option for things like this. (This is the same way the SDK does it.)
Potential Issue/Alternative/User Story: Users of the MPL/ESDK/DB-ESDK are also users of the AWS SDKs. The AWS SDKs have established logging and metric interfaces. There likely is an implicit customer expectation that Crypto Tools products behave and appear to be consistent with the AWS SDKs. Therefore, I suggest we carefully evaluate if we can utilize the SDKs metric and logging tooling, and offer a customer experience that closely mimics the SDKs experience. The current collection of docs does not state this as a goal, but it does leave it open as an opportunity. i.e: the proposed metric interface could wrap an SDK metric class. |
For best reading and commenting experience, I suggest splitting your window in two; the review page and the rendered page.
Here are the rendered files:
Goals for 9-15-2025 Spec Review:
Issue #, if available:
Description of changes:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Check any applicable: