feat: new claimsService and new merge behavior #881

marcoreni · 2023-02-09T09:49:01Z

Closes/fixes

Checklist

This PR makes changes to the public API
I have included links for closing relevant issue numbers

First chunk of #877 splitting. This PR introduces the new merge behavior and moves all the claims related logic inside a dedicated service

codecov · 2023-02-09T10:03:19Z

Codecov Report

Base: 77.69% // Head: 77.92% // Increases project coverage by +0.23% 🎉

Coverage data is based on head (1d76800) compared to base (da8c80d).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #881      +/-   ##
==========================================
+ Coverage   77.69%   77.92%   +0.23%     
==========================================
  Files          44       45       +1     
  Lines        1690     1708      +18     
  Branches      331      335       +4     
==========================================
+ Hits         1313     1331      +18     
  Misses        340      340              
  Partials       37       37

Flag	Coverage Δ
unittests	`77.92% <100.00%> (+0.23%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/SigninRequest.ts	`100.00% <ø> (ø)`
src/TokenClient.ts	`96.15% <ø> (-0.08%)`	⬇️
src/ClaimsService.ts	`100.00% <100.00%> (ø)`
src/OidcClient.ts	`93.33% <100.00%> (+0.11%)`	⬆️
src/OidcClientSettings.ts	`92.53% <100.00%> (+0.47%)`	⬆️
src/ResponseValidator.ts	`88.23% <100.00%> (-2.39%)`	⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

src/OidcClientSettings.ts

src/ClaimsService.ts

src/OidcClientSettings.ts

pamapa · 2023-02-09T10:14:19Z

Nice work! Only some small things to fix...

src/OidcClientSettings.ts

brockallen · 2023-02-10T13:46:50Z

The intent is to merge the identity claims in the id_token and the userinfo endpoint. And it's difficult to merge those if you have duplicates/conflicts in the sets of values (which unfortunately some token servers do). But those values "merged" all in one place are useful.

The merge used to work fairly well, then at some point someone came along and submitted a PR (of course, for their specific scenario) that actually made things worse for most everyone else. I am pretty sure I reverted it, but I don't recall now, so the state of things might have not been ideal.

pamapa · 2023-02-10T16:51:17Z

At the end it might be better to not merge at all, but store both claims objects within the user object. Something like:

profile["id_token"] = {id token claims}
profile["user_info"] = {user info claims}
or
claims_id_token = {id token claims}
claims_user_info =  {user info claims}
or better naming...

Like this the user of the library can decide what is the best to choose and we could still calculate an additional claims object, which is the merged of both if mergeClaims: true. How to merge still needs to be defined first, as it is not trivial.

brockallen · 2023-02-10T17:16:03Z

Well, the original implementation was decent, so you might check the history and compare to what it does now. But what I'd suggest then is to leave how it used to work (pre-PR I mentioned) and then make that extensible so people can override if needed. I still suggest putting it all in one place is important, as it shields the client code from needing to understand the details of claim in the id_token vs claims in the user info endpoint. And that might change as well in the token server based on config, something which the client logic will benefit being shielded from.

pamapa · 2023-02-13T07:52:11Z

I still suggest putting it all in one place is important, as it shields the client code from needing to understand the details of claim in the id_token vs claims in the user info endpoint.

I came to the same conclusion. Lets keep this simple and stupid. In case of merging the only question is; what is happening with arrays. Currently we are aggregating them, which is not the best solution (e.g. role: ["admin"]+role: ["user"]). As such we allow the user of this library to choose what shall happen with arrays (aggregate or replace):

not exists -> add
array -> aggregate or replace
other -> replace
object (not array) -> merge

I am not so sure if the current aggregate array behavior makes sense in the long run...

Badisi · 2023-02-13T12:33:44Z

@pamapa, do not forget about the upcoming refreshUserInfo() feature.

We will have 2 use cases:

merge after sign-in (where we will receive and merge up-to-dates IdToken and UserInfo)
merge when refreshing user info (where we will receive an up-to-date UserInfo but merge it in an outdated UserProfile)

In case 2, we might have received something in the IdToken at the beginning that is no more true in the new UserInfo (like a role that was removed to the user). So in this case : array -> substract

pamapa · 2023-02-13T13:31:07Z

In case 2, we might have received something in the IdToken at the beginning that is no more true in the new UserInfo (like a role that was removed to the user). So in this case : array -> substract

The idea is work there only with array replace, either aggregate does not exist or we throw an error. At the end its probably easier to only support array replace. In my opinion that should be enough to make refreshUserInfo work!

Maybe it makes sense for the refreshUserInfo use case to start again with the old IdToken and then merge it with the fresh userInfo, instead of using the already merge object...

marcoreni · 2023-02-15T14:35:52Z

@pamapa I agree that "starting from scratch" while updating userinfo is better, but this line makes me wonder... What happens if isOpenId is false?

pamapa · 2023-02-16T11:17:18Z

but this line makes me wonder... What happens if isOpenId is false?

if openid is not true (no "openid" in scope), means there is no id_token. Thus userInfo will be the first to start the user.profile

brockallen · 2023-02-16T14:34:59Z

Thus userInfo will be the first to start the user.profile

Userinfo doesn't exist without OIDC, and thus there is no user profile without OIDC.

marcoreni · 2023-02-17T10:43:48Z

So, trying to recap:
1- "Refresh user info" will create profile from scratch, re-decoding id_token claims and merging userinfo claims. The updated data will be stored just like today.
2- merge function, when in new behavior mode, will perform the following:

a. not exists -> add
b. array -> replace (since userinfo should be a superset of idtoken claims, we should be alright)
c. object (not array) -> merge
d. other -> replace

3- merge function will be used in both "signin scenario" and "refreshuserinfo scenario", with the same behavior

Questions:
1- 2c. (object.merge) is currently (as of v2.2.1) controlled by setting mergeClaims flag. Are we saying that in the "new world" we want this as the standard behavior and not to be controlled by any flag?
2- Have I missed anything?

(I'll have to close both this PR and the other one since the desiderata is quite different from what I implemented. I hope to be able to work on this soon.)

Badisi · 2023-02-20T08:03:08Z

The current behavior of the mergeClaims settings is misleading to me as a merge still occur even when set to false (ie. 2a, 2b, 2d). But if mergeClaims: false equals not merge at all, then getting UserInfo would be completely useless, as the info would never go up to the user. So either we remove the option (my personal choice) or we rename it to reflect the fact that it is only doing 2c.
I suspect @pamapa's thumb up to go for complete removal too. @pamapa, can you confirm ?
I don't think so, grazie for the recap 😉

marcoreni · 2023-02-21T17:35:47Z

So!

The new "merge behavior" was not so different from what was originally implemented in this PR. I

implemented the merge function discussed
deprecated mergeClaims and not considered it in the new implementation

Hopefully, this is good to go.

I see there are some conflicts, let me know if you need me to rebase.

pamapa · 2023-02-27T12:38:49Z

please resolve the conflicts otherwise i can not start the unit-tests...

… merge behavior

marcoreni · 2023-02-27T13:07:20Z

@pamapa Rebased, squashed and ready to go.

pamapa · 2023-02-27T15:36:22Z

src/OidcClientSettings.ts

     * Otherwise, they are added to an array as distinct objects for the claim type. (default: false)
+     * @deprecated since version 2.3.0. It's not a deterministic way of handling claims.
     */
    mergeClaims?: boolean;


Maybe it makes sense to keep this settings in the long turn in the way it was supposed:

From the description:

Indicates if objects returned from the user info endpoint as claims (e.g. `address`) are merged into the claims from the id token as a single object. Otherwise, they are added to an array as distinct objects for the claim type. (default: false)

-> true = merged, false = added/appended as an array (the otherwise clause).

-> mergeClaims = true -> the new behavior of array handling
-> mergeClaims = false -> the legacy behavior

We keep the default as is and in a major release we change the default to true.
The mergeClaimsLegacyBehavior is then not needed

What do you think? Does that make sense?

If we agree on the above it makes sense to rename legacyMergeClaims to appendClaims, remove if (typeof value === "object" && this._settings.mergeClaims) {... and not call it from mergeClaims. Instead we make and if/else on the caller side depending on mergeClaims.
-> easier to understand and easier to follow the unit-tests.

iff we want to keep the if (typeof value === "object" && this._settings.mergeClaims) {... -> if (typeof value === "object" && this._settings.mergeClaimsLegacyBehavior) {...

@pamapa frankly I just want this implementation to go ahead.

The current behavior is wrong, wherever we spin it we will have some breaking changes unless we create new attributes to keep this wrong behavior.

Right now mergeClaims, when enabled, works as a deep merging mechanism, which we will have by default in the new implementation, and changing its meaning will break some usage somewhere.

Whichever solution makes it clearer for you makes sense to me, I'll stick to the new implementation and try to forget about all of this ASAP, but we have 40 comments on this PR and I think we need to land it somehow.

@Badisi Should we keep the setting mergeClaims=false in the long run? aka to allow the append mode?

I'll stick to the new implementation and try to forget about all of this ASAP

I will help you here. First step see #904

I agree with @marcoreni that this whole mergeClaims thing is wrong from the start and that it's taking us too much discussion. TBH I don't even recall why we were talking about rewriting it at the first place. I thought that @marcoreni requirement was only to be able to reload UserInfo ?

So do we have any issue today with this merge mechanism ? If not then I would suggest to leave it as is, as it seems to work well for everybody (else we would have had other complains). And if we really need to refactor it I would go with a mergeClaimsStrategy=['none','merge','replace'] (or something alike), not so disruptive and with all use cases covered.

requirement was only to be able to reload UserInfo

will not work with the current wrong merge behavior...

I'm confused..

The first time, IdToken will be merged with UserInfo data.

The second time (when asking to refresh user info), we will use the current IdToken and merge it with the newly received UserInfo data.

This should be exactly the same.. why it wouldn't work ?

@Badisi You are right lets stop this merge request for now. And implement the feature like you said.

Merging claims is not perfect, we can change that later...

pamapa · 2023-03-17T09:59:03Z

@marcoreni Are you still interested to land this? I am not planing a 3.0.0, as such we can make breaking changes, lets focus your engineered algorithm of merging and drop the current merging + drop the mergingClaims settigs....