Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stringify JSON deterministically #83

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

fenichelar
Copy link

See auth0/node-jsonwebtoken#404 for more details of why.

omsmith and others added 8 commits July 21, 2017 16:26
binary payloads would get mangled due to the unnecessary string
conversion, which should go the other way around

Fixes: auth0#50
sign: dont convert input buffers to utf8 strings
Just put some curly braces as JS doesn't have such syntax (yet).
deps: replace base64url with inline definition
@fenichelar
Copy link
Author

This only works one level deep which should not be a problem because the user can pre-stringify the payload. This is only needed for the header.

@fenichelar
Copy link
Author

Another option would be to use https://github.com/substack/json-stable-stringify but I didn't want to add another dependency.

@danihodovic
Copy link

danihodovic commented Aug 15, 2018

Do we need test cases to prove this is deterministic?

@ivn-cote
Copy link

Hey guys! Just a proposal from my side to consider an initial request on the same topic in the Node repository and the response there nodejs/node#15628 (comment) . I think we can rely on proper ECMAScript implementation in Node so no sense worrying and considering an additional dependency.

@omsmith
Copy link
Contributor

omsmith commented Feb 23, 2019

Interesting - and apologies for taking so long to follow up. I'm a little conflicted, since it's just a simple change it would be easy to just merge, and I don't expect it would break anything...

On the other hand, as noted in auth0/node-jsonwebtoken#404, this doesn't actually break any intended usage (for compact representation). Verifying takes the input string and checks the signature part against the payload part, doesn't involve re-generating the payload. If generating a JWT twice, I might expect fields like jti, iat, or exp to change, and thus change the whole signature anyway.

So then where does this come up?

I could imaging a testing scenario:

const header = { alg: 'HS256', kid: '1234' };
const payload = { foo: 'bar', baz: 'quux' };
const secret = crypto.randomBytes(...);

const sig1 = jws.sign({ header, payload, secret });
const sig2 = jws.sign({ header, payload, secret });
assert.strictEqual(sig1, sig2);

But then that's making an unnecessary assumption about the library internals.

Alternatively, I could imagine someone writing code that does something similar:

// Don't do this

const USER_TOKENS = {};

app.use((req, res, next) => {
  const inputToken = req.headers.Authorization.split(' ')[1];

  if (!inputToken) {
    return next(new Error('No token!'));
  }

  const userId = req.params.userId;
  if (inputToken !== USER_TOKENS[userId]) {
    return next(new Error('Invalid token!'));
  }

  next();
});

But that's very much the wrong way to be using JWTs. I am sure there's other ways, but I am struggling to image one that isn't against the spirit of the ecosystem. I would almost be inclined to go the exact opposite of this PR and shuffle the keys in order to make the exact signature unpredictable for any sort of usage.

If there's something I'm missing, please let me know, but otherwise I'm going to close this out for now.

@omsmith omsmith closed this Feb 23, 2019
@fenichelar
Copy link
Author

fenichelar commented Feb 23, 2019

@omsmith I completely agree with everything you are saying. Expecting JWTs to have the keys in a deterministic order is stupid. However, that is what the telephone industry will be requiring for signed telephone calls.

My PR doesn't even completely solve this because it only works on the first level. But because the payload can be manually stringified and then passed to this library, the only issue is that the header can't be manually stringified.

Another option would be to add a flag could be added to allow for true (muli level) deterministic handling of the header and payload for those who need it?

Alternatively, allowing a function to be passed in that does the stringification to override the default behavior of using JSON.stringify?

References:

[1] https://tools.ietf.org/html/draft-ietf-stir-passport-shaken-07#section-8 (references RFC 8225 must be followed)
[2] https://tools.ietf.org/html/rfc8225#section-9 (requires deterministic lexicographical order)
[3] https://access.atis.org/apps/group_public/download.php/35256/ATIS-1000080.pdf (in depth design of signed telephone calls)
[4] https://transnexus.com/whitepapers/understanding-stir-shaken/ (overview of signed telephone calls including use of JWT)

My code to generate a token because of this (basically I can't use the JWT library):

const jwa = require('jwa');
const stringifyStable = require('json-stable-stringify');

const stiSigner = jwa('ES256');

const stiHeader = stringifyStable({
  alg: stiAlg,
  ppt: stiPpt,
  typ: stiTyp,
  x5u: stiX5u,
});
const stiPayload = stringifyStable({
  attest: stiAttest,
  dest: {
    tn: [
      stiDestTn,
    ],
  },
  iat: stiIat,
  orig: {
    tn: stiOrigTn,
  },
  origid: stiOrigid,
});
const unsignedToken = `${base64url(stiHeader)}.${base64url(stiPayload)}`;
const token = `${unsignedToken}.${stiSigner.sign(unsignedToken, privateKey)}`;

@omsmith
Copy link
Contributor

omsmith commented Feb 23, 2019

@fenichelar I appreciate the extra information! This is the first time I've heard of RFC8225, and from a quick skim I guess it's another unfortunate example of "we use RFCfoo, except"? I'll have to have a better read through later on.

One thing I did wonder about, which seems to be confirmed by your other examples:

RFC8225 9.3
3. JSON value literals MUST be lowercase.

Except the JW* specs, clearly defines its members as being generally case-sensitive, such as the "alg" parameter, and while "typ" is not, it's "RECOMMENDED" that JWT be uppercase.

And the example page clearly shows alg being uppercase, and an "attest" field in the payload being uppercase.

2019-02-23t18 46 14 00 00

Anyway, like I said, I still have to actually read through RFC8225; however, in the mean time an rfc8225 opt that does the deeply-lexical ordering like you suggested might be something?

@omsmith omsmith reopened this Feb 23, 2019
@fenichelar
Copy link
Author

fenichelar commented Feb 23, 2019

  1. JSON value literals MUST be lowercase.

Haha I didn't even notice that. Literally right below that in the example (https://tools.ietf.org/html/rfc8225#section-9.1) there are capital letters in the value...

9.1.  Example PASSporT Deterministic JSON Form

   This section demonstrates the deterministic JSON serialization for
   the example PASSporT Payload shown in Section 5.2.1.4.

   The initial JSON object is shown here:

   {
     "dest":{"uri":["sip:[email protected]"]},
     "orig":{"tn":"12155551212"}
     "iat":1443208345,
     "mky":[
       {
         "alg":"sha-256",
         "dig":"021ACC5427ABEB9C533F3E4B652E7D463F5442CD54
           F17A03A27DF9B07F4619B2"
       },
       {
         "alg":"sha-256",
         "dig":"4AADB9B13F82183B540212DF3E5D496B19E57C
           AB3E4B652E7D463F5442CD54F1"
       }
     ],
   }

I'm thinking that this was written incorrectly and should have been:

  1. JSON keys MUST be lowercase.

I'll bring this up with the authors of the RFC for further clarification.

Anyway, like I said, I still have to actually read through RFC8225; however, in the mean time an rfc8225 opt that does the deeply-lexical ordering like you suggested might be something?

Of course that would work great for me :) Do you think that is the best thing?

@omsmith
Copy link
Contributor

omsmith commented Feb 23, 2019

It looks like the reason that rfc specifies a serialization process, is because the recommend not actually sending the header and payload, but just the signature?

..abcdefghijklmnopqrstuvwxyz123456789

And then the receiving side constructs the header and payload from other information it already has?

@fenichelar
Copy link
Author

@omsmith That is correct: https://tools.ietf.org/html/rfc8225#section-7

The reasoning behind this has been to try not to exceed MTU because SIP is generally done over UDP and UDP fragmentation can be problematic. The SIP standard actually says that TCP should be used instead of UDP packet fragmentation, but I only know of one telephone switch vendor that follows this. Behavior defined here: https://tools.ietf.org/html/rfc3261#section-18.1.1.

However, for SHAKEN (the framework to prevent telephone number spoofing) compact headers have been abandoned. SHAKEN will require the full form:

The use of the compact form of PASSporT is not specified in this document and is not specified for use in SHAKEN [ATIS-1000074].

from https://tools.ietf.org/html/draft-ietf-stir-passport-shaken-07#section-3

5.3.3 Use of the Full Form of PASSporT
Draft-ietf-stir-rfc4474bis supports the use of both full and compact forms of the PASSporT token in the Identity header. The full form of the PASSporT token shall be used to avoid any potential SIP network element interaction with headers, in particular the Date header field, which could lead to large numbers of 438 (‘Invalid Identity Header’) errors being generated.

from https://access.atis.org/apps/group_public/download.php/32237/ATIS-1000074.pdf

@fenichelar
Copy link
Author

To clarify, reconstructing the iat value from the SIP Date header seemed like it was going to work but in practice, it doesn't. The Date header is frequently removed or updated by devices between the source and destination. If this happens, the call would appear to have been tampered with and there would be know way of knowing if this was a Date tampering or a spoofed number.

SHAKEN adds attest and origid claims which are not defined anywhere but in the payload so reconstruction is not possible for the SHAKEN use case of PASSporT.

@omsmith
Copy link
Contributor

omsmith commented Feb 23, 2019

So to be clear SHAKEN doesn't try to do any truncation, and always sends the full header and payload (and not a subset of the header and payload in order to try to fit within a frame)?

Will PASSporT's "compact" form be used in practice, if SHAKEN refuses it, and STIR seemingly suggests against it?

@fenichelar
Copy link
Author

So to be clear SHAKEN doesn't try to do any truncation, and always sends the full header and payload (and not a subset of the header and payload in order to try to fit within a frame)?

Yes.

Will PASSporT's "compact" form be used in practice, if SHAKEN refuses it, and STIR seemingly suggests against it?

Good questions. I'm not knowledgeable enough about all of the PASSporT extensions outside of SHAKEN to answer that.

There are a bunch of related documents here: https://datatracker.ietf.org/wg/stir/documents/. It's probably safe to assume that the 3 listed extensions are currently all that is in the works.

@fenichelar
Copy link
Author

FYI, I doubt Rich Call Data (RCD) will ever be used in practice without SHAKEN. Even the RCD draft discusses how it would be combined with SHAKEN in a single PASSporT.

@omsmith
Copy link
Contributor

omsmith commented Feb 23, 2019

Well you know know more than me :P. I'm aware of SIP and STUN and ICE, but it's the first time I'm hearing if these telephony protocols.

What I'm interpreting is that PASSporT's "compact" form was perhaps a mistake, and the ecosystem is ignoring it, but I'm biased and have no real knowledge of that.

Do you have a way if verifying that? Perhaps the RFC authors will have a sense of it when you reach out about the lowercase situation.

I don't terribly want to support an unused part of a generally unrelated spec if that's the case.

I'm happy to work with you either way though and appreciate the information you've been able to provide.

@fenichelar
Copy link
Author

What I'm interpreting is that PASSporT's "compact" form was perhaps a mistake, and the ecosystem is ignoring it, but I'm biased and have no real knowledge of that.

Agreed.

Do you have a way if verifying that? Perhaps the RFC authors will have a sense of it when you reach out about the lowercase situation.

I will try and find out more about the compact form.

I don't terribly want to support an unused part of a generally unrelated spec if that's the case.

There is a SHAKEN testbed that is run by Neustar at the request of The Alliance for Telecommunications Industry Solutions (ATIS): https://www.home.neustar/atis-testbed/index.php. This is purely for SHAKEN and helps telephone service provides test their implementation to ensure it will work with other telephone service provides. This testbed checks that the order is lexicographical even though the full form is used. Hard to say what other implementations do but they could easily do the same thing.

I had to abandon our use of the JWT library and directly create the token to be able to sign telephone calls for the testbed, before that the tests failed. For verifying telephone calls, I do use the JWT library and therefore don't check the order.

So for SHAKEN, I think the order needs to be lexicographical even though the reasoning is no longer relevant because the compact form has seemingly been abandoned :(

@cyberphone
Copy link

You might be interested in this related standards effort:
https://tools.ietf.org/html/draft-rundgren-json-canonicalization-scheme-05
It could be interpreted as not being compliant with JWK Thumbprint but if you 1) read the recommendations in the thumbprint RFC 2) take all existing JWKs to date 3) consider the fact that nobody (in their right mind) uses keys outside of 7-bit ASCII you will see that it is compatible, making a JS implementation trivial:
https://tools.ietf.org/html/draft-rundgren-json-canonicalization-scheme-05#appendix-A

Dropping the compact mode of Passport seems like a good idea.

@panva
Copy link

panva commented Mar 16, 2019

FWIW If there's the need for canonical payload i believe this should be a JWS extension (similar to b64:false) using a "crit" member being registered somewhere in the specifications. You can then easily see if a library supports this or not. "crit" is there for exactly this reason.

@erdtman
Copy link

erdtman commented Mar 22, 2019

And we have a published implementation of JCS here https://www.npmjs.com/package/canonicalize

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants