BytesToNameCanonicalizer can mishandle leading NUL #148

rjmac · 2014-08-10T22:15:00Z

More from randomized testing: deserializing the document { "\u0000abc" : "a", "abc" : "b" } via the UTF8StreamJsonParser will produce the same text for the second field name as the first. This is because the BytesToNameCanonicalizer will left-pad "abc" with a 0 byte to fit it in an int, which is indistinguishable from a string which really has a leading zero char.

Here's a standalone test case for it:

import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonParser;

public class Tokens {
    public static void main(String[] args) throws Exception {
        String[] expectedTexts = { "{", "\u0000abc", "a", "abc", "b", "}" };
        JsonParser parser = new JsonFactory().createParser("{\"\\u0000abc\" : \"a\", \"abc\" : \"b\"}".getBytes());
        for(String expected : expectedTexts) {
            parser.nextToken();
            if(!parser.getText().equals(expected)) throw new Exception(parser.getText() + " != " + expected);
        }
    }
}

The text was updated successfully, but these errors were encountered:

cowtowncoder · 2014-08-11T00:53:28Z

Hmmh. I hate the fact that null Unicode character is technically valid. Almost enough to rather throw an exception if one is seen, and indicate Jackson will not allow it...

Now: I am guessing this is found via generated test combinations, and not with actual usage? Just want to figure out likelihood of this being something found in the wild.

rjmac · 2014-08-11T01:23:54Z

That guess is correct. I'm not actually deliberately testing Jackson directly; I'm testing my own code and coming up with places where the test fails because the property being checked is forall x. decode(encode(x)) == x, and finding places where that's not true in Jackson.

Of course, it's possible come up with a situation where it's a potential security issue, due to allowing a user to change the interpretation of a chunk of JSON depending on what kind of parser is used. I admit it'd be hard to come up with a real case, but a super-contrived one's like this: say there's a service processor that operates in two phases. Phase one checks the credentials stored in a request, and if they're good, passes it unmodified to a second service that assumes the creds are good (because it knows the first layer has already handled it) and uses the username to decide whether to allow or deny a certain action. A valid but malicious user called "unprivileged" sends {"usr":"privileged","\u0000usr":"unprivileged","psw":"whatever",...other stuff...}. Auth layer is using byte-decoding, service layer uses text-decoding. Then the first layer will check the password against "unprivileged" and the second layer will see the "privileged" username. Now, that is obviously a stupidly designed system for a host of reasons, but it also only took about 10 seconds to come up with.

0xff can't occur in well-formed UTF-8, so it will not cause false-positives. Might resolve FasterXML#148

rjmac · 2014-08-11T04:23:45Z

I'm not sure that patch is correct, but it seems to solve the immediate issue and pass the tests. I don't much like it; it feels to me like the padding should be handled in the BytesToNameCanonicalizer itself rather than in the user of the BTNC.

More FasterXML#148

cowtowncoder · 2014-08-12T00:25:37Z

Right, this is problematic, and good catch no matter how things end up.

I started thinking that there may actually be relatively easy and inexpensive fix by using different padding value: since UTF-8 encoded content can not have couple of highest-valued bytes (for example, 0xFF is illegal), it might be possible to just initialize padding to -1 to be able to distinguish valid (if unusual) values of zero unit.

cowtowncoder · 2014-08-12T05:23:12Z

Heh. I should have read your update before adding a comment. Great minds think alike etc. :-)

I see why you feel conflicted about patch: I agree that ideally it'd be hidden in symbol table. But then again, patch is delightfully small. One potential improvement could be to try to initialize quad value with padding, instead of applying patch in the end; but then again that'd actually do bit more work. And I suspect performance-wise difference probably couldn't even be measured.

I'll think about this for a bit; I checked in (failing) unit test. I am still wondering if there is any real risk in adding this in patch version, but your patch seems so elegantly small that it seems unlikely there could be any risk.

cowtowncoder · 2014-08-12T05:27:02Z

Actually looking at patch again, I think it is pretty optimal (from efficiency POV). No point in tweaking, at most could inline if jmh showed that to matter (I am guessing it should not, easy enough to verify).

cowtowncoder · 2014-08-13T04:00:59Z

@rjmac Do you think patch is complete, that is, can I just apply it as is?

rjmac · 2014-08-13T04:23:58Z

I think so. It passes Jackson's own test suite, and the snapshot jar I built also looks like it passes my tests that showed this up in the first place.

cowtowncoder · 2014-08-13T16:24:01Z

Ok thanks. My main concern really is with patch release; 2.4.2 is finally resolving couple of regressions 2.4 had (due to various low-level performance optimizations). One possibility would be to do this for 2.5.0, just to eliminate even the small risk.

cowtowncoder · 2014-08-13T19:45:07Z

I switched master branch to be 2.5.0-SNAPSHOT, and will target fix for 2.5, just to be extra sure.

rjmac added a commit to rjmac/jackson-core that referenced this issue Aug 11, 2014

When caching names from bytes, pad with 0xff instead of 0

ce25ff4

0xff can't occur in well-formed UTF-8, so it will not cause false-positives. Might resolve FasterXML#148

rjmac added a commit to rjmac/jackson-core that referenced this issue Aug 11, 2014

Add test for FasterXML#148

4c6da5f

rjmac added a commit to rjmac/jackson-core that referenced this issue Aug 11, 2014

Padding the last quad in apos names needs to come before looking them up

aa83e9d

More FasterXML#148

cowtowncoder added a commit that referenced this issue Aug 12, 2014

Add a (failing) unit test for #148

fde2133

cowtowncoder closed this as completed in e997287 Aug 13, 2014

cowtowncoder added this to the 2.5. milestone Aug 13, 2014

cowtowncoder mentioned this issue Feb 26, 2022

Short NUL-only keys incorrectly detected as duplicates in CBOR and SMILE FasterXML/jackson-dataformats-binary#312

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BytesToNameCanonicalizer can mishandle leading NUL #148

BytesToNameCanonicalizer can mishandle leading NUL #148

rjmac commented Aug 10, 2014

cowtowncoder commented Aug 11, 2014

rjmac commented Aug 11, 2014

rjmac commented Aug 11, 2014

cowtowncoder commented Aug 12, 2014

cowtowncoder commented Aug 12, 2014

cowtowncoder commented Aug 12, 2014

cowtowncoder commented Aug 13, 2014

rjmac commented Aug 13, 2014

cowtowncoder commented Aug 13, 2014

cowtowncoder commented Aug 13, 2014

BytesToNameCanonicalizer can mishandle leading NUL #148

BytesToNameCanonicalizer can mishandle leading NUL #148

Comments

rjmac commented Aug 10, 2014

cowtowncoder commented Aug 11, 2014

rjmac commented Aug 11, 2014

rjmac commented Aug 11, 2014

cowtowncoder commented Aug 12, 2014

cowtowncoder commented Aug 12, 2014

cowtowncoder commented Aug 12, 2014

cowtowncoder commented Aug 13, 2014

rjmac commented Aug 13, 2014

cowtowncoder commented Aug 13, 2014

cowtowncoder commented Aug 13, 2014