-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full serialization using iwoplaza/typed-binary
package
#25
Conversation
I think that's actually exactly what c2pa-rs does as well. |
f438af4
to
e0e4fc4
Compare
This branch now experimentally enables the typed-binary serialization using a switch in Some question marks remain:
|
Short notes after call:
|
72e92ec
to
cf533d2
Compare
This should now be ready for review and merge, there shouldn't be any regressions from the existing code. Implementing this, one question came up though, concerning the |
iwoplaza/typed-binary
package (WORK IN PROGRESS)iwoplaza/typed-binary
package
According to the spec, every box in a C2PA manifest needs to have a label:
Either way, you're right that using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall!
I've aded some inline comments.
General remarks:
- Mocha didn't seem to run the full asset decoding test suite here, shouldn't it?
- The only thing I'm not really happy about is all the individual
readByte()
/writeByte()
calls. Is there no way to read/write full byte arrays using typed binary? - We should also add typed binary to the list of acknowledgments in the README
- About the URIs see my previous comment
src/jumbf/C2PASaltBox.ts
Outdated
measure(value: C2PASaltBox, measurer: bin.IMeasurer = new bin.Measurer()): bin.IMeasurer { | ||
return measurer.add( | ||
4 + // length | ||
4 + // type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this common header length be moved to the generic box schema instead?
Actually, the same goes for reading and writing the common header (length, type):
- Is it possible to have the generic schema handle the header and have the individual schemas only contain the specific box type's content?
- If not, have common methods for reading/writing/measuring the header that are called by each schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we can absolutely do is this:
return measurer.add(
this.length.measure(0).size +
this.type.measure(value.typeCode).size +
(value.salt ? value.salt.length : 0),
);
I find that a bit less readable and the performance will probably suffer, too.
Concerning the repeated code, let's consider this:
class BoxSchema extends bin.Schema<Box> {
read(input: bin.ISerialInput): Box {
const length = this.length.read(input);
const type = this.type.read(input);
return this.read_internal(input, length, type);
}
... // ditto for write() and measure()
}
My problem here is that this returns and consumes Box
objects, not e.g. C2PASaltBox
. Doing that with correct typing on top could possibly be done using mixins. I'd have to do some research first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could possibly be made to work with generics – but how about this as a less complicated approach:
write(output: bin.ISerialOutput, value: C2PASaltBox): void {
this.writeHeader();
value.salt?.forEach(byte => output.writeByte(byte));
}
measure(value: C2PASaltBox, measurer: bin.IMeasurer = new bin.Measurer()): bin.IMeasurer {
return this.measureHeader(measurer).add(this.value?.length ?? 0);
}
And then have writeHeader()
and measureHeader()
in the base class? I might be missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just extracted the length/type handling into an abstract base template for the concrete schemata. This was medium complicated, mainly to ensure correct typing, but I think it's a good compromise.
readonly id = bin.u32; | ||
readonly hash = bin.arrayOf(bin.byte, 32); | ||
// Note: This doesn't work due to a circular import. | ||
// readonly privateBoxes = new GenericBoxSchema(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually circular imports can be prevented by introducing a common base interface. I haven't checked the exact issue here though and whether that might help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GenericBoxSchema
class (the name sucks and I'd happily welcome a better one) needs to know all box classes and their schemata while some box classes' schemata (super, description) need to access to a generic schema for the nested boxes. I honestly don't fully understand why it seems to work in some cases but not others, so I basically gave up.
Writing this, I was wondering if the GenericBoxSchema
couldn't remain ignorant of any but the fallback schema. All other schemata would instead actively register themselves. I'm not sure if this is worth the effort at this point or if there are more urgent things to do.
Answering my own question:
This seems to be WIP iwoplaza/typed-binary#11 |
2fa4e67
to
f05f75f
Compare
Yes, bug in the GitHub Actions setup, already fixed in
Done. I also pushed ack's for cbor-x and mocha.
I'll look into that this week. I wonder if this should be handled during deserialization or during manifest creation/validation. This is similar to a JSON/CBOR field that doesn't properly decode. |
I wouldn't spend too much time on it – nobody would write a C2PA manifest that doesn't have each SuperBox labeled as it simply would be completely invalid manifest. So it's fine to just not apply the URI if there's no label and then have validation fail later when it tries to resolve URIs. Speaking of, is there a reason why |
It doesn't need any |
Well, it does take a box as its first parameter so I thought it could simply be turned into an instance method. Not a big deal either way 😉 |
f05f75f
to
adf62dc
Compare
These are used by multiple box-specific schemata. This implements: - length: Just a simple 32-bit integer. - type: Four letter word, representing the type of a box. - uuid: UUID made of 16 raw bytes. - fallback: fallback schema for unknown box types. This may be temporary, but for now it avoids having too many loose ends in the code.
The schema is required to be given as parameter to the constructor, along with the type string. Typically, they are provided by one of the derived classes.
This type template eases writing box schemata. It contains an abstract base class that manages the length and type header parts of the schema. It then delegates the other parts to the derived class.
This schema looks at either the input stream or the box and the dispatches to the according schema. Note that this presents some issues with circular dependencies, because this class needs to know about box classes that nest others and vice versa.
- Delete `parse()` functions from all box classes. - Re-implement `BoxReader` based on the typed-binary code. Maybe this class could even be removed entirely, it is only used by some tests currently.
adf62dc
to
7d3d7ab
Compare
Just as an example, here's one of the box classes implementing reading and writing. Notes:
getSchema()
needs to become part of the baseclass/interface.static typeCode
could be merged with the schema.