Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isSet or isNull flag for optional values? #21

Closed
denizzzka opened this issue Jun 11, 2019 · 19 comments
Closed

isSet or isNull flag for optional values? #21

denizzzka opened this issue Jun 11, 2019 · 19 comments

Comments

@denizzzka
Copy link
Contributor

In proto3 all fields are "optional".
How to test what field is set?

Please add example

@dcarp
Copy link
Owner

dcarp commented Jun 11, 2019

There is no way to differentiate between a field that was not set and a field that was set to the default value. Moreover the application should not differentiate between this two cases.
For details please see https://developers.google.com/protocol-buffers/docs/proto3#default

@denizzzka
Copy link
Contributor Author

denizzzka commented Jun 11, 2019

I think what:

There is no way to differentiate between a field that was not set and a field that was set to the default value. Moreover the application should not differentiate between this two cases.

and

For details please see https://developers.google.com/protocol-buffers/docs/proto3#default

are unrelated things.

proto3 deprecates specifying of default values, that's all. Optional values is not deprecated by itself.
Also this link says:

For message fields, the field is not set. Its exact value is language-dependent.

But here is no way to check this.

Related discussion: msoucy/dproto#63 (comment)

@denizzzka
Copy link
Contributor Author

I'll take a short pause to sort out more. Very strange it actually looks for me

@denizzzka
Copy link
Contributor Author

denizzzka commented Jun 11, 2019

Yep, my case is unability to check that some message is set or not.

Description of C++ version of proto3:
https://developers.google.com/protocol-buffers/docs/reference/cpp-generated

Singular Embedded Message Fields
Given the message type:

message Bar {}
For any of these field definitions:

//proto3
Bar foo = 1;
The compiler will generate the following accessor methods:

bool has_foo() const: Returns true if the field is set.

@dcarp
Copy link
Owner

dcarp commented Jun 11, 2019

There is a difference between scalar (single) fields and message (complex) fields. protoc3 removes the has_<field> for scalar fields.

For messages, when you use generate the D definition as class, you can still use foo !is null for checking whether it was set of not. If you use structs then they must be treated similar as the scalars.

In general, I think that the application should not use has_<field>, otherwise the back-/forward compatibility of protobuf will be compromised. You either consider the default value as the not-set flag, or consider the field always set (default value inclusive).

@denizzzka
Copy link
Contributor Author

denizzzka commented Jun 11, 2019

If you use structs then they must be treated similar as the scalars.

Mmm.. why? Google disagreed with it (above)

(I use structs)

In general, I think that the application should not use has_, otherwise the back-/forward compatibility

This is also my case. I move from proto2 (provided by dproto) because it contains unresolved significant bug (msoucy/dproto#96)

@dcarp
Copy link
Owner

dcarp commented Jun 11, 2019

If you use structs then they must be treated similar as the scalars.

Mmm.. why? Google disagreed with it (above)

Google does not define the implementation details. I think that it will be a mistake to just port the C++ implementation. protobuf-d implementation has the constraints that I described above and if I understand you correctly, you agree that has_<field> is not a good style.

Should I/you close this issue?

@denizzzka
Copy link
Contributor Author

denizzzka commented Jun 12, 2019

if I understand you correctly, you agree that has_ is not a good style.

No, no. I not agreed.
And I do not think that a certain style is important. I would agree to Nullable's isNull (or maybe Option!T). Although, frequent checks what user can't get around will hit performance?

Google does not define the implementation details.

Yes. But protocol provides information about set or not for any element and API can provide this too (maybe only for messages as in Google's C++ API to avoid creation of schemes that other libraries do not understand).

I know widely used .proto files built for use this feature. Simpliest example is OSM PBF: https://wiki.openstreetmap.org/wiki/PBF_Format

message Node {
  required sint64 id = 1;
  // Parallel arrays.
  repeated uint32 keys = 2 [packed = true]; // String IDs.
  repeated uint32 vals = 3 [packed = true]; // String IDs.
  optional Info info = 4; // May be omitted in omitmeta
  required sint64 lat = 8;
  required sint64 lon = 9;
}

Currently it can not be converted into proto3 to use with D because info is optional and has_<field> is unavailable.

@dcarp
Copy link
Owner

dcarp commented Jun 12, 2019

Yes. But protocol provides information about set or not for any element and API can provide this too

Not any element, scalar elements have no has_<field> method by design. Embedded messages have a has_ method because it is possible, but I consider it bad style. Why not treat the embedded message fields similar to scalar fields?

message Node {
  required sint64 id = 1;
  // Parallel arrays.
  repeated uint32 keys = 2 [packed = true]; // String IDs.
  repeated uint32 vals = 3 [packed = true]; // String IDs.
  optional Info info = 4; // May be omitted in omitmeta
  required sint64 lat = 8;
  required sint64 lon = 9;
}

Currently it can not be converted into proto3 to use with D because info is optional and has_ is unavailable.

Instead of has_info use info != defaultValue!Info. Will this work?

@denizzzka
Copy link
Contributor Author

denizzzka commented Jun 12, 2019

Why not treat the embedded message fields similar to scalar fields?

Instead of has_info use info != defaultValue!Info. Will this work

Maybe or maybe not - I just will never know that the data was read incorrectly by discussed package: if we remember that defaults in proto3 deprecated and zeroes is a very common meaning of something, then this is a problem

@denizzzka
Copy link
Contributor Author

denizzzka commented Jun 12, 2019

Google does not define the implementation details. I think that it will be a mistake to just port the C++ implementation.

I checked - that also is implemented in Dart and maybe another lanugages which I don't know.

@dcarp
Copy link
Owner

dcarp commented Jun 12, 2019

Maybe or maybe not - I just will never know that the data was read incorrectly by discussed package: if we remember that defaults in proto3 deprecated and zeroes is a very common meaning of something, then this is a problem

I don't quite get what you mean by that. By design Protobuf requires that the application should cope with default values. Otherwise in case of removing a field from the message definition will break the backward compatibility contract.

@dcarp
Copy link
Owner

dcarp commented Jun 12, 2019

I checked - that also is implemented in Dart and maybe another lanugages which I don't know.

Maybe they want to be API compatible with proto2, but protobuf-d does not have such a constraint.

@denizzzka
Copy link
Contributor Author

denizzzka commented Jun 13, 2019

By design Protobuf requires that the application should cope with default values. Otherwise in case of removing a field from the message definition will break the backward compatibility contract.

Custom default values is deprecated in proto3. And this package is not provides ability to check has_<field>. With both of these factors it is impossible to distinguish what some message is setted up by this pbf library from wire binary or just it used default value.

has_<field> checking is not prohibited by the format and therefore must be able to check. Other APIs provides has_<field> info for messages (But do not provide for scalar values. I think it is because there were problems with defaults in the previous version of the protocol - it can be silently changed. So, Google considered to drop custom defaults support and checking of set or not.)

That is, we also need to be able to check message fields for their set or not.

Maybe they want to be API compatible with proto2, but protobuf-d does not have such a constraint.

I think what this is issue, not Google's recommended design.

@dcarp
Copy link
Owner

dcarp commented Jun 13, 2019

Custom default values is deprecated in proto3. And this package is not provides ability to check has_. With both of these factors it is impossible to distinguish what some message is setted up by this pbf library from wire binary or just it used default value.

This is by design as described in here. Quote:

Note that for scalar message fields, once a message is parsed there's no way of telling whether a field was explicitly set to the default value (for example whether a boolean was set to false) or just not set at all: you should bear this in mind when defining your message types. For example, don't have a boolean that switches on some behaviour when set to false if you don't want that behaviour to also happen by default. Also note that if a scalar message field is set to its default, the value will not be serialized on the wire.

The highlighted part is the contract that the application must fulfill. I see no reason why this should not apply for complex types as well.
Please provided a compelling argument why info != defaultValue!Info does not work.

@denizzzka
Copy link
Contributor Author

denizzzka commented Jun 13, 2019

Please provided a compelling argument why info != defaultValue!Info does not work.

(Perhaps I misunderstand how defaultValue!Info works.)

Message with all zeroed scalars: scalars should be treated as valid zeroes or it means what message was not passed by wire at all?

@dcarp
Copy link
Owner

dcarp commented Jun 13, 2019

As I sad the application should cope with this uncertainty. An old application should still function even if it receives data where a field is missing because in meantime it was deprecated. The protobuf contract is that the application should consider that any field can be missing (deprecated) and it is applications job to sensible react on that and decide whether it can continue or not. This is the so called forward compatibility.

If you still insist on the possibility that a field is missing on propose, then the best solution is to generate that field type as class and not as struct. Having it as struct means that memory is wasted when the field is missing. Your proposal translates to fighting the wasted memory with even more wasted memory: the has_<field> flag. I think that the definition as class is a good choice in this case.

@denizzzka
Copy link
Contributor Author

I do not think that memory is more important than logical consistency of processed data.

Ok, maybe someone will find this discussion sometime and be able to say something another for support of my position about "set flag".
For the time being, I will use classes.

Thank!

@denizzzka
Copy link
Contributor Author

To future archaeologists: I asked Google Protobuf team about this topic and they confirmed point of view that additional functions such as has_* for primitive types are not needed:

https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/protobuf/BCpk_02wrl0/kg4T8Ly3AQAJ

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants