-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Incomplete Support for Union or | Type Deserialization in Common Python Types (e.g., int
, Enum
, Literal
)
#67
Comments
I don't think this has anything to do with Literals - anything that needs to be deserialized in a union is not being deserialized. >>> from dataclasses import dataclass
>>> from dataclass_wizard import JSONWizard
>>> @dataclass
... class B:
... thing: str
...
>>>
>>> @dataclass
... class A(JSONWizard):
... one_or_more: B | list[B]
...
>>>
>>> result = A.from_dict({
... 'one_or_more': {
... 'thing': 'value'
... }
... })
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.10/site-packages/dataclass_wizard/loaders.py", line 536, in fromdict
return load(d)
File "/usr/local/lib/python3.10/site-packages/dataclass_wizard/loaders.py", line 619, in cls_fromdict
cls_kwargs[field_name] = field_to_parser[field_name](
File "/usr/local/lib/python3.10/site-packages/dataclass_wizard/parsers.py", line 250, in __call__
raise ParseError(
dataclass_wizard.errors.ParseError: Failure parsing field `one_or_more` in class `A`. Expected a type [<class '__main__.B'>, <class 'list'>], got dict.
value: {'thing': 'value'}
error: Object was not in any of Union types
tag_key: '__tag__'
json_object: '{"one_or_more": {"thing": "value"}}'
>>> |
Got the same issue here. Mixing primitives with dataclasses in unions seems to be the problem: from dataclass_wizard import fromdict
from dataclasses import dataclass
@dataclass
class MoreDetails:
arg: str
@dataclass
class Config:
# Unions with primitives and dataclasses
entry: str | MoreDetails | int
# Works
fromdict(Config, {
"entry": "a"
})
# Works too
fromdict(Config, {
"entry": 1
})
# Fails, fails to interpret the dict as a dataclass
fromdict(Config, {
"entry": {"arg": "str"}
}) Traceback (most recent call last):
File "/mnt/c/Users/fakui/Seafile/Programming/projects/cc_crapton/./ha_modbus_mqtt/dataclasswz_bug.py", line 29, in <module>
fromdict(Config2, {
File "/home/fakui/.local/lib/python3.10/site-packages/dataclass_wizard/loaders.py", line 536, in fromdict
return load(d)
File "/home/fakui/.local/lib/python3.10/site-packages/dataclass_wizard/loaders.py", line 619, in cls_fromdict
cls_kwargs[field_name] = field_to_parser[field_name](
File "/home/fakui/.local/lib/python3.10/site-packages/dataclass_wizard/parsers.py", line 250, in __call__
raise ParseError(
dataclass_wizard.errors.ParseError: Failure parsing field `entry` in class `Config2`. Expected a type [<class '__main__.MoreDetails'>, <class '__main__.Config'>], got dict.
value: {'arg': '1'}
error: Object was not in any of Union types
tag_key: '__tag__'
json_object: '{"entry": {"arg": "1"}}' dataclass-wizard version: 0.22.2 |
It appears you might be unfamiliar with the GitHub workflow. In order to propose your changes you must fork this repository and submit those changes to your fork. Once you have done so there will appear a button on your fork that asks if you want to open a PR. TL;Dr you need to fork, submit your changes to the fork, then create a PR. It's not normal for you to have the ability to commit to a repository you are not a maintainer of. |
@StripedMonkey thanks for the quick reply! Indeed I was unfamiliar with this way of working. I forked the repo and created #111. |
I was incorrect in thinking #111 would fix the issue. After looking into this a bit deeper, it seems I figured out the issue, but it's a tough fix. The problem is that if you annotate like The failure is happening because you get float value Here's a sample code to demonstrate: from dataclasses import dataclass
from typing import Literal
from dataclass_wizard import JSONWizard
@dataclass
class MyClass(JSONWizard):
literal_or_float: Literal['abc'] | float
# OK
c = MyClass.from_dict({"literal_or_float": "abc"})
print(repr(c))
# OK
c = MyClass.from_dict({"literal_or_float": 1.23})
print(repr(c))
# NOK (Raises error)
c = MyClass.from_dict({"literal_or_float": "1.23"}) This results in error on the last line:
As you can see, it works for first 2 cases, but it fails because a string value of BUT, when you add a I want to address this in a way that doesn't impact performance (too much), so I'll have to think about this. The thing is if a string is coming in as I'll have to look into this some more, but I'm re-opening this issue because it's a known/current problem. |
@fakuivan That is actually not the root issue. Dataclass in Union Types is possible with the Dataclass Wizard, it just requires correct usage of
@mia-n Similar approach can be used with the example you had. That code runs fine with some modification. The root issue is that Dataclass Wizard does not attempt a "naive" approach when it comes to That is why "tag" key is required when dataclasses are in Unions. Since each Union type is not attempted in a "try-catch" fashion, the "tag" key helps identify exactly which dataclass to parse as. This is more performant, and also helps in cases when two dataclasses in Union share the same fields and types, for example. |
Adding my thoughts here in the form of pseudo-code. I think one approach can be to create a class Again, pseudo-code of how that could look:
Then, in parsers.py where we call
We can replace that check with something like:
This should give a good compromise b/w efficiency and also fix the parsing support for Some BenchmarksAlso, just in case someone tells me I'm worried about micro-optimizations which is possible, I made a small benchmark to test validating vs blind parsing for common types like from enum import Enum
from timeit import timeit
class MyEnum(Enum):
VAL_1 = 'ONE PLEASE'
VAL_2 = 'TWO IS GREAT'
VAL_3 = "THREE'S AWESOME"
VAL_4 = 97
def is_enum1(o):
try:
_ = MyEnum(o)
return True
except ValueError:
return False
def is_enum2(o):
return o in MyEnum._value2member_map_
def is_int1(o):
try:
_ = int(o)
return True
except ValueError:
return False
def is_int2(o: str):
return o.removeprefix('-').isdigit()
def is_float1(o):
try:
_ = float(o)
return True
except ValueError:
return False
def is_float2(o: str):
return o.removeprefix('-').replace('.', '', 1).isdigit()
n = 1_000
print(timeit('is_enum1("TWO IS GREAT")', globals=globals(), number=n))
print(timeit('is_enum2("TWO IS GREAT")', globals=globals(), number=n))
print(timeit('is_float1("54.321")', globals=globals(), number=n))
print(timeit('is_float2("54.321")', globals=globals(), number=n))
print(timeit("is_int1('123')", globals=globals(), number=n))
print(timeit("is_int2('123')", globals=globals(), number=n))
# 0.00019675004296004772
# 4.5042019337415695e-05
# 6.279093213379383e-05
# 8.941697888076305e-05
# 8.983299994724803e-05
# 4.8417001380585134e-05
print()
print(timeit('is_enum1("OOPS!!")', globals=globals(), number=n))
print(timeit('is_enum2("OOPS!!")', globals=globals(), number=n))
print(timeit('is_float1("OOPS!!")', globals=globals(), number=n))
print(timeit('is_float2("OOPS!!")', globals=globals(), number=n))
print(timeit("is_int1('1.abc')", globals=globals(), number=n))
print(timeit("is_int2('1.abc')", globals=globals(), number=n))
# 0.0014168331399559975
# 3.700004890561104e-05
# 0.00046787504106760025
# 6.270897574722767e-05
# 0.000634250001894543
# 4.758400245918892e-05
assert is_enum1("OOPS!!") is is_enum2("OOPS!!") is False
assert is_enum1("TWO IS GREAT") is is_enum2("TWO IS GREAT") is True
assert is_float1("OOPS!!") is is_float2("OOPS!!") is False
assert is_float1("54.321") is is_float2("54.321") is True
assert is_int1('123') is is_int2('123') is True
assert is_int1('1.abc') is is_int2('1.abc') is False Results show that validating values for types like Enum and int is an overall better approach. There are some types like Hope this helps - I currently don't have time to implement this myself, which is why I'm jotting my thoughts down, in case anyone wants to take a stab at it. I'm also going to update the issue with "help wanted" label. For now, I'm going to focus on addressing other issues with the Dataclass Wizard library, but I'll keep a tab on this issue and will hopefully be back to look at it with fresh eyes sometime soon. |
int
, Enum
, Literal
)
Description
Using the dataclass-wizard to deserialize a YAML file fails when trying to use string literals as field options.
In my particular case, a value in the config can be set manually or derived automatically depending on user preference.
In this instance, the only valid string value for
x
should beAuto
. Any other value should fail to deserialize. However, since it appears that the parser is unaware of the possible equality of str/num/bytes to a Literal this fails.What I Did
The text was updated successfully, but these errors were encountered: