Grammatical Features #871
Replies: 13 comments 16 replies
-
IMHO, grammars are too diverse to normalize and too complex to code. Just when you come up with solutions to those issues, along comes the bigger problem of user generated content (i.e. entity names), for which you would also have to use genders, cases etc. And I really don't believe that most users know enough elementary grammar to provide an alias for each case, to state genders of aliases etc. Plus, not all of them use the same case or number for entity names (nouns). E.g. "Kitchen lights", "Fan in the bathroom", "Yard's gate" and "Curtains - left". In some cases and in some languages, entity names could, themselves, be compound structures (like "kitchen window LED in the lower left"). Figuring out which is the object in that construct is difficult even for a human. I believe that, rather than having to code extremely extensive grammar rules, we would be better off instructing language leaders to come up with permissive sentences and with responses that require as little declension as possible, like @davefx suggested here. |
Beta Was this translation helpful? Give feedback.
-
I thing it will be too hard to cover all languages and I personally didn’t see the need for it. |
Beta Was this translation helpful? Give feedback.
-
Besides grammatically correct responses, there is another possible problem. One of our goals with Assist's sentences is to train better intent matching models, as well as speech to text systems. The plan is to generate training sentences from the templates. If the generated sentences aren't grammatical, I worry it will degrade performance of the trained system. Like this: Template: The last sentence isn't correct. Maybe not a big deal, but I could see some STT systems wrongly transcribing "turn on Anna's light" as "turn on an light" if they've been trained with this bad data. |
Beta Was this translation helpful? Give feedback.
-
as in Czech, we have entities and areas in nominative. But at most cases we trigger an action in accusative, but not always. It quite depends on how the entity or area are named. Or which type of sentence the user use. On other hand, response will be answered in accusative - almost always. So it depends on how the intent was triggered. Sometimes we need to answer as an adjective. as in example: I have an lamp named in HA So for now the parser we have entities like: More dramatic situation is for brightens as I mentioned in #755 - there are totally mismatched intents and answers as we speaks about cases and adverbs. And I would see another problem in other languages about gender and so on. |
Beta Was this translation helpful? Give feedback.
-
It may be interesting to allow a pluggable nature to this solution. Start with the basic rule-engine that HA is building now for basic intent/responses, then add the option to post-process the responses with a grammar-engine. Given the variety of grammar across different languages, I don't think any one solution is going to be "perfect", so a pluggable nature makes sense. For example, a English/GPT-based engine might do the following:
|
Beta Was this translation helpful? Give feedback.
-
Related to replies from both @schizza and @peter-dolkens, we could have a pluggable post-processing step for each language that adds "hints" used when selecting the response. I'm thinking of something trained from the Universal Dependencies corpus. Basically a classifier that guesses part of speech, gender, case, etc. And then the response selection could have a set of rules for deciding how to generate the response based on those hints. |
Beta Was this translation helpful? Give feedback.
-
What do you think about my proposal here? https://community.home-assistant.io/t/entity-user-defined-metadata-like-noun-gender-or-number-for-localization-or-user-generated-names/535963 This would be a core feature that the intents could leverage to fix some of the issues we're having. Concerning noun cases, we could provide another entity metadata field (this time, completely user-generated text) where the user could provide the different case forms for his entity names. If they don't, then they will get a slightly worse experience in terms of responses, using only the base form. |
Beta Was this translation helpful? Give feedback.
-
There must be some kind of grammar database/engine (at least, for most popular and used languages) that detects the grammatical characteristics of a given word, and that is also able to transform any other given word to the provided case/number/gender. Using such a grammar engine, we could somehow provide hints in the sentences and in the generated responses so:
Unfortunately, I'm not an expert in this, so I don't know of any grammar engines, but IMHO this should be the path to follow. |
Beta Was this translation helpful? Give feedback.
-
in French, gender is important and we would like to add it in the answers. ex:
I tried to implement it but it doesn't seem possible to get a definite list in _common in the response templates example in HassGetState responses
what do you think? |
Beta Was this translation helpful? Give feedback.
-
On the use of tenses: At the moment assist says "Turned on light" To me, "turned on light" implies success - is the intent platform actually confirming success before issuing a TTS response? If not then the past perfect tense seems inappropriate. |
Beta Was this translation helpful? Give feedback.
-
On the use of articles: An alternative solution would be "Lights turned on" which would be correct, although it is unclear whether it is a an exclamation like "Lights turned on!?" (as a HA user with a broken automation might say) or an incorrect contraction of the passive-voice statement "The lights were turned on by (me/the voice assistant)" I suggest that articles be used where possible, at least in English, although it will add plural complexity in other languages (e.g le vs les). |
Beta Was this translation helpful? Give feedback.
-
Can you please suggest how to treat articles in non-English environments? |
Beta Was this translation helpful? Give feedback.
-
My understanding is that there are 3 different purposes for sentences that we are trying to represent in this repo:
I don't believe that each of these would benefit from having a universal strict grammatical accuracy requirement. TTS responsesI believe these make the most sense to be strictly grammatically correct. In the case of Home Assistant, this will have to be "best guess" because we can't quite know every entity name. Someone may have a plural entity where we generally assume singular. Short of having an attribute associated with entities to indicate plurality, this is something we should more or less ignore. Intent captureIn my opinion, intent capture should only be absolutely required to be unambiguous. There should be a leaning towards not being overly permissive and limit to being only inclusive grammatical errors. I believe this should be the target because not all speakers will have the same command of the target language. For example, if my toddler were to say "Is all the lights on?", while clearly not proper, it's unambiguous and can be matched to an intent. STT trainingI understand the desire to use proper grammar to train STT recognition models since we'd want to assume proper grammar first, however the fact is that some people still speak with grammatical errors. This, at the very least, should not impede the ability to match intents. I saw a suggestion of running permuted sentences through a grammar checker before training, which makes sense to me and would be a good way to meet both use cases without harm. My suggestion can be summarized as:
I believe this is the same thing suggested in [this thread], however I wanted to try to write it up in a single post to make it clearer what the consensus is. |
Beta Was this translation helpful? Give feedback.
-
Let's discuss how to add things like grammatical case, gender, etc. to the intents/responses format. For example, see: #755
Also, let me know if you would be interested in joining a Zoom meeting in the future so we can discuss this in person 🙂
Beta Was this translation helpful? Give feedback.
All reactions