Grammatical Features #871

synesthesiam · 2023-02-02T15:04:03Z

synesthesiam
Feb 2, 2023
Maintainer

Let's discuss how to add things like grammatical case, gender, etc. to the intents/responses format. For example, see: #755

Also, let me know if you would be interested in joining a Zoom meeting in the future so we can discuss this in person 🙂

tetele · 2023-02-02T15:20:00Z

tetele
Feb 2, 2023
Collaborator

IMHO, grammars are too diverse to normalize and too complex to code.

Just when you come up with solutions to those issues, along comes the bigger problem of user generated content (i.e. entity names), for which you would also have to use genders, cases etc. And I really don't believe that most users know enough elementary grammar to provide an alias for each case, to state genders of aliases etc. Plus, not all of them use the same case or number for entity names (nouns). E.g. "Kitchen lights", "Fan in the bathroom", "Yard's gate" and "Curtains - left".

In some cases and in some languages, entity names could, themselves, be compound structures (like "kitchen window LED in the lower left"). Figuring out which is the object in that construct is difficult even for a human.

I believe that, rather than having to code extremely extensive grammar rules, we would be better off instructing language leaders to come up with permissive sentences and with responses that require as little declension as possible, like @davefx suggested here.

1 reply

synesthesiam Feb 2, 2023
Maintainer Author

I agree that this may ultimately be out of scope for the project. I do hope that we can find ways of automatically generating different forms of nouns (entity/area names) for some languages where it could be cumbersome to create many different aliases.

BossNeo · 2023-02-02T19:53:26Z

BossNeo
Feb 2, 2023

I thing it will be too hard to cover all languages and I personally didn’t see the need for it.
Goal is to be voice assistant, not a chatbot , for me will be annoying, if I am in living room and ask the device to turn on living room lights. I can see he/she turn it on , I don’t need perfectly grammar answer back. It be worse if I need to shoot two commands in a row. If my command triggers something in other room , then only a pleasant beep will be enough.
I will want answer back only if fail, but again just different beep will work. If someone want something to reappear back every word that he say , to get a parrot or a wife 😅.
For me is better in begging to focus on other tasks.
Or let chatgpt to give positive feedback and negative if it fail. At least will be fun.

3 replies

spuljko Feb 2, 2023
Collaborator

@BossNeo it all depends on language. Some languages are more complex and if you say it wrong way that just doesn't make a sense, it is not just grammar issue.

BossNeo Feb 2, 2023

If you say it wrong , then respond it be: sorry I didn’t understand you , one fail beeb will give you the same info. I just wanted to point that for a lot of people full answer back is annoying. I never see person to be annoying from short answers.

D-side Aug 1, 2023

Hard disagree.

Just doing what you told it to is a very early stage of VAs. They can eventually start answering questions about certain things, possibly including information you may not know because it was put in by somebody else (actual someone other than you, some automation or just yourself from a few days ago).

Things I've tried with Rhasspy 2 at certain points of my usage were a simplistic inventory system (a glorified KV-store, where K is an item and V its location) and a shopping list (which could further evolve to ERP-grade with Grocy). On Russian, which features practically random grammatical genders for nouns and commonly uses various word formations to link words together inside complete sentences.

I had to adapt the thing to the closed-vocabulary training of Kaldi and so I used specific language structures to avoid variations, but usability suffers from it. Now that we have Whisper recognizing arbitrary speech so well so quickly, handling grammar in a similar way no longer seems insurmountable.

And these are just the things I actually tried off the top of my head, they're a small fraction of use cases I considered at all.

synesthesiam · 2023-02-02T21:48:48Z

synesthesiam
Feb 2, 2023
Maintainer Author

Besides grammatically correct responses, there is another possible problem. One of our goals with Assist's sentences is to train better intent matching models, as well as speech to text systems.

The plan is to generate training sentences from the templates. If the generated sentences aren't grammatical, I worry it will degrade performance of the trained system. Like this:

Template: turn on [the|a|an] light
Matches:
- turn on the light
- turn on a light
- turn on an light

The last sentence isn't correct. Maybe not a big deal, but I could see some STT systems wrongly transcribing "turn on Anna's light" as "turn on an light" if they've been trained with this bad data.

3 replies

tetele Feb 3, 2023
Collaborator

At the moment, we are allowing (or, at least, I am) most phrases that any resemblance to a valid request in order to be able to do what the user wants. If we should be extra careful about all possible combinations in order to prevent those which are not grammatically correct, that should probably be in the docs.

Robin-St Feb 5, 2023

Could this be helped a bit by using some grammar checking tool as an intermediate step? To help clean the dataset of generated sentences from grammatically bad ones before being used to train the STT?

synesthesiam Feb 6, 2023
Maintainer Author

This is the approach I'm planning to take. As @tetele said, we would have to be extremely careful with our sentences if we want all possible sentences to be grammatically correct. But if we have some kind of checker that filters out bad sentences as they're generated, then we can still get a good STT training set.

schizza · 2023-02-03T09:42:53Z

schizza
Feb 3, 2023
Collaborator

as in Czech, we have entities and areas in nominative. But at most cases we trigger an action in accusative, but not always. It quite depends on how the entity or area are named. Or which type of sentence the user use. On other hand, response will be answered in accusative - almost always. So it depends on how the intent was triggered. Sometimes we need to answer as an adjective.
So if we use grammaticaly correct intent we can get totally misspelled answer because Czech language is so complex.
And we cannot use some generic rules. We have so many exceptions and rules.
After all - then we will be talking to HA as robots and get answers more robotic

as in example:

I have an lamp named in HA lampa na nočním stolku - it is an entity name, so it is a single lamp.
What I need to say to turn it on: rozsviť lampu na nočním stolku, answer would be lampa na nočním stoluku byla zapnuta

So for now the parser must know that lampa is the same name as lampu, but in accusative and that answer must be back in nominative.

we have entities like: malé světlo v koupelně and velké světlo v koupelně grouped in area koupelna - all in nominative case, all substantives.
We want to trigger turn on lights in area: rozsviť světla v koupelně - now area is an adverbium, so parser needs to know that a substantivum koulepna have its adverb koupelně to correctly parse the intent. And the answer have to be Světla v koupelně byla zapnuta
On other hand - if I want to turn on just a single light malé světlo v koupelně I would say: zapni malé světlo v koupelně and answer will be malé světlo v koupelně bylo zapnuto and voila all sentences are in nominative.

More dramatic situation is for brightens as I mentioned in #755 - there are totally mismatched intents and answers as we speaks about cases and adverbs.

And I would see another problem in other languages about gender and so on.
So a little help with how to parse sentences and making correct answers would be fine.
So may be it would be nice to set an adverbs, nominatives, adjects, genders to generic locations and generic entity names. And provide a hint for parser in which form should the response will be rendered.

0 replies

peter-dolkens · 2023-02-03T20:21:44Z

peter-dolkens
Feb 3, 2023

It may be interesting to allow a pluggable nature to this solution.

Start with the basic rule-engine that HA is building now for basic intent/responses, then add the option to post-process the responses with a grammar-engine.

Given the variety of grammar across different languages, I don't think any one solution is going to be "perfect", so a pluggable nature makes sense.

For example, a English/GPT-based engine might do the following:

Please write the following with better grammar

Lounge light 40% bright
"Set the brightness of the lounge light to 40%."

Lounge light made 40%
"The lounge light was adjusted to 40% brightness."

Amber is in away
"Amber is away."

Amber is in Bedroom
"Amber is in the bedroom."

0 replies

synesthesiam · 2023-02-06T02:46:04Z

synesthesiam
Feb 6, 2023
Maintainer Author

Related to replies from both @schizza and @peter-dolkens, we could have a pluggable post-processing step for each language that adds "hints" used when selecting the response.

I'm thinking of something trained from the Universal Dependencies corpus. Basically a classifier that guesses part of speech, gender, case, etc. And then the response selection could have a set of rules for deciding how to generate the response based on those hints.

2 replies

skynetua Feb 11, 2023
Collaborator

@peter-dolkens suggestion is good for responses. Regarding sentences recognition issues mentioned by @schizza the easiest way is probably to add some per-language rules to make matcher not so strict for names and areas. Like skipping word's endings (1-2 characters).

skynetua Feb 11, 2023
Collaborator

We have related WIP PR for eliminate this #164

tetele · 2023-02-16T11:29:24Z

tetele
Feb 16, 2023
Collaborator

What do you think about my proposal here? https://community.home-assistant.io/t/entity-user-defined-metadata-like-noun-gender-or-number-for-localization-or-user-generated-names/535963

This would be a core feature that the intents could leverage to fix some of the issues we're having.

Concerning noun cases, we could provide another entity metadata field (this time, completely user-generated text) where the user could provide the different case forms for his entity names. If they don't, then they will get a slightly worse experience in terms of responses, using only the base form.

1 reply

davefx Feb 24, 2023
Collaborator

This would be overkilling for the user.
I would expect these advanced grammar transformations to be done automatically.

davefx · 2023-02-24T14:52:47Z

davefx
Feb 24, 2023
Collaborator

There must be some kind of grammar database/engine (at least, for most popular and used languages) that detects the grammatical characteristics of a given word, and that is also able to transform any other given word to the provided case/number/gender.

Using such a grammar engine, we could somehow provide hints in the sentences and in the generated responses so:

Only words with the correct case/number/gender match are recognized.
We can generate responses using the correct case/number/gender (for example, forcing the device name to use a given case, or forcing the status name to match with the device gender and number...)

Unfortunately, I'm not an expert in this, so I don't know of any grammar engines, but IMHO this should be the path to follow.

0 replies

flexy2dd · 2023-02-27T09:29:16Z

flexy2dd
Feb 27, 2023

in French, gender is important and we would like to add it in the answers.
Indeed, the answer will be more precise and clear if we say
"Yes, the front door is unlocked" rather than "Yes, the front door device is unlocked
but for that you have to determine the kind of entity.
I was thinking of adding a list in the _common, as this seems to me to be the most simple to manage later in the response templates.

ex:

lists:
  gender_entities::
    values:
      - in: "(porte[s])"
        out: "f"
      - in: "(fenêtre[s])"
        out: "f"
      - in: "portail[s]"
        out: "m"
      - in: "(lumière[s]|lampe[s]|ampoule[s])"
        out: "f"
      - in: "(spot[s]|lustre[s])"
        out: "m"

I tried to implement it but it doesn't seem possible to get a definite list in _common in the response templates

example in HassGetState responses

any: |
  {% if query.matched %}
    {% set match = query.matched | map(attribute="name") | sort | list %}
    {% set count_match = no_match | length | int %}
    {% if match | length > 4 %}
      Oui, {{ match[:3] | join(", ") }} et {{ (match | length - 3) }} autres
    {% elif no_match | length == 0 %}
      Oui, 
      {% if match | length > 4 %}
      {% for name in match -%}
        {% if not loop.first and not loop.last %}, {% elif loop.last and not loop.first %} et {% endif -%}

        // gender determination
        {% set gender = gender_entities:name | defaul('NEUTRAL') %}
        {% if gender == 'F' %}
          la
        {% elif gender == 'M' %}
          le
        {% else %}
          l'appareil
        {% endif %}

        {{ name }}
      {% endfor %}
      est {{ state.state_with_unit }}
    {%- else -%}
      Oui, les appareils
      {% for name in match -%}
        {% if not loop.first and not loop.last %}, {% elif loop.last and not loop.first %} et {% endif -%}
        {{ name }}
      {% endfor %}
      sont {{ state.state_with_unit }}s
    {% endif %}
  {% else %}
    Non
  {% endif %}

what do you think?

5 replies

tetele Feb 27, 2023
Collaborator

Your proposal assumes that users do not name their entities anything other than the short list you have above, right? Case in point, in your proposal all lights are feminine, whereas if I had a light called "[le] LED du salle de bain", it would fail. Even if you would have defined LED as masculine in your list, I don't know if you can parse well enough to determine which of the objects is the subject, in nominative form (e.g. "la luminosité du LED" should be feminine).

User generated content is the killer here (entity names and aliases).

flexy2dd Feb 27, 2023

yes it's true indeed, I hadn't seen it from that perspective,
perhaps the ideal would be to be able to define gender at the entity level (like alias1es)
this part of the conversation assisatnt is really complicated

tetele Feb 27, 2023
Collaborator

See my post above on exactly this proposal (it's modeled in French, btw).

davefx Feb 27, 2023
Collaborator

I'm afraid that your solution would be only valid for languages that have this problem only with gender & number concordance (French, Italian, Spanish, Portuguese...). These are the easy case here, as we'll only have to worry about which adjective (from a set of four) will have the correct gender & number concordance. But many languages will have to decline the device name in order to transform it into the correct case.

Because of this, IMHO Home Assistant shouldn't rely on having users that fill in, for every device name, if its grammatical gender is masculine or feminine, or if its number is singular or plural, or how to decline it in all the needed cases. This is something that should be done automatically.

I know this is difficult, but until then, I think we should just try to use sentences that are correct no matter the gender/number of the device names.

tetele Feb 27, 2023
Collaborator

I'm afraid that your solution would be only valid for languages that have this problem only with gender & number concordance (French, Italian, Spanish, Portuguese...).

Which is why my proposal was not limited to number of genders, but rather to "entity metadata (like number and gender)", as the title suggests. Further on, "each language defines a set of taxonomies for entity names." - I haven't restricted the number of taxonomies. Forms for definite/indefinite articles are an example. Noun cases are another, which brings us to...

But many languages will have to decline the device name in order to transform it into the correct case.

...or simply define aliases for other cases and specify the cases as alias-associated metadata, manually or automatically (see below). Then, in the intention matcher, you could simply ask HA for the correct gender/number/case/other-grammatical-feature for the response you want to offer. Grammatically incorrect fallbacks would be required, of course (i.e. if no extra aliases are provided for an entity name, use the default, which is believed to be in nominative, singular etc.). Domain-specific nouns with many declinations are another idea (e.g. "light", "the light", "lights", "the lights", "the light's", "the lights'" just to mention combinations of article, number and nominative/genitive case in English).

English could also be a beneficiary, although its grammar rules are far simpler and don't actually need this treatment:

prepend "the" for definite article, "a/an" (depending on whether the word starts with a vowel) for indefinite article
for 98% of nouns, append "s" for plural
append "'s" for genitive singular and "'" for genitive plural

This is something that should be done automatically.

100% agreed. Even when discussing the automation of this process, there still needs to be underlying architectural support for such concepts.

Like I said, "In the future, I would like to see HA try to tap into entity renaming events to pull the new name and see if it can figure out a gender/number/whatever for it. Users can override these auto-detected values, but with a good detector, we would have 90% of the issue solved.". You (the user) write in an entity name or alias and HA could do any of the following automatically:

determine taxonomy values for the current language, given the entity name
extract the base word, which should be the principal object in the entity name/alias
propose and/or store declinations (i.e. additional taxonomy values) for the extracted base word based on grammatical engines for that language

I know this is difficult, but until then, I think we should just try to use sentences that are correct no matter the gender/number of the device names.

In some languages, that's either very hard or not possible without having some very long constructs that defy the purpose of having brief answers.

We're on the same page here, but just like no actual voice support was added to HA in February, Year-of-the-voice A.D., it's the same with this - I'm merely trying to suggest a way of addressing this issue properly and in an extensible manner. It's by no means exhaustive, it's just what I believe to be a good enough foundation.

HarvsG · 2023-12-20T16:56:14Z

HarvsG
Dec 20, 2023

On the use of tenses:

At the moment assist says "Turned on light"
Ok google says "Sure, turning on the hallway light"

To me, "turned on light" implies success - is the intent platform actually confirming success before issuing a TTS response? If not then the past perfect tense seems inappropriate.
If the commands are optimistic, using a similar present-tense approach seems most appropriate.

0 replies

HarvsG · 2023-12-20T17:03:56Z

HarvsG
Dec 20, 2023

On the use of articles:
#1539 Adds the word "the" to the response "Turned on lights"
To me, the definite article should be used in declarative sentences like this.
At least one grammar checker agrees:

An alternative solution would be "Lights turned on" which would be correct, although it is unclear whether it is a an exclamation like "Lights turned on!?" (as a HA user with a broken automation might say) or an incorrect contraction of the passive-voice statement "The lights were turned on by (me/the voice assistant)"

I suggest that articles be used where possible, at least in English, although it will add plural complexity in other languages (e.g le vs les).

0 replies

ddppddpp · 2024-01-17T09:25:50Z

ddppddpp
Jan 17, 2024

Can you please suggest how to treat articles in non-English environments?
In English articles are separate words (a/an/the) so we can define skip words, however i.e. in Bulgarian they are suffixes to the article word.
Example - living room can be хол, but 'the livingroom' should be 'холът' or 'хола'.
I think that defining multiple aliases per entity would be too much to ask for.

1 reply

v1k70rk4 Jan 23, 2024
Collaborator

I think that defining multiple aliases per entity would be too much to ask for.

This can be resolved in the common.yaml file. In the Hungarian language, there are prefixes and suffixes as well:

area: "[<area_prefix>] {area}[(<area_suffix>| <area_filler>)]"
area_prefix: "(в|на|във)"
area_suffix: "(ът|а)"
area_filler: "стаята"

Thus, it recognizes the text "в хола" as well.
Assuming the root of the word doesn't change sometimes, as it does in Hungarian.

If the root of the word changes as well, then there is a problem. In Hungarian, this can be solved with a filler word, for example, in the case of 'háló' (bedroom), you can say 'turn on the light in the háló helyiségben/szobában' (in the bedroom) so that you don't have to inflect the word 'háló'.

Generating a response is a more difficult question.

ViViDboarder · 2024-06-07T16:24:29Z

ViViDboarder
Jun 7, 2024
Collaborator

My understanding is that there are 3 different purposes for sentences that we are trying to represent in this repo:

TTS responses
Intent capture
STT training

I don't believe that each of these would benefit from having a universal strict grammatical accuracy requirement.

TTS responses

I believe these make the most sense to be strictly grammatically correct. In the case of Home Assistant, this will have to be "best guess" because we can't quite know every entity name. Someone may have a plural entity where we generally assume singular. Short of having an attribute associated with entities to indicate plurality, this is something we should more or less ignore.

Intent capture

In my opinion, intent capture should only be absolutely required to be unambiguous. There should be a leaning towards not being overly permissive and limit to being only inclusive grammatical errors. I believe this should be the target because not all speakers will have the same command of the target language. For example, if my toddler were to say "Is all the lights on?", while clearly not proper, it's unambiguous and can be matched to an intent.

STT training

I understand the desire to use proper grammar to train STT recognition models since we'd want to assume proper grammar first, however the fact is that some people still speak with grammatical errors. This, at the very least, should not impede the ability to match intents. I saw a suggestion of running permuted sentences through a grammar checker before training, which makes sense to me and would be a good way to meet both use cases without harm.

My suggestion can be summarized as:

TTS responses - Must be strictly correct (where possible)
Intent capture - Must be unambiguous, should be grammatically correct or a common grammatical error
STT training - Should be grammatically correct, can be filtered from list of intent sentences if adding a grammar checker to the sentence generation system (does this exist today?)

I believe this is the same thing suggested in [this thread], however I wanted to try to write it up in a single post to make it clearer what the consensus is.

0 replies

Grammatical Features #871

synesthesiam Feb 2, 2023 Maintainer

Replies: 13 comments · 16 replies

tetele Feb 2, 2023 Collaborator

synesthesiam Feb 2, 2023 Maintainer Author

spuljko Feb 2, 2023 Collaborator

synesthesiam Feb 2, 2023 Maintainer Author

tetele Feb 3, 2023 Collaborator

synesthesiam Feb 6, 2023 Maintainer Author

schizza Feb 3, 2023 Collaborator

synesthesiam Feb 6, 2023 Maintainer Author

skynetua Feb 11, 2023 Collaborator

skynetua Feb 11, 2023 Collaborator

tetele Feb 16, 2023 Collaborator

davefx Feb 24, 2023 Collaborator

davefx Feb 24, 2023 Collaborator

tetele Feb 27, 2023 Collaborator

tetele Feb 27, 2023 Collaborator

davefx Feb 27, 2023 Collaborator

tetele Feb 27, 2023 Collaborator

v1k70rk4 Jan 23, 2024 Collaborator

ViViDboarder Jun 7, 2024 Collaborator

TTS responses

Intent capture

STT training

synesthesiam
Feb 2, 2023
Maintainer

Replies: 13 comments 16 replies

tetele
Feb 2, 2023
Collaborator

synesthesiam Feb 2, 2023
Maintainer Author

spuljko Feb 2, 2023
Collaborator

synesthesiam
Feb 2, 2023
Maintainer Author

tetele Feb 3, 2023
Collaborator

synesthesiam Feb 6, 2023
Maintainer Author

schizza
Feb 3, 2023
Collaborator

synesthesiam
Feb 6, 2023
Maintainer Author

skynetua Feb 11, 2023
Collaborator

skynetua Feb 11, 2023
Collaborator

tetele
Feb 16, 2023
Collaborator

davefx Feb 24, 2023
Collaborator

davefx
Feb 24, 2023
Collaborator

tetele Feb 27, 2023
Collaborator

tetele Feb 27, 2023
Collaborator

davefx Feb 27, 2023
Collaborator

tetele Feb 27, 2023
Collaborator

v1k70rk4 Jan 23, 2024
Collaborator

ViViDboarder
Jun 7, 2024
Collaborator