Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Echo engine endpoint coverage #80

Merged
merged 11 commits into from
Aug 22, 2023
Merged

Conversation

Enkidu93
Copy link
Collaborator

@Enkidu93 Enkidu93 commented Aug 10, 2023

Slightly more detailed Echo documentation as well as added endpoints (train-segment and get-word-graph). Also, fixed the error checking so that matching target and source languages for echo engines and their corpora is allowed as per Damien's concern.


This change is Reviewable

@johnml1135
Copy link
Collaborator

src/Serval.Client/Client.g.cs line 811 at r2 (raw file):

        /// <br/>return empty responses. Endpoints like translate and get-word-graph echo the sent content back to the user
        /// <br/>in a format that mocks Nmt or Smt. For example, translating a segment "test" with the Echo engine would
        /// <br/>yield a translation response with translation "test". This engine is useful for debugging and testing purposes.

We probably should add a comment that Echo engines allow the same language for source and target, unlike other translation engines. Or inversely, for NMT and SMT engines that the languages need to be different.

@johnml1135
Copy link
Collaborator

:lgtm:

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r2, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit and @Enkidu93)

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 5 files at r2, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @Enkidu93 and @johnml1135)


samples/EchoTranslationEngine/TranslationEngineServiceV1.cs line 110 at r2 (raw file):

                                    string[] targetLines = await File.ReadAllLinesAsync(targetPath, cancellationToken);
                                    bool isTabSeparated =
                                        (sourceLines.Length > 0) && (sourceLines[0].Split('\t').Length > 1);

It would be better to use Contains().


samples/EchoTranslationEngine/TranslationEngineServiceV1.cs line 120 at r2 (raw file):

                                        )
                                        {
                                            if (sourceLine.Length > 0)

Pretranslations should only be generated for verses that do not have a target translation.


samples/EchoTranslationEngine/TranslationEngineServiceV1.cs line 265 at r2 (raw file):

                    Arcs =
                    {
                        from index in Enumerable.Range(0, tokens.Length - 1)

To make the codebase consistent, we do not use the query syntax for LINQ. We always use the functional syntax, i.e. chained function calls.


src/Serval.Client/Client.g.cs line 811 at r2 (raw file):

Previously, johnml1135 (John Lambert) wrote…

We probably should add a comment that Echo engines allow the same language for source and target, unlike other translation engines. Or inversely, for NMT and SMT engines that the languages need to be different.

In fact, the echo engine requires that the source and target language are the same.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, please try to avoid merging the main branch into feature branches. This can make it a lot harder to review the changes. Also, rebasing is preferred, since it results in a more straightforward history.

Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @Enkidu93 and @johnml1135)

@Enkidu93
Copy link
Collaborator Author

samples/EchoTranslationEngine/TranslationEngineServiceV1.cs line 120 at r2 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Pretranslations should only be generated for verses that do not have a target translation.

Is this the behavior of NMT? Notice too that coupled with `targetLine.Length > 0 ? ...", non-empty lines are just getting the targetLine as a pretranslation. The only reason I altered this was because Peter Chapman told me about/showed me output from NMT, and he wanted Echo to be consistent with it: #66 (comment). Let me know.

@Enkidu93
Copy link
Collaborator Author

src/Serval.Client/Client.g.cs line 811 at r2 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

In fact, the echo engine requires that the source and target language are the same.

Do you want this to be a "requires" that has the force of a validation check and an error thrown? As it is, I've had to add specific "is it Echo?" checks to get around this rule on the positive (i.e., yes, you may have the same source and target language) side which seems semantically out of place in the Serval code. I'm happy to do so, but if we think that long-term there may be many different kinds of engines with different behaviors, it may be worth rethinking how this is done and maybe make some-kind of engine-specific call to check validity or push the error-checking elsewhere. (I could think of something that's maybe a 'stylistic' translation that translates from one register to another within the same language that would require the same source and target languages...maybe that's a bad example, but you get the point).

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @Enkidu93 and @johnml1135)


samples/EchoTranslationEngine/TranslationEngineServiceV1.cs line 120 at r2 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

Is this the behavior of NMT? Notice too that coupled with `targetLine.Length > 0 ? ...", non-empty lines are just getting the targetLine as a pretranslation. The only reason I altered this was because Peter Chapman told me about/showed me output from NMT, and he wanted Echo to be consistent with it: #66 (comment). Let me know.

I checked the NMT code and you are correct. I wasn't aware that the behavior of NMT was changed. It doesn't really make sense to generate a pretranslation for a verse that already has one, so we should change the NMT behavior back to the way it was. Go ahead and revert this back to the original behavior.


src/Serval.Client/Client.g.cs line 811 at r2 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

Do you want this to be a "requires" that has the force of a validation check and an error thrown? As it is, I've had to add specific "is it Echo?" checks to get around this rule on the positive (i.e., yes, you may have the same source and target language) side which seems semantically out of place in the Serval code. I'm happy to do so, but if we think that long-term there may be many different kinds of engines with different behaviors, it may be worth rethinking how this is done and maybe make some-kind of engine-specific call to check validity or push the error-checking elsewhere. (I could think of something that's maybe a 'stylistic' translation that translates from one register to another within the same language that would require the same source and target languages...maybe that's a bad example, but you get the point).

We don't need a validation check, since it is only used for testing purposes. I would say that engine-specific validation like this should be done by the Grpc Create request, so in the case of the echo engine, we could validate that the source and target language are the same in the TranslationEngineServiceV1.Create() method. Serval shouldn't do any specific validation for the engines, since we want to keep Serval and the engines loosely coupled.

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @Enkidu93)

@Enkidu93
Copy link
Collaborator Author

@ddaspit I think everything you mentioned has been fixed.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r3, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @Enkidu93)


samples/EchoTranslationEngine/TranslationEngineServiceV1.cs line 16 at r3 (raw file):

    {
        if (request.SourceLanguage != request.TargetLanguage)
            throw new ArgumentException("Source and target languages must be the same");

I believe that you have to throw a RpcException in a gRPC service.


src/Serval.Client/ServalClientHelper.cs line 104 at r3 (raw file):

            // Throttle requests to only 2 x second
            await Task.Delay(500);
            // cRevision = result.Revision + 1;

Why was this commented out?

@Enkidu93
Copy link
Collaborator Author

src/Serval.Client/ServalClientHelper.cs line 104 at r3 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

Why was this commented out?

I didn't mean to check that in. I apologize. It was a quick fix to get around a timeout problem that John fixed elsewhere but I'm not sure has been merged??? I'll investigate. I'll remove that; thank you.

@Enkidu93
Copy link
Collaborator Author

samples/EchoTranslationEngine/TranslationEngineServiceV1.cs line 16 at r3 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I believe that you have to throw a RpcException in a gRPC service.

OK, I'll go ahead and fix that.

@johnml1135
Copy link
Collaborator

src/Serval.Client/ServalClientHelper.cs line 104 at r3 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

I didn't mean to check that in. I apologize. It was a quick fix to get around a timeout problem that John fixed elsewhere but I'm not sure has been merged??? I'll investigate. I'll remove that; thank you.

It is a hack solution - I ripped out all of cRevision and told Eli that by not incrementing revision, it would pass the tests (it was failing).
Here is an idea - I wouldn't want SF and other Clients pinging Serval every 500 ms but we could do this:

  • Have cRevision be default not used (with a new parameter called "incrementRequestedRevisionPerCall"
  • Make a new parameter ""requestInterval" and default it to 5 seconds. Override it when calling from a test environment.
    @ddaspit , what do you think?

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ddaspit)

@Enkidu93
Copy link
Collaborator Author

@ddaspit @johnml1135 Let me know if there are any other outstanding issues.

@johnml1135
Copy link
Collaborator

src/Serval.Client/ServalClientHelper.cs line 104 at r3 (raw file):

Previously, johnml1135 (John Lambert) wrote…

It is a hack solution - I ripped out all of cRevision and told Eli that by not incrementing revision, it would pass the tests (it was failing).
Here is an idea - I wouldn't want SF and other Clients pinging Serval every 500 ms but we could do this:

  • Have cRevision be default not used (with a new parameter called "incrementRequestedRevisionPerCall"
  • Make a new parameter ""requestInterval" and default it to 5 seconds. Override it when calling from a test environment.
    @ddaspit , what do you think?

Reverting for now is ok.

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r4, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ddaspit)

@johnml1135
Copy link
Collaborator

docker-compose.yml line 140 at r4 (raw file):

      - ClearMLNmtEngine__ApiServer=https://api.sil.hosted.allegro.ai
      - ClearMLNmtEngine__Queue=lambert_24gb
      - ClearMLNmtEngine__DockerImage=mpy.local

This is a fine change. The use of mpy.local was for local development (a locally made image) and can still be manually modified when doing development. However for E2E tests, it would be better to use the standard production queue and github images.

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 5 files at r3.
Dismissed @ddaspit from a discussion.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ddaspit and @johnml1135)

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit)

@johnml1135
Copy link
Collaborator

src/Serval.Client/ServalClientHelper.cs line 104 at r3 (raw file):

Previously, johnml1135 (John Lambert) wrote…

Reverting for now is ok.

It fixes the test. It's not good for production, but no one is using it now. #98

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r5, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ddaspit)

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissed @ddaspit from a discussion.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Enkidu93)

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @Enkidu93)

@johnml1135 johnml1135 dismissed ddaspit’s stale review August 22, 2023 14:19

The issues have been addressed

@johnml1135 johnml1135 merged commit 74305c3 into main Aug 22, 2023
1 check passed
Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r4, 2 of 2 files at r5, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@Enkidu93 Enkidu93 deleted the echo_engine_endpoint_coverage branch October 19, 2023 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants