Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Worker code hangs when message passing with sync-send and receiver terminates with a non-error value #42476

Closed
lochana-chathura opened this issue Apr 4, 2024 · 6 comments · Fixed by #42478
Assignees
Labels
Lang/Actions/WorkerMessagePassing Priority/High Reason/EngineeringMistake The issue occurred due to a mistake made in the past. Team/jBallerina All the issues related to BIR, JVM backend code generation and runtime Type/Bug
Milestone

Comments

@lochana-chathura
Copy link
Member

lochana-chathura commented Apr 4, 2024

Description

Consider the below code. It hangs when we run it. However, this code works for the asyc-send

public function main() {
    worker w1 {
        1 ->> w2;
        2 ->> w2;
    }

    worker w2 {
        boolean b = true;

        int _ = <- w1;

        if b {
            return;
        }

        int _ = <- w1;
    }

    wait w1; // code hangs with this wait statement
}
@ballerina-bot ballerina-bot added the Team/jBallerina All the issues related to BIR, JVM backend code generation and runtime label Apr 4, 2024
@lochana-chathura lochana-chathura added this to the 2201.9.0 milestone Apr 4, 2024
@HindujaB HindujaB self-assigned this Apr 4, 2024
@lochana-chathura
Copy link
Member Author

lochana-chathura commented Apr 4, 2024

@HindujaB and I were discussing this. This scenario is not really handled in the implementation.
We need to confirm the expected behavior for the above code.

When we have an error return, we propagate that error to the sender.

public function main() {
    worker w1 {
        1 ->> w2;
        error? unionResult = 2 ->> w2; // we enforce assignment here and get possible error assigned to a variable
    }

    worker w2 returns error? {
        boolean b = true;

        int _ = <- w1;

        if b {
            return error("error"); // error return
        }

        int _ = <- w1;
    }

    wait w1;
}

Option 1:
Similarly, for the non-error return case, we may need to propagate the info to the sender, that the receiver is not going to receive.

Option 2:
Fix this by just ignoring the send value. However, the sender wouldn't know if the sending was received by the other end. If we ship this behavior with this release we may not be able to change the behavior later.

Option 3:
We can make this an error like before. When we have a possible non-error return before a receive-action we give an error to the receive. (If we have this for now, we can change the behavior in a later release). I couldn't recall the reason why we removed previous error in the first place. I guess it is because we were going to have conditional-receive.

@hasithaa @MaryamZi

@HindujaB
Copy link
Contributor

HindujaB commented Apr 4, 2024

Adding to @lochana-chathura's comment, the current spec does not cover this scenario. But, it mentions like this.

If the receive-action was not executed and its worker terminated normally, then the termination value of the worker will belong to the failure type. The failure type will be a (possibly empty) subtype of error.

And, through the sync-send action or async-send with a corresponding flush, the user ensures whether the message passing within the respected channels is completed or not. This implies that the send action is waiting for some state to know about the message passing.

And for receive action, if the corresponding worker channel is closed which indicates the send action will not complete (for conditional send actions), we return a NoMessage error.

Similarly, wouldn't it be more consistent to return an error in the above scenario as well? only for the sync-send or flush actions.
@hasithaa @MaryamZi

@lochana-chathura
Copy link
Member Author

I think the best thing to do here is option 3. (Additionally, we need to disallow receive-action inside the worker-on-fail-clasue).

This would make sure the following line from the spec is not violated.

"It is a compile time error if it is possible for a worker to terminate with success before it has executed all its receive actions."

Plus, we can always go for option 1 and 2 later on top of this after careful evaluation.

@hasithaa
Copy link
Contributor

hasithaa commented Apr 5, 2024

Isn't this the same as a send stmt in an alternative receive? Where the send value is ignored. IIRC, that is what was discussed offline.

@lochana-chathura
Copy link
Member Author

I see your point. In these scenarios, though we have sync-send, not receiving the message is not propagated back to the sender. They are completed with a nil type. However, since we pair each send with a receive, on the receiving end it is known that there is "no message".

So if we consider overall communication which comprises a sending part and a receiving part, either the sender or receiver knows what happened in the communication. However, in the original code sample in this issue, none of the end knows what happened.

[1]

public function main() {

    worker w1 {
        boolean b = false;
        if b {
            () _ = 1 ->> function ;
        }

        () _ = 2 ->> function ;
    }

    int|error:NoMessage x = <- w1;
    io:println(x); // error NoMessage
    int y = <- w1;
    io:println(y); // 2
}

[2]

public function main() {
    worker w1 {
        boolean b = false;
        if b {
            () _ = 1 ->> function ;
        }

        () _ = 2 ->> function ;
    }

    int|error:NoMessage x = <- w1|w1;
    io:println(x); // 2
}

Copy link

This issue is NOT closed with a proper Reason/ label. Make sure to add proper reason label before closing. Please add or leave a comment with the proper reason label now.

      - Reason/EngineeringMistake - The issue occurred due to a mistake made in the past.
      - Reason/Regression - The issue has introduced a regression.
      - Reason/MultipleComponentInteraction - Issue occured due to interactions in multiple components.
      - Reason/Complex - Issue occurred due to complex scenario.
      - Reason/Invalid - Issue is invalid.
      - Reason/Other - None of the above cases.

@lochana-chathura lochana-chathura added the Reason/EngineeringMistake The issue occurred due to a mistake made in the past. label Apr 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Lang/Actions/WorkerMessagePassing Priority/High Reason/EngineeringMistake The issue occurred due to a mistake made in the past. Team/jBallerina All the issues related to BIR, JVM backend code generation and runtime Type/Bug
Projects
None yet
4 participants