Reasoning for Reconsideration of Current Specification #931
-
Reasoning for Reconsideration of Current SpecificationThe current specification in the new alpha version has certain limitations that can be addressed for improved functionality and maintainability. The current design defines task sequences in an arraylist of titled tasks. This approach can lead to several issues: 1. Redundancy and Confusion: Duplicated naming of tasks with different operations is redundant and can create confusion, especially when dealing with autocompletion in development environments. 2. Complexity in Task Definition: Each operation needs to be defined within titled tasks in the do array, which can become cumbersome and difficult to manage, especially for larger workflows. 3. Function Reusability: While reusable components (functions) are defined, their usage within the task sequence can be simplified and made more intuitive. 4. Simplified Serialization and Deserialization: This approach simplifies task serialization and deserialization processes, making it easier to structure model definitions in code. 5. Improved Autocompletion: Named functions ensure that the YAML already removes duplicated named functions and the do sequence operations will not generate multiple possible autocomplete variants, enhancing the development experience. Suggested ImprovementTo address these issues, the task sequence should be defined using direct calls to named function operations. This approach reduces redundancy and enhances clarity. The functions section can be used to define reusable operations, which can then be called directly in the do array. It is important that both do and the do array in functions always have an array to maintain consistency and readability. Improved Code ExampleHere is an example of how this improved specification can be structured: document:
dsl: '1.0.0-alpha1'
namespace: test
name: advanced-workflow
version: '0.1.0'
use:
errors:
errorMessage:
status: 400
detail: ${ .error.message }
type: ${ .error.type }
instance: ${ .error.instance }
code: ${ .error.code }
authentications:
fake-booking-agency-oauth2:
oauth2:
authority: https://fake-booking-agency.com
grant: client_credentials
client:
id: serverless-workflow-runtime
secret: secret0123456789
functions:
findPet:
do:
- call:
use: asyncapi
with:
document: https://fake.com/docs/asyncapi.json
operationRef: findPetsByStatus
server: staging
message: getPetByStatusQuery
binding: http
payload:
petId: ${ .pet.id }
grpcGreet:
do:
- call:
use: grpc
with:
proto: file://app/greet.proto
service:
name: GreeterApi.Greeter
host: localhost
port: 5011
method: SayHello
arguments:
name: ${ .user.preferredDisplayName }
do:
- call:
use: findPet
with:
document: https://fake.com/docs/asyncapi.json
operationRef: findPetsByStatus
server: staging
message: getPetByStatusQuery
binding: http
payload:
petId: ${ .pet.id }
- for:
each: pet
in: .pets
do:
- call:
use: grpcGreet
with:
proto: file://app/greet.proto
service:
name: GreeterApi.Greeter
host: localhost
port: 5011
method: SayHello
arguments:
name: ${ .user.preferredDisplayName } Note: This is just an example flow to illustrate the suggested improvements and is not an actual working flow. Comparison with Current Alpha VersionCurrent Alpha Version: use:
errors:
errorMessage:
status: 400
detail: ${ .error.message }
type: ${ .error.type }
instance: ${ .error.instance }
code: ${ .error.code }
authentications:
fake-booking-agency-oauth2:
oauth2:
authority: https://fake-booking-agency.com
grant: client_credentials
client:
id: serverless-workflow-runtime
secret: secret0123456789
do:
- findPet:
call: asyncapi
with:
document: https://fake.com/docs/asyncapi.json
operationRef: findPetsByStatus
server: staging
message: getPetByStatusQuery
binding: http
payload:
petId: ${ .pet.id }
- grpcGreet:
call: grpc
with:
proto: file://app/greet.proto
service:
name: GreeterApi.Greeter
host: localhost
port: 5011
method: SayHello
arguments:
name: ${ .user.preferredDisplayName }
- httpGetPet:
call: http
with:
method: get
endpoint: https://petstore.swagger.io/v2/pet/{petId}
- openapiFindPet:
call: openapi
with:
document: https://petstore.swagger.io/v2/swagger.json
operationId: findPetsByStatus
parameters:
status: available
- emitOrderEvent:
emit:
event:
source: https://petstore.com
type: com.petstore.order.placed.v1
data:
client:
firstName: Cruella
lastName: de Vil
items:
- breed: dalmatian
quantity: 101
- checkPets:
for:
each: pet
in: .pets
do:
- waitForCheckup:
listen:
to:
one:
with:
type: com.fake.petclinic.pets.checkup.completed.v2
output:
as: '.pets + [{ "id": $pet.id }]' Note: This is just an example flow to illustrate the suggested improvements and is not an actual working flow. By adopting the suggested approach, the task sequence becomes more streamlined, reusable functions are better utilized, and the overall workflow definition is more intuitive and less prone to naming conflicts or redundancies. This also simplifies task serialization and deserialization and structuring model definitions in code. You can check for named functions where the YAML already removes duplicated named functions and the do sequence operations will not have multiple possible autocomplete variants, making the development process smoother and more efficient. Other Discussions About This TopicYou can have a read at this discussion, which is probably the most extensive discussion on the subject. Other Related Discussions: |
Beta Was this translation helpful? Give feedback.
Replies: 9 comments 13 replies
-
I'm gonna withhold my comments about naming until start of next week, hopefully after @JBBianchi, @matthias-pichler-warrify, @ricardozanini and @fjtirado can kick in. In the meanwhile, I don't understand the following:
How so? If anything, your sample is repeating/duplicating information that in the current version is defined only once, making it more cumbersome and less readable IMO. Additionally the translation of your suggestion into the current version of the DSL is incorrect afaik regarding that point. As a matter of fact, a more accurate translation (adorned with some comments) would be: use:
functions:
findPet:
input:
schema: #usefull for autocompletion: when calling findPet, your IDE can suggest/validate the `with` of the call to perform
document:
type: object
properties:
petId:
type: integer
output:
schema: {}
as: #you cannot ignore that keyword, or you'd kill the schema
petDetails: ${ .pet.details }
call: asyncapi #don't see the need to put a `do` here, as you are defining a single call
with:
document: https://fake.com/docs/asyncapi.json
operationRef: findPetsByStatus
server: staging
message: getPetByStatusQuery
binding: http
payload:
petId: ${ .petId }
do:
- callFindPet:
call: findPet #dont know why you'd think re-introducing the term function here - which is IMO confusing, opinated and complexifies the syntax - instead of just `call: asyncapi | openapi | ... | customFunctionName` would be more intuitive or would change anything in auto-completion or serdes
with: #autocompletion can kick in, here, and suggest/validate the input defined in your reusable "function"
petId: ${ .pet.id } On a side note, you can avoid typing |
Beta Was this translation helpful? Give feedback.
-
I don't think forcing the user to define functions to later use them in tasks is an improvement. It splits the information into two parts and makes writing and reading less fluent. When writing, you need to know ahead of time exactly how you'll split your workflow and which pieces of information go where, and you'll have to go back and forth between your function and task definitions. The same applies when reading the workflow; the information will be scattered, and you'll have to go up and down, not forgetting which task you were looking at when you asked yourself which endpoint is called.
Eventhough it's not (and cannot be?) enforced by the JSON Schema, each task name MUST be unique within its task list and SHOULD be unique throughout the entire workflow.
This is a valid point, discussed in the links you provided. For me, the main argument for keeping the names vs. making them optional via property is when using flow directives. First off, because it speaks directly to the user when it's defined as
If I understood the discussion with @cdavernas correctly, this is a topic worth discussing. In the DSL, we describe how to use custom functions as tasks, but if I'm not mistaken, the JSON Schema doesn't allow it. I think that's what you have in mind, isn't it?
Maybe it does, but it shouldn't really be a consideration. As both ways are possible (with or without names), it's not blocking. The most important factor is the author experience.
It's a valid point, but I think it can be mitigated with the VS Code extension. |
Beta Was this translation helpful? Give feedback.
-
It cannot be enforced using just the JSON Schema. There is (unhappilly) no way to rely only on schema validation: it is as a matter of fact but a first step, which should be followed by a DSL one. DSL validation is IMHO non-optional, as there are many non-structural checks that should be performed to ensure a workflow has been properly defined (ex: check that
IMHO opinion, this is not just a strong argument, but is THE reason why names MUST be used and enforced. Additionally, names are essential for user friendliness, especially in the context of a schema, for example.
Not sure I understand what you mean here. Custom functions exist and are allowed by the JSON Schema.
I totally agree, especially that it's easily achievable, as demonstrated by the updated SDKs.
Autocompletion can be achieved using what is in place now, it's just a bit more complex than relying on enums, like we did in the past (and which made us hit several walls). On a side note, enums are IMO your worst enemy, are usually the sign of a code smell, and should whenever possible be avoided (I used to love them to begin with, but quickly learned to hate them, as they were just bottlenecking my code and forbidding extensibility without a partial refactor). |
Beta Was this translation helpful? Give feedback.
-
I don't think I have much more to add to this discussion but I agree with the points raised here, specifically:
Yeah the ability to "quickly" script something together and then extract reusable components later should definitely be allowed by the spec. Writing every function beforehand seems cumbersome.
Yeah JSON schema is great for structural validation but a "semantic" one should definitely be used by any runtime to avoid runtime errors.
I am also strongly in favor of mandatory names. Another benefit of the current spec that would be lost is that when I fold all tasks in my editor I can still read the flow document: {}
do:
- getDeviceForContact: ...
- checkPlatform: ...
- unknownPlatform: ...
- createApplePass: ...
- createGooglePass: ...
- sendLink: ... the equivalent would look like this with the proposed changes, or I would have to define everything beforehand even though nothing is reused document: {}
do:
- call: http
with: ...
export: ...
- switch: ...
- raise: ...
- call: http
with: ...
export: ...
then: 5
- call: http
with: ...
export: ...
then: 5
- call: http
with: ...
then: end
Yeah as defined in #906 authors concerns should come first. The JSON schema shows that structural validation is still easily possible and there is no strong evidence that semantic validation should prove and harder given the current schema. I absolutely agree that it makes things harder to parse. I can Imagine that it does make it harder to build an AST. Even for the runtime parsing it makes me do things like const [[taskName, taskDefinition]] = Object.entries(task) but I think the authoring experience is worth it. I am having decent success using the YAML extension for VS code and starting my files with: # yaml-language-server: $schema=https://serverlessworkflow.io/schemas/1.0.0-alpha2/workflow.yaml
document: {}
...
I haven't played around with the current spec of reusable tasks yet so I can't say much yet but at least I don't see any immediate problems with it |
Beta Was this translation helpful? Give feedback.
-
@zolero do you have any other concerns to be addressed? |
Beta Was this translation helpful? Give feedback.
-
Currently, I'm monitoring the progress of the new @neuroglia-io/sdk-typescript to see if we run into the reasonings and potential cons I've previously mentioned. Regarding the task sequence specification, I believe it still requires refinement. Ideally, but it could be personal preference, the task sequence should mimic typical coding practices, where tasks are executed in a straightforward, linear order without referencing to previous tasks in the sequence. Why?
Not sure if this perspective would result in a completely different view of how the task flow is executed right now, and I apologize if this causes any confusion. I've been also examining GitLab's CI/CD pipeline step implementation, which uses a type definition, making it easier to understand what task operation should be executed. The Step Definition for GitLab Steps:
Why?
Here is an example to illustrate how this would look: document:
dsl: '1.0.0-alpha2'
namespace: runtime
name: test-flow
version: '1.0.0'
use:
errors:
petNotFoundError:
title: Undefined Priority
status: 400
type: 'https://fake.com/errors/undefined-priority'
instance: /handleError
authentications:
clinicOauth2:
type: oauth2
oauth2:
authority: https://auth.petclinic.com
grant: client_credentials
client:
id: petclinic-app
secret: secret0123456789
functions:
grpcGreet: # This would be the reference name of the function task
type: grpc
with:
proto: file://app/greet.proto
service:
name: GreeterApi.Greeter
host: localhost
port: 5011
method: SayHello
arguments:
name: ${ .user.preferredDisplayName }
do:
- type: http
# name: findPet *Optional
with:
method: get
endpoint: https://petstore.swagger.io/v2/pet/{petId}
auth: clinicOauth2
then: continue # Default flow direction
- type: use
ref: grpcGreet # Reference to an reusable function
# then: continue
- type: use
ref: petNotFoundError # Reference to an reusable error
# then: continue
- type: switch
switch:
- case:
when: ${.orderType == "electronic"}
then: exit
- case:
when: ${.orderType == "physical"}
do:
- type: http
- default:
then: continue
- type: run
name: runScript # Naming the operation
script:
language: js
code: |
console.log("Running script task");
- type: raise
error:
type: 'https://fake.com/errors/unknown-order-type'
status: 400
title: Unknown Order Type
- type: sequence # Or some other sequence naming like:, 'tasks' | 'steps'
do:
- type: http
with:
method: get
endpoint: https://fake-inventory-service.com/inventory
- type: set
- type: http
with:
method: post
endpoint: https://fake-packaging-service.com/pack
- type: http
with:
method: post
endpoint: https://fake-shipping-service.com/schedule
- type: listen
to:
any:
- with:
type: com.fake-hospital.vitals.measurements.temperature
data:
temperature: ${ .temperature > 38 }
- with:
type: com.fake-hospital.vitals.measurements.bpm
data:
bpm: ${ .bpm < 60 or .bpm > 100 }
then: exit Note: This is just an example flow to illustrate the suggested improvements and is not an actual working flow. I would like to hear other people's opinions on this approach. I hope it is not too far away from the current topic we're discussing. |
Beta Was this translation helpful? Give feedback.
-
@zolero I think you have the conception of a DAG, which can be implemented using the specification. But the spec supports a Cycle Graph, which can use the tasks to have references in different directions (or having loops within). I believe that continuing this discussion to force the specification to conform to a DAG is counter-productive since it's not the concept the community aims for. The control flow in this case is free to come and go in any direction the DSL specifies. Surely, the basic graph can be directed acyclic and one can even skip defining A CI/CD usually is a DAG, so the comparison you are doing is different and won't fit the goals of this specification. We can't force a DAG implementation, but a DAG can be defined using the spec. You're free to use it this way. EDIT: if we would to be a DAG specification, |
Beta Was this translation helpful? Give feedback.
-
Thank you for sharing your perspective on this. I understand your intention to align the task sequence with typical coding practices. However, I have some concerns. The DSL is intended for use by non-technical people and, ideally, should be closer to human language than to a computer's. Let's consider an if/else block as an example. With your proposal, this would need to be factored into a dedicated task with higher cyclomatic complexity, instead of just relying on straightforward flow control.
While I see the value in preventing unexpected behavior, such an approach would eliminate essential features like the
This approach seems to remove critical flow control mechanisms, which are essential for real-world scenarios. A simple, finite sequence of tasks is too restrictive for most use cases I've encountered.
Our current DSL also achieves explicitness but uses keywords instead of a
Referencing previous comments, particularly from @matthias-pichler-warrify, there are strong arguments for enforcing task naming. Naming and typing serve different purposes and should both be maintained. To conclude, I believe the current DSL can meet your use case without significant alterations. You might consider implementing an auto-naming feature for simplicity. However, incorporating your proposal might make it challenging to author even basic processing tasks, such as a circuit breaker.
No need to apologize. Your input is valuable, and I appreciate the discussion. While I hold a different view, I hope this response conveys my points clearly and respectfully. 😊 |
Beta Was this translation helpful? Give feedback.
-
Wouldn't adding a standalone event listener to the specification to trigger it's flow already create a Cycle Graph? The task operation sequence would follow a DAG structure, with inputs and outputs, but the overall workflow execution and processing would function as a Cycle Graph. This would reduce complexity significantly doesn't it? Not only for creating an engine, but as well for creating tooling and state and event management. |
Beta Was this translation helpful? Give feedback.
@zolero I think you have the conception of a DAG, which can be implemented using the specification. But the spec supports a Cycle Graph, which can use the tasks to have references in different directions (or having loops within).
I believe that continuing this discussion to force the specification to conform to a DAG is counter-productive since it's not the concept the community aims for. The control flow in this case is free to come and go in any direction the DSL specifies.
Surely, the basic graph can be directed acyclic and one can even skip defining
then
. But we aim for a flow that can refer to tasks in any direction.A CI/CD usually is a DAG, so the comparison you are doing is differ…