Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harmonize condition syntax across workflows and policies #122

Open
lauwers opened this issue Jul 2, 2022 · 13 comments
Open

Harmonize condition syntax across workflows and policies #122

lauwers opened this issue Jul 2, 2022 · 13 comments

Comments

@lauwers
Copy link
Contributor

lauwers commented Jul 2, 2022

Condition clauses are used in policy definitions as well as in workflow definitions. While the condition syntax is well thought-out and supports arbitrary boolean expressions, in practice it is very limited since conditions can only apply to properties or attributes of
one single node, capability, or relationship. With current syntax it is not possible to create complex condition expressions that consider multiple nodes and relationships in a service topology.

To make matters worse, workflow definitions and policy definitions use different syntax for specifying the node or relationship that has the property or attribute for which the condition needs to be evaluated. Workflows use target and target_relationship keywords to specify the entity with the property or attribute for which the condition needs to be evaluated. There is no way to identify
properties or attributes in capabilities in a condition clause in a workflow. Policies, on the other hand, use a target_filter keyword
to specify the entity with the property or attribute for which the condition needs to be evaluated. This target filter allows you to
specify a capability.

The following snippet shows a workflow that includes a condition that checks the 'state' attribute of a relationship. It also shows a policy trigger that includes a condition that checks the 'level' attribute of a capability of a node.

tosca_definitions_version: tosca_simple_yaml_1_3

policy_types:
  ServiceMonitor:
    targets: [ Server ]
capability_types:
  Service:
    properties:
      level:
        type: string
        constraints:
          - valid_values: [ excellent, mediocre, bad ]
relationship_types:
  ServedBy:
    properties:
      state:
        type: string
        constraints:
          - valid_values: [ UP, DOWN ]
node_types:
  Server:
    capabilities:
      service:
        type: Service
  Client:
    requirements:
      - server:
          capability: Service
          relationship: ServedBy
          
topology_template:

  node_templates:
    server:
      type: Server
      capabilities:
        service:
          properties:
            level: excellent
    client:
      type: Client
      requirements:
        - server:
            node: server
            relationship:
              properties:
                state: DOWN

  workflows:
    start_service:
      preconditions:
        - target: client
          target_relationship: server
          condition:
            - assert:
              - state: [{equal: DOWN}]

  policies:   
    - level_monitor:
        type: ServiceMonitor
        triggers:
          degraded_service:
            target_filter:
              node: server
              capability: service
            condition:
              method: average
              constraint:
                - not:
                  - level: [ { equal: excellent } ]

At the very least, we should consider harmonizing the condition syntax so it is consistent across policy definitions and workflow
definitions. To support conditions that apply to entire topologies, we should consider supporting ToscaPath expressions in condition
clauses. I'll submit an example separately of what that could look like.

@lauwers
Copy link
Contributor Author

lauwers commented Jul 2, 2022

The following is an example of what an harmonized syntax could look like. It eliminates the target and target_relationship keynames in workflow conditions, and it eliminates the target_filter keyname in policy conditions. Instead, it uses ToscaPath syntax to identify the entity to which the condition applies inline. This has the added benefit that this will enable us to define conditions that take properties from multiple nodes or relationships into account.

tosca_definitions_version: tosca_2_0

policy_types:
  ServiceMonitor:
    targets: [ Server ]
capability_types:
  Service:
    properties:
      level:
        type: string
        constraints:
          - valid_values: [ excellent, mediocre, bad ]
relationship_types:
  ServedBy:
    properties:
      state:
        type: string
        constraints:
          - valid_values: [ UP, DOWN ]
node_types:
  Server:
    capabilities:
      service:
        type: Service
  Client:
    requirements:
      - server:
          capability: Service
          relationship: ServedBy
          
service_template:

  node_templates:
    server:
      type: Server
      capabilities:
        service:
          properties:
            level: excellent
    client:
      type: Client
      requirements:
        - server:
            node: server
            relationship:
              properties:
                state: DOWN

  workflows:
    start_service:
      preconditions:
        - condition:
            - assert:
              - [ client, RELATIONSHIP, server, state ]: [{equal: DOWN}]

  policies:   
    - level_monitor:
        type: ServiceMonitor
        triggers:
          degraded_service:
            condition:
              method: average
              constraint:
                - not:
                  - [ server, CAPABILITY, service, level ]: [ { equal: excellent } ]

@pmbruun
Copy link

pmbruun commented Jul 12, 2022

We should be able to use functions, and particularly the new custom functions #123 in these conditions.

As discussed, we could allow a function both in the map-key side of a condition and after equal:.

Custom functions can return a boolean type result and of course that could be compared with equal: true.

This suggestion is related to #80 and #119 a regexp match condition #71 would also be obvious, and perhaps easier to express as a function.

The argument against this was that expressions with functions in TOSCA could create very cluttered syntax, but compare with #69. If we adopt the $-syntax for calling functions proposed in #123 I don't think it would be bad at all - let me make a few examples.

Each existing TOSCA constraint, like equal or not could be defined as semantically equivalent to calling an $equal or $not function. That would not require any semantics in a TOSCA orchestrator that wasn't already there.

I am assuming that the $ syntax should also be usable for the existing TOSCA functions, and that we find a good solution for escaping $ in keys where we do not want it to be interpreted as a function call.

Examples

The proposal above has:

         ...
            - assert:
              - [ server, CAPABILITY, service, level ]: [ { equal: excellent } ]

Using functions that would become:

         ...
            - assert:
              - $equal: [ [ server, CAPABILITY, service, level ], excellent ]

I don't think that reads any worse than the original proposal, and it can be easily distinguished from the legacy condition syntax of TOSCA by the (unescaped) $ starting the key.

The negated constraint would read:

         ...
            - assert:
              - $not: [$equal: [ [ server, CAPABILITY, service, level ], excellent ] ]

Or we could compare two paths:

         ...
            - assert:
              - $equal: [ [ server, CAPABILITY, service, level ], [ client, RELATIONSHIP, server, state ] ]

Of course orchestrators could get creative and introduce $or as in:

         ...
            - assert:
              - $or:
                - $equal: [ [ server, CAPABILITY, service, level ], excellent ]
                - $equal: [ [ client, RELATIONSHIP, server, state ], DOWN ] ]

Of course there are contexts in which you wouldn't be able to call some functions - for example $get_attribute.

A $size function would be easy to add for the situation we discussed where a condition would need to compare the lengths of two lists.

Ok - this will be postfix syntax for conditions instead of the current sort-of infix, but people have lived with postfix in spreadsheets for decades, so I think they can live with it.

@lauwers
Copy link
Contributor Author

lauwers commented Aug 4, 2022

I like this proposal a lot. It suggests we should use Boolean functions (rather than constraint clauses) as the building blocks for condition statements. Can we start with this proposal and investigate if we can retrofit filters to use similar syntax as well?

@calincurescu
Copy link

calincurescu commented Aug 8, 2022

Agree, I think we should use the boolean functions, and also use the automatic functional equivalents of the constraint operators. An advantage to have the functional form is that we don't need the assert keyname (it was just positional sugar anyway). Also, in line with Tal's suggestion we should also keep the get_propery function to access the respective property, i.e.

   condition:
       - $or:
          - $equal: [ {get_property: [server, CAPABILITY, service, level]}, excellent ]
          - $equal: [ {get_attribute: [client, RELATIONSHIP, server, state]}, down ]

We should use the same syntax for the node_filter.

In the node_filter case we need a specific keyword(s) to represent the initial context that the tosca_path for that filter starts from. Now, when using a node_filter, the TOSCA processor needs to: a) create a list with all node candidates, b) apply the filter on each of them, c) choose one of the filtered candidates. Thus, the specific tosca_path keyword will always point to the current filter candidate as initial context.

I think we need two keywords: NODE_FILTER and CAPABILITY_FILTER. The first one is used when we want to filter on the properties of the node when we know the node type. The second should be used when we want to filter on the properties of the target capability in a target node (for which we might not even know the node type). Note that nothing stops us to use both starting filtering contexts within the same requirement node_filter:

   node_filter:
       - $or:
          - $equal: [ {get_property: [CAPABILITY_FILTER, level]}, excellent ]
          - $equal: [ {get_property: [NODE_FILTER, id]}, 37 ]
          - $equal: [ {get_property: [NODE_FILTER, CAPABILITY, transcoding, codec]}, mpeg ]

Note that in the first get_property (in the example above) we point to the target capability of the requirement (without even knowing the symbolic name of that capability in the node type), while the last get_property we point to a specific named capability of the node that may or may not be the target capability of the requirement.

The advantage of defining the NODE_FILTER separate from the CAPABILITY_FILTER is that we can use the NODE_FILTER form unchanged also within a node_filter in a node definition that uses the [select] directive. In such a node_filter we cannot use the CAPABILITY_FILTER as there is no relationship construction.

Regarding constraints I think we should add the (recursive) operators: and, not, or, xor. Also we should add more operators for string/list/map constraints such as:
is_suffix, has_suffix, is_prefix, has_prefix, is_contained, contains, values_of, has_values, keys_of, has_keys

@calincurescu
Copy link

calincurescu commented Aug 9, 2022

Potential use of filter in a "select" directive node. with SELF as the initial context keyword:

  mysql_compute:
      type: Compute
      directives: [ select ]
      node_filter:
        - $equal: [ {$get_property: [SELF, CAPABILITY, host, num_cpus]}, 2 ]
        - $greater_or_equal: [ {$get_property: [SELF, CAPABILITY, host, mem_size]}, 2 GB ]
        - $equal: [ {$get_property: [SELF, CAPABILITY, os, architecture]}, x86_64 ]
        - $equal: [ {$get_property: [SELF, CAPABILITY, os, type]}, linux ]
        - $equal: [ {$get_property: [SELF, CAPABILITY, os, distribution]}, ubuntu ]

@pmjordan
Copy link
Contributor

pmjordan commented Aug 9, 2022

We acknowledge that the output of node_filter conditions is a set of nodes which match the given criteria. c.f. SQL the 'where' clause is implict here. At present the selection of a single node from within that set is delegated to the orchestrator, c.f. SQL 'order by' clause.
Perhaps the 'where' clause should be made explict to allow for a later addtion of a clause similar to 'order by' e.g.

  mysql_compute:
      type: Compute
      directives: [ select ]
      node_filter:
        - where:
              - $equal: [ {$get_property: [SELF, CAPABILITY, host, num_cpus]}, 2 ]
              - $greater_or_equal: [ {$get_property: [SELF, CAPABILITY, host, mem_size]}, 2 GB ]
              - $equal: [ {$get_property: [SELF, CAPABILITY, os, architecture]}, x86_64 ]
              - $equal: [ {$get_property: [SELF, CAPABILITY, os, type]}, linux ]
              - $equal: [ {$get_property: [NODE_FILTER, CAPABILITY, os, distribution]}, ubuntu ]`

@pmbruun
Copy link

pmbruun commented Aug 9, 2022

@pmjordan Not everyone would be using an SQL database, or a database at all, so this would be a lot to require.

On the other hand, we easily run topologies with 1M nodes and clearly it would not be useful to have to load everything into memory and execute the filtering there. On top, we are able to do node filtering on our equivalent of get_attribute, which is of course not in scope for TOSCA. To handle that, I am doing static evaluation of those parts of the node filter that can be mapped to SQL and only do in-memory filtering on remaining conditions.

This becomes relevant when we consider a generic "objective function" because a generic objective function would be impossible to optimize for database queries.

Instead, I would prefer the ability to sort on properties - and any objective function that can be expressed in a filter could also be "crystalized" into one or more objective properties. That would allow optimization by database sorting.

@pmjordan
Copy link
Contributor

pmjordan commented Aug 9, 2022

@pmbruun I was not intending to actually use SQL nor a database. I was just comparing the syntax, noting that there are separate clauses for a) deriving the set of matching nodes and b) selecting from that set.
Part b) is currently delegated to the orchestrator but TOSCA may introduce a clause to influence it. If so then the syntax adopted for a) may need to take account of the future possibility of b)

@calincurescu
Copy link

calincurescu commented Aug 15, 2022

To summarize the discussion in the last Language WG: An alternative to use the keywords CAPABILITY_FILTER (as initial context for capabilities in requirements filters) and NODE_FILTER (as initial context for nodes in both requirements filters or filters in nodes with "select" directive) we could use:

  • SELF for the initial context in filters in nodes with the "select" directive:
  mysql_compute:
      type: Compute
      directives: [ select ]
      node_filter:
        - where:
              - $equal: [ {$get_property: [SELF, id]}, 37 ]
              - $equal: [ {$get_property: [SELF, CAPABILITY, host, num_cpus]}, 2 ]
              - $greater_or_equal: [ {$get_property: [SELF, CAPABILITY, host, mem_size]}, 2 GB ]
  • REQ_FILTER for the initial context in requirement filters. To be able to access both the target capability (without knowing its symbolic name in the node) and also the target node properties or other capability properties, we can define the initial context as if starting from the requirement relationship, then use further the CAPABILITY or TARGET in the TOSCA path. This context is applied sequentially to all potential node candidates as requirement target until one is matching the filter. The syntax will be:
   node_filter:
       - $or:
          - $equal: [ {get_property: [REQ_FILTER, CAPABILITY, level]}, excellent ]
          - $equal: [ {get_property: [REQ_FILTER, TARGET, id]}, 37 ]
          - $equal: [ {get_property: [REQ_FILTER, TARGET, CAPABILITY, transcoding, codec]}, mpeg ]

@lauwers
Copy link
Contributor Author

lauwers commented Aug 15, 2022

In my opinion, there is no need for new keywords. The existing SELF keyword works just fine in both cases. For node filters defined in node templates, SELF refers to the node template context. For node filters defined in requirements, SELF refers to the relationship context that is the result of fulfilling the requirement.
@calincurescu mentioned earlier that in requirements, SELF must refer to the node that contains the requirement, since we may need to retrieve property values from that node. However, that is what the SOURCE keyword is for.
The following shows a complete example of using SELF and SOURCE keywords in requirement definitions and requirement assignments, including the use of the SOURCE keyword in node filters to retrieve property values from the containing node. It supports the idea that inside a requirement, SELF refers to the relationship, not the containing node.

tosca_definitions_version: tosca_simple_yaml_1_3

data_types:
  Content:
    derived_from: string
    constraints:
      - valid_values: [ basic, premium ]
  Quality:
    derived_from: string
    constraints:
      - valid_values: [ 720p, 1080p, 4k ]
    
capability_types:
  Service:
    properties:
      content:
        type: Content
        
relationship_types:
  StreamedFrom:
    derived_from: tosca.relationships.Root
    properties:
      subscriber_name:
        type: string
      quality:
        type: Quality
    interfaces:
      Configure:
        inputs:
          subscriber_name: { get_property: [ SELF, subscriber_name ] }

node_types:
  StreamingServer:
    derived_from: tosca.nodes.Root
    capabilities:
      service:
        type: Service

  Subscriber:
    derived_from: tosca.nodes.Root
    properties:
      name:
        type: string
      content:
        type: Content
    requirements:
      - stream:
          capability: Service
          relationship:
            type: StreamedFrom
            interfaces:
              Configure:
                operations:
                  add_source:
                    inputs:
                      quality: { get_property: [ SELF, quality ] }

topology_template:
  inputs:
    quality:
      type: Quality
    content:
      type: Content
    name:
      type: string

  node_templates:
    basic_server:
      type: StreamingServer
      capabilities:
        service:
          properties:
            content: basic
    premium_server:
      type: StreamingServer
      capabilities:
        service:
          properties:
            content: premium
    subscriber:
      type: Subscriber
      properties:
        name: { get_input: name }
        content: { get_input: content }
      requirements:
        - stream:
            relationship:
              properties:
                subscriber_name: { get_property: [ SOURCE, name ] }
                quality: { get_input: quality }
            node: StreamingServer
            node_filter:
              capabilities:
                - service:
                    properties:
                      - content: { equal: { get_property: [ SOURCE, content] } } 

@calincurescu
Copy link

calincurescu commented Aug 16, 2022

The propsal from @lauwers above seems to be quite elegant, where the SELF within the requirements: section representing the specific requirement relationship to be established. Note also that outside the requirements: section, the SELF will represent as usual that particular node.

To update the above example to the actualized tosca_path and filter syntax:

 node_templates:
   basic_server:
     type: StreamingServer
     capabilities:
       service:
         properties:
           content: basic
   premium_server:
     type: StreamingServer
     capabilities:
       service:
         properties:
           content: premium
   subscriber:
     type: Subscriber
     properties:
       name: { $get_input: name }
       content: { $get_input: content }
     requirements:
       - stream:
           relationship:
             properties:
               subscriber_name: { $get_property: [ SELF, SOURCE, name ] }
               quality: { $get_input: quality }
           node: StreamingServer
           node_filter:
             - $equal: 
               - $get_property: [SELF, CAPABILITY, content]
               - $get_property: [SELF, SOURCE, content]

Note, that since in this case we know the node type, the filter above can be equivalently written as:

           node_filter:
             - $equal: 
               - $get_property: [SELF, TARGET, CAPABILITY, service, content]
               - $get_property: [SELF, SOURCE, content]

@lauwers
Copy link
Contributor Author

lauwers commented Sep 16, 2022

In Calin's example in #122 (comment) the syntax suggests that a node filter is a list of Boolean functions. Why not just use a single Boolean function? If multiple functions need to be evaluated, we should use explicit $and and $or functions.

@lauwers
Copy link
Contributor Author

lauwers commented Feb 20, 2023

Workflow preconditions are now expressed using Boolean expressions as documented in https://docs.oasis-open.org/tosca/TOSCA/v2.0/csd05/TOSCA-v2.0-csd05.html#_Toc125468883. Similarly, conditions in policy triggers are now expressed using Boolean expressions as well as documented in https://docs.oasis-open.org/tosca/TOSCA/v2.0/csd05/TOSCA-v2.0-csd05.html#_Toc125468856. Using the new harmonized syntax, the original example in this issue is expressed as follows:

tosca_definitions_version: tosca_2_0

interface_types:
  Heal:
    operations:
      restart: 
        description: restart

policy_types:
  ServiceMonitor:
    targets: [ Server ]

capability_types:
  Service:
    properties:
      level:
        type: string
        validation:
          $valid_values: [ $value, [ excellent, mediocre, bad ] ]

relationship_types:
  ServedBy:
    properties:
      state:
        type: string
        validation:
          $valid_values: [ $value, [ UP, DOWN ] ]
          
node_types:
  Server:
    capabilities:
      service:
        type: Service
  Client:
    requirements:
      - server:
          capability: Service
          relationship: ServedBy
          count_range: [1, 1]
    interfaces:
      heal:
        type: Heal
          
service_template:
  node_templates:
    server:
      type: Server
      capabilities:
        service:
          properties:
            level: excellent
    client:
      type: Client
      requirements:
        - server:
            node: server
            relationship:
              properties:
                state: DOWN

  workflows:
    start_service:
      preconditions:
        $equal:
          - $get_property: [ client, RELATIONSHIP, server, state ]
          - DOWN
      steps:
        to_up:
          target: client
          activities:
            - call_operation: heal.restart

  policies:   
    - level_monitor:
        type: ServiceMonitor
        triggers:
          degraded_service:
            condition:
              $equal:
                - $get_property: [ server, CAPABILITY, service, level ]
                - excellent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants