Skip to content

Filtering in source definition

Aref Shafaei edited this page Jan 12, 2022 · 31 revisions

Filter in source

Since the discussion for #685 was lengthy, I moved all of its content to here.

Table of contents

Filter syntax

The source path syntax we are using in ERMrestJS was copied from the ERMrest dynamic ACL binding notation. We could use the same notion of "filter" that exists in ERMrest dynamic ACLs here as well. Therefore in any part of the source path array definition (except the last element that is reserved for the column name), we should be able to define filters using the following syntax:

<terminal_filter_element>: {"filter": , "operand": , "operator": , "negate": }

<filter>: {"and": [<filter>, ...], "negate": } || 
          {"or": [<filter>, ...], "negate": } || 
          <terminal_filter_element>

(this is just a summary of syntax. for the complete rules of how these attributes work please refer to the dynamic ACL documentation)

  • Following dynamic ACL syntax, "filter" value could be any of the following formats:

    • [<left alias>, <column>]: to allow usage of columns from other contexts that are already defined.
    • [null, <column>]: to refer to the columns of the current path context.
    • <column>]: same as above.
    • <column>: same as above.
  • The value of operator can be any of the binary filter predicates that ERMrest understand. We are not going to introduce a mapping and data-modelers should just use the raw ERMrest syntax here.

  • ERMrestJS must validate the filters which include the values of each property (operator, operand, etc). If we marked a filter as invalid, the whole source definition will be marked as invalid.

  • The filter is applied from left to right following the source path starting from the table of interest (e.g. the table that has the annotation)

Based on this, the following are some examples of how this works:

// example 1
{
  "source": [
    {"inbound": ["schema", "constraint"]},
    {"filter": "col", "operand": "123"},
    "RID"
  ]
}

// example 2
{
  "source": [
    {"inbound": ["schema", "constraint"]},
    {
      "and": [
        {
          "or": [
            {"filter": "col1", "operand": "val1", "operator": "="},
            {"filter": "col2", "operand": "val2", "operator": "="}
          ],
          "negate": true
        },
        {"filter": "col3", "operand": "val3", "operator": "="}
      ]
    },
    "RID"
  ]
}

Impact on chaise

In the following we'll go over the impact of adding filter support for each source type that we have.

In ERMrest, the filters are always applied at the end. Even if a filter is defined in the middle of the path, it will be applied after the join based on the correct context. If we properly add alias to every foreign key node in the path, we could technically parse the filters at the end as well. This might simplify the logic of adding alias/context support later.

So let's assume the following is how the database looks like:

And we're looking at record page of main table with RID=v1. In case of recordset assume we're just looking at recordset of table main without any filters.

All-outbound paths (compact)

This includes one or more hops. For example:

{
  "source": [
    {"outbound": ["s", "main_a_fk"]},
    {"outbound": ["s", "a_b_fk"]},
    "RID"
  ]
}

These columns will not produce a secondary request and we're including them in the main entity request, so in this case it would be:

M:=s:main
         /left(fk_to_A)=(A:RID)/F1:=left(fk_to_B)=(B:RID)/$M
         /RID;F1:=array(F1:*)

But if we add filter to any parts of this path, the result will be incorrect as the filters are applied to the main table instead of filtering the results before applying left join to the result (i.e. sub-query). Therefore we have to send an extra query.

{
  "source": [
    {"outbound": ["s", "main_a_fk"]},
    {"filter": "col", "operand": "3"},
    {"outbound": ["s", "a_b_fk"]},
    "RID"
  ]
}
T:=s:main/RID=v1
         /(fk_to_A)=(A:RID)/col=3/M:=(fk_to_B)=(B:RID)
-- expected behavior on recordset page   
SELECT 
  *,  
  ( -- each psedo-column
    SELECT cnt(*) 
    FROM A 
    JOIN B ON ( A.fk_to_B = B.RID)
    WHERE A.RID = M.fk_to_A 
      AND SOURCE_PATH_FILTERS -- the new filter extension
  )
FROM M 
JOIN FACET_TABLE -- facet tables 
WHERE FACET_FILTERS


-- if apply filters in the same main request.. This won't work especially when combining all relevant pseudo-columns
SELECT * 
FROM M JOIN FACET_TABLES
  LEFT JOIN A 
  LEFT JOIN B
WHERE FACET_FILTERS
  AND SOURCE_PATH_FILTERS 

Note: All multi-hop outbounds pseudo-columns are currently made within the main search request. Might want to consider split multi-hop pseodo-columns from the main request for performance.

Aggregate (compact)

The aggregate function or path shouldn't affect how we're processing filter, so let's assume the following multi-hop case:

{
  "source": [
    {"inbound": ["s", "e_main_fk"]},
    {"inbound": ["s", "f_e_fk"]},
    "f_col"
  ],
  "aggregate": "array_d"
}

we're sending secondary requests to fetch the aggregates. For the secondary request we're using inner join:

M:=S:main/RID=v1/(RID)=(E:fk_to_main)/T:=(RID)=(F:fk_to_G)/c:=M:RID;v:=array_d(T:f_col)

Which is in the format of

<main table url>/<key value pairs of all the shortestkey values>/<join path based on the given source>/<shortest key>;<aggregate column>

Adding filter to any where on the path should not make any difference:

{
  "source": [
    {"inbound": ["s", "e_main_fk"]},
    {"filter": "e_col", "operator": "::gt::", "operand": "3"},
    {"inbound": ["s", "f_e_fk"]},
    "f_col"
  ]
}
M:=S:main/RID=v1/(RID)=(E:fk_to_main)/e_col::gt::3/T:=(RID)=(F:fk_to_G)/c:=M:RID;v:=array_d(T:f_col)

@HT Note: prior to Nov 30, 2021, we used the left join from the domain table to the main table. Because the RIDs of the main table were added as filters, this syntax behaved the same way as the inner join mentioned above but more expensive in the backend.

One-hop related entity (detailed context)

For example:

{
  "source": [
    {"inbound": ["s", "c_main_fk"]},
    "RID"
  ]
}

Display (and explore link)

Request to fetch the entities (uses the reverse path as above) as well as the "explore" link:

s:C/<facet-blob>

Where <facet-blob> is reverse path filtered by the shortestkey value of main (@HT assuming that the current record is main.RID=v1) :

{
  "and": [
    {
      "source": [
        {"outbound": ["s", "c_main_fk"]},
        "RID"
      ],
      "choices": ["v1"]
    }
  ]
}

Which would be the following request to ermrest:

M:=S:C/(fk_to_main)/(main:RID)/RID=v1/$M

Request to fetch the entities where filter is used should just take the filter into account regardless of where the filter is:

{
  "source": [
    {"filter": "main_col", "operand": "c1"},
    {"inbound": ["s", "c_main_fk"]},
    "f_col"
  ]
}

the facet blob would be

{
  "and": [
    {
      "source": [
        {"outbound": ["s", "c_main_fk"]},
        {"filter": "main_col", "operator": "::gt::", "operand": "3"},
        "RID"
      ],
      "choices": ["v1"]
    }
  ]
}

And the ermrest request:

M:=S:C/(fk_to_main)=(main:RID)/main_col::gt::3/RID=v1/$M

Filtering on the main record @HT

Although the location of filter doesn't affect the type of request that we're generating, we might want to change the display heuristics depending on the filter position. If the filter is applied to the record page main entity (e.g. if filter is defined prior to the foreign key path) and the initial request didn't return any values, we should consider:

  • completely hide the related table (not even visible in the show all related tables) -- preferred.
  • treat it the same as other empty tables

Add

Currently, by clicking on "add" chaise will create a cookie so we can "pre-fill" and disable inputs. For this, we're pre-filling foreignkeys that are superset of the "related table foreignkey", and extra columns are not-null. More info here and here.

Depending on the complexity of filter and where in the path is defined, we might or not be able to properly do the "pre-fill" logic (e.g. when the filter is not a simple form of <column> = "string"). Therefore, we should provide the following options:

@HT: I am adding the following 1 and 2 topics:

  1. Whether to show the add button and whether to disable the prefilled values
  • (discuss) Data-modelers should be able to choose the level of checks that we're doing in recordedit:

    • Restricted: If we can properly pre-fill values, we should (and disable the inputs). Otherwise the button will be disabled. @HT This is preferred as it is less confusing e.g. users can't create a record that will not show up in the record page.
    • Guidance: If we can properly pre-fill values, we should just add them as placeholder and allow users to change.
    • Unrestricted: Don't try to pre-fill or limit users.
  • Data-modelers should be able to disable the "add" button. To allow this, we might want to add a general "acl" support to whether to hide/show the add button pseudo-columns. But what does that exactly mean? Can they also hide a column/related table? (refer to the example below)

  • Apart from the pre-filled foreign key data, we should send more information to allow us to enforce the filters in recordedit as well. discuss We should have some heuristics to support simpler cases and for the more general cases we're just going to hide "add" button.

  • (discuss) In GitHub if you create an issue in a filtered view where the newly created issue is not matching, it tosses a little dismissible alert saying

    The item was created but is hidden because it does not match the current view
    

    We might be able do the same in this scenario. We should know the RIDs of created/mutated records, and so might be able to formulate a query to test whether it is still visible in the current filtered set. @HT and provide a link for user to edit the newly created entry. I like this idea, but it should only be applied when the filters are specified.

  1. Allow dba to provide guidance on how to properly fill in the form (refer to the example below)

The following is how a visible-column/visible-foreign-key could look like:

{
  "source": [...],
  "markdown_name": "related table name",
  "comment": "",
  "comment_display": "inline",

  // proposed properties:
  "filter_checking": "restricted",

  "addRelatedAcls": {
    "show": ["*"],
    "enable": ["*"]
  },
  "acls": {
    "insert": ["*"],
    "select": ["*"],
    "update": ["*"],
    "delete": ["*"]
  },

  "entry_instructions": "... Only fill in observed diseases.",
  "contextualized": {
    "entry": {
      "comment": "... Only fill in observed diseases."
    }
  }
}
  • We discussed about the possibility of allowing different filter for "add". But given that requires a different syntax we decided to not pursuit it for now.

Edit

Currently we're not restricting edit at all. So even in the current workflow users can change the foreign key value which would cause the row to disappear from the list or related entities. Does that matter?

  • @HT: I find this feature useful as sometimes there are need to reassign related records to other main entities. I think warning that the entries will no longer showed up on a page is sufficient.

Pure and binary association

For example:

{
  "source": [
    {"inbound": ["s", "assoc_main_fk"]},
    {"outbound": ["s", "assoc_d_fk"]},
    "RID"
  ]
}

Display

It's the same discussion as one hop related tables. We should hide the section completely if the filter is based on the main entity.

Add button

It opens a popup where we're sending two requests for each page of results:

  • One request is unfiltered domain table:

     M:=s:D
    
  • The other request is with the reverse facet:

    {
      "and": [
        {
          "source": [
            {"inbound": ["s", "assoc_d_fk"]},
            {"outbound": ["s", "assoc_main_fk"]},
            "RID"
          ],
          "choices": ["v1"]
        }
      ]
    }

    Which results in

    M:=S:D/(RID)=(main_D_assoc:fk_to_D)=(fk_to_main)=(main:RID)/RID=v1/$M
    

And if we add facet filters, they will be added to both requests:

Let's assume we choose d_col=3 in a facet:

  • domain request:

    M:=S:D/d_col=3/$M
    
  • existing value request:

    {
      "and": [
        {
          "source": [
            {"inbound": ["s", "assoc_d_fk"]},
            {"outbound": ["s", "assoc_main_fk"]},
            "RID"
          ],
          "choices": ["v1"]
        },
        {
          "source": "d_col",
          "choices": ["3"]
        }
      ]
    }

    Which results in

    M:=S:D
          /(RID)=(main_D_assoc:fk_to_D)=(fk_to_main)=(main:RID)/RID=v1/$M
          /d_col=3/$M
    

Adding filter to the source definition will affect these two requests. Depending on where the filter is defined, we might or might not be able to properly handle it:

  1. Filter in between: It's impossible to offer an intuitive add behavior. We cannot add the filter without the path, and adding path is assuming the relationship between related table and association table already exists based on other main table key. This causes correctness issue as it will cause less rows to show up in the popup.

    {
      "source": [
        {"inbound": ["s", "assoc_main_fk"]},
        {"filter": "assoc_col", "operator": "::gt::", "operand": "3"},
        {"outbound": ["s", "assoc_d_fk"]},
        "RID"
      ]
    }
  • So we shouldn't allow "add" feature but we still should consider it as pure and binary (unlink).
  1. Filter on related table (leaf):

    {
      "source": [
        {"inbound": ["s", "assoc_main_fk"]},
        {"outbound": ["s", "assoc_d_fk"]},
        {"filter": "d_col", "operator": "::gt::", "operand": "3"},
        "RID"
      ]
    }

    Where the read request is with the following facet:

    {
      "and": [
        {
          "source": [
            {"filter": "d_col", "operator": "::gt::", "operand": "3"},
            {"inbound": ["s", "assoc_d_fk"]},
            {"outbound": ["s", "assoc_main_fk"]},
            "RID"
          ],
          "choices": ["v1"]
        }
      ]
    }

    In this case we might be able to process the filter and add it as a hidden filter to the popup.

Notes:

  • (discuss) In all cases, we should offer settings to control how the add should behave:

    • Restricted mode: If we can process the filter in popup then we should, otherwise the "add" should be hidden.
    • Unrestricted: Allow add. Don't apply the filters and just show the whole domain table.
  • (discuss) There should be a way to show the filter to users the same as domain_filter.

  1. Filter on main table: It will just change whether we should even show the whole related or not. So if we decided to show the values then we should allow them to add without any additional filters. @HT: instead of this, it should be refer the filter to the leave table (second case). Note I swaped 2 and 3.
    {
      "source": [
        {"filter": "assoc_col", "operator": "::gt::", "operand": "3"},
        {"inbound": ["s", "assoc_main_fk"]},
        {"outbound": ["s", "assoc_d_fk"]},
        "RID"
      ]
    }

Free-form related entity

For example:

{
  "source": [
    {"inbound": ["s", "e_main_fk"]},
    {"inbound": ["s", "f_e_fk"]},
    {"outbound": ["s", "f_g_fk"]},
    "RID"
  ]
}
  • Request to fetch the entities: same discussion as above (filter should change the request and if it's based on the main entity should completely hide the section).

  • Add button: not offered so shouldn't change.

Facet

Let's assume the following facets are defined for the main table:

{
  "and": [
    {
      "source": [
        {"inbound": ["s", "e_main_fk"]},
        {"inbound": ["s", "f_e_fk"]},
        "RID"
      ]
    },
    {
      "source": [
        {"inbound": ["s", "assoc_main_fk"]},
        {"inbound": ["s", "assoc_d_fk"]},
        "RID"
      ]
    }
  ]
}

In the following I'm going over different iterations of how we're processing facets:

  1. Without filter in the path and without any null values:
{
  "and": [
    {
      "source": [
        {"inbound": ["s", "e_main_fk"]},
        {"inbound": ["s", "f_e_fk"]},
        "f_col"
      ],
      "choices": ["v1", "v2"]
    },
    {
      "source": [
        {"inbound": ["s", "assoc_main_fk"]},
        {"inbound": ["s", "assoc_d_fk"]},
        "d_col"
      ],
      "choices": ["v3", "v4"]
    }
  ]
}

main request

M:=s:main/
         (RID)=(E:e_col)/(RID)=(F:fk_to_E)/f_col=v1;f_col=v2/$M
         (RID)=(D:fk_to_main)/(fk_to_D)=(D:RID)/d_col=v3;d_col=v4/$M         

facet request (D)

T:=s:main/
        (RID)=(E:e_col)/(RID)=(F:fk_to_E)/f_col=v1;f_col=v2/$T
        (RID)=(D:fk_to_main)/M:=(fk_to_D)=(D:RID)
  1. With filter in path and without any null values:
{
  "and": [
    {
      "source": [
        {"inbound": ["s", "e_main_fk"]},
        {"filter": "e_col", "operand": "3"},
        {"inbound": ["s", "f_e_fk"]},
        "col"
      ],
      "choices": ["v1", "v2"]
    },
    {
      "source": [
        {"inbound": ["s", "assoc_main_fk"]},
        {"inbound": ["s", "assoc_d_fk"]},
        "RID"
      ],
      "choices": ["v3", "v4"]
    }
  ]
}

main request:

M:=s:main/
         (RID)=(E:e_col)/e_col=3/(RID)=(F:fk_to_E)/f_col=v1;f_col=v2/$M
         (RID)=(D:fk_to_main)/(fk_to_D)=(D:RID)/d_col=v3;d_col=v4/$M         

facet request (D):

T:=s:main/
        (RID)=(E:e_col)/e_col/(RID)=(F:fk_to_E)/f_col=v1;f_col=v2/$T
        (RID)=(D:fk_to_main)/M:=(fk_to_D)=(D:RID)
  1. Without filter in the path and with null value:
s:F/RID::null::/(fk_to_E)=(E:RID)/M:=right(fk_to_main)=(main:RID)/
   /(RID)=(main_D_assoc:fk_to_main)/(fk_to_D)=(D:RID)/RID=v3;RID=v4/$M

Note: As I mentioned in the beginning, the filters are technically applied in the end, so more accurate representation of filter would be like this:

s:F/(fk_to_E)=(E:RID)/M:=right(fk_to_main)=(main:RID)/
   /(RID)=(main_D_assoc:fk_to_main)/(fk_to_D)=(D:RID)/$M/F:RID::null::&RID=v3;RID=v4  

facet request (D):

s:F/RID::null::/(fk_to_E)=(E:RID)/T:=right(fk_to_main)=(main:RID)/
  /(RID)=(main_D_assoc:fk_to_main)/M:=(fk_to_D)=(D:RID)

which technically is:

s:F/(fk_to_E)=(E:RID)/T:=right(fk_to_main)=(main:RID)/
  /(RID)=(main_D_assoc:fk_to_main)/M:=(fk_to_D)=(D:RID)/F:RID::null::
  1. With filter in the path and with null value:
{
  "and": [
    {
      "source": [
        {"inbound": ["s", "e_main_fk"]},
        {"filter": "e_col", "operand": "3"},
        {"inbound": ["s", "f_e_fk"]},
        "col"
      ],
      "choices": [null]
    },
    {
      "source": [
        {"inbound": ["s", "assoc_main_fk"]},
        {"inbound": ["s", "assoc_d_fk"]},
        "RID"
      ],
      "choices": ["v3", "v4"]
    }
  ]
}

main request (not connected to a record):

s:F/(fk_to_E)=(E:RID)/M:=right(fk_to_main)=(main:RID)/
   /(RID)=(main_D_assoc:fk_to_main)/(fk_to_D)=(D:RID)/$M/F:RID::null::&E:e_col=3&D:RID=v3;D:RID=v4  

Which each filter translate to:

  • The F:RID::null filter ensures that we're returning rows that are not part of the path from F to E to main.
  • The E:e_col=3 is looking for rows that are part of the path with the value.

And as you can see they are opposite of each other and therefore the request will always returns an empty result.

If we could use inner join for null filters then we could support filter in source in combination with null filter.

Click here for more information as to why we're doing outer join.

There are three types of null/not-null that we might be interested:

  1. Connected to non null.
  2. Connected to null.
  3. Not connected.

But

  • We have to implement 1 with inner joins
  • We cannot implement 3 only.
  • We could implement 2 with inner joins but nobody cares about distinction of 2 vs. 3.
  • We implement 2 union 3 with outer join (can't distinguish between 2 and 3).

That's why in the past we decided to always do outer join, and in the cases that both likability and scalar value can be null, we're simply not offering the null option. @HT: I don't understand the last sentence.

To find more info about the existing hide_null_choice logic refer to here and here.


Therefore if there's a filter along the path except the main table, we cannot offer "No value" option.

We discussed about possibility of allowing data-modelers to change the restriction level. We were thinking about doing:

  • Restricted: ensure "No value" is not offered for facets with filter in their source.
  • Unrestricted: Ignore filters while applying "No value" facet and therefore offer the "No value" option.

But the "unrestricted" mode is going to be confusing in most cases, and if they really care about the "No value" option the could add an extra check_presence facet without the filters in the path.

Error handling

Currently, if there's an issue in the source definition we're simply ignoring it. This has been mainly done to avoid throwing raw ERMrest errors. With the addition of filters in source definition, avoiding filters might not be easily achievable. So apart from making sure there are some bare minimum validation of syntax, we should also improve the error handling of chaise.

For instance, if we define a filter like {"filter": "boolean_col", "operand_pattern": "3"} where boolean_col is a boolean type, ERMrest will complain and chaise just displays the raw 409 error. Therefore:

  • We should not show the raw error message to users and instead display the generic terminal error message (error details should show the raw message)

  • We should add context to the errors of the GET requests that we have. The simplest way to do this is adding an extra attribute to the error object in the catch clause of Reference.read instances in chaise. And then showing the context as a separate line in the error popup. For instance, for faceting requests we could say

    Unable to fetch data for the facet `<facet-name>`.
    
  • We might be able to find a pattern in the error messages that we see in order to repaint them for a more readable message. For instance, in case of a filter like {"filter": "boolean_col", "operand_pattern": "3"}, the error message is like the following:

    409 Conflict
    The request conflicts with the state of the server. invalid input syntax for type boolean: "12"
    LINE 8: WHERE (((t0."boolean_column" = '12'::boolean))) ...
                                            ^
    

    We could look for invalid input syntax for type and then reword the error message. In general the following are the issues that a filter might have:

    • The used column in filter doesn't exist.
    • The operator is not appropriate for the column type.
    • The operand_pattern is not appropriate for the column type.

We should be able to find a repeatable pattern for each error message. I should also mention that in some cases we might be able to avoid this by validating the given filter in source. For instance we can simply detect the case where the column doesn't exist in the table

Pseudo-column name logic

Currently ERMrestJS will assign a unique hash name to each pseudo-column based on the structure of its source. The name is used in different places, for example,

  • We use the generated name instead of checking the whole structure again to detect duplicates in visible-columns and visible-foreignkeys annotations.
  • It's used to match the filters in the facet blob with facet columns.
  • When users click on "sort" icon in a tabular view for a pseudo-column, the name will be used as a sort modifier.

Since we're creating the hash based on the given source property, we have to ensure the generated hash is deterministic and only changes when the source is structurally different. Therefore,

  • The order of properties in an object shouldn't matter. So the following should generate the same hash:

    {
      "source": [
        {"inbound": "schema", "constraint"},
        {"filter": "col1", "operand_pattern": "val", "operator": "::gt::"}
      ]
    }
    {
      "source": [
        {"inbound": "schema", "constraint"},
        { "operator": "::gt::", "operand_pattern": "val", "filter": "col1"}
      ]
    }
  • The order of filters under and/or shouldn't matter (requires more thought). So the following should generate the same hash:

    {
      "source": [
        {"inbound": ["schema", "constraint"]},
        {
          "and": [
            {"filter": "col2", "operand": "val2", "operator": "="},
            {"filter": "col1", "operand": "val1", "operator": "="}
          ]
        },
        "RID"
      ]
    }
    {
      "source": [
        {"inbound": ["schema", "constraint"]},
        {
          "and": [
            {"filter": "col1", "operand": "val1", "operator": "="}
            {"filter": "col2", "operand": "val2", "operator": "="}
          ]
        },
        "RID"
      ]
    }

Other extensions

Dynamic filter that uses pattern

In #854 we talked about the possibility of supporting domain_filter in add pure and binary, but apart from the challenges that making that change has, usage of domain_filter requires ERMrest knowledge. You have to be familiar with how ERMrest filter and joins work to write a proper domain_filter (although the current usage in different deployments is just limited to value check). Therefore we decided to think about the possibility of extending the filters to allow patterns. This means addition of operand_pattern to the filters of source syntax.

<terminal_filter_element>:  {"filter": , "operand_pattern": , "operator": , "negate": } || {"filter": , "operand": , "operator": , "negate": }

<filter>: {"and": [<filter>, ...], "negate": } || 
          {"or": [<filter>, ...], "negate": } || 
          <terminal_filter_element>

The main problem with allowing patterns in a source definition is the templating environment. Currently, the templating environment is determined based on "active list". "active list" refers to the list of columns, pseudo-columns, and related entities that are visible on the page, and user can access their values. To generate this list, Chaise will process the list of visible columns and foreign keys (including the wait_fors) and determine when a pattern needs to be executed. Given that the structure of visible columns and foreign keys is needed for the active list, we cannot simply support the whole active list and wait_for in here and we would have to limit it. This is doable in the record and recordedit page where we first fetch the main entity and then create the structure of the page based on the result of the main entity.

Comparison with domain_filter

domain_filter advantages:

  • You can conditionally add the filter, but this is not easily possible with filter_pattern and would require complicating the filter. Although this depends on how we want to treat computed empty/null values in filter_pattern
  • We can define a display_markdown_pattern to show the filter to users.
  • Allows filter based on other tables (using joins) as long as the path is properly reseted.

domain_filter disadvantages:

  • Requires ERMrest knowledge
  • Limited to recordedit and requires changes to be able to have it in other apps.
  • Limited to the pickers and cannot be used for showing the values.

Recoredit

Given that recordedit is simple and doesn't have any flow-control system, it cannot support any of the more complicated pseudo-columns (you can use source syntax to define a local column or an outbound foreign key, but nothing more complicated).

If we want to allow these types of pseudo-columns,

  • In the case of edit/copy, we should make sure only values that match the filters are displayed. To do so, we would need to send multiple requests and therefore would require a proper flow-control system.
  • For the foreign key pickers, the reference used in the popup should use the filtered path which is simple.

Record

Allowing patterns in the filter definition shouldn't add too much complexity to the record page. We just have to make sure that we're processing the patterns before creating the list of visible foreign keys and columns.

Click here for an example of how this would work.
"source_definitions": {
  "sources": {
    "S_collection": {
      "source": [
        {"inbound": ["CFDE", "personal_collection_file_file_fkey"]},
        {"outbound": ["CFDE", "personal_collection_file_collection_fkey"]},
        "RID"
      ]
    }
  }
},
"visible_foreign_keys": {
   "detailed": [
     {"sourcekey": "S_collection"}
   ],
   "detailed/link": [
     {
       "source": [
         {"sourcekey": "S_collection"},
         {"filter": "RCB", "operand_pattern": "{{{$session.id}}}"},
         "RID"
       ]
     }
   ]
}

Questions/notes:

  • What if the pattern results in null/empty string? With domain_filter you could conditionally add filters, what about patterns?

    "foreign-key": {
        "domain_filter": {
             "ermrest_path_pattern": "RID={{{$fkey_schema_const.values.RID}}}"
        }
     }
    
    • Per our discussions, when a pattern results in an empty string, we have the following option:
      • Invalidate the whole source definition.
      • Ignore the filter
      • Treat the operand as empty string.

    But to give more freedom to data-modelers we might want to introduce an extra condition_pattern. If this pattern validates to any non-empty string, we will compute the operand_pattern and use it as is. Otherwise the filter clause should be ignored.

    {
      "filter": "RID",
      "operand": "",
      "operand_pattern": "{{{$fkey_schema_const.values.RID}}}", 
      "operator": "=",
      // whether or not we should compute operand_pattern or not
      "condition_pattern": ""
    },
  • What if both operand and operand_pattern are used? Do we even need operand if we're going to add operand_pattern?

    • Based on what we discussed, this depends on whether handlebars allows escaping the expressions, which it does.
  • What about other apps? Should we just invalidate these column definitions in other apps?

Alias and context

The source definition syntax is based on dynamic ACLs. That's why when we originally discussed adding filter we followed the syntax described here. After adding filter, the only thing that is missing is the alias and context support. For the existing use case these two attributes are not needed but we should think about the possibility of supporting them in the future.

The complexity of adding alias is not related to the feature that is discussed in this issue and I think we should separate the two. This came up before in #846 so I'm going to add more information in there.