On February 2020 we redesigned the whole logging mechanism in chaise. If you want to analysis a data prior to this date, please refer to the old documentation.
This document will describe how chaise is logging server requests as well as client actions.
By providing Deriva-Client-Context
header in ermrset requests we can log extra objects alongside the request. ERMrest will log the provided object in the dcctx
attribute of logs. For example the following is a line from /var/log/messages
file in dev.isrd that is for the request to getting one of the facet options.
{
"elapsed": 0.037084000000000006,
"req": "Le-1bpTwSfSxddRT8Y0T5g",
"scheme": "https",
"host": "dev.derivacloud.org",
"status": "200 OK",
"method": "GET",
"path": "/ermrest/catalog/1/entity/T:=isa:dataset/(id)=(isa:dataset_organism:dataset_id)/M:=(organism)=(vocab:species:id)@sort(name,RID)?limit=11",
"type": "application/json",
"client": "128.9.180.218",
"referrer": "https://dev.derivacloud.org/chaise/recordset/",
"agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36",
"track": "916d434.5aa5c5cf6b595",
"dcctx": {
"catalog": "1",
"schema_table": "vocab:species",
"stack": [
{
"type": "set",
"s_t": "isa:dataset"
},
{
"type": "facet",
"s_t": "vocab:species",
"source": [
{
"i": ["isa", "dataset_organism_dataset_id_fkey"]
},
{
"o": ["isa", "dataset_organism_organism_fkey"]
},
"id"
],
"entity": true
}
],
"action": ":set/facet,choice;load",
"cid": "recordset",
"pid": "1sy623td1kht2cyu2bv01w3e",
"wid": "1kcg21wn27ah1hr52djl269e",
"elapsed_ms": 4578
}
}
In the following, we're going to summarize what are the attributes that are being logged in the dcctx
attribute alongside each request.
The following are the default attributes that you can find on all the requests:
cid
: The app name (record, recordset, recordedit, etc).wid
: The window id (randomly generated). This value is stable across page loads to allow tracking of chaise app flows within one window.pid
: The page id. Randomly generated on each page load.catalog
: The catalog id.schema_table
: Theschema:table
combination. This captures the table that the current action is performed on.action
: A pre-defined string that implies what the request was for. Please refer to Action definition section for more information.elapsed_ms
: A value set to determine the elapsed time since the ermrestJS http service has been available. This will always be in milliseconds
Depending on the request, we might log extra attributes that we are gong to list in the following. You can find more information about each one in the next section.
-
ppid
: The parentpid
. This attributes will be available only on a number of requests. It will indicate which app and page led to this current request. These request are:- recordset: First read of the main entity.
- record: First read of the main entity.
- recordedit: The create/update request (the request generated when user clicks on submit).
- viewer: First read of the main entity.
-
pcid
: The parentcid
. Please refer toppid
to find more information on where you mind find this attribute. The following is the list ofpcid
's that will be present on main entity requests in the chaise apps.cid
will always represent what app the request was made from,pcid
will represent what app the user navigated from when the main entity request was triggered.record
recordset
recordedit
viewer
navbar
navbar/record
navbar/recordset
navbar/recordedit
If the user clicked on a link in the navbar, the
PCID
will properly denote what app the user came from that had the navbar present. A static page that uses the navbar app, will set thePCID
asnavbar
. Otherwise the appname will be appended (i.e.navbar/<appname>
). This is true for the deriva-webapps as well. A full list ofPCID
values (including applications that are not chaise) can be found here. -
paction
: The action in the parent page that fired the current request. Acceptable values are:view
: Available on the first read of the main entity in record page. Indicates that user clicked on "view" button in tabular displays.apply-sq
: Available on the first read of the main set in recordset page. Indicates that the user clicked "Apply search criteria" button in tabular displays.explore
: Available on the first read of the main set in recordset page. Indicates that the user clicked the "Explore" button on record app.
-
stack
: This attribute can be found on almost all the requests. It will capture the path that user took to get to the performed action. For example, if the logged request is for when a user interacts with a add pure and binary picker, using this stack you can figure out which main table and related (or inline table) user is interacting with.stack
is an array of objects that each node can have the following attributes:-
Required attributes:
-
s_t
: The end table of this node in the format ofschema:table
.- As an exception, in viewer app, if an image annotation is derived from file (not database), this value will not be available on the stack object.
-
type
: The type of the node request. It can be any of:entity
(row based),set
(rowset based),col
(column),pcol
(pseudo-column),fk
(foreign key),related
(inline or related table),saved_query
(saved query dropdown functionality),annotation
(image annotation in viewer app).
-
-
Optional attributes:
-
filters
: The facet object using the compressed syntax. -
source
, andentity
: The source path that defines this node using the compressed syntax. -
agg
: The aggregate function. -
picker
: If the request is happening on a picker,"picker":1
is added to the last node of thestack
. -
cfacet
: Only applicable to the first node. If url contains custom-facets (*::cfacets::
), this attribute will be equal to one. In this case one of the following attributes will be available:cfacet_str
: the displayname of custom-facet (if provided in url).cfacet_path
: the ERMrest path that was sent with the custom-facet.
-
custom_filters
: Only applicable to the first node, it captures the usage of "ermrest filters". If we could turn an "ermrest filter" into facet, this will returntrue
. Otherwise it will return the given "ermrest filters". -
causes
: The "reload" requests are using this attribute to give more clue of why this request was generated. It is an array of unique values. The possible values are different based on the "reload" request and you should refer to each request for more information. The following is all the possible causes values:clear-all
: Clear all button clicked.clear-cfacet
: Clear "cfacet" button in breadcrumbs clicked.clear-custom-filter
: Clear "custom filter" button in breadcrumbs clicked.entity-create
: New rows have been created for the current table.entity-delete
: A row in the current table has been deleted.entity-update
: Some rows in the table have been updated.facet-clear
: Clear facet button in breadcrumb clicked.facet-deselect
: Facet checkbox unchecked.facet-select
: Facet checkbox checked.facet-modified
: Facet changed in the modal.facet-search-box
: Facet search box changed.facet-plot-relayout
: Users interact with plot and we need to get new info for it.facet-retry
: Users click on retry for a facet that timed out.page-limit
: Change displayed number of rows per page.page-next
: Go to next page,page-prev
: Go to previous page.related-create
: New rows have been created for a related table.related-delete
: A row in one of related tables has been deleted.related-update
: Some rows in one of related table have been updated.related-inline-create
: New rows have been created for an inline related table.related-inline-delete
: A row in one of inline related tables has been deleted.related-inline-update
: Some rows in one of inline related table have been updated.sort
: sort changed.search-box
: search-box changed.
-
start_ms
: Used only on "reload" requests to indicate the time that the page become dirty first. -
end_ms
: Used only on "reload" requests to indicate the time that we're sending the request. If you join all the "reload" requests by theirstart_ms
, the longestend_ms
will return the request that the user actually sees. The other requests were generated by flow-control as user kept interacting with the page without actually showing their result. -
search-str
: Available only on client actions that are based on interacting with a search box. It returns the new search string. -
num_created
: The number of rows that have been asked to be created in that request. -
num_updated
: The number of rows that have been asked to be updated in that request. -
updated_keys
: Available only on "update" request, will return the key columns and submitted values. It's an object that and thecols
attribute returns an array of key column names, whilevals
returns an array of values that corresponds to the givencols
array (So it's an array of arrays). -
template
: Used only for the "export" request and will return an object withdisplayname
andtype
attributes to give more information about the used export template. -
old_thickness
,new_thickness
: Available only on "line thickness adjustment" client log in viewer app. It will capture the old and new value after user interacted with the UI to change the line thickness. -
file
: If the displayed image annotation in viewer app is derived from a while (and not database),"file": 1
will be added to the stack (s_t
will not be available.)
-
-
-
rid
: Available on the "go to RID" client action, to indicate the RID value that users searched for. -
sq_rid
: Available on the first read of the main set in recordset page. Indicates that the user navigated to this recordset page from a saved query link. -
cqp
(chaise query parameter): When a user uses a link that includes the?
instead of the#
. These urls are only used to help with google indexing and should be used only for navigating users from search engines to chaise apps. -
names
: Used in "navbar" request to capture the path that user took to end up in a particular menu option. It is an array of navbar option "name"s. The last item in the array if the name of the navbar option that user is acting on, and the rest are the name of its ancestors.
As it is mentioned in the previous section, action
is one of the required attributes in the logs. action
is a string that is created using the following format:
[<app-mode>]*:[<stack-path>]*,[<ui-context>]*;<verb>
Where,
-
app-mode
(optional) is used when for apps that have different modes. The following are apps that are using this and the possible values:- recordedit:
edit
: Edit mode of the app.create-copy
: When users end up in this app by clicking on "copy" button in record app.create-preselect
: When users end up in this app by clicking on "Add" button of related entities (In this case, we're preselecting the foreignkey relationship between related entity and the main).create
: Create mode of the app that is not the other more specific versions.
- viewer:
edit
: When users are editing an annotation.create
: When users are creating an annotation.create-preselect
: When users click on "edit" button of an annotation that was imported from a file. In this case, technically they are going to create a new annotation record in the database and the annotated term is preselected.
- recordedit:
-
stack-path
(optional) is available only whenstack
is used and summarizes thestack
. To distinguish between each stack nodes in this string, we're using/
. The values used for each node are:entity
: Based onentity
stack nodetype
.set
: Based onset
stack nodetype
.col
: Based oncol
stack nodetype
.pcol
: Based onpcol
stack nodetype
.related
,related-inline
: Based onrelated
stack nodetype
.related-link-picker
: Used for the association link picker.related-unlink-picker
: Used for the association unlink picker.facet
: Based onfacet
stack nodetype
.facet-picker
: Used for facet picker.fk
: Based onfk
stack nodetype
.fk-picker
: Used for foreign key picker.fk-bulk-picker
: Used for foreign key picker when the selections fill the same foreign key field in multiple recordedit formsfk-dropdown
: Used for foreign key inputs when they are a dropdown.saved-query-entity
: Used for saved query create popup.saved-query-picker
: Used for saved query apply picker.annotation-set
: Annotation list displayed on the viewer app.annotation-entity
: Each individual annotation displayed in annotation list of viewer app.
Based on this,
entity/related-inline
is a possiblestack-path
. -
ui-context
(optional) is used to give more clue as to where in the UI this request belongs to. -
verb
is the actual user action.
Extra rules followed in the action string:
/
is used in each of these sections, to indicate another level. For exampleentity/related
instack-path
means that the wholestack-path
of page isentity
, and this request is part ofentity/related
subsection.stack-path
has the same number of levels asstack
and each level corresponds to each stack node. But the string used to represent a stack node can be different from thetype
used in the stack. This has been done to add more information in the action string. For example, theload
request for both facet, and facet picker are using an stack that hasset
, andfacet
nodes. But theirstack-path
s areset/facet
andset/facet-picker
.-
is used to separate the words at the same level. The-
could imply hierarchy or just a separator for words, and we're not going to distinguish the two cases by using different delimiters to avoid confusion.
You can find the full list of log requests in this google sheet. The following sections have been added to provide more information about the content of this google sheet.
When users open a page, or interact with the page, we might send some requests to server based on the action. You can gather more information about the request just by looking at the url associated withe request. We call these server logs. On the other hand, some user interactions won't necessarily generate a server request and therefore we are logging these by sending a HEAD request to a predefined ermrest path. Currently we're using /ermrest/client_action
path for these client logs.
As we previously explained, stack
is used to capture the user path. It's an array of "stack nodes", where each node summarizes the user path. This has been mainly done to distinguish between the same actions that users might have taken in different paths. For instance, we show the same controls that we have in recordset app, in all of our modal pickers. Without the stack, the user path (context) would be lost. We tried to summarize the stack structure associated with each request in the google sheet. Each cell represent a node in the stack with attributes that will be available on it.
To give you a better idea of how to read the table, let's consider the case that the user is interacting with a facet picker, that was opened from an association link picker on an inline related entity. So,
-
first stack node should capture the record page app and the filters that it had. So it should be of
"type": "entity"
, and should havefilters
of main page ands_t
should be the main table.{ "type": "entity", "s_t": "schema:main_table", "filters": {"and": [{"src": "RID", "ch": ["RID_VALUE"]}]} }
-
second stack node should capture the association link picker and the filters that it had before opening the facet picker. So the
s_t
should show the leaf (related) table, thesource
should capture the path from the main table to this related table, a"picker": 1
should be added to signal that this was a picker and not the related section, and if the picker had any filters (facets),filters
should capture those (filters
might not be available if the user didn't select any filters).{ "type": "related", "s_t": "schema:related_table", "source": [{"i": ["schema", "constraint"]}, "RID"], "entity": true, "picker": 1, "filters": {"and": [{"src": "SOME_COLUMN", "ch": ["SOME_VALUE"]}]} }
-
third stack node should capture the state of facet picker. So the
s_t
should show the facet table,source
should be the path from the related table to the facet table,"picker": 1
should be added to signal that it was a picker, and if the picker had any filters (search-box filter is the only applicable case here),filters
should capture those (filters
might not be available if the user didn't change the search-box value).{ "type": "facet", "s_t": "schema:facet_table", "source": [{"o": ["schema", "constraint2"]}, "RID"], "entity": false, "picker": 1, "filters": {"and": [{"key": "search-box", "s": ["TERM"]}]} }
And if we put these three together, this is how the stack should look like:
{
"stack": [
{
"type": "entity",
"s_t": "schema:main_table",
"filters": {"and": [{"src": "RID", "ch": ["RID_VALUE"]}]}
},
{
"type": "related",
"s_t": "schema:related_table",
"source": [{"i": ["schema", "constraint"]}, "RID"],
"entity": true,
"picker": 1,
"filters": {"and": [{"src": "SOME_COLUMN", "ch": ["SOME_VALUE"]}]}
},
{
"type": "facet",
"s_t": "schema:facet_table",
"source": [{"o": ["schema", "constraint2"]}, "RID"],
"entity": false,
"picker": 1,
"filters": {"and": [{"key": "search-box", "s": ["TERM"]}]}
}
]
}
This stack would be added to any logs that we generate from that picker, and will be reflected in the action string. For example if the user opens the page-limit dropdown menu, the action would be :entity/related-link-picker/facet-picker,page-size;open
.
Since the facet object can be lengthy, we decide to modify it for the log purposes. The structure is the same, we are just going to compress some of the attribute names. These are the compressed version of each attribute used in the facet syntax:
i
forinbound
o
foroutbound
src
forsource
key
forsourcekey
ch
forchoices
r
forranges
s
forsearch
So for example if the facet object is
"and": [
{"source":"search-box", "search":["test"]},
{"source": [{"inbound": ["s", "cons1"]}, {"outbound": ["s", "cons2"]}, "RID"], "choices":["1"]},
{"sourcekey":"some-key", "ranges":[{"min": 1}]}
]
We're going to log it as the following:
"and": [
{"src":"search-box", "s":["test"]},
{"src": [{"u": ["s", "cons1"]}, {"o": ["s", "cons2"]}, "RID"], "ch":["1"]},
{"key":"some-key", "r":[{"min": 1}]}
]
To find the errors in log, you can search for /ermrest/terminal_error
path. This is what a error log would look like:
{
"e": 1,
"message": "ERROR MESSAGE",
"name": "ERROR TITLE"
}
As the name suggests, we currently are only logging terminal errors.
The object that we want to log might be lengthy. So we should truncate this object if it's exceeding the maximum length (we're currently limiting it to 6500 characters after encoding). In truncation logic we perform each of the described steps below to shorten the length of the object. If one step was not enough, we would perform the next step and so on until the length goes below the limit. The steps are:
- Replace all foreign key constraints that are in
source
orfilters
properties with their ermrest-provided model RID. - Replace values (
choices
,ranges
,search
) in thefilters
with the number of values. - Replace all
filters.and
with the number of filters. - Replace all source paths with the number of path nodes.
- Replace
stack
value with the number of stack nodes.
As you might have noticed, we are not adding any extra flag to signal truncation, and the structure itself should indicate that the request header has been truncated. The following summarizes how structure is changed based on the defined steps above:
- Instead of array of two elements ([
schema
,constraint
]) for the constraints, there will be a string. - Instead of arrays of values for
choices
,ranges
, andsearch
attributes infilters
, there will be a number. - Instead of array of filters in the
filters.and
, there will be just a number. - Instead of an array, the
stack
value will be just a number.
In this section we're going to mention some of the patterns that you can use to process the logs.
Since the asset download and also the default CSV export requests are simple redirects to the location, we cannot pass dcctx
to the header. Instead, we're sending these query parameters:
uinit=1
: To signal that this is chaise/ermrestjs that has generated the link.cid
: Thecid
of the app that user found the link from.
The following url patterns are what's unique about each of these requests and you can use for your analysis:
-
Asset download:
uinit=1&cid=
-
CSV default export:
?limit=none&accept=csv&uinit=1&cid=
In recordset app, or any of the other places that use the recordset view, e.g, modal pickers, chaise will communicate with server as soon as user interacts with the page as long as we have enough flow-control slots to send the request. So there might be some requests that we generate while the user is interacting with the page that will be discarded. To find the actual request that the user sees on the page, you can use start_ms
, and end_ms
. All these reload requests will have these two attributes. If you join all the reload request by start_ms
, you can find all the requests that we sent when the user started interacting at start_ms
. So the request with the longest end_ms
would give you the actual request that users will see on the page.
Clicking on "view" button will navigate users to record page with a specific paction
value. So you would need to find requests with the following values:
cid=record
pcid=recordset
paction=view
If you're interested in doing this for each specific table, you can choose to do so by further filtering this with the value of schema_table
.
- Added a new UI context,
saved-query
, and stack path,saved-query-entity
. Together these create 5 new actions,:set,saved-query;open
,create:set/saved-query-entity,;preload
,create:set/saved-query-entity,;create
,create:set/saved-query-entity,;cancel
, and:set/saved-query-entity,;update
- Added stack path,
saved-query-picker
. This includes 15 actions as well for a recordset modal selector further documented in the action list spreadsheet.
- Renamed
:entity/channel-set,z-default;load
to:entity/channel-set,;load
since it has nothing to do withdefault-z
anymore. It was used as a backup for getting the URL information of the default-z image. But now it's the request to get all the available channel information. - Renamed
:entity/channel-set,;load
to:entity/z-plane-entity,;load
. The previous request included the channel information alongside the processed image information. But the new request is only fetching the processed image information. Since this is currently only used for default z-index, adefault_z
attribute is also logged in the stack. Later this request can be used for other z-indexes as well.
- Added
z_index
attribute to the:entity/annotation-set,;load
request. - Added
:entity,;update
,:entity/channel-set,;update
,:entity/z-plane-set,;count
,:entity/z-plane-set,;reload
,:entity/z-plane-set,[search-box|slider|default-z];load-before
, and:entity/z-plane-set,[search-box|slider|default-z];load-after
requests.
- Removed the anatomy, comment, and annotation_comment requests of viewer app and therefore their log actions.
- Changed the annotation request in viewer app to be aligned with the rest of the apps (Action changed from
:entity,annotation;read
to:entity/annotation-set,;load
and stack structure modified.)
-
Added proper log support to viewer app. This includes properly logging the requests that viewer app was already making and adding client logs. Please refer to the PR and documentation for more information.
-
Added client log action for "go to RID" feature.
-
Added
paction=view
to first request in record app, to indicates user clicked on "view" button in tabular displays.
- Completely changed how we're logging in chaise. This is the starting date of logging as described in this documentation.