From 9ecd6ef9040d3ef4ada46bfc76bcd3ec453066f0 Mon Sep 17 00:00:00 2001 From: Katerina Hronikova Date: Wed, 27 Nov 2024 13:46:08 +0100 Subject: [PATCH 01/30] docs: add dataset schema validation --- .../index.md} | 2 +- .../dataset_schema/validation.md | 312 ++++++++++++++++++ sources/platform/monitoring/index.md | 12 +- 3 files changed, 324 insertions(+), 2 deletions(-) rename sources/platform/actors/development/actor_definition/{output_schema.md => dataset_schema/index.md} (99%) create mode 100644 sources/platform/actors/development/actor_definition/dataset_schema/validation.md diff --git a/sources/platform/actors/development/actor_definition/output_schema.md b/sources/platform/actors/development/actor_definition/dataset_schema/index.md similarity index 99% rename from sources/platform/actors/development/actor_definition/output_schema.md rename to sources/platform/actors/development/actor_definition/dataset_schema/index.md index 6834c907f..d4299ed9a 100644 --- a/sources/platform/actors/development/actor_definition/output_schema.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/index.md @@ -118,7 +118,7 @@ The template above defines the configuration for the default dataset output view The default behavior of the Output tab UI table is to display all fields from `transformation.fields` in the specified order. You can customize the display properties for specific formats or column labels if needed. -![Output tab UI](./images/output-schema-example.png) +![Output tab UI](../images/output-schema-example.png) ## Structure diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md new file mode 100644 index 000000000..0de23b14b --- /dev/null +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -0,0 +1,312 @@ +--- +title: Dataset validation +description: Specify the dataset schema within the Actors so you can add monitoring and validation down to the field level. +slug: /actors/development/actor-definition/dataset-schema/validation +--- + +**Specify the dataset schema within the Actors so you can add monitoring and validation down to the field level.** + +--- + +To define a schema for a default dataset of an actor run, you need to set `fields` property in the dataset schema. It’s currently impossible to set a schema for a named dataset (same as for dataset views). + +:::info + +The schema defines a single item in the dataset. Be careful not to define the schema as an array, it always needs to be a schema of an object. + +::: + +You can either do that directly through `actor.json` like this: + +```json title=".actor.json" +{ + "actorSpecification": 1, + "storages": { + "dataset": { + "actorSpecification": 1, + "fields": { + "$schema": "http://json-schema.org/draft-07/schema#", + "type": "object", + "properties": { + "name": { + "type": "string" + } + }, + "required": ["name"] + }, + "views": {} + } + } +} +``` + +Or in a separate separate file like this: + +```json title=".actor.json" +{ + "actorSpecification": 1, + "storages": { + "dataset": "./dataset_schema.json" + } +} +``` + +```json title="dataset_schema.json" +{ + "actorSpecification": 1, + "fields": { + "$schema": "http://json-schema.org/draft-07/schema#", + "type": "object", + "properties": { + "name": { + "type": "string" + } + }, + "required": ["name"] + }, + "views": {} +} +``` + +:::important + +The `$schema` line is important and must be exactly this value or it must be omitted: + +`"$schema": "http://json-schema.org/draft-07/schema#"` + +::: + +## Dataset validation + +When you define a schema of your default dataset, the schema is then always used when you insert data into the dataset to perform validation (we use [AJV](https://ajv.js.org/)). + +If the validation succeeds, nothing changes from the current behavior, data is stored and an empty response with status code 201 is returned. + +**If the data you attempt to store in the dataset is invalid** (meaning any of the items received by the API fails the validation), **the whole request is discarded** and the API will return a response with status code 400 and the following JSON response: + +```json +{ + "error": { + "type": "schema-validation-error", + "message": "Schema validation failed", + "data": { + "invalidItems": [{ + "itemPosition": "", + "validationErrors": "" + }] + } + } +} +``` + +The type of the AJV validation error object is [here](https://github.com/ajv-validator/ajv/blob/master/lib/types/index.ts#L86) + +If you use the Apify JS client or Apify SDK and call `pushData` function you can access the validation errors in a `try catch` block like this: + +```javascript +try { + const response = await Actor.pushData(items); +} catch (error) { + if (!error.data?.invalidItems) throw error; + error.data.invalidItems.forEach((item) => { + const { itemPosition, validationErrors } = item; + }); +} +``` + +## Examples + +Optional field (price is optional in this case): + +```json +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "type": "object", + "properties": { + "name": { + "type": "string" + }, + "price": { + "type": "number" + } + }, + "required": ["name"] +} +``` + +Field with multiple types: + +```json +{ + "price": { + "type": ["string", "number"] + } +} +``` + +Field with type `any`: + +```json +{ + "price": { + "type": ["string", "number", "object", "array", "boolean"] + } +} +``` + +Enabling fields to be `null` : + +```json +{ + "name": { + "type": "string", + "nullable": true + } +} +``` + +Define type of objects in array: + +```json +{ + "comments": { + "type": "array", + "items": { + "type": "object", + "properties": { + "author_name": { + "type": "string" + } + } + } + } +} +``` + +Define specific fields, but allow anything else to be added to the item: + +```json +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "type": "object", + "properties": { + "name": { + "type": "string" + } + }, + "additionalProperties": true +} +``` + +See [json schema reference](https://json-schema.org/understanding-json-schema/reference) for additional options. + +Example of schema generator [here](https://www.liquid-technologies.com/online-json-to-schema-converter). + +# Dataset field statistics + +When you have the dataset fields schema set up, we then use the schema to generate a list of fields and measure statistics for these fields. + +The measured statistics are following: + +- **Null count:** how many items in the dataset have the field set to null +- **Empty count:** how many items in the dataset are `undefined` , meaning that for example empty string is not considered empty +- **Minimum and maximum** + - For numbers, this is calculated directly + - For strings, this field tracks string length + - For arrays, this field tracks the number of items in the array + - For objects, this tracks the number of keys + +:::note + +Currently, you cannot view these statistics. We will add API endpoint soon. But you can already use them in monitoring. + +::: + +## Examples + +For this schema: + +```json +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "type": "object", + "properties": { + "name": { + "type": "string" + }, + "description": { + "type": "string" + }, + "dimensions": { + "type": "object", + "nullable": true, + "properties": { + "width": { + "type": "number" + }, + "height": { + "type": "number" + } + }, + "required": ["width", "height"] + }, + "price": { + "type": ["string", "number"] + } + }, + "required": ["name", "price"] +} +``` + +The stored statistics and fields in the database look like this: + +```json +{ + "_id" : "1lVGVBkWIhSYPY1dD", + "fields" : [ + "name", + "description", + "dimensions", + "dimensions/width", + "dimensions/height", + "price" + ], + "stats": { + "description": { + "emptyCount": 105, + "max": 19, + "min": 19 + }, + "dimensions": { + "emptyCount": 144, + "max": 2, + "min": 2, + "nullCount": 86 + }, + "dimensions/height": { + "emptyCount": 230, + "max": 992, + "min": 18 + }, + "dimensions/width": { + "emptyCount": 230, + "max": 977, + "min": 4 + }, + "name": { + "max": 13, + "min": 11 + }, + "price": { + "max": 999, + "min": 1 + } + } +} +``` + +:::note + +If you want to see for yourself, check `datasetStatistics` collection. The ids correspond to the ids of datasets. + +::: diff --git a/sources/platform/monitoring/index.md b/sources/platform/monitoring/index.md index d308d6fce..35d0a370b 100644 --- a/sources/platform/monitoring/index.md +++ b/sources/platform/monitoring/index.md @@ -41,12 +41,22 @@ Currently, the monitoring option offers the following features: ### Alert configuration -When you set up an alert, you have two choices for how you want the metrics to be evaluated. And depending on your choices, the alerting system will behave differently: +When you set up an alert, you have three choices for how you want the metrics to be evaluated. And depending on your choices, the alerting system will behave differently: 1. **Alert, when the metric is lower than** - This type of alert is checked after the run finishes. If the metric is lower than the value you set, the alert will be triggered and you will receive a notification. 2. **Alert, when the metric is higher than** - This type of alert is checked both during the run and after the run finishes. During the run, we do periodic checks (approximately every 5 minutes) so that we can notify you as soon as possible if the metric is higher than the value you set. After the run finishes, we do a final check to make sure that the metric does not go over the limit in the last few minutes of the run. +3. **Alert, when run status is one of following** - This type of alert is checked only after the run finishes. It makes possible to track the status of your finished runs and send an alert if the run finishes in a state you do not expect. If your actor runs very often and suddenly starts failing, you will receive a single alert after the first failed run in 1 minute, and then aggregated alert every 15 minutes. + +4. **Alert for dataset field statistics** - If you have a [dataset schema](../actors/development/actor_definition/dataset_schema/validation.md) set up, then you can use the field statistics to set up an alert. You can use field statistics for example to track if some field is filled in in all records, if some numeric value is too low/high (for example when tracking the price of a product over multiple sources), if the number of items in an array is too low/high (for example alert on Instagram actor if post has a lot of comments) and many other tasks like these. + + :::important + + Available dataset fields are taken from the last successful build of the monitored actor. If different versions have different fields, currently the solution will always display only those from the default version. + + ::: + ![Metric condition configuration](./images/metric-options.png) You can get notified by email, Slack, or in Apify Console. If you use Slack, we suggest using Slack notifications instead of email because they are more reliable, and you can also get notified quicker. From b96183fe4780e33fe000236e46a7ea7bb3cad965 Mon Sep 17 00:00:00 2001 From: Katerina Hronikova Date: Wed, 27 Nov 2024 13:54:51 +0100 Subject: [PATCH 02/30] fix lint --- .../actor_definition/dataset_schema/validation.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 0de23b14b..de2177543 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -211,10 +211,10 @@ The measured statistics are following: - **Null count:** how many items in the dataset have the field set to null - **Empty count:** how many items in the dataset are `undefined` , meaning that for example empty string is not considered empty - **Minimum and maximum** - - For numbers, this is calculated directly - - For strings, this field tracks string length - - For arrays, this field tracks the number of items in the array - - For objects, this tracks the number of keys + - For numbers, this is calculated directly + - For strings, this field tracks string length + - For arrays, this field tracks the number of items in the array + - For objects, this tracks the number of keys :::note From c4a48ec6267575ae94c215b55b553850e2f24613 Mon Sep 17 00:00:00 2001 From: Katerina Hronikova Date: Wed, 27 Nov 2024 13:56:44 +0100 Subject: [PATCH 03/30] fix lint --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index de2177543..9c728b998 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -8,7 +8,7 @@ slug: /actors/development/actor-definition/dataset-schema/validation --- -To define a schema for a default dataset of an actor run, you need to set `fields` property in the dataset schema. It’s currently impossible to set a schema for a named dataset (same as for dataset views). +To define a schema for a default dataset of an Actor run, you need to set `fields` property in the dataset schema. It’s currently impossible to set a schema for a named dataset (same as for dataset views). :::info From d1b2db60141942c23231df248927fa8072293992 Mon Sep 17 00:00:00 2001 From: Katerina Hronikova Date: Wed, 27 Nov 2024 13:59:29 +0100 Subject: [PATCH 04/30] capitalized Actor --- sources/platform/monitoring/index.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/sources/platform/monitoring/index.md b/sources/platform/monitoring/index.md index 35d0a370b..c5d8bbac5 100644 --- a/sources/platform/monitoring/index.md +++ b/sources/platform/monitoring/index.md @@ -47,13 +47,13 @@ When you set up an alert, you have three choices for how you want the metrics to 2. **Alert, when the metric is higher than** - This type of alert is checked both during the run and after the run finishes. During the run, we do periodic checks (approximately every 5 minutes) so that we can notify you as soon as possible if the metric is higher than the value you set. After the run finishes, we do a final check to make sure that the metric does not go over the limit in the last few minutes of the run. -3. **Alert, when run status is one of following** - This type of alert is checked only after the run finishes. It makes possible to track the status of your finished runs and send an alert if the run finishes in a state you do not expect. If your actor runs very often and suddenly starts failing, you will receive a single alert after the first failed run in 1 minute, and then aggregated alert every 15 minutes. +3. **Alert, when run status is one of following** - This type of alert is checked only after the run finishes. It makes possible to track the status of your finished runs and send an alert if the run finishes in a state you do not expect. If your Actor runs very often and suddenly starts failing, you will receive a single alert after the first failed run in 1 minute, and then aggregated alert every 15 minutes. -4. **Alert for dataset field statistics** - If you have a [dataset schema](../actors/development/actor_definition/dataset_schema/validation.md) set up, then you can use the field statistics to set up an alert. You can use field statistics for example to track if some field is filled in in all records, if some numeric value is too low/high (for example when tracking the price of a product over multiple sources), if the number of items in an array is too low/high (for example alert on Instagram actor if post has a lot of comments) and many other tasks like these. +4. **Alert for dataset field statistics** - If you have a [dataset schema](../actors/development/actor_definition/dataset_schema/validation.md) set up, then you can use the field statistics to set up an alert. You can use field statistics for example to track if some field is filled in in all records, if some numeric value is too low/high (for example when tracking the price of a product over multiple sources), if the number of items in an array is too low/high (for example alert on Instagram Actor if post has a lot of comments) and many other tasks like these. :::important - Available dataset fields are taken from the last successful build of the monitored actor. If different versions have different fields, currently the solution will always display only those from the default version. + Available dataset fields are taken from the last successful build of the monitored Actor. If different versions have different fields, currently the solution will always display only those from the default version. ::: From 087f1643d2c7e3284b5ee0efbf34aa943cd6757b Mon Sep 17 00:00:00 2001 From: Katerina Hronikova Date: Wed, 27 Nov 2024 14:01:22 +0100 Subject: [PATCH 05/30] one two three four --- sources/platform/monitoring/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/monitoring/index.md b/sources/platform/monitoring/index.md index c5d8bbac5..5d208cc7c 100644 --- a/sources/platform/monitoring/index.md +++ b/sources/platform/monitoring/index.md @@ -41,7 +41,7 @@ Currently, the monitoring option offers the following features: ### Alert configuration -When you set up an alert, you have three choices for how you want the metrics to be evaluated. And depending on your choices, the alerting system will behave differently: +When you set up an alert, you have four choices for how you want the metrics to be evaluated. And depending on your choices, the alerting system will behave differently: 1. **Alert, when the metric is lower than** - This type of alert is checked after the run finishes. If the metric is lower than the value you set, the alert will be triggered and you will receive a notification. From 2372e0b95fe45b7175a97059381342b991edf2e4 Mon Sep 17 00:00:00 2001 From: Katerina Hronikova Date: Fri, 29 Nov 2024 12:49:40 +0100 Subject: [PATCH 06/30] remove examples of dataset field statistics --- .../dataset_schema/validation.md | 89 ------------------- 1 file changed, 89 deletions(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 9c728b998..9d11c4c37 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -221,92 +221,3 @@ The measured statistics are following: Currently, you cannot view these statistics. We will add API endpoint soon. But you can already use them in monitoring. ::: - -## Examples - -For this schema: - -```json -{ - "$schema": "http://json-schema.org/draft-07/schema#", - "type": "object", - "properties": { - "name": { - "type": "string" - }, - "description": { - "type": "string" - }, - "dimensions": { - "type": "object", - "nullable": true, - "properties": { - "width": { - "type": "number" - }, - "height": { - "type": "number" - } - }, - "required": ["width", "height"] - }, - "price": { - "type": ["string", "number"] - } - }, - "required": ["name", "price"] -} -``` - -The stored statistics and fields in the database look like this: - -```json -{ - "_id" : "1lVGVBkWIhSYPY1dD", - "fields" : [ - "name", - "description", - "dimensions", - "dimensions/width", - "dimensions/height", - "price" - ], - "stats": { - "description": { - "emptyCount": 105, - "max": 19, - "min": 19 - }, - "dimensions": { - "emptyCount": 144, - "max": 2, - "min": 2, - "nullCount": 86 - }, - "dimensions/height": { - "emptyCount": 230, - "max": 992, - "min": 18 - }, - "dimensions/width": { - "emptyCount": 230, - "max": 977, - "min": 4 - }, - "name": { - "max": 13, - "min": 11 - }, - "price": { - "max": 999, - "min": 1 - } - } -} -``` - -:::note - -If you want to see for yourself, check `datasetStatistics` collection. The ids correspond to the ids of datasets. - -::: From b5b20326d8b4b6cd0a13289f819d721fe79aedb9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Tue, 3 Dec 2024 08:41:15 +0100 Subject: [PATCH 07/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md Co-authored-by: Jaroslav Hejlek --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 9d11c4c37..9e48576c6 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -70,7 +70,7 @@ Or in a separate separate file like this: :::important -The `$schema` line is important and must be exactly this value or it must be omitted: +Dataset schema needs to be a valid JSON schema draft-07, so the `$schema` line is important and must be exactly this value or it must be omitted: `"$schema": "http://json-schema.org/draft-07/schema#"` From f1d989a5f590cd7bfcab6510244a3df8dcf37fd8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Tue, 3 Dec 2024 08:41:22 +0100 Subject: [PATCH 08/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md Co-authored-by: Jaroslav Hejlek --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 9e48576c6..1db273458 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -114,7 +114,7 @@ try { } ``` -## Examples +## Examples of common types of validation Optional field (price is optional in this case): From 73e0f6cc0715d86e57e8120a61567159454f64a9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Tue, 3 Dec 2024 08:41:39 +0100 Subject: [PATCH 09/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md Co-authored-by: Jaroslav Hejlek --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 1db273458..c81f722c3 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -200,7 +200,7 @@ Define specific fields, but allow anything else to be added to the item: See [json schema reference](https://json-schema.org/understanding-json-schema/reference) for additional options. -Example of schema generator [here](https://www.liquid-technologies.com/online-json-to-schema-converter). +You can also use [conversion tools](https://www.liquid-technologies.com/online-json-to-schema-converter) to convert an existing JSON document into it's JSON schema. # Dataset field statistics From 4bdf857471c4a47025ce20b5ecbb43787f0078f8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Tue, 3 Dec 2024 08:41:56 +0100 Subject: [PATCH 10/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md Co-authored-by: Jaroslav Hejlek --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index c81f722c3..e29197bc7 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -215,7 +215,7 @@ The measured statistics are following: - For strings, this field tracks string length - For arrays, this field tracks the number of items in the array - For objects, this tracks the number of keys - +- For booleans, this tracks whether the boolean was set to true. So minimum is always 0, but maximum can be either 1 or 0 based on whether at least on item in the dataset has the boolean field set to true. :::note Currently, you cannot view these statistics. We will add API endpoint soon. But you can already use them in monitoring. From 48441920cea2e1cb93e6cc71439289caa1dd51b8 Mon Sep 17 00:00:00 2001 From: Katerina Hronikova Date: Tue, 3 Dec 2024 08:51:32 +0100 Subject: [PATCH 11/30] add link, format fixes --- .../actor_definition/dataset_schema/validation.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index e29197bc7..1f6c2795b 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -215,9 +215,10 @@ The measured statistics are following: - For strings, this field tracks string length - For arrays, this field tracks the number of items in the array - For objects, this tracks the number of keys -- For booleans, this tracks whether the boolean was set to true. So minimum is always 0, but maximum can be either 1 or 0 based on whether at least on item in the dataset has the boolean field set to true. + - For booleans, this tracks whether the boolean was set to true. Minimum is always 0, but maximum can be either 1 or 0 based on whether at least on item in the dataset has the boolean field set to true. + :::note -Currently, you cannot view these statistics. We will add API endpoint soon. But you can already use them in monitoring. +Currently, you cannot view these statistics. We will add API endpoint soon. But you can already use them in [monitoring](../../../../monitoring#alert-configuration). ::: From 4aca8a8ba541318e800f53eeb4abd32e4c00901f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Wed, 4 Dec 2024 14:51:15 +0100 Subject: [PATCH 12/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Michał Olender <92638966+TC-MO@users.noreply.github.com> --- .../actor_definition/dataset_schema/validation.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 1f6c2795b..32e6bf0d4 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -8,7 +8,13 @@ slug: /actors/development/actor-definition/dataset-schema/validation --- -To define a schema for a default dataset of an Actor run, you need to set `fields` property in the dataset schema. It’s currently impossible to set a schema for a named dataset (same as for dataset views). +To define a schema for a default dataset of an Actor run, you need to set `fields` property in the dataset schema. + +:::note Schema limitations + +Schema configuration is not available for named datasets or dataset views. + +::: :::info From 7356b35e27592ac1e42986bf70bb6aaa79921062 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Wed, 4 Dec 2024 14:51:25 +0100 Subject: [PATCH 13/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Marek Trunkát --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 32e6bf0d4..d78632292 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -22,7 +22,7 @@ The schema defines a single item in the dataset. Be careful not to define the sc ::: -You can either do that directly through `actor.json` like this: +You can either do that directly through `actor.json`: ```json title=".actor.json" { From d484ad0124079a5ec24e91431e19c0c17a4f23ba Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Wed, 4 Dec 2024 14:51:39 +0100 Subject: [PATCH 14/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Marek Trunkát --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index d78632292..3dcc037be 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -46,7 +46,7 @@ You can either do that directly through `actor.json`: } ``` -Or in a separate separate file like this: +Or in a separate separate file linked from the `.actor.json`: ```json title=".actor.json" { From 1adaf5b426d17fd84eb21e27432019c51a9f2d9a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Wed, 4 Dec 2024 14:51:52 +0100 Subject: [PATCH 15/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Marek Trunkát --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 3dcc037be..60b92ac75 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -133,7 +133,7 @@ Optional field (price is optional in this case): "type": "string" }, "price": { - "type": "number" + "type": "number" } }, "required": ["name"] From dc6329e5d59344a9ecc72037d47e9315b47f023d Mon Sep 17 00:00:00 2001 From: Katerina Hronikova Date: Wed, 4 Dec 2024 16:55:19 +0100 Subject: [PATCH 16/30] merge master plus open api --- .../datasets/CreateDatasetResponseError.yaml | 9 +++++++++ .../schemas/datasets/errorDataset.yaml | 12 ++++++++++++ .../openapi/paths/datasets/datasets.yaml | 19 +++++++++++++++++++ package-lock.json | 2 +- 4 files changed, 41 insertions(+), 1 deletion(-) create mode 100644 apify-api/openapi/components/schemas/datasets/CreateDatasetResponseError.yaml create mode 100644 apify-api/openapi/components/schemas/datasets/errorDataset.yaml diff --git a/apify-api/openapi/components/schemas/datasets/CreateDatasetResponseError.yaml b/apify-api/openapi/components/schemas/datasets/CreateDatasetResponseError.yaml new file mode 100644 index 000000000..c9b8dd9c1 --- /dev/null +++ b/apify-api/openapi/components/schemas/datasets/CreateDatasetResponseError.yaml @@ -0,0 +1,9 @@ +title: Createdatasetresponseerror +required: + - error +type: object +properties: + error: + allOf: + - $ref: ./errorDataset.yaml + - {} diff --git a/apify-api/openapi/components/schemas/datasets/errorDataset.yaml b/apify-api/openapi/components/schemas/datasets/errorDataset.yaml new file mode 100644 index 000000000..d71c6e4b4 --- /dev/null +++ b/apify-api/openapi/components/schemas/datasets/errorDataset.yaml @@ -0,0 +1,12 @@ +title: error +required: + - type + - message +type: object +properties: + type: + type: string + example: schema-validation-error + message: + type: string + example: 'Schema validation failed' diff --git a/apify-api/openapi/paths/datasets/datasets.yaml b/apify-api/openapi/paths/datasets/datasets.yaml index e22976e89..8a372625f 100644 --- a/apify-api/openapi/paths/datasets/datasets.yaml +++ b/apify-api/openapi/paths/datasets/datasets.yaml @@ -143,6 +143,25 @@ post: actId: null actRunId: null fields: [] + '400': + description: '' + headers: {} + content: + application/json: + schema: + allOf: + - $ref: >- + ../../components/schemas/datasets/Createdatasetresponseerror.yaml + - example: + error: + type: schema-validation-error + message: >- + Schema validation failed + example: + error: + type: schema-validation-error + message: >- + Schema validation failed deprecated: false x-legacy-doc-urls: - https://docs.apify.com/api/v2#/reference/datasets/dataset-collection/create-dataset diff --git a/package-lock.json b/package-lock.json index 84c00b433..30cf10e6f 100644 --- a/package-lock.json +++ b/package-lock.json @@ -72,7 +72,7 @@ }, "apify-docs-theme": { "name": "@apify/docs-theme", - "version": "1.0.144", + "version": "1.0.145", "license": "ISC", "dependencies": { "@apify/docs-search-modal": "^1.1.1", From 4fb5eaf9cd5f51d7841f5acc16229dee12815833 Mon Sep 17 00:00:00 2001 From: Katerina Hronikova Date: Wed, 4 Dec 2024 17:49:01 +0100 Subject: [PATCH 17/30] add info to api docs --- ...seError.yaml => PutItemResponseError.yaml} | 2 +- .../schemas/datasets/errorDataset.yaml | 61 ++++++++++++++++--- .../openapi/paths/datasets/datasets.yaml | 20 +----- .../datasets/datasets@{datasetId}@items.yaml | 33 ++++++++++ 4 files changed, 86 insertions(+), 30 deletions(-) rename apify-api/openapi/components/schemas/datasets/{CreateDatasetResponseError.yaml => PutItemResponseError.yaml} (76%) diff --git a/apify-api/openapi/components/schemas/datasets/CreateDatasetResponseError.yaml b/apify-api/openapi/components/schemas/datasets/PutItemResponseError.yaml similarity index 76% rename from apify-api/openapi/components/schemas/datasets/CreateDatasetResponseError.yaml rename to apify-api/openapi/components/schemas/datasets/PutItemResponseError.yaml index c9b8dd9c1..d3baffb32 100644 --- a/apify-api/openapi/components/schemas/datasets/CreateDatasetResponseError.yaml +++ b/apify-api/openapi/components/schemas/datasets/PutItemResponseError.yaml @@ -1,4 +1,4 @@ -title: Createdatasetresponseerror +title: PutItemResponseError required: - error type: object diff --git a/apify-api/openapi/components/schemas/datasets/errorDataset.yaml b/apify-api/openapi/components/schemas/datasets/errorDataset.yaml index d71c6e4b4..0121b9993 100644 --- a/apify-api/openapi/components/schemas/datasets/errorDataset.yaml +++ b/apify-api/openapi/components/schemas/datasets/errorDataset.yaml @@ -1,12 +1,53 @@ -title: error -required: - - type - - message type: object properties: - type: - type: string - example: schema-validation-error - message: - type: string - example: 'Schema validation failed' + error: + type: object + properties: + type: + type: string + description: The type of the error. + example: "schema-validation-error" + message: + type: string + description: A human-readable message describing the error. + example: "Schema validation failed" + data: + type: object + properties: + invalidItems: + type: array + description: A list of invalid items in the received array of items. + items: + type: object + properties: + itemPosition: + type: number + description: The position of the invalid item in the array. + example: 2 + validationErrors: + type: array + description: A complete list of AJV validation error objects for the invalid item. + items: + type: object + properties: + instancePath: + type: string + description: The path to the instance being validated. + schemaPath: + type: string + description: The path to the schema that failed the validation. + keyword: + type: string + description: The validation keyword that caused the error. + message: + type: string + description: A message describing the validation error. + params: + type: object + description: Additional parameters specific to the validation error. + required: + - invalidItems + required: + - type + - message + - data diff --git a/apify-api/openapi/paths/datasets/datasets.yaml b/apify-api/openapi/paths/datasets/datasets.yaml index 8a372625f..b9f734fe5 100644 --- a/apify-api/openapi/paths/datasets/datasets.yaml +++ b/apify-api/openapi/paths/datasets/datasets.yaml @@ -106,6 +106,7 @@ post: Keep in mind that data stored under unnamed dataset follows [data retention period](https://docs.apify.com/platform/storage#data-retention). It creates a dataset with the given name if the parameter name is used. If a dataset with the given name already exists then returns its object. + operationId: datasets_post parameters: - name: name @@ -143,25 +144,6 @@ post: actId: null actRunId: null fields: [] - '400': - description: '' - headers: {} - content: - application/json: - schema: - allOf: - - $ref: >- - ../../components/schemas/datasets/Createdatasetresponseerror.yaml - - example: - error: - type: schema-validation-error - message: >- - Schema validation failed - example: - error: - type: schema-validation-error - message: >- - Schema validation failed deprecated: false x-legacy-doc-urls: - https://docs.apify.com/api/v2#/reference/datasets/dataset-collection/create-dataset diff --git a/apify-api/openapi/paths/datasets/datasets@{datasetId}@items.yaml b/apify-api/openapi/paths/datasets/datasets@{datasetId}@items.yaml index 1bd4312e9..e073b6e26 100644 --- a/apify-api/openapi/paths/datasets/datasets@{datasetId}@items.yaml +++ b/apify-api/openapi/paths/datasets/datasets@{datasetId}@items.yaml @@ -478,6 +478,10 @@ post: The POST payload is a JSON object or a JSON array of objects to save into the dataset. + If the data you attempt to store in the dataset is invalid (meaning any of the items received by the API fails the validation), the whole request is discarded and the API will return a response with status code 400. + For more information about dataset schema validation, see [Dataset schema](https://docs.apify.com/platform/actors/development/actor-definition/dataset-schema/validation). + + **IMPORTANT:** The limit of request payload size for the dataset is 5 MB. If the array exceeds the size, you'll need to split it into a number of smaller arrays. operationId: dataset_items_post parameters: @@ -523,6 +527,35 @@ post: type: object example: {} example: {} + '400': + description: '' + headers: {} + content: + application/json: + schema: + allOf: + - $ref: >- + ../../components/schemas/datasets/PutItemResponseError.yaml + - example: + error: + type: schema-validation-error + message: >- + Schema validation failed + example: + error: + type: schema-validation-error + message: >- + Schema validation failed + data: + invalidItems: + - itemPosition: 2 + - validationErrors: + instancePath: /1/stringField + schemaPath: /items/properties/stringField/type + keyword: type + params: + type: string + message: 'must be string' deprecated: false x-legacy-doc-urls: - https://docs.apify.com/api/v2#/reference/datasets/item-collection/put-items From 9c6c99cbbc0a18877bbe9c3d996196e27162144a Mon Sep 17 00:00:00 2001 From: Katerina Hronikova Date: Thu, 5 Dec 2024 12:11:14 +0100 Subject: [PATCH 18/30] api docs part final --- ...yaml => DatasetSchemaValidationError.yaml} | 0 .../datasets/PutItemResponseError.yaml | 2 +- .../datasets/datasets@{datasetId}@items.yaml | 28 +++++++++---------- 3 files changed, 14 insertions(+), 16 deletions(-) rename apify-api/openapi/components/schemas/datasets/{errorDataset.yaml => DatasetSchemaValidationError.yaml} (100%) diff --git a/apify-api/openapi/components/schemas/datasets/errorDataset.yaml b/apify-api/openapi/components/schemas/datasets/DatasetSchemaValidationError.yaml similarity index 100% rename from apify-api/openapi/components/schemas/datasets/errorDataset.yaml rename to apify-api/openapi/components/schemas/datasets/DatasetSchemaValidationError.yaml diff --git a/apify-api/openapi/components/schemas/datasets/PutItemResponseError.yaml b/apify-api/openapi/components/schemas/datasets/PutItemResponseError.yaml index d3baffb32..2f6f30ceb 100644 --- a/apify-api/openapi/components/schemas/datasets/PutItemResponseError.yaml +++ b/apify-api/openapi/components/schemas/datasets/PutItemResponseError.yaml @@ -5,5 +5,5 @@ type: object properties: error: allOf: - - $ref: ./errorDataset.yaml + - $ref: ./DatasetSchemaValidationError.yaml - {} diff --git a/apify-api/openapi/paths/datasets/datasets@{datasetId}@items.yaml b/apify-api/openapi/paths/datasets/datasets@{datasetId}@items.yaml index e073b6e26..4cad700ad 100644 --- a/apify-api/openapi/paths/datasets/datasets@{datasetId}@items.yaml +++ b/apify-api/openapi/paths/datasets/datasets@{datasetId}@items.yaml @@ -539,23 +539,21 @@ post: - example: error: type: schema-validation-error - message: >- - Schema validation failed + message: Schema validation failed example: error: - type: schema-validation-error - message: >- - Schema validation failed - data: - invalidItems: - - itemPosition: 2 - - validationErrors: - instancePath: /1/stringField - schemaPath: /items/properties/stringField/type - keyword: type - params: - type: string - message: 'must be string' + type: schema-validation-error + message: Schema validation failed + data: + invalidItems: + - itemPosition: 2 + validationErrors: + - instancePath: /1/stringField + schemaPath: /items/properties/stringField/type + keyword: type + params: + type: string + message: 'must be string' deprecated: false x-legacy-doc-urls: - https://docs.apify.com/api/v2#/reference/datasets/item-collection/put-items From 0f9872b42de44d8d7e8042407e090ec7ca9c7be3 Mon Sep 17 00:00:00 2001 From: Katerina Hronikova Date: Thu, 5 Dec 2024 12:21:37 +0100 Subject: [PATCH 19/30] no subsequent admonitions --- .../actor_definition/dataset_schema/validation.md | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 60b92ac75..2ebd989f4 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -10,16 +10,12 @@ slug: /actors/development/actor-definition/dataset-schema/validation To define a schema for a default dataset of an Actor run, you need to set `fields` property in the dataset schema. -:::note Schema limitations - -Schema configuration is not available for named datasets or dataset views. - -::: - :::info The schema defines a single item in the dataset. Be careful not to define the schema as an array, it always needs to be a schema of an object. +Schema configuration is not available for named datasets or dataset views. + ::: You can either do that directly through `actor.json`: From 12142db153a6aefa25ce52bdba72ce828456f7ce Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Thu, 5 Dec 2024 12:22:00 +0100 Subject: [PATCH 20/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Michał Olender <92638966+TC-MO@users.noreply.github.com> --- .../development/actor_definition/dataset_schema/validation.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 2ebd989f4..8d887335d 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -219,8 +219,6 @@ The measured statistics are following: - For objects, this tracks the number of keys - For booleans, this tracks whether the boolean was set to true. Minimum is always 0, but maximum can be either 1 or 0 based on whether at least on item in the dataset has the boolean field set to true. -:::note -Currently, you cannot view these statistics. We will add API endpoint soon. But you can already use them in [monitoring](../../../../monitoring#alert-configuration). +You can use them in [monitoring](../../../../monitoring#alert-configuration). -::: From df063982b51c1e4a8a5cbc4cc83a1428b83e573e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Thu, 5 Dec 2024 12:22:13 +0100 Subject: [PATCH 21/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Michał Olender <92638966+TC-MO@users.noreply.github.com> --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 8d887335d..8f6052d1c 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -101,7 +101,7 @@ If the validation succeeds, nothing changes from the current behavior, data is s } ``` -The type of the AJV validation error object is [here](https://github.com/ajv-validator/ajv/blob/master/lib/types/index.ts#L86) +The type of the AJV validation error object is [here](https://github.com/ajv-validator/ajv/blob/master/lib/types/index.ts#L86). If you use the Apify JS client or Apify SDK and call `pushData` function you can access the validation errors in a `try catch` block like this: From 6d26ea018ac5f9ebb785ead9bb1b8fe0f4397418 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Thu, 5 Dec 2024 12:22:38 +0100 Subject: [PATCH 22/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Michał Olender <92638966+TC-MO@users.noreply.github.com> --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 8f6052d1c..529e67807 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -82,7 +82,7 @@ Dataset schema needs to be a valid JSON schema draft-07, so the `$schema` line i When you define a schema of your default dataset, the schema is then always used when you insert data into the dataset to perform validation (we use [AJV](https://ajv.js.org/)). -If the validation succeeds, nothing changes from the current behavior, data is stored and an empty response with status code 201 is returned. +If the validation succeeds, nothing changes from the current behavior, data is stored and an empty response with status code `201` is returned. **If the data you attempt to store in the dataset is invalid** (meaning any of the items received by the API fails the validation), **the whole request is discarded** and the API will return a response with status code 400 and the following JSON response: From 976dd4fc7d66b20bb3660d0e90b42f6c246ed8a2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Thu, 5 Dec 2024 12:22:56 +0100 Subject: [PATCH 23/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Michał Olender <92638966+TC-MO@users.noreply.github.com> --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 529e67807..cc8663c41 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -217,7 +217,7 @@ The measured statistics are following: - For strings, this field tracks string length - For arrays, this field tracks the number of items in the array - For objects, this tracks the number of keys - - For booleans, this tracks whether the boolean was set to true. Minimum is always 0, but maximum can be either 1 or 0 based on whether at least on item in the dataset has the boolean field set to true. + - For booleans, this tracks whether the boolean was set to true. Minimum is always 0, but maximum can be either 1 or 0 based on whether at least one item in the dataset has the boolean field set to true. You can use them in [monitoring](../../../../monitoring#alert-configuration). From fbb6dcdb7e7761597e050f412cb14ebeadcd609f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Thu, 5 Dec 2024 12:23:20 +0100 Subject: [PATCH 24/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Michał Olender <92638966+TC-MO@users.noreply.github.com> --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index cc8663c41..0ff6bb4d6 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -84,7 +84,7 @@ When you define a schema of your default dataset, the schema is then always used If the validation succeeds, nothing changes from the current behavior, data is stored and an empty response with status code `201` is returned. -**If the data you attempt to store in the dataset is invalid** (meaning any of the items received by the API fails the validation), **the whole request is discarded** and the API will return a response with status code 400 and the following JSON response: +If the data you attempt to store in the dataset is _invalid_ (meaning any of the items received by the API fails validation), _the entire request will be discarded_, The API will return a response with status code `400` and the following JSON response: ```json { From 64b51e0d76315428bcbdcc03843ad02f35826104 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Thu, 5 Dec 2024 12:23:34 +0100 Subject: [PATCH 25/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Michał Olender <92638966+TC-MO@users.noreply.github.com> --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 0ff6bb4d6..89acd64be 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -1,6 +1,6 @@ --- title: Dataset validation -description: Specify the dataset schema within the Actors so you can add monitoring and validation down to the field level. +description: Specify the dataset schema within the Actors so you can add monitoring and validation at the field level. slug: /actors/development/actor-definition/dataset-schema/validation --- From b07d03d8b5e262e7b1a2c9b87ea8c58ba2e2bee6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Thu, 5 Dec 2024 12:23:46 +0100 Subject: [PATCH 26/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Michał Olender <92638966+TC-MO@users.noreply.github.com> --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 89acd64be..6acedfdfc 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -204,7 +204,7 @@ See [json schema reference](https://json-schema.org/understanding-json-schema/re You can also use [conversion tools](https://www.liquid-technologies.com/online-json-to-schema-converter) to convert an existing JSON document into it's JSON schema. -# Dataset field statistics +## Dataset field statistics When you have the dataset fields schema set up, we then use the schema to generate a list of fields and measure statistics for these fields. From 310d614c06f8119b298b19fe45c9e83e8a9df7c2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Thu, 5 Dec 2024 12:24:03 +0100 Subject: [PATCH 27/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Michał Olender <92638966+TC-MO@users.noreply.github.com> --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 6acedfdfc..cb3f363da 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -4,7 +4,7 @@ description: Specify the dataset schema within the Actors so you can add monito slug: /actors/development/actor-definition/dataset-schema/validation --- -**Specify the dataset schema within the Actors so you can add monitoring and validation down to the field level.** +**Specify the dataset schema within the Actors so you can add monitoring and validation at the field level.** --- From c518b6b23c28fc32ac058df13b4344bd7d608f06 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Thu, 5 Dec 2024 12:24:17 +0100 Subject: [PATCH 28/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Michał Olender <92638966+TC-MO@users.noreply.github.com> --- .../development/actor_definition/dataset_schema/validation.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index cb3f363da..b7d54b62b 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -206,9 +206,7 @@ You can also use [conversion tools](https://www.liquid-technologies.com/online-j ## Dataset field statistics -When you have the dataset fields schema set up, we then use the schema to generate a list of fields and measure statistics for these fields. - -The measured statistics are following: +When you configure the dataset fields schema, we generates a field list and measure the following statistics: - **Null count:** how many items in the dataset have the field set to null - **Empty count:** how many items in the dataset are `undefined` , meaning that for example empty string is not considered empty From 8e0de86060de8d9bcae98da0d2aab4f9ae6f5a47 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Fri, 6 Dec 2024 11:02:23 +0100 Subject: [PATCH 29/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Michał Olender <92638966+TC-MO@users.noreply.github.com> --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index b7d54b62b..94547bfc6 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -42,7 +42,7 @@ You can either do that directly through `actor.json`: } ``` -Or in a separate separate file linked from the `.actor.json`: +Or in a separate file linked from the `.actor.json`: ```json title=".actor.json" { From 330dfe220e891edc15e6e2f21ef3706dab250a3f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kate=C5=99ina=20Hron=C3=ADkov=C3=A1?= <56041262+katacek@users.noreply.github.com> Date: Fri, 6 Dec 2024 11:02:30 +0100 Subject: [PATCH 30/30] Update sources/platform/actors/development/actor_definition/dataset_schema/validation.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Michał Olender <92638966+TC-MO@users.noreply.github.com> --- .../development/actor_definition/dataset_schema/validation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md index 94547bfc6..882b614e9 100644 --- a/sources/platform/actors/development/actor_definition/dataset_schema/validation.md +++ b/sources/platform/actors/development/actor_definition/dataset_schema/validation.md @@ -206,7 +206,7 @@ You can also use [conversion tools](https://www.liquid-technologies.com/online-j ## Dataset field statistics -When you configure the dataset fields schema, we generates a field list and measure the following statistics: +When you configure the dataset fields schema, we generate a field list and measure the following statistics: - **Null count:** how many items in the dataset have the field set to null - **Empty count:** how many items in the dataset are `undefined` , meaning that for example empty string is not considered empty