diff --git a/docs/source/redact/index.rst b/docs/source/redact/index.rst index 72c9971..9bf3254 100644 --- a/docs/source/redact/index.rst +++ b/docs/source/redact/index.rst @@ -1,33 +1,33 @@ Redact ============= -The Textual redact functionality allows you to identify entities in files, and then optionally tokenize/synthesize these entities to create a safe version of your unstructured text. This functionality works on both raw strings and files, including PDF, DOCX, XLSX, and other formats. +The Textual redact functionality allows you to identify entities in files, and then optionally tokenizeor synthesize these entities to create a safe version of your unstructured text. This functionality works on both raw strings and files, including PDF, DOCX, XLSX, and other formats. Before you can use these functions, read the :doc:`Getting started <../quickstart/getting_started>` guide and create an API key. -Redacting Text +Redacting text ----------------- -You can redact text directly in a variety of formats such as plain text, json, xml, and html. All redaction requests return a response which includes the original text, redacted text, a list of found entities and their locations. Additionally all redact functions allow you to specify which entities are tokenized and which are synthesized. +You can redact text directly in a variety of formats, such as plain text, JSON, XML, and HTML. All redaction requests return a response that includes the original text, redacted text, a list of found entities, and the entity locations. All redact functions also allow you to specify which entities to tokenize and which to synthesize. -The common set of inputs to are redact functions are: +The common set of inputs to redact functions are: * **generator_default** - The default operation performed on an entity. The options are 'Redact', 'Synthesis', and 'Off' + The default operation to perform on an entity. The options are 'Redact', 'Synthesis', and 'Off'. * **generator_config** - A dictionary whose keys are entity labels and values are how to redact the entity. The options are 'Redact', 'Synthesis', and 'Off'. + A dictionary where the keys are entity labels and the values are how to redact the entity. The options are 'Redact', 'Synthesis', and 'Off'. Example: {'NAME_GIVEN': 'Synthesis'} * **label_allow_lists** - A dictionary whose keys are entity labels and values are lists of regexes. If a piece of text matches a regex it is flagged as that entity type. + A dictionary where the keys are entity labels and the values are lists of regular expressions. If a piece of text matches a regular expression, it is flagged as that entity type. Example: {'HEALTHCARE_ID': [r'[a-zA-zZ]{3}\\d{6,}'] * **label_block_lists** - A dictionary whose keys are entity labels and values are lists of regexes. If a piece of text matches a regex it is ignored for that entity type. + A dictionary where the keys are entity labels and the values are lists of regular expressions. If a piece of text matches a regular expression, it is ignored for that entity type. Example: {'NUMERIC_VALUE': [r'\\d{3}'] -The JSON and XML redact functions also have additional inputs which you can read about in their respective sections. +The JSON and XML redact functions also have additional inputs, which you can read about in their respective sections. .. toctree:: :hidden: @@ -42,7 +42,7 @@ Textual can also identify entities within files, including PDF, DOCX, XLSX, CSV, Textual can then recreate these files with entities that are redacted or synthesized. -To generated redacted/synthesized files: +To generated redacted and synthesized files: .. code-block:: python @@ -71,9 +71,9 @@ To learn more about how to generate redacted and synthesized files, go to :doc:` Working with datasets --------------------- -A dataset is a feature in the Textual UI. It is a collection of files that all share the same redaction/synthesis configuration. +A dataset is a feature in the Textual application. It is a collection of files that all share the same redaction and synthesis configuration. -To help automate workflows, you can work with datasets directly from the SDK. To learn more about how you can use the SDK to work with datasets, go to :doc:`Datasets `. +To help automate workflows, you can work with datasets directly from the SDK. To learn more about how to use the SDK to work with datasets, go to :doc:`Datasets `. .. toctree::