Skip to content

Commit

Permalink
Merge pull request #1263 from facebookresearch/add-provider-doc-pages
Browse files Browse the repository at this point in the history
Added doc pages for In-House and Mock providers
  • Loading branch information
meta-paul authored Nov 20, 2024
2 parents 598c8b3 + aa76dc0 commit 4902b6d
Show file tree
Hide file tree
Showing 51 changed files with 611 additions and 211 deletions.
12 changes: 6 additions & 6 deletions .github/workflows/cypress-end-to-end-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ jobs:
- name: 📂 Set the data directory
run: mephisto config core.main_data_directory ~/mephisto/data

- name: 🚚 Create Inhouse provider
- name: 🚚 Create In-House provider
run: mephisto register inhouse

- name: 📦 Setting up mephisto-core package
Expand Down Expand Up @@ -259,7 +259,7 @@ jobs:
- name: 📂 Set the data directory
run: mephisto config core.main_data_directory ~/mephisto/data

- name: 🚚 Create Inhouse provider
- name: 🚚 Create In-House provider
run: mephisto register inhouse

- name: 📦 Setting up mephisto-core package
Expand Down Expand Up @@ -309,7 +309,7 @@ jobs:
- name: 📂 Set the data directory
run: mephisto config core.main_data_directory ~/mephisto/data

- name: 🚚 Create Inhouse provider
- name: 🚚 Create In-House provider
run: mephisto register inhouse

- name: 📦 Setting up mephisto-core package
Expand Down Expand Up @@ -414,7 +414,7 @@ jobs:
- name: 📂 Set the data directory
run: mephisto config core.main_data_directory ~/mephisto/data

- name: 🚚 Create Inhouse provider
- name: 🚚 Create In-House provider
run: mephisto register inhouse

- name: 📦 Setting up mephisto-core package
Expand Down Expand Up @@ -465,7 +465,7 @@ jobs:
- name: 📂 Set the data directory
run: mephisto config core.main_data_directory ~/mephisto/data

- name: 🚚 Create Inhouse provider
- name: 🚚 Create In-House provider
run: mephisto register inhouse

- name: 📦 Setting up mephisto-core package
Expand Down Expand Up @@ -516,7 +516,7 @@ jobs:
- name: 📂 Set the data directory
run: mephisto config core.main_data_directory ~/mephisto/data

- name: 🚚 Create Inhouse provider
- name: 🚚 Create In-House provider
run: mephisto register inhouse

- name: 📦 Setting up mephisto-core package
Expand Down
43 changes: 43 additions & 0 deletions docs/web/docs/guides/how_to_use/providers/inhouse.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
---

# Copyright (c) Meta Platforms and its affiliates.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

sidebar_position: 3
---

# In-House

In-House CrowdProvider allows running live Tasks with no third-party worker platform integration.


## Simple In-House authorization

> Note that this feature is enabled only for In-House provider.
To prevent unauthorized access to your Task, you can enable a simple authorization.
Providing a one-column CSV file with allowed Worker usernames will limit your target audience only to these specific workers.


### Enable CSV authorization

1. Create a CSV file with one column containing allowed Worker usernames
1. Worker will need to use this username on Task's Welcome page
2. Alternatively, you can include username in a Task URL as a parameter (that you will send to Worker to invite them to the Task)
2. Place this file in the data directory of your Task (e.g. `examples/static_react_task/data/authorization.csv`)
3. In your Task config, set `provider.authorization_csv` parameter to this file path
```yaml
mephisto:
...

provider:
authorization_csv: ${task_dir}/data/authorization.csv
...
```
4. Run your Task and try authorizing yourself with one of the allowed usernames from the CSV file.


## Example

To understand how it works, you can run an example Task from our [In-House authorization example](https://github.com/facebookresearch/Mephisto/blob/main/examples/static_react_task/run_task_with_authorization__local__inhouse.py).
14 changes: 14 additions & 0 deletions docs/web/docs/guides/how_to_use/providers/mock.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---

# Copyright (c) Meta Platforms and its affiliates.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

sidebar_position: 4
---

# Mock

Mock provider is used solely for running tests.
It has simplified implementation, and cannot be used with any workers.
If you wish to run a live Task without thrid party provides, consider [In-House provider](/docs/guides/how_to_use/providers/inhouse/) instead.
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,15 @@ sidebar_position: 1
# Using qualifications to improve worker quality
Qualification control is a powerful component of Mephisto, allowing you to filter out workers with both manual and automatic controls. Within this are typical allowlists and blocklists, setting up value-based qualifications, making automatic qualifications for onboarding, and also utilizing the qualifications that various crowdsourcing providers have to offer. This document seeks to describe some common use cases for qualifications, and how we currently go about using them.


### Blocking qualifications
When you set a `block_qualification` during a launch, calling `Worker.grant_qualification(<block_qualification>)` will prevent that worker from working on any tasks that you have set the same `block_qualification` for. You can use this to set up blocklists for specific tasks, or for groups of tasks.


### Onboarding qualifications
Mephisto has an automatic setup for assigning workers qualifications for particular tasks that they've worked on, such that it's possible to specify a qualification that a worker can be granted on the first time they take out a particular task. This qualification is given the name `onboarding_qualification`, and is compatible with any blueprints that have onboarding tasks.

When a worker accepts your task for the first time, they will have neither the passing or failing version of the onboarding qualification, and will be put into a test version of the task that determines if they are qualified. Then only those that qualify the first time will be able to continue working on that task.
When a worker accepts your task for the first time, they will have neither the passing or failing version of the onboarding qualification, and will be put into a trial version of the task that determines if they are qualified. Then only those that qualify the first time will be able to continue working on that task.

The `onboarding_qualification` is shared between all task runs that use the same qualification name, and as such you can ensure that a worker need not repeatedly qualify for the same or similar tasks by sharing the same lists.

Expand All @@ -29,7 +31,7 @@ from mephisto.utils.qualifications import make_qualification_dict

ONBOARDING_QUALIFICATION_NAME = "TEST_ONBOARDING_QUAL_NAME"

# Making a qualification that requires a worker has
# Making a qualification that requires a worker has
# passed an onboarding from a different task
shared_state.qualifications = [
make_qualification_dict(
Expand All @@ -39,7 +41,7 @@ shared_state.qualifications = [
)
]

# Making a qualification that requires that a worker
# Making a qualification that requires that a worker
# has not failed a particular onboarding from a different task
shared_state.qualifications = [
make_qualification_dict(
Expand Down Expand Up @@ -82,6 +84,19 @@ shared_state.qualifications = [
]
```

### Admitting Workers with no prior qualification

Let's say your Task requires certain qualifications and you wish to expand your pool of Workers, but you do not wish to use Onboarding Qualifications. In this case you can allow Task access to all workers that lack any failing qualification by using `admit_workers_with_no_prior_qualification` Task parameter:

```yaml
mephisto:
...

provider:
admit_workers_with_no_prior_qualification: true
...
```

### Adding custom qualifications to SharedTaskState
You should be able to specify a qualification in Mephisto using the following:
```python
Expand All @@ -103,11 +118,13 @@ where `QUAL_COMPARATOR` is any of the comparators available [here](https://githu

You can directly grant that qualification to mephisto `Worker`'s using `Worker.grant_qualification("QUALIFICATION_NAME", qualification_value)`.


### What if I want to block a worker that hasn't connected before?
For this you'll want to use the interface that a `CrowdProvider` has set up to do the granting process directly. An example for this can be found in `abstractions.providers.mturk.utils.script_utils`.
For this you'll want to use the interface that a `CrowdProvider` has set up to do the granting process directly. An example for this can be found in `abstractions.providers.mturk.utils.script_utils`.

Note, while you're able to grant these qualifications to a worker that isn't tracked by Mephisto, it will not be possible for Mephisto to help in bookkeeping qualifications granted to workers in this manner.


### What if I want to use qualifications only set by a provider?
For the special case of provider-specific qualifications, `SharedTaskState` has fields for `<provider>_specific_qualifications` wherein you can put qualifications in the expected format for that crowd provider. For instance, you can do the following for using an [MTurk-specific qualification](https://docs.aws.amazon.com/AWSMechTurk/latest/AWSMturkAPI/ApiReference_QualificationRequirementDataStructureArticle.html#ApiReference_QualificationType-IDs) on a task:
```python
Expand Down
12 changes: 4 additions & 8 deletions docs/web/docs/guides/how_to_use/worker_quality/other_methods.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,30 +7,26 @@
sidebar_position: 6
---

# Other methods for quality control
# Future features

While not yet implemented in Mephisto's core codebase, there are a few additional methods of quality control that may be successful. This doc lists a few that we've considered for Mephisto thusfar.

## Worker Agreement

A fairly common method for ensuring that data is of high quality is to check for inter-annotator agreement. Putting the same work out for different annotators to complete is currently supported by mephisto using the `mephisto.blueprint.units_per_assignment` argument on static and remote query tasks. This ensures that the specified number of different workers will complete each task.
A fairly common method for ensuring that data is of high quality is to check for inter-annotator agreement. Putting the same work out for different annotators to complete is currently supported by mephisto using the `mephisto.blueprint.units_per_assignment` argument on static and remote query tasks. This ensures that the specified number of different workers will complete each Assignment from the Task.

Once you have multiple completions, you can write your own review script to parse the results for all `Assignment` of your `TaskRun` to see if the `Unit`s within each `Assignment` have similar enough submitted data.

Partial worker agreement may be a more efficient method of determining whether a worker is performing to your expectations, wherein you sample the tasks from a given worker and relaunch for others to complete and validate.

## Review tasks as tasks
## Tasks for reviewing other Tasks

An extension of the above, it may be preferable to create tasks to review the data of other submitted workers. You can then use the results to simplify the time taken reviewing over all samples to just reviewing the borderline cases from your metareviewers.

A review project like this almost certainly would require creating a specific allowlist of workers who are qualified to review the work of others, generally some of your higher performing workers on other tasks or during pilots.
A review project like this almost certainly would require creating a specific allowlist of workers who are qualified to review the work of others, generally some of your higher performing workers on other tasks or during pilots.

There's certainly a lot of lift to implement this type of workflow, so we're looking to support this type of functionality within Mephisto in our 1.1 release.

## Multi-tier worker qualification

Some have found it effective to keep local ratings on worker quality such that allowlist and blocklist can be created on the fly for specific tasks. You can certainly extend any review script you use to allow categorizing workers, and then may find that your higher-tiered workers are more appropriate for sensitive tasks, or those that require a quality comparison.

## Contributing

While all of the above methods these aren't yet codified, they should all be able to hook into Mephisto primatives in some form or other. We'd be excited to review contributions for any of the above.
6 changes: 3 additions & 3 deletions examples/form_composer_demo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ These form-based questionnaires are example of FormComposer task generator.
2. SSH into running container to run server: `docker exec -it mephisto_dc bash`
3. Inside the container, go to FormComposer examples directory: `cd /mephisto/examples/form_composer_demo`
4. Inside the `examples` directory, run a desired example with one of these commands:
- Simple form with Inhouse provider: `python ./run_task__local__inhouse.py`
- Dynamic form with Inhouse provider: `python ./run_task_dynamic__local__inhouse.py`
- Simple form with In-House provider: `python ./run_task__local__inhouse.py`
- Dynamic form with In-House provider: `python ./run_task_dynamic__local__inhouse.py`
- Dynamic form with Mturk on EC2: `python ./run_task_dynamic__ec2__mturk_sandbox.py`
- Dynamic form with Prolific on EC2: `python ./run_task_dynamic__ec2__prolific.py`
- Dynamic form with Presigned URLs: `python ./run_task_dynamic_presigned_urls__ec2__prolific.py`
Expand Down Expand Up @@ -56,6 +56,6 @@ mephisto form_composer config --directory /mephisto/examples/form_composer_demo/
mephisto form_composer config --directory /mephisto/examples/form_composer_demo/data/dynamic_presigned_urls/ --extrapolate-token-sets

# 2b. Run the Task
cd /mephisto/examples/form_composer_demo
cd /mephisto/examples/form_composer_demo
python ./run_task_dynamic_presigned_urls__ec2__prolific.py
```
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ mephisto:
link_task_source: false
extra_source_dir: ${task_dir}/webapp/src/static
units_per_assignment: 2
log_level: "debug"
provider:
ui_base_url: "http://localhost:3001"
task:
task_name: "Sample Questionnaire"
task_title: "Example how to easily create dynamic form-based Tasks"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ mephisto:
link_task_source: false
extra_source_dir: ${task_dir}/webapp/src/static
units_per_assignment: 2
log_level: "debug"
provider:
ui_base_url: "http://localhost:3001"
task:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ mephisto:
min_golds: 1 # Required for Gold Units
max_incorrect_golds: 1 # Required for Gold Units
max_gold_units: 1 # Required for Gold Units
log_level: "debug"
provider:
ui_base_url: "http://localhost:3001"
task:
allowed_concurrent: 1 # Required for Gold Units
task_name: "Sample Questionnaire"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ mephisto:
extra_source_dir: ${task_dir}/webapp/src/static
units_per_assignment: 1
onboarding_qualification: onboarding-qualification # Required for Onboarding
log_level: "debug"
provider:
ui_base_url: "http://localhost:3001"
task:
task_name: "Sample Questionnaire"
task_title: "Example how to easily create simple form-based Tasks with onboarding"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ mephisto:
block_qualification: blocked-qualification # Required for Screening
use_screening_task: true # Required for Screening
max_screening_units: 1 # Required for Screening
log_level: "debug"
provider:
ui_base_url: "http://localhost:3001"
task:
allowed_concurrent: 1 # Required for Screening
task_name: "Sample Questionnaire"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ defaults:
- /mephisto/blueprint: parlai_chat
- /mephisto/architect: local
- /mephisto/provider: inhouse

mephisto:
blueprint:
world_file: ${task_dir}/demo_worlds.py
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ defaults:
mephisto:
blueprint:
custom_source_bundle: ${task_dir}/webapp/build/bundle.js
log_level: "debug"
provider:
ui_base_url: "http://localhost:3001"
task:
task_name: parlai-chat-example
task_title: "Test ParlAI Prebuilt Chat Task"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ defaults:
mephisto:
blueprint:
custom_source_dir: ${task_dir}/custom_simple
log_level: "debug"
provider:
ui_base_url: "http://localhost:3001"
task:
task_name: parlai-chat-example
task_title: "Test ParlAI Simply Built Chat Task"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ mephisto:
world_file: ${task_dir}/demo_worlds.py
task_description_file: ${task_dir}/task_description.html
num_conversations: 1
log_level: "debug"
provider:
ui_base_url: "http://localhost:3001"
task:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ defaults:
mephisto:
blueprint:
onboarding_qualification: test-parlai-chat-qualification
log_level: "debug"
provider:
ui_base_url: "http://localhost:3001"
task:
task_name: parlai-chat-example
task_title: "Test ParlAI Chat Task"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ mephisto:
# NOTE pick something based on your task
block_qualification: test_qual_block
units_per_assignment: 1
log_level: "debug"
provider:
ui_base_url: "http://localhost:3001"
task:
allowed_concurrent: 1
task_name: remote-procedure-mnist
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ mephisto:
block_qualification: "test-mnist-blocked-qualification"
use_screening_task: true
max_screening_units: 3
log_level: "debug"
provider:
ui_base_url: "http://localhost:3001"
task:
task_name: remote-procedure-mnist
task_title: "Provide feedback on our MNIST model"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ mephisto:
preview_source: ${task_dir}/server_files/demo_preview.html
extra_source_dir: ${task_dir}/server_files/extra_refs
units_per_assignment: 2
log_level: "debug"
provider:
ui_base_url: "http://localhost:3001"
task:
task_name: html-static-task-example
task_title: "Test static task"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ mephisto:
onboarding_source: ${task_dir}/server_files/demo_onboarding.html
onboarding_qualification: static-test-onboarding-qual
units_per_assignment: 2
log_level: "debug"
provider:
ui_base_url: "http://localhost:3001"
task:
task_name: html-static-task-example
task_title: "Test static task"
Expand Down
6 changes: 6 additions & 0 deletions examples/static_react_task/data/authorization.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
WORKER_USERNAME
WORKER_USERNAME2
WORKER_USERNAME3
x
y
z
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@ mephisto:
link_task_source: false
extra_source_dir: ${task_dir}/webapp/src/static
units_per_assignment: 1
log_level: "debug"
provider:
ui_base_url: "http://localhost:3001"
task:
task_name: react-static-task-example
task_title: "Rating a sentence as good or bad"
Expand Down
Loading

0 comments on commit 4902b6d

Please sign in to comment.