-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #146 from microsoft/pre-release
Pre release
- Loading branch information
Showing
18 changed files
with
1,084 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Copyright (c) Microsoft Corporation. | ||
# Licensed under the MIT License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Copyright (c) Microsoft Corporation. | ||
# Licensed under the MIT License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Copyright (c) Microsoft Corporation. | ||
# Licensed under the MIT License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Copyright (c) Microsoft Corporation. | ||
# Licensed under the MIT License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Copyright (c) Microsoft Corporation. | ||
# Licensed under the MIT License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Copyright (c) Microsoft Corporation. | ||
# Licensed under the MIT License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Copyright (c) Microsoft Corporation. | ||
# Licensed under the MIT License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Copyright (c) Microsoft Corporation. | ||
# Licensed under the MIT License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
# Execution | ||
|
||
The instantiated plans will be executed by a execute task. After execution, evalution agent will evaluation the quality of the entire execution process. | ||
|
||
In this phase, given the task-action data, the execution process will match the real controller based on word environment and execute the plan step by step. | ||
|
||
## ExecuteFlow | ||
|
||
The `ExecuteFlow` class is designed to facilitate the execution and evaluation of tasks in a Windows application environment. It provides functionality to interact with the application's UI, execute predefined tasks, capture screenshots, and evaluate the results of the execution. The class also handles logging and error management for the tasks. | ||
|
||
|
||
### Task Execution | ||
|
||
The **task execution** in the `ExecuteFlow` class follows a structured sequence to ensure accurate and traceable task performance: | ||
|
||
1. **Initialization**: | ||
- Load configuration settings and log paths. | ||
- Find the application window matching the task. | ||
- Retrieve or create an `ExecuteAgent` for executing the task. | ||
|
||
2. **Plan Execution**: | ||
- Loop through each step in the `instantiated_plan`. | ||
- Parse the step to extract information like subtasks, control text, and the required operation. | ||
|
||
3. **Action Execution**: | ||
- Find the control in the application window that matches the specified control text. | ||
- If no matching control is found, raise an error. | ||
- Perform the specified action (e.g., click, input text) using the agent's Puppeteer framework. | ||
- Capture screenshots of the application window and selected controls for logging and debugging. | ||
|
||
4. **Result Logging**: | ||
- Log details of the step execution, including control information, performed action, and results. | ||
|
||
5. **Finalization**: | ||
- Save the final state of the application window. | ||
- Quit the application client gracefully. | ||
|
||
Input of `ExecuteAgent` | ||
|
||
| **Parameter** | **Type** | **Description** | | ||
|-------------------|----------|-------------------------------------------------------------------------------| | ||
| `name` | `str` | The name of the agent. Used for identification and logging purposes. | | ||
| `process_name` | `str` | The name of the application process that the agent interacts with. | | ||
| `app_root_name` | `str` | The name of the root application window or main UI component being targeted. | | ||
--- | ||
|
||
### Evaluation | ||
|
||
The **evaluation** process in the `ExecuteFlow` class is designed to assess the performance of the executed task based on predefined prompts: | ||
|
||
1. **Start Evaluation**: | ||
- Evaluation begins immediately after task execution. | ||
- It uses an `ExecuteEvalAgent` initialized during class construction. | ||
|
||
2. **Perform Evaluation**: | ||
- The `ExecuteEvalAgent` evaluates the task using a combination of input prompts (e.g., main prompt and API prompt) and logs generated during task execution. | ||
- The evaluation process outputs a result summary (e.g., quality flag, comments, and task type). | ||
|
||
3. **Log and Output Results**: | ||
- Display the evaluation results in the console. | ||
- Return the evaluation summary alongside the executed plan for further analysis or reporting. | ||
|
||
# Reference | ||
|
||
### ExecuteFlow | ||
|
||
::: execution.workflow.execute_flow.ExecuteFlow | ||
|
||
### ExecuteAgent | ||
|
||
::: execution.agent.execute_agent.ExecuteAgent | ||
|
||
### ExecuteEvalAgent | ||
|
||
::: execution.agent.execute_eval_agent.ExecuteEvalAgent |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
# Instantiation | ||
|
||
There are three key steps in the instantiation process: | ||
|
||
1. `Choose a template` file according to the specified app and instruction. | ||
2. `Prefill` the task using the current screenshot. | ||
3. `Filter` the established task. | ||
|
||
Given the initial task, the dataflow first choose a template (`Phase 1`), the prefill the initial task based on word envrionment to obtain task-action data (`Phase 2`). Finnally, it will filter the established task to evaluate the quality of task-action data. | ||
|
||
<h1 align="center"> | ||
<img src="../../img/instantiation.png"/> | ||
</h1> | ||
|
||
## 1. Choose Template File | ||
|
||
Templates for your app must be defined and described in `dataflow/templates/app`. For instance, if you want to instantiate tasks for the Word application, place the relevant `.docx` files in dataflow `/templates/word `, along with a `description.json` file. The appropriate template will be selected based on how well its description matches the instruction. | ||
|
||
The `ChooseTemplateFlow` uses semantic matching, where task descriptions are compared with template descriptions using embeddings and FAISS for efficient nearest neighbor search. If semantic matching fails, a random template is chosen from the available files. | ||
|
||
#### ChooseTemplateFlow | ||
|
||
::: instantiation.workflow.choose_template_flow.ChooseTemplateFlow | ||
|
||
<br> | ||
|
||
## 2. Prefill the Task | ||
|
||
The `PrefillFlow` class orchestrates the refinement of task plans and UI interactions by leveraging `PrefillAgent` for task planning and action generation. It automates UI control updates, captures screenshots, and manages logs for messages and responses during execution. | ||
|
||
#### PrefillFlow | ||
|
||
::: instantiation.workflow.prefill_flow.PrefillFlow | ||
|
||
#### PrefillAgent | ||
|
||
The `PrefillAgent` class facilitates task instantiation and action sequence generation by constructing tailored prompt messages using the `PrefillPrompter`. It integrates system, user, and dynamic context to generate actionable inputs for automation workflows. | ||
|
||
::: instantiation.agent.prefill_agent.PrefillAgent | ||
|
||
<br> | ||
|
||
### 3. Filter Task | ||
|
||
The `FilterFlow` class is designed to process and refine task plans by leveraging a `FilterAgent`. | ||
|
||
#### FilterFlow | ||
|
||
::: instantiation.workflow.filter_flow.FilterFlow | ||
|
||
#### FilterAgent | ||
|
||
::: instantiation.agent.filter_agent.FilterAgent |
Oops, something went wrong.