Skip to content

Commit

Permalink
Merge pull request #146 from microsoft/pre-release
Browse files Browse the repository at this point in the history
Pre release
  • Loading branch information
vyokky authored Dec 13, 2024
2 parents 3e9730a + cef7de7 commit d2a3ab6
Show file tree
Hide file tree
Showing 18 changed files with 1,084 additions and 1 deletion.
2 changes: 2 additions & 0 deletions dataflow/execution/agent/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
2 changes: 2 additions & 0 deletions dataflow/execution/workflow/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
2 changes: 2 additions & 0 deletions dataflow/instantiation/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
2 changes: 2 additions & 0 deletions dataflow/instantiation/agent/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
2 changes: 2 additions & 0 deletions dataflow/instantiation/workflow/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
2 changes: 2 additions & 0 deletions dataflow/prompter/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
2 changes: 2 additions & 0 deletions dataflow/prompter/execution/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
2 changes: 2 additions & 0 deletions dataflow/prompter/instantiation/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
75 changes: 75 additions & 0 deletions documents/docs/dataflow/execution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Execution

The instantiated plans will be executed by a execute task. After execution, evalution agent will evaluation the quality of the entire execution process.

In this phase, given the task-action data, the execution process will match the real controller based on word environment and execute the plan step by step.

## ExecuteFlow

The `ExecuteFlow` class is designed to facilitate the execution and evaluation of tasks in a Windows application environment. It provides functionality to interact with the application's UI, execute predefined tasks, capture screenshots, and evaluate the results of the execution. The class also handles logging and error management for the tasks.


### Task Execution

The **task execution** in the `ExecuteFlow` class follows a structured sequence to ensure accurate and traceable task performance:

1. **Initialization**:
- Load configuration settings and log paths.
- Find the application window matching the task.
- Retrieve or create an `ExecuteAgent` for executing the task.

2. **Plan Execution**:
- Loop through each step in the `instantiated_plan`.
- Parse the step to extract information like subtasks, control text, and the required operation.

3. **Action Execution**:
- Find the control in the application window that matches the specified control text.
- If no matching control is found, raise an error.
- Perform the specified action (e.g., click, input text) using the agent's Puppeteer framework.
- Capture screenshots of the application window and selected controls for logging and debugging.

4. **Result Logging**:
- Log details of the step execution, including control information, performed action, and results.

5. **Finalization**:
- Save the final state of the application window.
- Quit the application client gracefully.

Input of `ExecuteAgent`

| **Parameter** | **Type** | **Description** |
|-------------------|----------|-------------------------------------------------------------------------------|
| `name` | `str` | The name of the agent. Used for identification and logging purposes. |
| `process_name` | `str` | The name of the application process that the agent interacts with. |
| `app_root_name` | `str` | The name of the root application window or main UI component being targeted. |
---

### Evaluation

The **evaluation** process in the `ExecuteFlow` class is designed to assess the performance of the executed task based on predefined prompts:

1. **Start Evaluation**:
- Evaluation begins immediately after task execution.
- It uses an `ExecuteEvalAgent` initialized during class construction.

2. **Perform Evaluation**:
- The `ExecuteEvalAgent` evaluates the task using a combination of input prompts (e.g., main prompt and API prompt) and logs generated during task execution.
- The evaluation process outputs a result summary (e.g., quality flag, comments, and task type).

3. **Log and Output Results**:
- Display the evaluation results in the console.
- Return the evaluation summary alongside the executed plan for further analysis or reporting.

# Reference

### ExecuteFlow

::: execution.workflow.execute_flow.ExecuteFlow

### ExecuteAgent

::: execution.agent.execute_agent.ExecuteAgent

### ExecuteEvalAgent

::: execution.agent.execute_eval_agent.ExecuteEvalAgent
53 changes: 53 additions & 0 deletions documents/docs/dataflow/instantiation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Instantiation

There are three key steps in the instantiation process:

1. `Choose a template` file according to the specified app and instruction.
2. `Prefill` the task using the current screenshot.
3. `Filter` the established task.

Given the initial task, the dataflow first choose a template (`Phase 1`), the prefill the initial task based on word envrionment to obtain task-action data (`Phase 2`). Finnally, it will filter the established task to evaluate the quality of task-action data.

<h1 align="center">
<img src="../../img/instantiation.png"/>
</h1>

## 1. Choose Template File

Templates for your app must be defined and described in `dataflow/templates/app`. For instance, if you want to instantiate tasks for the Word application, place the relevant `.docx` files in dataflow `/templates/word `, along with a `description.json` file. The appropriate template will be selected based on how well its description matches the instruction.

The `ChooseTemplateFlow` uses semantic matching, where task descriptions are compared with template descriptions using embeddings and FAISS for efficient nearest neighbor search. If semantic matching fails, a random template is chosen from the available files.

#### ChooseTemplateFlow

::: instantiation.workflow.choose_template_flow.ChooseTemplateFlow

<br>

## 2. Prefill the Task

The `PrefillFlow` class orchestrates the refinement of task plans and UI interactions by leveraging `PrefillAgent` for task planning and action generation. It automates UI control updates, captures screenshots, and manages logs for messages and responses during execution.

#### PrefillFlow

::: instantiation.workflow.prefill_flow.PrefillFlow

#### PrefillAgent

The `PrefillAgent` class facilitates task instantiation and action sequence generation by constructing tailored prompt messages using the `PrefillPrompter`. It integrates system, user, and dynamic context to generate actionable inputs for automation workflows.

::: instantiation.agent.prefill_agent.PrefillAgent

<br>

### 3. Filter Task

The `FilterFlow` class is designed to process and refine task plans by leveraging a `FilterAgent`.

#### FilterFlow

::: instantiation.workflow.filter_flow.FilterFlow

#### FilterAgent

::: instantiation.agent.filter_agent.FilterAgent
Loading

0 comments on commit d2a3ab6

Please sign in to comment.