Skip to content

Commit

Permalink
Minor: Update README.md with system diagram (#148)
Browse files Browse the repository at this point in the history
  • Loading branch information
alamb authored Mar 1, 2024
1 parent 0b73c15 commit 1f53e25
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 1 deletion.
9 changes: 8 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,20 @@ under the License.
Comet is an Apache Spark plugin that uses [Apache Arrow DataFusion](https://arrow.apache.org/datafusion/)
as native runtime to achieve improvement in terms of query efficiency and query runtime.

On a high level, Comet aims to support:
Comet runs Spark SQL queries using the native DataFusion runtime, which is
typically faster and more resource efficient than JVM based runtimes.

<a href="doc/comet-overview.png"><img src="doc/comet-system-diagram.png" align="center" width="500" ></a>

Comet aims to support:
- a native Parquet implementation, including both reader and writer
- full implementation of Spark operators, including
Filter/Project/Aggregation/Join/Exchange etc.
- full implementation of Spark built-in expressions
- a UDF framework for users to migrate their existing UDF to native

## Architecture

The following diagram illustrates the architecture of Comet:

<a href="doc/comet-overview.png"><img src="doc/comet-overview.png" align="center" height="600" width="750" ></a>
Expand Down
Binary file added doc/comet-system-diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 1f53e25

Please sign in to comment.