Skip to content

Commit

Permalink
Merge pull request #515 from milvus-io/v2.0.0-cy0827
Browse files Browse the repository at this point in the history
About Milvus docs update
  • Loading branch information
claireyuw authored Aug 27, 2021
2 parents 609d944 + b4d2777 commit 768c816
Show file tree
Hide file tree
Showing 4 changed files with 18 additions and 11 deletions.
16 changes: 8 additions & 8 deletions site/en/about/comparison.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ We recommend you trying out Milvus 2.0. Here is why:
## Design concepts
As our next-generation cloud-native vector database, Milvus 2.0 is built around the following three principles:

**Cloud-native first:** We believe that only architectures supporting storage and computing separation can scale on demand and take the full advantage of cloud's elasticity. We'd also like to bring your attention to the microservice design of Milvus 2.0, which features read and write separation, incremental and historical data separation, and CPU-intensive, memory-intensive, and IO-intensive task separation. Microservices help optimize allocation of resources for the ever-changing heterogeneous workload.
**Cloud-native first:** We believe that only architectures supporting storage and computing separation can scale on demand and take full advantage of the cloud's elasticity. We'd also like to bring your attention to the microservice design of Milvus 2.0, which features read and write separation, incremental and historical data separation, and CPU-intensive, memory-intensive, and IO-intensive task separation. Microservices help optimize allocation of resources for the ever-changing heterogeneous workload.

**Logs as data:** In Milvus 2.0, the log broker serves as the system' backbone: All data insert and update operations must go through the log broker, and worker nodes execute CRUD operations by subscribing to and consuming logs. This design reduces system complexity by moving core functions such as data persistence and flashback down to the storage layer, and log pub-sub make the system even more flexible and better positioned for future scaling.
**Logs as data:** In Milvus 2.0, the log broker serves as the system's backbone: All data insert and update operations must go through the log broker, and worker nodes execute CRUD operations by subscribing to and consuming logs. This design reduces system complexity by moving core functions such as data persistence and flashback down to the storage layer, and log pub-sub make the system even more flexible and better positioned for future scaling.

**Unified batch and stream processing:** Milvus 2.0 implements the unified Lambda architecture, which integrates the processing of the incremental and historical data. Compared with the Kappa architecture, Milvus 2.0 introduces log backfill, which stores log snapshots and indexes in the object storage to improve failure recovery efficiency and query performance. To break unbounded (stream) data down into bounded windows, Milvus embraces a new watermark mechanism, which slices the stream data into multiple message packs according to write time or event time, and maintains a timeline for users to query by time.

Expand All @@ -26,23 +26,23 @@ Data reliability and service sustainability are the basic requirements for a dat
- "Fail often" refers to the introduction of chaos testing, which uses fault injection in a testing environment to simulate situations such as hardware failures and dependency failures and accelerate bug discovery.

#### Hybrid search between scalar and vector data
To leverage synergy between structured and unstructured data, Milvus 2.0 supports both scalar and vector data and enables hybrid search between them. Hybrid search helps users find the approximate nearest neighbors that match a filter criteria. Currently, Milvus supports relational operations such as EQUAL, GREATER THAN, and LESS THAN, and logical operations such as NOT, AND, OR, and IN.
To leverage the synergy between structured and unstructured data, Milvus 2.0 supports both scalar and vector data and enables hybrid search between them. Hybrid search helps users find the approximate nearest neighbors that match filter criteria. Currently, Milvus supports relational operations such as EQUAL, GREATER THAN, and LESS THAN, and logical operations such as NOT, AND, OR, and IN.

#### Tunable consistency
As a distributed database abiding by the PACELC theorem, Milvus 2.0 has to trade off between consistency and availability & latency. In most scenarios, overemphasizing data consistency in production can overkill because allowing a small portion of data to be invisible has little impact on the overall recall but can significantly improve the query performance. Still, we believe that consistency levels, such as strong, bounded staleness, and session, have their own unique application. Therefore, Milvus supports tunable consistency at the request level. Taking testing as an example, users may require strong consistence to ensure test results are absolutely correct.
As a distributed database abiding by the PACELC theorem, Milvus 2.0 has to trade off between consistency and availability & latency. In most scenarios, overemphasizing data consistency in production can be overkill because allowing a small portion of data to be invisible has little impact on the overall recall but can significantly improve the query performance. Still, we believe that consistency levels, such as strong, bounded staleness, and session, have their own unique application. Therefore, Milvus supports tunable consistency at the request level. Taking testing as an example, users may require strong consistency to ensure test results are absolutely correct.

#### Time travel
Data engineers often need to do data rollback to fix dirty data and code bugs. Traditional databases usually implement data rollback through snapshots or even data retrain. This could bring excessive overhead and maintenance costs. Milvus maintains a timeline for all data insert and delete operations, and users can specify a timestamp in a query to retrieve a data view at a specified point in time. With time travel, Milvus can also implement a lightweight data backup or data clone.

#### ORM Python SDK:
Object relational mapping (ORM) allows users to focus more on the upper-level business model than on the underlying data model, making it easier for developers to manage relations between collections, fields, and programs. To close the gap between proof of concept (PoC) for AI algorithms and production deployment, we engineered PyMilvus ORM APIs, which can work with an embedded library, a standalone deployment, a distributed cluster , or even a cloud service. With a unified set of APIs, we provide users with a consistent user experience and reduce code migration or adaptation costs.
Object-relational mapping (ORM) allows users to focus more on the upper-level business model than on the underlying data model, making it easier for developers to manage relations between collections, fields, and programs. To close the gap between proof of concept (PoC) for AI algorithms and production deployment, we engineered PyMilvus ORM APIs, which can work with an embedded library, a standalone deployment, a distributed cluster, or even a cloud service. With a unified set of APIs, we provide users with a consistent user experience and reduce code migration or adaptation costs.

![ORM_Python_SDK](../../../assets/python_orm.png)

#### Support tools
- **Milvus Insight** is Milvus' graphical user interface offering practical functionalities such as cluster state management, meta management, and data query. The source code of Milvus Insight will also be open sourced as an independent project. We are looking for more contributors to join this effort.
- [**Milvus Insight**](insight_overview.md) is Milvus' graphical user interface offering practical functionalities such as cluster state management, meta management, and data query. The source code of Milvus Insight will also be open sourced as an independent project. We are looking for more contributors to join this effort.

- **Milvus CLI** is Milvus' command-line interface based on [Milvus Python SDK](https://github.com/milvus-io/pymilvus), supporting database connection, data operations, and data export/import.
- [**Milvus CLI**](https://github.com/milvus-io/milvus_cli#overview) is Milvus' command-line interface based on [Milvus Python SDK](https://github.com/milvus-io/pymilvus), supporting database connection, data operations, and data export/import.

- **Out-of-box experience (OOBE), faster deployment:** Milvus 2.0 can be deployed using helm or Docker Compose.

Expand Down Expand Up @@ -106,7 +106,7 @@ Object relational mapping (ORM) allows users to focus more on the upper-level bu
</tr>
<tr>
<th>SDKs</th>
<td><li>Python</li><li>Go (in planning)</li><li>Java (in planning)</li><li>RESTful (in planning)</li><li>C++ (in planning)</li></td>
<td><li>Python</li><li>Node.js</li><li>Go (in planning)</li><li>Java (in planning)</li><li>C++ (in planning)</li></td>
<td><li>Python</li><li>Java</li><li>Go</li><li>RESTful</li><li>C++</li></td>
</tr>
<tr>
Expand Down
3 changes: 3 additions & 0 deletions site/en/about/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,9 @@ Similarity search is the process of comparing a target to a database to find obj
#### Milvus Insight
[Milvus Insight](https://github.com/milvus-io/milvus-insight) is a graphical management system for Milvus. It features visualization of cluster states, meta management, data queries and more. Milvus Insight will eventually be open sourced.

#### Milvus CLI
[Milvus CLI](https://github.com/milvus-io/milvus_cli#overview) is Milvus' command-line interface based on [Milvus Python SDK](https://github.com/milvus-io/pymilvus), supporting database connection, data operations, and data export/import.

#### Milvus DM
[Data migration tool](migrate_overview.md) for Milvus 2.0 is now available.

Expand Down
6 changes: 3 additions & 3 deletions site/zh-CN/about/comparison.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ id: comparison.md
## 设计理念
围绕以下三个理念,我们重新定义下一代云原生向量数据库:

**云原生优先:** 我们认为,只有存储计算分离的架构才能发挥云的弹性,实现按需扩容的模式。另一个值得注意是 Milvus 2.0 采取了 读写分离、实时离线分离、计算瓶颈/内存瓶颈/IO瓶颈分离的微服务化设计模式,这有助于我们面对复杂的工作负载选择最佳的资源配比。
**云原生优先:** 我们认为,只有存储计算分离的架构才能发挥云的弹性,实现按需扩容的模式。另一个值得注意是 Milvus 2.0 采取了读写分离、实时离线分离、计算瓶颈/内存瓶颈/IO 瓶颈分离的微服务化设计模式,这有助于我们面对复杂的工作负载选择最佳的资源配比。

**日志即数据(log as data):** Milvus 引入消息存储作为系统的骨架,数据的插入修改只通过消息存储交互,执行节点通过订阅消息流(publish/subscribe)来执行数据库的增删改查操作。这一设计的优势在于降低了系统的复杂度,将数据库关键的持久化和闪回等能力都下钻到存储层;另一方面,日志订阅机制提供了极大的灵活性,为系统未来的拓展奠定了基础。

Expand Down Expand Up @@ -102,12 +102,12 @@ Milvus 2.0 是基于消息存储构建的分布式数据库,遵循 PACELC 定
</tr>
<tr>
<th>SDK</th>
<td><li>Python</li><li>Go (开发中)</li><li>Java (开发中)</li><li>RESTful (开发中)</li><li>C++ (开发中)</li></td>
<td><li>Python</li><li>Node.js</li><li>Go (开发中)</li><li>Java (开发中)</li><li>C++ (开发中)</li></td>
<td><li>Python</li><li>Java</li><li>Go</li><li>RESTful</li><li>C++</li></td>
</tr>
<tr>
<th>当前状态</th>
<td>预览版本。预计 2021 年 8 月发布稳定版本。</td>
<td>预览版本。预计 2021 年底发布稳定版本。</td>
<td>长期支持(LTS)版本</td>
</tr>
</tbody>
Expand Down
4 changes: 4 additions & 0 deletions site/zh-CN/about/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,10 @@ Milvus 项目在 GitHub 上获星超 6,000,拥有逾 1,000 家企业用户,

[Milvus Insight](https://github.com/milvus-io/milvus-insight) 是 Milvus 图形化管理工具,包含了集群状态可视化、元数据管理、数据查询等实用功能。Milvus Insight 源码未来也会作为独立项目开源。

#### Milvus CLI

[Milvus CLI](https://github.com/milvus-io/milvus_cli#overview) 是基于 [PyMilvus](https://github.com/milvus-io/pymilvus) 的 Milvus 命令行界面,支持连接服务器、数据操作和数据导出/导入。

#### Milvus DM 数据迁移工具
[Milvus 数据迁移工具](migrate_overview.md)现已上线。

Expand Down

0 comments on commit 768c816

Please sign in to comment.