Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: GPU_CAGRA omm and slower as GPU_IVF_PQ #38101

Open
1 task
yongxin3344520 opened this issue Nov 29, 2024 · 3 comments
Open
1 task

[Bug]: GPU_CAGRA omm and slower as GPU_IVF_PQ #38101

yongxin3344520 opened this issue Nov 29, 2024 · 3 comments
Assignees
Labels
kind/bug Issues or changes related a bug triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@yongxin3344520
Copy link

yongxin3344520 commented Nov 29, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.5.0-beta
- Deployment mode(standalone ):
- SDK version(pymilvus 2.5.0):
- OS(Ubuntu20.04): 

CPU:Intel(R)Xeon(R)Platinu [email protected]核
RAM:48G
GFU:4090 24G
硬盘:机械 120g

Current Behavior

I used 5000 to query 6000000, with dimensions of 256 and topk=100.

use GPU_CAGRA index, and omm , and slower as GPU_IVF_PQ

The configuration file milvus.yaml is not effective

Setting gpu'maxMemSize is 20480 , but use 22144MiB at least

image

Expected Behavior

Traceback (most recent call last):
File "/soft/test_gpu2.py", line 115, in
results = collection.search(data=[
File "/root/miniconda3/lib/python3.9/site-packages/pymilvus/orm/collection.py", line 801, in search
resp = conn.search(
File "/root/miniconda3/lib/python3.9/site-packages/pymilvus/decorators.py", line 141, in handler
raise e from e
File "/root/miniconda3/lib/python3.9/site-packages/pymilvus/decorators.py", line 137, in handler
return func(*args, **kwargs)
File "/root/miniconda3/lib/python3.9/site-packages/pymilvus/decorators.py", line 176, in handler
return func(self, *args, **kwargs)
File "/root/miniconda3/lib/python3.9/site-packages/pymilvus/decorators.py", line 116, in handler
raise e from e
File "/root/miniconda3/lib/python3.9/site-packages/pymilvus/decorators.py", line 86, in handler
return func(*args, **kwargs)
File "/root/miniconda3/lib/python3.9/site-packages/pymilvus/client/grpc_handler.py", line 805, in search
return self._execute_search(request, timeout, round_decimal=round_decimal, **kwargs)
File "/root/miniconda3/lib/python3.9/site-packages/pymilvus/client/grpc_handler.py", line 746, in _execute_search
raise e from e
File "/root/miniconda3/lib/python3.9/site-packages/pymilvus/client/grpc_handler.py", line 735, in _execute_search
check_status(response.status)
File "/root/miniconda3/lib/python3.9/site-packages/pymilvus/client/utils.py", line 63, in check_status
raise MilvusException(status.code, status.reason, status.error_code)
pymilvus.exceptions.MilvusException: <MilvusException: (code=65535, message=fail to search on QueryNode 3: worker(3) query failed: Operator::GetOutput faide id: 197] : => failed to search: config={{"itopk_size":128,"k":100,"metric_type":"IP","span_id":"8b80dac94bbe6d43","trace_flags":0,"trace_id":"7edcf859 raft inner error: std::bad_alloc: out_of_memory: RMM failure at:/workspace/source/cmake_build/3rdparty_download/rmm-src/include/rmm/mr/device/pool_memorypool size exceeded at /workspace/source/internal/core/src/index/VectorMemIndex.cpp:411

Steps To Reproduce

1. Insert random generate 6000000 * 256 vectors ;
2. Random select from 6000000 * 256 vectors 10000 nums ;
3. Use 10000 * 256 serach in 6000000 * 256 vectors ;

Milvus Log

..............

Anything else?

......

@yongxin3344520 yongxin3344520 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 29, 2024
@yongxin3344520
Copy link
Author

yongxin3344520 commented Nov 29, 2024

#This is my python code:

`
from pymilvus import connections, FieldSchema, CollectionSchema, DataType, Collection
import random, time
import uuid

if name == 'main':
dim = 256
collection_name = "example_collection00000"
# 连接Milvus服务器
connections.connect(host='127.0.0.1', port='19530')
conn = connections._fetch_handler()
res = conn.list_collections()
for r in res:
print(r)
if r == collection_name:
continue
conn.release_collection(r)
conn.drop_collection(r)

#exit(0)

if not conn.has_collection(collection_name):

    # 定义字段
    field1 = FieldSchema(name="embedding",
                         dtype=DataType.FLOAT_VECTOR,
                         description="float vector field",
                         is_primary=False,
                         dim=dim
                         )
    field2 = FieldSchema(name="id",
                         dtype=DataType.INT64,
                         description="id field",
                         is_primary=True)

    # 定义集合的 schema
    schema = CollectionSchema(fields=[field1, field2], description="collection description")

    # 创建集合
    collection = Collection(name=collection_name, schema=schema)

    # 配置索引参数
    index_params = {
        "metric_type": "IP",
        "index_type": "GPU_CAGRA",
        "params": {
            "intermediate_graph_degree": 64,
            "graph_degree": 32,
            "build_algo": "IVF_PQ",
            "cache_dataset_on_device": "false"
        }
    }
    # 创建索引
    collection.create_index(field_name="embedding", index_params=index_params)
else:
    collection = Collection(name=collection_name)

max_id = 0

# 3. 插入随机生成的向量
ts = time.time()
eps = 6000
for j in range(eps):
    t0 = time.time()
    data = []
    for i in range(1000):
        data.append({
            "id": max_id + 1 + i + j * 1000,
            # "project_id": str(random.randint(0, 1000)),
            # "sponsorProjectId": str(random.randint(0, 1000)),
            # "totalPackageProjectId": str(random.randint(0, 1000)),
            # "projectType": random.choice(["a", "b", "c", "d", "e"]),
            # "url": str(uuid.uuid4()),
            # "sponsorId": str(random.randint(0, 1000)),
            # "produceName": str(uuid.uuid4()),
            # "imageId": str(random.randint(0, 1000)),
            # "submitTime": str(random.randint(0, 10000)),
            "embedding": [random.uniform(-1, 1) for _ in range(dim)]
        })
    t1 = time.time()
    res = collection.insert(
        # partition_name=random.choice(partition_names),
        data=data
    )
    t2 = time.time()
    speed = (j + 1) * len(data) / (t2 - ts)
    eta = (eps - 1 - j) * len(data) / speed
    print(
        f"\rSize:{len(data)}, Generate Cost: {'%.3f' % (t1 - t0)}, Insert Cost: {'%.3f' % (t2 - t1)},  Speed:{'%.3f' % speed}it/s, Eta:{int(eta)}s             ",
        end="")

collection.load()

# 执行查询以获取总数据量
res = collection.query(expr=" ", output_fields=["count(*)"])
total = list(res[0].values())[0]
print(f"The total number of entities in the collection '{collection_name}' is: {total}")

search_params = {
    "metric_type": "IP",
    "params": {
        "itopk_size": 128
    }
}

ids = [i for i in range(total)]
select_ids = random.sample(ids, 10000)

t2 = time.time()
query_vector = collection.query(expr=f"id in [{','.join([str(i) for i in select_ids])}] ", output_fields=["id", "embedding"])

t3 = time.time()

results = collection.search(data=[
    [random.uniform(-1, 1) for _ in range(dim)] for i in range(10000)
], anns_field="embedding", param=search_params, limit=100)
t4 = time.time()
print(f"Success: total:{total} query cost: {t3-t2}, search cost{t4 - t3}")
# print(res)
print(len(res), len(res[0]),
      # res ,
      "\n",
      # json.dumps(res, indent=4)
      )

`

@yanliang567
Copy link
Contributor

/assign @Presburger
/unassign

@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 29, 2024
@yanliang567 yanliang567 added this to the 2.5.0 milestone Nov 29, 2024
@Presburger
Copy link
Member

@yongxin3344520 Hello, you don't need to modify the default initsize and maxsize parameters in milvus.yaml, because the memory pool is only used for query vectors. Setting them too large may prevent the vectors and indexes in the database from loading properly.

@yanliang567 yanliang567 modified the milestones: 2.5.0, 2.5.1, 2.5.2 Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

3 participants