-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Milvus cannot load collection #38457
Comments
@gavinshark I think the Milvus is not running healthy, Please refer this doc to export the whole Milvus logs for investigation. |
Hi yanglliang, I have sent th related log file to [email protected]. Please help to check it. |
After checking the logs, I suggest you add more memory to the query nodes. @gavinshark |
memUsage = 41928.635314941406 MB is not correct. The real memUsage is about 8000MB |
By the way, the collection is 768 dim 10M cohere datasert, query node memory is 60GB, segment number is 3 and index is HNSW with M=16. |
3 query node in the cluster |
I have done the same test in Milvus 2.4.15. The problem does not happen. The memory usage is about 10GB per node and 30GB total. Each node has 35GB free memeory space. |
Mmm...could you please refer to this doc: https://github.com/milvus-io/birdwatcher to backup etcd backup with birdwatcher |
Sent an Email to you. |
I meet the same issue on 2.5.0 version. By the way, the dataset is imported by VDBBench and the index is compacted into 3 segment to improve the performance by changing the segment max size. The compaction is done, but the loading failed(neither from vdbbench or attu) |
[2024/12/24 05:26:57.041 +00:00] [INFO] [task/executor.go:228] ["load segments..."] [taskID=1735007751540] [collectionID=454803907019330956] [replicaID=454821872001089539] [segmentID=454803907070007119] [node=146] [source=segment_checker] [shardLeader=146] |
[2024/12/24 05:26:57.062 +00:00] [WARN] [task/executor.go:232] ["failed to load segment"] [taskID=1735007751541] [collectionID=454803907019330956] [replicaID=454821872001089539] [segmentID=454803907070084622] [node=146] [source=segment_checker] [shardLeader=146] [error="load segment failed, OOM if load, maxSegmentSize = 18597.378524780273 MB, memUsage = 37306.86931705475 MB, predictMemUsage = 55904.24784183502 MB, totalMem = 61440 MB thresholdFactor = 0.900000"] |
It seems all 3 segments are loaded by a single query node. So I have 2 questions:
|
I configure the shard number with 3 . |
/assign @XuanYang-cn |
I guess this is the reason:
To verify my guess:
Suggestions:
|
I'd like to sync up some information:
|
Is there an existing issue for this?
Environment
Current Behavior
A hnsw collection cannot be loaded.
Expected Behavior
The hnsw collection can be loaded
Steps To Reproduce
Milvus Log
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered: