Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [benchmark][retrieve] After writing data serially, load raises an error collection not loaded #38710

Open
1 task done
wangting0128 opened this issue Dec 24, 2024 · 0 comments
Assignees
Labels
kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@wangting0128
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:master-20241223-f499ca47-amd64
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka):rocksmq    
- SDK version(e.g. pymilvus v2.0.0rc2):2.5.0rc124
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

AWS EC2

server:

NAME                                               READY   STATUS      RESTARTS   AGE     IP             NODE                                         NOMINATED NODE   READINESS GATES
retrieve-perf-etcd-0                               1/1     Running     0          25h     10.15.43.23    ip-10-15-33-8.us-west-2.compute.internal     <none>           <none>
retrieve-perf-etcd-1                               1/1     Running     0          25h     10.15.30.59    ip-10-15-25-170.us-west-2.compute.internal   <none>           <none>
retrieve-perf-etcd-2                               1/1     Running     0          25h     10.15.33.203   ip-10-15-43-90.us-west-2.compute.internal    <none>           <none>
retrieve-perf-milvus-standalone-7c66958b6b-ssbfd   1/1     Running     0          25h     10.15.32.53    ip-10-15-40-67.us-west-2.compute.internal    <none>           <none>
截屏2024-12-24 17 12 07

client log:
scene_sparse_float_vector.log

[2024-12-24 06:43:31,283 -  INFO - fouram]: [Time] Collection.insert run in 2.3159s (api_request.py:49)
[2024-12-24 06:43:31,285 -  INFO - fouram]: [Base] Number of vectors in the collection(scene_sparse_float_vector): 39960000 (base.py:535)
[2024-12-24 06:43:34,286 -  INFO - fouram]: [Base] Start inserting, ids: 39980000 - 39999999, data size: 40,000,000 (base.py:366)
[2024-12-24 06:43:36,549 -  INFO - fouram]: [Time] Collection.insert run in 2.2619s (api_request.py:49)
[2024-12-24 06:43:36,550 -  INFO - fouram]: [Base] Number of vectors in the collection(scene_sparse_float_vector): 39980000 (base.py:535)
[2024-12-24 06:43:36,596 -  INFO - fouram]: [Base] Total time of insert: 4694.3311s, average number of vector bars inserted per second: 8520.9158, average time to insert 20000 vectors per time: 2.3472s (base.py:422)
[2024-12-24 06:43:36,596 -  INFO - fouram]: [Base] Start flush collection scene_sparse_float_vector (base.py:313)
[2024-12-24 06:43:39,618 -  INFO - fouram]: [Time] Collection.flush run in 3.0219s (api_request.py:49)
[2024-12-24 06:43:39,622 -  INFO - fouram]: [Base] Collection:scene_sparse_float_vector is not building index (base.py:494)
[2024-12-24 06:43:39,622 -  INFO - fouram]: [Base] Start build index of SPARSE_INVERTED_INDEX for field:sparse_float_vector collection:scene_sparse_float_vector, params:{'index_type': 'SPARSE_INVERTED_INDEX', 'metric_type': 'IP', 'params': {'drop_ratio_build': 0.2}}, kwargs:{} (base.py:472)
[2024-12-24 07:20:49,085 -  INFO - fouram]: [Time] Index run in 2229.4619s (api_request.py:49)
[2024-12-24 07:20:49,086 -  INFO - fouram]: [CommonCases] RT of build index SPARSE_INVERTED_INDEX: 2229.4619s (common_cases.py:168)
[2024-12-24 07:20:49,086 -  INFO - fouram]: [CommonCases] Prepare index SPARSE_INVERTED_INDEX done. (common_cases.py:170)
[2024-12-24 07:20:49,087 -  INFO - fouram]: [CommonCases] No scalar and vector fields need to be indexed. (common_cases.py:189)
[2024-12-24 07:20:49,088 -  INFO - fouram]: [Base] Index params of scene_sparse_float_vector:[{'sparse_float_vector': {'index_type': 'SPARSE_INVERTED_INDEX', 'metric_type': 'IP', 'params': {'drop_ratio_build': 0.2}}}] (base.py:491)
[2024-12-24 07:20:49,090 -  INFO - fouram]: [Base] Number of vectors in the collection(scene_sparse_float_vector): 40000000 (base.py:535)
[2024-12-24 07:20:49,090 -  INFO - fouram]: [PerfTemplate] Actual parameters used: {'dataset_params': {'metric_type': 'IP', 'vector_field_name': 'sparse_float_vector', 'dim': 30000, 'sparse_range': [100, 150], 'dataset_name': 'local', 'dataset_size': '40m', 'ni_per': 20000}, 'collection_params': {'shards_num': 1, 'collection_name': 'scene_sparse_float_vector'}, 'index_params': {'index_type': 'SPARSE_INVERTED_INDEX', 'index_param': {'drop_ratio_build': 0.2}}} (performance_template.py:67)
[2024-12-24 07:20:49,090 -  INFO - fouram]: [Base] Start load collection scene_sparse_float_vector,replica_number:1,kwargs:{} (base.py:319)
[2024-12-24 07:32:27,323 - ERROR - fouram]: RPC error: [get_loading_progress], <MilvusException: (code=101, message=collection not loaded[collection=454804082068633373])>, <Time:{'RPC start': '2024-12-24 07:32:27.315088', 'RPC error': '2024-12-24 07:32:27.323789'}> (decorators.py:140)
[2024-12-24 07:32:27,334 - ERROR - fouram]: RPC error: [wait_for_loading_collection], <MilvusException: (code=101, message=collection not loaded[collection=454804082068633373])>, <Time:{'RPC start': '2024-12-24 07:20:49.112322', 'RPC error': '2024-12-24 07:32:27.333979'}> (decorators.py:140)
[2024-12-24 07:32:27,334 - ERROR - fouram]: RPC error: [load_collection], <MilvusException: (code=101, message=collection not loaded[collection=454804082068633373])>, <Time:{'RPC start': '2024-12-24 07:20:49.090621', 'RPC error': '2024-12-24 07:32:27.334118'}> (decorators.py:140)
[2024-12-24 07:32:27,340 - ERROR - fouram]: (api_response) : [Collection.load] <MilvusException: (code=101, message=collection not loaded[collection=454804082068633373])>, [requestId: 9991efba-c1c7-11ef-8ec3-0eb43a01be00] (api_request.py:57)
[2024-12-24 07:32:27,340 - ERROR - fouram]: [CheckFunc] load request check failed, response:<MilvusException: (code=101, message=collection not loaded[collection=454804082068633373])> (func_check.py:106)

Expected Behavior

No response

Steps To Reproduce

1. create a collection with fields: 'id'(INT64, primary key), 'sparse_float_vector'
2. build SPARSE_INVERTED_INDEX index on field sparse_float_vector
3. insert 40m data
4. flush collection
5. rebuild index
6. load collection <- raises error

Milvus Log

No response

Anything else?

server config:

cluster:
  enabled: false
etcd:
  image:
    registry: harbor-us-vdc.zilliz.cc
    repository: milvus/etcd
    tag: 3.5.16-r1
  metrics:
    enabled: true
    podMonitor:
      enabled: true
  replicaCount: 3
externalS3:
  accessKey: ***
  bucketName: ***
  enabled: true
  host: ***
  port: "443"
  secretKey: ***
  useSSL: true
extraConfigFiles:
  user.yaml: |
    localStorage:
      path: /milvus-data
    dataCoord:
      segment:
        sealProportion: 1
    indexCoord:
      scheduler:
        interval: 1
    queryNode:
      cache:
        warmup: sync
      mmap:
        vectorField: true
        vectorIndex: true
        scalarField: true
        scalarIndex: true
image:
  all:
    repository: milvusdb/milvus
    tag: master-20241223-f499ca47-amd64
log:
  level: info
metrics:
  serviceMonitor:
    enabled: true
minio:
  enabled: false
nodeSelector:
  kubernetes.io/hostname: ***
  wt: "true"
pulsarv3:
  enabled: false
standalone:
  resources:
    limits:
      cpu: "2.0"
      memory: 16Gi
    requests:
      cpu: "1.0"
      memory: 10Gi
tolerations:
- effect: NoSchedule
  key: wt
  operator: Equal
  value: "true"
volumeMounts:
- mountPath: /milvus-data
  name: milvus-data-volume
volumes:
- hostPath:
    path: /mnt/data
    type: Directory
  name: milvus-data-volume

client config:

{
     "dataset_params": {
          "metric_type": "IP",
          "vector_field_name": "sparse_float_vector",
          "dim": 30000,
          "sparse_range": [
               100,
               150
          ],
          "dataset_name": "local",
          "dataset_size": "40m",
          "ni_per": 20000
     },
     "collection_params": {
          "shards_num": 1,
          "collection_name": "scene_sparse_float_vector"
     },
     "index_params": {
          "index_type": "SPARSE_INVERTED_INDEX",
          "index_param": {
               "drop_ratio_build": 0.2
          }
     },
     "concurrent_params": {
          "concurrent_number": [
               10,
               100
          ],
          "during_time": "30m",
          "interval": 20
     },
     "concurrent_tasks": [
          {
               "type": "query",
               "weight": 1,
               "params": {
                    "expr": "",
                    "output_fields": [
                         "sparse_float_vector"
                    ],
                    "limit": 10,
                    "timeout": 600,
                    "random_data": true,
                    "random_count": 10,
                    "random_range": [
                         0,
                         40000000
                    ],
                    "field_name": "id",
                    "field_type": "int64"
               }
          }
     ]
}
@wangting0128 wangting0128 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

3 participants