Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [benchmark][cluster] Build index raises error context deadline exceeded in concurrent DQL & DDL scene #37258

Closed
1 task done
wangting0128 opened this issue Oct 29, 2024 · 7 comments
Assignees
Labels
kind/bug Issues or changes related a bug test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@wangting0128
Copy link
Contributor

wangting0128 commented Oct 29, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:master-20241029-0f59bfdf-amd64
- Deployment mode(standalone or cluster):cluster
- MQ type(rocksmq, pulsar or kafka):pulsar    
- SDK version(e.g. pymilvus v2.0.0rc2):2.5.0rc97
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

argo task: fouramf-bitmap-mmap-9shm7
test case name: test_bitmap_locust_dql_ddl_cluster

server: enabled mmap

NAME                                                            READY   STATUS      RESTARTS         AGE     IP              NODE         NOMINATED NODE   READINESS GATES
fouramf-bitmap-mmap-9shm7-2-etcd-0                              1/1     Running     0                4h26m   10.104.27.138   4am-node31   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-etcd-1                              1/1     Running     0                4h26m   10.104.19.7     4am-node28   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-etcd-2                              1/1     Running     0                4h26m   10.104.17.3     4am-node23   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-milvus-datanode-f85569c7f-x4wp8     1/1     Running     3 (4h22m ago)    4h26m   10.104.1.201    4am-node10   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-milvus-indexnode-95b46f588-6h58z    1/1     Running     3 (4h25m ago)    4h26m   10.104.25.80    4am-node30   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-milvus-indexnode-95b46f588-kn45x    1/1     Running     2 (4h26m ago)    4h26m   10.104.15.80    4am-node20   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-milvus-mixcoord-58c8bc9f89-nd2jp    1/1     Running     3 (4h25m ago)    4h26m   10.104.15.81    4am-node20   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-milvus-proxy-7f87654597-qkcsv       1/1     Running     3 (4h22m ago)    4h26m   10.104.14.68    4am-node18   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-milvus-querynode-66567d8858-snfc5   1/1     Running     2 (4h26m ago)    4h26m   10.104.34.93    4am-node37   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-milvus-querynode-66567d8858-tls98   1/1     Running     2 (4h26m ago)    4h26m   10.104.14.69    4am-node18   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-minio-0                             1/1     Running     0                4h26m   10.104.19.5     4am-node28   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-minio-1                             1/1     Running     0                4h26m   10.104.27.139   4am-node31   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-minio-2                             1/1     Running     0                4h26m   10.104.32.131   4am-node39   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-minio-3                             1/1     Running     0                4h26m   10.104.17.4     4am-node23   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-pulsar-bookie-0                     1/1     Running     0                4h26m   10.104.16.16    4am-node21   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-pulsar-bookie-1                     1/1     Running     0                4h26m   10.104.23.27    4am-node27   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-pulsar-bookie-2                     1/1     Running     0                4h26m   10.104.27.142   4am-node31   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-pulsar-bookie-init-kt9qb            0/1     Completed   0                4h26m   10.104.16.11    4am-node21   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-pulsar-broker-0                     1/1     Running     0                4h26m   10.104.30.231   4am-node38   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-pulsar-proxy-0                      1/1     Running     0                4h26m   10.104.27.127   4am-node31   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-pulsar-pulsar-init-dswf4            0/1     Completed   0                4h26m   10.104.16.10    4am-node21   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-pulsar-recovery-0                   1/1     Running     0                4h26m   10.104.4.116    4am-node11   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-pulsar-zookeeper-0                  1/1     Running     0                4h26m   10.104.16.15    4am-node21   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-pulsar-zookeeper-1                  1/1     Running     0                4h26m   10.104.27.145   4am-node31   <none>           <none>
fouramf-bitmap-mmap-9shm7-2-pulsar-zookeeper-2                  1/1     Running     0                4h25m   10.104.21.109   4am-node24   <none>           <none>

build_index_failed.log

{pod=~"fouramf-bitmap-mmap-9shm7-2-.*"} |~ "context deadline exceeded|scene_hybrid_search_test_J194MWfE|f4b54b6f4eeadd983b760a3e54b8f334"
image

client logs:

[2024-10-29 06:38:29,256 - DEBUG - fouram]: [Base] Create collection scene_hybrid_search_test_J194MWfE (base.py:273)
[2024-10-29 06:38:29,256 - DEBUG - fouram]: (api_request)  : [Collection] args: ['scene_hybrid_search_test_J194MWfE', {'auto_id': False, 'description': '', 'fields': [{'name': 'id', 'description': '', 'type': <DataType.INT64: 5>, 'is_primary': True, 'auto_id': False}, {'name': 'float_vector', 'description': '', 'type': <DataType.FLOAT_VECTOR: 101>, 'params': {'dim': 128}}, {'name': 'binary_vector_scene_hybrid_search_test_1', 'description': '', 'type': <DataType.BINARY_VECTOR: 100>, 'params': {'dim': 512}}, {'name': 'float16_vector_scene_hybrid_search_test_2', 'description': '', 'type': <DataType.FLOAT16_VECTOR: 102>, 'params': {'dim': 64}}, {'name': 'sparse_float_vector_scene_hybrid_search_test_3', 'description': '', 'type': <DataType.SPARSE_FLOAT_VECTOR: 104>}, {'name': 'int64_1', 'description': '', 'type': <DataType.INT64: 5>}, {'name': 'bool_1', 'description': '', 'type': <DataType.BOOL: 1>}, {'name': 'varchar_1', 'description': '', 'type': <DataType.VARCHAR: 21>, 'params': {'max_length': 256}}], 'enable_dynamic_field': False}, 'default'], kwargs: {'shards_num': 2}, [requestId: 68940e44-95c0-11ef-bffa-62953db63396] (api_request.py:77)
<name>: scene_hybrid_search_test_J194MWfE
[2024-10-29 06:38:30,294 - DEBUG - fouram]: [Base] Start inserting 3000 vectors to collection scene_hybrid_search_test_J194MWfE (base.py:383)
[2024-10-29 06:38:30,765 - DEBUG - fouram]: [Base] Number of vectors in the collection(scene_hybrid_search_test_J194MWfE): 0 (base.py:532)
[2024-10-29 06:38:30,796 - DEBUG - fouram]: [Base] Start flush collection scene_hybrid_search_test_J194MWfE (base.py:313)
[2024-10-29 06:38:33,827 - DEBUG - fouram]: [Base] Number of vectors in the collection(scene_hybrid_search_test_J194MWfE): 3000 (base.py:532)
[2024-10-29 06:38:33,827 - DEBUG - fouram]: [Base] Start build index of IVF_SQ8 for field:float_vector collection:scene_hybrid_search_test_J194MWfE, params:{'index_type': 'IVF_SQ8', 'metric_type': 'L2', 'params': {'nlist': 2048}}, kwargs:{} (base.py:469)
<name>: scene_hybrid_search_test_J194MWfE
[2024-10-29 06:38:54,262 - DEBUG - fouram]: [Base] Start build index of BIN_IVF_FLAT for field:binary_vector_scene_hybrid_search_test_1 collection:scene_hybrid_search_test_J194MWfE, params:{'index_type': 'BIN_IVF_FLAT', 'metric_type': 'JACCARD', 'params': {'nlist': 2048}}, kwargs:{} (base.py:469)
<name>: scene_hybrid_search_test_J194MWfE
[2024-10-29 06:39:00,851 - DEBUG - fouram]: [Base] Start build index of DISKANN for field:float16_vector_scene_hybrid_search_test_2 collection:scene_hybrid_search_test_J194MWfE, params:{'index_type': 'DISKANN', 'metric_type': 'IP', 'params': {}}, kwargs:{} (base.py:469)
<name>: scene_hybrid_search_test_J194MWfE
[2024-10-29 06:39:14,027 - DEBUG - fouram]: [Base] Start build index of SPARSE_WAND for field:sparse_float_vector_scene_hybrid_search_test_3 collection:scene_hybrid_search_test_J194MWfE, params:{'index_type': 'SPARSE_WAND', 'metric_type': 'IP', 'params': {'drop_ratio_build': 0.2}}, kwargs:{} (base.py:469)
<name>: scene_hybrid_search_test_J194MWfE
[2024-10-29 06:39:25,397 - DEBUG - fouram]: [Base] Start build scalar index of scene_hybrid_search_test_J194MWfE for field:int64_1, index_params:{}, kwargs: {} (base.py:478)
<name>: scene_hybrid_search_test_J194MWfE
[2024-10-29 06:39:38,541 - DEBUG - fouram]: [Base] Start build scalar index of scene_hybrid_search_test_J194MWfE for field:bool_1, index_params:{'index_type': 'BITMAP'}, kwargs: {} (base.py:478)
<name>: scene_hybrid_search_test_J194MWfE
[2024-10-29 06:40:03,470 - DEBUG - fouram]: [Base] Start build scalar index of scene_hybrid_search_test_J194MWfE for field:varchar_1, index_params:{'index_type': 'BITMAP'}, kwargs: {} (base.py:478)
<name>: scene_hybrid_search_test_J194MWfE
[2024-10-29 06:40:13,476 - ERROR - fouram]: RPC error: [create_index], <MilvusException: (code=10001, message=context deadline exceeded)>, <Time:{'RPC start': '2024-10-29 06:40:03.470398', 'RPC error': '2024-10-29 06:40:13.476455'}> (decorators.py:140)
[2024-10-29 06:40:13,478 - ERROR - fouram]: (api_response) : [Index] <MilvusException: (code=10001, message=context deadline exceeded)>, [requestId: a0bbe350-95c0-11ef-bffa-62953db63396] (api_request.py:57)
[2024-10-29 06:40:13,478 - ERROR - fouram]: [CheckFunc] init_index request check failed, response:<MilvusException: (code=10001, message=context deadline exceeded)> (func_check.py:106)

Expected Behavior

No response

Steps To Reproduce

concurrent test and calculation of RT and QPS

        :purpose:  `primary key: INT64`
            1. building `BITMAP` index on all supported 12 scalar fields, `INVERTED` index on pk field
            2. 2 fields of different vector types
            3. verify DQL & DML requests

        :test steps:
            1. create collection with fields:
                'float_vector': 128dim
                'float_vector_1': 768dim
                'id': primary key type is INT64

                all scalar fields: varchar max_length=100, array max_capacity=13
            2. build indexes:
                HNSW: 'float_vector'
                IVF_SQ8: 'float_vector_1'

                BITMAP: all scalar fields
                INVERTED: 'id' prmary key field
            3. insert 10 million data
            4. flush collection
            5. build indexes again using the same params
            6. load collection
            7. concurrent request:
                - search
                - query
                - hybrid_search
                - scene_test
                    (collection: create->insert->flush->index->drop)
                - scene_search_test
                    (collection: create->insert->flush->index->load->search->drop)
                - scene_hybrid_search_test: 4 vector fields, 3 scalar fields
                    (collection: create->insert->flush->index->load->hybrid_search->drop)

Milvus Log

No response

Anything else?

test result:

[2024-10-29 08:24:57,197 -  INFO - fouram]: Print locust final stats. (locust_runner.py:56)
[2024-10-29 08:24:57,198 -  INFO - fouram]: Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s (stats.py:789)
[2024-10-29 08:24:57,198 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-10-29 08:24:57,198 -  INFO - fouram]: grpc     hybrid_search                                                                   1466     0(0.00%) |   6731    3438   46410   5800 |    0.14        0.00 (stats.py:789)
[2024-10-29 08:24:57,198 -  INFO - fouram]: grpc     query                                                                           1467     0(0.00%) |    322     105   24122    130 |    0.14        0.00 (stats.py:789)
[2024-10-29 08:24:57,199 -  INFO - fouram]: grpc     scene_hybrid_search_test                                                        1450     1(0.07%) |  80328   14212  208073  77000 |    0.13        0.00 (stats.py:789)
[2024-10-29 08:24:57,199 -  INFO - fouram]: grpc     scene_search_test                                                               1394     0(0.00%) |  51675   10159  147462  50000 |    0.13        0.00 (stats.py:789)
[2024-10-29 08:24:57,199 -  INFO - fouram]: grpc     scene_test                                                                      1506     0(0.00%) |  80601   63888  143937  80000 |    0.14        0.00 (stats.py:789)
[2024-10-29 08:24:57,199 -  INFO - fouram]: grpc     search                                                                          1515     0(0.00%) |   1633     941   41809   1200 |    0.14        0.00 (stats.py:789)
[2024-10-29 08:24:57,199 -  INFO - fouram]: --------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|----------- (stats.py:789)
[2024-10-29 08:24:57,199 -  INFO - fouram]:          Aggregated                                                                      8798     1(0.01%) |  36680     105  208073  14000 |    0.81        0.00 (stats.py:789)
[2024-10-29 08:24:57,199 -  INFO - fouram]:  (stats.py:790)
[2024-10-29 08:24:57,209 -  INFO - fouram]: [PerfTemplate] Report data: 
{'server': {'deploy_tool': 'helm',
            'deploy_mode': 'cluster',
            'config_name': 'cluster_8c16m',
            'config': {'queryNode': {'resources': {'limits': {'cpu': '16', 'memory': '8Gi'}, 'requests': {'cpu': '5.0', 'memory': '8Gi'}}, 'replicas': 2},
                       'indexNode': {'resources': {'limits': {'cpu': '8.0', 'memory': '8Gi'}, 'requests': {'cpu': '5.0', 'memory': '5Gi'}}, 'replicas': 2},
                       'dataNode': {'resources': {'limits': {'cpu': '8.0', 'memory': '16Gi'}, 'requests': {'cpu': '5.0', 'memory': '9Gi'}}},
                       'cluster': {'enabled': True},
                       'pulsar': {},
                       'kafka': {},
                       'minio': {'metrics': {'podMonitor': {'enabled': True}}},
                       'etcd': {'metrics': {'enabled': True, 'podMonitor': {'enabled': True}}},
                       'metrics': {'serviceMonitor': {'enabled': True}},
                       'log': {'level': 'debug'},
                       'extraConfigFiles': {'user.yaml': 'queryNode:\n'
                                                         '  mmap:\n'
                                                         '    mmapEnabled: true\n'
                                                         '    vectorField: true\n'
                                                         '    vectorIndex: true\n'
                                                         '    scalarField: true\n'
                                                         '    scalarIndex: true\n'},
                       'image': {'all': {'repository': 'harbor.milvus.io/milvus/milvus', 'tag': 'master-20241029-0f59bfdf-amd64'}}},
            'host': 'fouramf-bitmap-mmap-9shm7-2-milvus.qa-milvus.svc.cluster.local',
            'port': '19530',
            'uri': ''},
 'client': {'test_case_type': 'ConcurrentClientBase',
            'test_case_name': 'test_bitmap_locust_dql_ddl_cluster',
            'test_case_params': {'dataset_params': {'metric_type': 'L2',
                                                    'dim': 128,
                                                    'max_length': 100,
                                                    'scalars_index': {'id': {'index_type': 'INVERTED'},
                                                                      'int8_1': {'index_type': 'BITMAP'},
                                                                      'int16_1': {'index_type': 'BITMAP'},
                                                                      'int32_1': {'index_type': 'BITMAP'},
                                                                      'int64_1': {'index_type': 'BITMAP'},
                                                                      'varchar_1': {'index_type': 'BITMAP'},
                                                                      'bool_1': {'index_type': 'BITMAP'},
                                                                      'array_int8_1': {'index_type': 'BITMAP'},
                                                                      'array_int16_1': {'index_type': 'BITMAP'},
                                                                      'array_int32_1': {'index_type': 'BITMAP'},
                                                                      'array_int64_1': {'index_type': 'BITMAP'},
                                                                      'array_varchar_1': {'index_type': 'BITMAP'},
                                                                      'array_bool_1': {'index_type': 'BITMAP'}},
                                                    'vectors_index': {'float_vector_1': {'index_type': 'IVF_SQ8',
                                                                                         'index_param': {'nlist': 1024},
                                                                                         'metric_type': 'L2'}},
                                                    'scalars_params': {'array_int8_1': {'params': {'max_capacity': 13},
                                                                                        'other_params': {'dataset': 'random_algorithm',
                                                                                                         'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                              'specify_range': [-128, 128],
                                                                                                                              'max_capacity': 13}}},
                                                                       'array_int16_1': {'params': {'max_capacity': 13},
                                                                                         'other_params': {'dataset': 'random_algorithm',
                                                                                                          'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                               'specify_range': [-200, 200],
                                                                                                                               'max_capacity': 13}}},
                                                                       'array_int32_1': {'params': {'max_capacity': 13},
                                                                                         'other_params': {'dataset': 'random_algorithm',
                                                                                                          'algorithm_params': {'algorithm_name': 'specify_scope',
                                                                                                                               'specify_range': [-300, 300],
                                                                                                                               'max_capacity': 13}}},
                                                                       'array_int64_1': {'params': {'max_capacity': 13},
                                                                                         'other_params': {'dataset': 'random_algorithm',
                                                                                                          'algorithm_params': {'algorithm_name': 'fixed_value_range',
                                                                                                                               'specify_range': [-400, 432],
                                                                                                                               'batch': 50,
                                                                                                                               'max_capacity': 13}}},
                                                                       'array_varchar_1': {'params': {'max_capacity': 13},
                                                                                           'other_params': {'dataset': 'random_algorithm',
                                                                                                            'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                                 'specify_range': [-1500, 1500],
                                                                                                                                 'max_capacity': 13}}},
                                                                       'array_bool_1': {'params': {'max_capacity': 13}},
                                                                       'int8_1': {'other_params': {'dataset': 'random_algorithm',
                                                                                                   'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                        'specify_range': [-128, 128],
                                                                                                                        'max_capacity': 13}}},
                                                                       'int16_1': {'other_params': {'dataset': 'random_algorithm',
                                                                                                    'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                         'specify_range': [-200, 200],
                                                                                                                         'max_capacity': 13}}},
                                                                       'int32_1': {'other_params': {'dataset': 'random_algorithm',
                                                                                                    'algorithm_params': {'algorithm_name': 'specify_scope',
                                                                                                                         'specify_range': [-300, 300],
                                                                                                                         'max_capacity': 13}}},
                                                                       'int64_1': {'other_params': {'dataset': 'random_algorithm',
                                                                                                    'algorithm_params': {'algorithm_name': 'fixed_value_range',
                                                                                                                         'specify_range': [-400, 432],
                                                                                                                         'batch': 50,
                                                                                                                         'max_capacity': 13}}},
                                                                       'varchar_1': {'other_params': {'dataset': 'random_algorithm',
                                                                                                      'algorithm_params': {'algorithm_name': 'random_range',
                                                                                                                           'specify_range': [-1500, 1500],
                                                                                                                           'max_capacity': 13}}}},
                                                    'dataset_name': 'sift',
                                                    'dataset_size': 10000000,
                                                    'ni_per': 5000},
                                 'collection_params': {'other_fields': ['float_vector_1', 'int8_1', 'int16_1', 'int32_1', 'int64_1', 'varchar_1', 'bool_1',
                                                                        'array_int8_1', 'array_int16_1', 'array_int32_1', 'array_int64_1', 'array_varchar_1',
                                                                        'array_bool_1'],
                                                       'shards_num': 2},
                                 'resource_groups_params': {'reset': False},
                                 'database_user_params': {'reset_rbac': False, 'reset_db': False},
                                 'index_params': {'index_type': 'HNSW', 'index_param': {'M': 8, 'efConstruction': 200}},
                                 'concurrent_params': {'concurrent_number': 30, 'during_time': '3h', 'interval': 20, 'spawn_rate': None},
                                 'concurrent_tasks': [{'type': 'search',
                                                       'weight': 1,
                                                       'params': {'nq': 1000,
                                                                  'top_k': 10,
                                                                  'search_param': {'nprobe': 16},
                                                                  'expr': 'int8_1 == 100',
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'output_fields': ['id', 'float_vector', 'int64_1'],
                                                                  'ignore_growing': False,
                                                                  'group_by_field': None,
                                                                  'timeout': 60,
                                                                  'random_data': True,
                                                                  'check_task': 'check_search_output',
                                                                  'check_items': {'nq': 1000}}},
                                                      {'type': 'query',
                                                       'weight': 1,
                                                       'params': {'ids': None,
                                                                  'expr': 'int64_1 > -1',
                                                                  'output_fields': ['*'],
                                                                  'offset': None,
                                                                  'limit': 10,
                                                                  'ignore_growing': False,
                                                                  'partition_names': None,
                                                                  'timeout': 60,
                                                                  'consistency_level': None,
                                                                  'random_data': False,
                                                                  'random_count': 0,
                                                                  'random_range': [0, 1],
                                                                  'field_name': 'id',
                                                                  'field_type': 'int64',
                                                                  'check_task': 'check_query_output',
                                                                  'check_items': {'expect_length': 10}}},
                                                      {'type': 'hybrid_search',
                                                       'weight': 1,
                                                       'params': {'nq': 10,
                                                                  'top_k': 10,
                                                                  'reqs': [{'search_param': {'ef': 32},
                                                                            'anns_field': 'float_vector',
                                                                            'expr': '(array_contains_any(array_int32_1, [0]) || array_contains(array_int64_1, '
                                                                                    '1)) || ((varchar_1 like "1%") and (bool_1 == True))',
                                                                            'top_k': 30},
                                                                           {'search_param': {'nprobe': 64},
                                                                            'anns_field': 'float_vector_1',
                                                                            'expr': 'not (int16_1 == int8_1) && ARRAY_CONTAINS_ANY(array_int64_1, [-1, 0, '
                                                                                    '1])'}],
                                                                  'rerank': {'RRFRanker': []},
                                                                  'output_fields': ['*'],
                                                                  'ignore_growing': False,
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'timeout': 60,
                                                                  'random_data': True,
                                                                  'check_task': 'check_search_output',
                                                                  'check_items': {'output_fields': ['float_vector_1', 'int8_1', 'int16_1', 'int32_1', 'int64_1',
                                                                                                    'varchar_1', 'bool_1', 'array_int8_1', 'array_int16_1',
                                                                                                    'array_int32_1', 'array_int64_1', 'array_varchar_1',
                                                                                                    'array_bool_1', 'id', 'float_vector'],
                                                                                  'nq': 10}}},
                                                      {'type': 'scene_test',
                                                       'weight': 1,
                                                       'params': {'dim': 128,
                                                                  'data_size': 3000,
                                                                  'nb': 3000,
                                                                  'index_type': 'IVF_SQ8',
                                                                  'index_param': {'nlist': 2048},
                                                                  'metric_type': 'L2',
                                                                  'other_fields': [],
                                                                  'scalars_params': {},
                                                                  'scalars_index': {},
                                                                  'vectors_index': {}}},
                                                      {'type': 'scene_search_test',
                                                       'weight': 1,
                                                       'params': {'dataset': 'local',
                                                                  'dim': 128,
                                                                  'shards_num': 2,
                                                                  'data_size': 3000,
                                                                  'nb': 3000,
                                                                  'index_type': 'IVF_SQ8',
                                                                  'index_param': {'nlist': 2048},
                                                                  'metric_type': 'L2',
                                                                  'other_fields': ['array_int64_1', 'array_bool_1', 'array_varchar_1'],
                                                                  'replica_number': 1,
                                                                  'nq': 1,
                                                                  'top_k': 10,
                                                                  'search_param': {'nprobe': 16},
                                                                  'search_counts': 10,
                                                                  'scalars_params': {},
                                                                  'scalars_index': {'array_int64_1': {'index_type': 'BITMAP'},
                                                                                    'array_bool_1': {'index_type': 'BITMAP'},
                                                                                    'array_varchar_1': {'index_type': 'BITMAP'}},
                                                                  'vectors_index': {},
                                                                  'prepare_before_insert': False,
                                                                  'new_connect': False,
                                                                  'new_user': False}},
                                                      {'type': 'scene_hybrid_search_test',
                                                       'weight': 1,
                                                       'params': {'nq': 1,
                                                                  'top_k': 1,
                                                                  'reqs': [{'search_param': {'nprobe': 128},
                                                                            'anns_field': 'float_vector',
                                                                            'expr': 'bool_1 == True',
                                                                            'top_k': 100},
                                                                           {'search_param': {'nprobe': 32},
                                                                            'anns_field': 'binary_vector_scene_hybrid_search_test_1',
                                                                            'expr': 'bool_1 != True',
                                                                            'top_k': 10},
                                                                           {'search_param': {'search_list': 30},
                                                                            'anns_field': 'float16_vector_scene_hybrid_search_test_2',
                                                                            'expr': 'int64_1 >= 1500',
                                                                            'top_k': 5},
                                                                           {'search_param': {'drop_ratio_search': 0.1},
                                                                            'anns_field': 'sparse_float_vector_scene_hybrid_search_test_3',
                                                                            'expr': 'varchar_1 like "1%"',
                                                                            'top_k': 10}],
                                                                  'rerank': {'RRFRanker': []},
                                                                  'output_fields': ['*'],
                                                                  'ignore_growing': False,
                                                                  'guarantee_timestamp': None,
                                                                  'partition_names': None,
                                                                  'timeout': 600,
                                                                  'random_data': True,
                                                                  'dataset': 'local',
                                                                  'dim': 128,
                                                                  'shards_num': 2,
                                                                  'data_size': 3000,
                                                                  'nb': 3000,
                                                                  'index_type': 'IVF_SQ8',
                                                                  'index_param': {'nlist': 2048},
                                                                  'metric_type': 'L2',
                                                                  'other_fields': ['binary_vector_scene_hybrid_search_test_1',
                                                                                   'float16_vector_scene_hybrid_search_test_2',
                                                                                   'sparse_float_vector_scene_hybrid_search_test_3', 'int64_1', 'bool_1',
                                                                                   'varchar_1'],
                                                                  'replica_number': 1,
                                                                  'scalars_params': {'binary_vector_scene_hybrid_search_test_1': {'params': {'dim': 512},
                                                                                                                                  'other_params': {'dataset': 'binary'}},
                                                                                     'float16_vector_scene_hybrid_search_test_2': {'params': {'dim': 64}}},
                                                                  'scalars_index': {'int64_1': {},
                                                                                    'bool_1': {'index_type': 'BITMAP'},
                                                                                    'varchar_1': {'index_type': 'BITMAP'}},
                                                                  'vectors_index': {'binary_vector_scene_hybrid_search_test_1': {'index_type': 'BIN_IVF_FLAT',
                                                                                                                                 'index_param': {'nlist': 2048},
                                                                                                                                 'metric_type': 'JACCARD'},
                                                                                    'float16_vector_scene_hybrid_search_test_2': {'index_type': 'DISKANN',
                                                                                                                                  'index_param': {},
                                                                                                                                  'metric_type': 'IP'},
                                                                                    'sparse_float_vector_scene_hybrid_search_test_3': {'index_type': 'SPARSE_WAND',
                                                                                                                                       'index_param': {'drop_ratio_build': 0.2},
                                                                                                                                       'metric_type': 'IP'}},
                                                                  'prepare_before_insert': False,
                                                                  'hybrid_search_counts': 10,
                                                                  'new_connect': False,
                                                                  'new_user': False}}]},
            'run_id': 2024102943739782,
            'datetime': '2024-10-29 03:59:33.760618',
            'client_version': '2.5.0'},
 'result': {'test_result': {'index': {'RT': 2055.6764,
                                      'float_vector_1': {'RT': 509.9911},
                                      'id': {'RT': 315.1771},
                                      'int8_1': {'RT': 92.8785},
                                      'int16_1': {'RT': 0.5275},
                                      'int32_1': {'RT': 0.5257},
                                      'int64_1': {'RT': 0.5269},
                                      'varchar_1': {'RT': 0.5245},
                                      'bool_1': {'RT': 0.5225},
                                      'array_int8_1': {'RT': 0.5226},
                                      'array_int16_1': {'RT': 0.5255},
                                      'array_int32_1': {'RT': 0.523},
                                      'array_int64_1': {'RT': 0.523},
                                      'array_varchar_1': {'RT': 0.5251},
                                      'array_bool_1': {'RT': 0.528}},
                            'insert': {'total_time': 1352.8655, 'VPS': 7391.7178, 'batch_time': 0.6764, 'batch': 5000},
                            'flush': {'RT': 3.1945},
                            'load': {'RT': 27.1894},
                            'Locust': {'Aggregated': {'Requests': 8798,
                                                      'Fails': 1,
                                                      'RPS': 0.81,
                                                      'fail_s': 0.0,
                                                      'RT_max': 208073.29,
                                                      'RT_avg': 36680.3,
                                                      'TP50': 14000.0,
                                                      'TP99': 127000.0},
                                       'hybrid_search': {'Requests': 1466,
                                                         'Fails': 0,
                                                         'RPS': 0.14,
                                                         'fail_s': 0.0,
                                                         'RT_max': 46410.16,
                                                         'RT_avg': 6731.3,
                                                         'TP50': 5800.0,
                                                         'TP99': 26000.0},
                                       'query': {'Requests': 1467,
                                                 'Fails': 0,
                                                 'RPS': 0.14,
                                                 'fail_s': 0.0,
                                                 'RT_max': 24122.84,
                                                 'RT_avg': 322.05,
                                                 'TP50': 130.0,
                                                 'TP99': 2600.0},
                                       'scene_hybrid_search_test': {'Requests': 1450,
                                                                    'Fails': 1,
                                                                    'RPS': 0.13,
                                                                    'fail_s': 0.0,
                                                                    'RT_max': 208073.29,
                                                                    'RT_avg': 80328.3,
                                                                    'TP50': 77000.0,
                                                                    'TP99': 173000.0},
                                       'scene_search_test': {'Requests': 1394,
                                                             'Fails': 0,
                                                             'RPS': 0.13,
                                                             'fail_s': 0.0,
                                                             'RT_max': 147462.28,
                                                             'RT_avg': 51675.86,
                                                             'TP50': 50000.0,
                                                             'TP99': 118000.0},
                                       'scene_test': {'Requests': 1506,
                                                      'Fails': 0,
                                                      'RPS': 0.14,
                                                      'fail_s': 0.0,
                                                      'RT_max': 143937.36,
                                                      'RT_avg': 80601.79,
                                                      'TP50': 80000.0,
                                                      'TP99': 112000.0},
                                       'search': {'Requests': 1515,
                                                  'Fails': 0,
                                                  'RPS': 0.14,
                                                  'fail_s': 0.0,
                                                  'RT_max': 41809.84,
                                                  'RT_avg': 1633.19,
                                                  'TP50': 1200.0,
                                                  'TP99': 9400.0}}}}}
@wangting0128 wangting0128 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. test/benchmark benchmark test labels Oct 29, 2024
@wangting0128 wangting0128 added this to the 2.5.0 milestone Oct 29, 2024
@yanliang567
Copy link
Contributor

/assign @xiaocai2333
/unassign

@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 29, 2024
@xiaocai2333
Copy link
Contributor

seems it was timeout when writing meta.

@xiaocai2333
Copy link
Contributor

The etcd operations are too slow.

[WARN] [etcd/etcd_kv.go:656] ["Slow etcd operation save"] ["time spent"=10.000362989s] [key=by-dev/meta/field-index/453554856282597876/453554856282679727]

@xiaocai2333
Copy link
Contributor

/assign @wangting0128

@wangting0128
Copy link
Contributor Author

The etcd operations are too slow.

[WARN] [etcd/etcd_kv.go:656] ["Slow etcd operation save"] ["time spent"=10.000362989s] [key=by-dev/meta/field-index/453554856282597876/453554856282679727]

@LoveEachDay Please help check why etcd operation slows down on the 4am cluster machines, thanks

@LoveEachDay
Copy link
Contributor

Here's the etcd cluster events timeline:

06:39:43 etcd-1(leader) disk overloaded failed to send heartbeat
06:40:03 milvus send put request to etcd cluster etcd-0
06:40:05 etcd-0(follower) trigger a election, etcd-1 lost leadership, and etcd-2 became new leader
etcd-2 (leader) has no log between 06:40:06~06:40:26
06:40:13 etcd-0(follower) report put by-dev/meta/field-index/453554856282597876/453554856282679727 with 10s timeout

It should be a etcd cluster issue with slow disk.

@xiaofan-luan
Copy link
Collaborator

ignore all etcd slow issues since there is nothing we can do on it in short

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants