Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: pymilvus.exceptions.MilvusException: <MilvusException: (code=5, message=the length (1748) of 0th string exceeds max length (1746): expected=valid length string, actual=string length exceeds max length: invalid parameter)> #27470

Closed
1 task done
parth-patel2023 opened this issue Oct 3, 2023 · 8 comments
Assignees
Labels
help wanted Extra attention is needed kind/user-doc Issues or changes related to the user document stale indicates no udpates for 30 days triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@parth-patel2023
Copy link

parth-patel2023 commented Oct 3, 2023

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.2.0
- Deployment mode(standalone or cluster): standalone
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): Ubuntu
- CPU/Memory: 16 GB RAM
- GPU: No
- Others:

Current Behavior

pymilvus.exceptions.MilvusException: <MilvusException: (code=5, message=the length (1748) of 0th string exceeds max length (1746): expected=valid length string, actual=string length exceeds max length: invalid parameter)>

Expected Behavior

It should be execute and add the data int Milvus db

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

@parth-patel2023 parth-patel2023 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 3, 2023
@xiaofan-luan
Copy link
Collaborator

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.2.0
- Deployment mode(standalone or cluster): standalone
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): Ubuntu
- CPU/Memory: 16 GB RAM
- GPU: No
- Others:

Current Behavior

pymilvus.exceptions.MilvusException: <MilvusException: (code=5, message=the length (1748) of 0th string exceeds max length (1746): expected=valid length string, actual=string length exceeds max length: invalid parameter)>

Expected Behavior

It should be execute and add the data int Milvus db

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

you have to specify large max length.
currently your varchar max lengh is 1746 and less than your insertion.
I would recommend 4096 at least

@xiaofan-luan xiaofan-luan added help wanted Extra attention is needed and removed kind/bug Issues or changes related a bug labels Oct 3, 2023
@parth-patel2023
Copy link
Author

@xiaofan-luan
Thanks for your reply.
Can you please help me because I am not able to find where I can set the max_length ?

I am using this code to store the embeddings of text so I can ask question related to that content
vector_store = Milvus.from_documents(
docs,
embedding=embeddings,
collection_name = "document",
text_field = 'document_intro',
connection_args={"host": MILVUS_HOST, "port": MILVUS_PORT},
)

@xiaofan-luan
Copy link
Collaborator

it is defined when you create collection.
which document did you refer to?

@yanliang567 yanliang567 added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 7, 2023
@mmgitub
Copy link

mmgitub commented Oct 18, 2023

I also got exactly this error while trying https://milvus.io/docs/integrate_with_langchain.md

here i updated chunk_size=1024 to chunk_size=2048. But having below issues

  1. Why its creating empty collection like below, and its creating a collection name of its own even if I gave a proper collection_name ?

image

When I go to data preview in Attu it gives this error : failed to search: attempt #0: fail to get shard leaders from QueryCoord: collection=445025252223672443: collection not loaded: unrecoverable error

embeddings = OpenAIEmbeddings()
vector_store = Milvus.from_documents(
docs,
embedding=embeddings,
collection_name = "my_collection",
connection_args={"host": _MILVUS_HOST, "port": _MILVUS_PORT},
)

@yanliang567
Copy link
Contributor

/assign @AnthonyTsu1984
@AnthonyTsu1984 is trying to fix the doc issue.
/unassign

@yanliang567 yanliang567 added the kind/user-doc Issues or changes related to the user document label Oct 19, 2023
Copy link

stale bot commented Nov 18, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Nov 18, 2023
@stale stale bot closed this as completed Nov 25, 2023
@gsantopaolo
Copy link

[insert_rows], <MilvusException: (code=0, message=the length (68362) of 10250th string exceeds max length (65536): expected=valid length string, actual=string length exceeds max length: invalid parameter)>, <Time:{'RPC start': '2024-07-08 23:21:27.634777', 'RPC error': '2024-07-08 23:21:28.962498'}>

` def store_chunk_list(self, chunk_list: List[ChunkedItem], collection_name: str, model_name: str,
model_dimension: int):
entities = []

    connections.connect(
        alias=milvus_alias,
        host=milvus_host,
        port=milvus_port,
        user=milvus_user,
        password=milvus_pass
    )

    fields = [
        FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
        FieldSchema(name="document_id", dtype=DataType.INT64),
        FieldSchema(name="parent_id", dtype=DataType.INT64),
        FieldSchema(name="content", dtype=DataType.VARCHAR, max_length=65535),
        FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=model_dimension),
    ]

    schema = CollectionSchema(fields=fields, enable_dynamic_field=True)
    collection = Collection(name=collection_name, schema=schema)

    index_params = {
        "index_type": milvus_index_type,
        "metric_type": milvus_metric_type,
    }

    collection.create_index(field_name="vector", index_params=index_params)
    collection.load()

    for item in chunk_list:
        content_length = len(item.content.encode('utf-8'))
        self.logger.debug(f"Original content length: {content_length}")

        # Check if the content exceeds milvus limit
        if content_length > 65535:
            truncated_content = item.content.encode('utf-8')[:65535].decode('utf-8', 'ignore')
        else:
            truncated_content = item.content

        truncated_length = len(truncated_content.encode('utf-8'))
        self.logger.debug(f"Truncated content length: {truncated_length}")

        embedding = self.embedd(truncated_content, model_name)
        json_content = {"content": truncated_content}

        json_content_length = len(str(json_content).encode('utf-8'))
        self.logger.debug(f"JSON content length: {json_content_length}")

        entities.append({
            "document_id": item.document_id,
            "parent_id": item.parent_id,
            "content": json_content,
            "vector": embedding
        })

    collection.insert(entities)
    collection.flush()
    success = True
    self.logger.debug(f"Elements successfully inserted in collection")`

pymilvus==2.4.3
milvus:v2.3.0

any suggestion?

@yanliang567
Copy link
Contributor

@gsantopaolo it looks like you defined a varchar field while you are trying to insert a json value to it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed kind/user-doc Issues or changes related to the user document stale indicates no udpates for 30 days triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

6 participants