Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] BWC failure on backward compatibility test #3184

Open
andy-k-improving opened this issue Dec 3, 2024 · 4 comments
Open

[BUG] BWC failure on backward compatibility test #3184

andy-k-improving opened this issue Dec 3, 2024 · 4 comments
Labels
backwards-compatibility bug Something isn't working testing Related to improving software testing

Comments

@andy-k-improving
Copy link
Contributor

andy-k-improving commented Dec 3, 2024

What is the bug?

This is served as a follow up ticket for #3168 to gather all the progress and information related to the backward compatibility test failure on CI pipeline.

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. To trigger the Run backward compatibility tests

What is the expected behavior?
To have backward compatbility test passed on the main branch for Github CI-Runner.

What is your host/environment?
N/A

Do you have any screenshots?
https://github.com/opensearch-project/sql/actions/runs/12144751620/job/33864844886

image

Do you have any additional context?
As the test is executed against openserach-min distribution, which only Linux distribution is available (Not Mac and WIndows)
Contributor is encouraged to equipped with a Ubuntu machine or a equivalent virtual environment.

@RyanL1997
Copy link

Hi @andy-k-improving , I just took a look of the above log in https://github.com/opensearch-project/sql/actions/runs/11621890097/job/32366453716 and https://github.com/opensearch-project/sql/actions/runs/12144751620/job/33864844886. And I found that the BWC test is failing because the cluster cannot reach a stable state (cluster health does not reach "yellow" within 40 seconds):

cluster-manager not discovered or elected yet, an election requires at least 2 nodes

The cluster is unable to elect a cluster manager due to a lack of quorum or communication issues between nodes.

Right now, I'm trying to re-produce this on my own fork to see if this is a flaky issue due to the unstable connection of our github runner or if we need to extend the timeout for waiting for cluster health.

@Swiddis
Copy link
Collaborator

Swiddis commented Dec 19, 2024

Possibly related to OpenSearch Lucene updates at opensearch-project/OpenSearch#15333, and this K-NN PR might also be relevant opensearch-project/k-NN#2195. (Copied from #3089)

@Swiddis Swiddis added the testing Related to improving software testing label Dec 19, 2024
@andy-k-improving
Copy link
Contributor Author

@Swiddis Would you mind to elaborate more on why the Lucene change is related to the BWC failure.
I looked at the log and the failure seems to be on the inter-cluster communication, but not about the searching aspect?
And I imagine in the case of Lucene failure, it will affect the search?

@andy-k-improving
Copy link
Contributor Author

Locally I tried to bump the base version into v2.18, but get the same error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backwards-compatibility bug Something isn't working testing Related to improving software testing
Projects
None yet
Development

No branches or pull requests

4 participants