Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue in SimpleCatalog when accessing in a concurrent environment #138

Open
masterlittle opened this issue Apr 29, 2023 · 2 comments
Open

Comments

@masterlittle
Copy link

We are using zetasql Java parser as an API on Cloud run. Each container receives multiple requests and the Catalog is declared in the global scope as a static final variable.
Each request has a query and we scan the query for tables and add it to the catalog at runtime. The idea is that as more queries come, the catalog will grow as needed, without it needing to be prefilled.

Now the issue is that I'm seeing random Table not found errors when requests are coming. Some how the parser is not seeing the tables in the catalog as it parses the query. This is only happening in concurrent environment. If I keep concurrency as 1, everything works perfectly.

My hypothesis is that the SimpleCatalog uses Hashmap to store tables and functions. As a hashmap is not thread safe, when multiple requests are coming quickly, it is facing issues/not being updated.

Can someone help in finding why this might be happening?

@matthewcbrown
Copy link
Collaborator

SimpleCatalog is not thread safe (HashMap is one problem, but there are others), it's really intended to be constructed once per query (or, once per prepared query).

@masterlittle
Copy link
Author

Got it. Any suggestions on how to write one for a thread safe environment? Or will it be too complex?
A simple fix I could do is using ConcurrentHashmap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants