Use pool schema/catalog for original values, lazy initialise originalSchema if required #105

rbygrave · 2024-10-30T09:17:50Z

Make originalSchema and originalCatalog non final and default from the pool configuration
Initialise originalSchema and originalCatalog lazily if required and when they are not set from the pool configuration

This improves the performance of initialising connections for the cases where schema and catalog are not changed, by skipping the reading of the schema and catalog from the connection [which is frequently a query/network call].

@rPraml, can I get you to review this.

Note: The currentSchema is null when using the originalSchema, I think this is ok (with key for cached PreparedStatements etc).

…Schema if required

rPraml

Hello Rob,

I think there could be a problematic edge case, when the default schema or catalog is null. Please check my comments.

The rest looks good to me.

Roland

/edit:

Note: The currentSchema is null when using the originalSchema, I think this is ok (with key for cached PreparedStatements etc).

Important information: At least for SqlServer (maybe others too), you must not use a prepared statement accross schemas / catalogs. (Therefore, the schema and catalaog must be taken into account in the cache key, which was already fixed in last PR)

ebean-datasource/src/main/java/io/ebean/datasource/pool/PooledConnection.java

…a/catalog case

rPraml

Important information: At least for SqlServer (maybe others too), you must not use a prepared statement accross schemas / catalogs. (Therefore, the schema and catalaog must be taken into account in the cache key, which was already fixed in last PR)

sorry I didn't think of that yesterday

String key = sql + ':' + currentSchema + ':' + currentCatalog + ':' + returnKeysFlag;

I think, that we can get a (theoretical?) problem here.

Imagine the connection is "fresh", and the connection-string defines something like schema=S1

So the current schema of the connection would be S1, but currentSchema is still null. So we prepare a statement for S1, but the cache key contains null

If we then invoke setSchema(null) the connection is in schema-less state. Preparing the same statement would retrieve the one from above, which is prepared for S1

I know this is probably (!) a case that will not arise in practice, but we should still think about it.

What do you think, if we use an int schemaState and int catalogState instead of booleans?

0 = UNKNOWN - so we do not know the current schema and catalog. We do not need to add it in cache key computation (additional performance benefit) and we do not need to reset ist.
1 = CHANGED - when setSchema/setCatalog is invoked and we come from UNKNOWN state, we read the state once, call setSchema/setCatalog on the delegate and go to CHANGED state.
2 = KNOWN - when we close the connection and come from the CHANGED state, we reset it and go to KNOWN state. From then on we only switch between KNOWN and CHANGED

It might happen, that we might cache the same statement for UNKNONW and KNOWN state twice (or even triple, when you call setSchema("S1") and connection is already in that state) - but then, there is no risk, as described above. (And I prefer to waste some cache memory rather than mixing the schemes)

Don't get me wrong, I think that there is no practical use case (at the moment) and the code will work as it is. But I believe it is our mutual quality standard to be 100.0% correct here 😉

ebean-datasource/src/main/java/io/ebean/datasource/pool/PooledConnection.java

@default

… null). The problematic case that this change addresses is: Imagine the connection is "fresh", and the connection-string defines something like schema=S1 So the current schema of the connection would be S1, but currentSchema is still null. So we prepare a statement for S1, but the cache key contains null. If we then invoke setSchema(null) the connection is in schema-less state. Preparing the same statement would retrieve the one from above, which is prepared for S1. --- The FIX for this case, use "@default" instead of null, such that the cache key isn't problematic for the above case.

rPraml · 2024-11-05T11:01:33Z

Hello @rbygrave
I've reviewed this PR. See #106

Problems I've identified:

It seems that DataSourceConfig.catalog() is never applied. Please check #106 (comment)

I think, we must not reset the currentSchema to null, as it may be used as cache key then... Please check #106 (comment)

These two should be fixed here.

Interesting thing:

setSchema/getSchema is not always symmetric.
#106 (comment)
(So we cannot use asserts)

Play it safe

Do you think it is a good idea, to enforce schema/catalog, when taking a connection from the pool?
#106 (comment)
I know, this may send 1-2 commands to db, which increases latency

Feature/change schema reset review

The resetForUse() is called "on borrow" just before the connection is returned to the application. As such we can't do anything that could throw any Exception here. All connection reset actions must be performed when the connection is returned to the pool.

rbygrave · 2024-11-05T19:02:07Z

Do you think it is a good idea, to enforce schema/catalog, when taking a connection from the pool?

No I don't think we should rely on performing those tasks at "connection borrow time" just before the connection is returned to the application. This is because when anything fails here it's problematic. Instead, the design is that all "reset tasks" are performed when the connection is returned and if for any reason that fails then the connection doesn't actually make it back into the pool [its "force closed" and not added back into the pool].

rPraml

Looks good now, but maybe you can change the order of restoring schema/catalog in close() to be on the safe side.

Thank you for your work.

rPraml · 2024-11-06T06:38:21Z

ebean-datasource/src/main/java/io/ebean/datasource/pool/ConnectionPool.java

@@ -388,6 +398,9 @@ private Connection initConnection(Connection conn) throws SQLException {
    if (readOnly) {
      conn.setReadOnly(true);
    }
+    if (catalog != null) {
+      conn.setCatalog(catalog);
+    }


Every day I think of something new. 🤣 Do you think the order is important? I could imagine that you have to change the catalog first before you can change the schema, because the schema does not exist in the current catalog, for example. We did it right here. But not in close(). From my gut feeling, I would say change the catalog first, then the schema

rPraml · 2024-11-06T06:42:04Z

ebean-datasource/src/main/java/io/ebean/datasource/pool/PooledConnection.java

+      schemaState = SCHEMA_CATALOG_KNOWN;
+      this.cacheKeySchema = originalSchema;
+      // if schema & catalog is defined, we can be sure, that connection is initialized properly
+      assert originalSchema.equals(connection.getSchema()) : "Connection is in the wrong schema: " + connection.getSchema() + ", expected: " + originalSchema;


What do you think about these asserts? On the one hand, I think they are good because you can see, for example, if you specify a non-existent schema in the PoolConfiguration (at least pg returns null in that case). On the other hand, they could cause problems if the schema is returned in uppercase, for example.
They definitely help to find potential programming errors.

I would keep them in until someone provides a use case. (and the user can disable assertions anyway)

What do you think about these asserts?

I've honestly changed my mind a couple of times on these asserts. On one hand they look useful, but on the other they can throw SQLException and fail on case but only with asserts enabled, and if the driver allows setting unknown schema etc maybe its best to just allow that.

So I'm thinking of removing them. That really means that we are leaving it up to developers to set original schema and catalog appropriately and to test that themselves as needed.

rPraml · 2024-11-06T06:56:04Z

ebean-datasource/src/main/java/io/ebean/datasource/pool/PooledConnection.java

      }

-      if (resetCatalog) {
+      if (catalogState == SCHEMA_CATALOG_CHANGED) {
        connection.setCatalog(originalCatalog);


Important: We should switch the order here. First restore catalog, then restore schema. See other comment

By the way, I can imagine an edge case here again:

Postgres fails silently, if I call setSchema("doesnotexist") and getSchema() returns null

So if you set up your pool, that

has originalSchema = "TENANT_MASTER"

and a thread takes a connection from the pool and changes the schema to "TENANT_123"

the thread does some long running tasks...

in the meantime, the DBA deletes "TENANT_MASTER" (yes, a stupid idea)

the thread above puts back the connection to the pool

we try to switch back to originalSchema, but PG does not fail in this case

in what state is that connection? 😉

Still in "TEANT_123"

or in no man's land? (at least getSchema() would return null for Postgres)

is there a danger, that the next one, that pulls the connection will start in "TENANT_123"

I think we would be overdoing it here if we also covered this special case, because it is really a stupid idea to delete the main schema

First restore catalog, then restore schema

Yes, I'm good with that. I'll make that change.

I think we would be overdoing it here if we also covered this special case

I agree. We can leave that sort of case with developers to own.

…hema/catalog

rbygrave added 2 commits October 30, 2024 22:13

Use pool schema/catalog for original values, lazy initialise original…

74b6d65

…Schema if required

Reset the currentSchema and currentCatalog back to null on reset

f9466b1

rPraml suggested changes Oct 30, 2024

View reviewed changes

ebean-datasource/src/main/java/io/ebean/datasource/pool/PooledConnection.java Outdated Show resolved Hide resolved

ebean-datasource/src/main/java/io/ebean/datasource/pool/PooledConnection.java Show resolved Hide resolved

Add initialisedSchema and initialisedCatalog to handle the null schem…

b3077ee

…a/catalog case

rPraml reviewed Oct 31, 2024

View reviewed changes

ebean-datasource/src/main/java/io/ebean/datasource/pool/PooledConnection.java Outdated Show resolved Hide resolved

rbygrave and others added 3 commits November 1, 2024 16:11

Fix: Missing catalog initialization

b427700

make suggestion with state variables

921b428

rPraml mentioned this pull request Nov 5, 2024

Feature/change schema reset review #106

Merged

rbygrave added 2 commits November 6, 2024 07:34

Merge pull request #106 from FOCONIS/feature/change-schema-reset-review

b053076

Feature/change schema reset review

rPraml approved these changes Nov 6, 2024

View reviewed changes

Reset catalog before schema and remove asserts on setting original sc…

a8d736f

…hema/catalog

rbygrave assigned rbygrave and rPraml Nov 7, 2024

rbygrave added this to the 9.2 milestone Nov 7, 2024

rbygrave merged commit 825b0bc into master Nov 7, 2024
2 checks passed

rbygrave deleted the feature/change-schema-reset branch November 7, 2024 08:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use pool schema/catalog for original values, lazy initialise originalSchema if required #105

Use pool schema/catalog for original values, lazy initialise originalSchema if required #105

rbygrave commented Oct 30, 2024 •

edited

Loading

rPraml left a comment •

edited

Loading

rPraml left a comment

rPraml commented Nov 5, 2024 •

edited

Loading

rbygrave commented Nov 5, 2024

rPraml left a comment

rPraml Nov 6, 2024

rPraml Nov 6, 2024

rbygrave Nov 7, 2024

rPraml Nov 6, 2024

rbygrave Nov 7, 2024

Use pool schema/catalog for original values, lazy initialise originalSchema if required #105

Use pool schema/catalog for original values, lazy initialise originalSchema if required #105

Conversation

rbygrave commented Oct 30, 2024 • edited Loading

rPraml left a comment • edited Loading

Choose a reason for hiding this comment

rPraml left a comment

Choose a reason for hiding this comment

rPraml commented Nov 5, 2024 • edited Loading

Problems I've identified:

Interesting thing:

Play it safe

rbygrave commented Nov 5, 2024

rPraml left a comment

Choose a reason for hiding this comment

rPraml Nov 6, 2024

Choose a reason for hiding this comment

rPraml Nov 6, 2024

Choose a reason for hiding this comment

rbygrave Nov 7, 2024

Choose a reason for hiding this comment

rPraml Nov 6, 2024

Choose a reason for hiding this comment

rbygrave Nov 7, 2024

Choose a reason for hiding this comment

rbygrave commented Oct 30, 2024 •

edited

Loading

rPraml left a comment •

edited

Loading

rPraml commented Nov 5, 2024 •

edited

Loading