You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@Test
def testComplexReturnStatementNoValues(): Unit = {
val df = ss.read.format(classOf[DataSource].getName)
.option("url", SparkConnectorScalaSuiteIT.server.getBoltUrl)
.option("query",
"""MATCH (p:Person)-[b:BOUGHT]->(pr:Product)
|RETURN id(p) AS personId, id(pr) AS productId, {quantity: b.quantity, when: b.when} AS map, "some string" as someString, {anotherField: "201", and: 1} as map2""".stripMargin)
.option("schema.strategy", "string")
.load()
assertEquals(Seq("personId", "productId", "map", "someString", "map2"), df.columns.toSeq)
}
Given that I'm 101% sure that the assertEquals is green, executing this causes this timeout error
java.lang.AssertionError: Timeout hit (30 seconds) while waiting for condition to match:
Expected: <true>
but: was <false>
Expected :<true>
Actual :<false>
Connection log is:
For test testComplexReturnStatementNoValues => connections before: 2, after: 3
Including an action in the test (like df.count() make the whole thing work, no error anymore and the test is green.
Investigate if we have useless connection hanging that causes the problem or if it's test configuration issue.
The text was updated successfully, but these errors were encountered:
Within the neo4j driver object it's possible to configure the size of the connection pool that it opens when you initialize it. If you don't configure this, I think you get something like 3-5 connections, the driver assuming that you'll issue multiple queries and so on.
If it is the case that neo4j operations are always single-threaded within a worker node, it might make sense to explicitly configure max connections to be 1 for all driver instances.
Related to this: I can get some very weird driver errors (not connector errors) when playing around with connection schemes.
For example, imagine any simple read query to the database, doesn't matter what.
Using the notebook repo, do a simple read (which uses bolt:// by default)
Now do the same thing, but switch connection URL in the example to neo4j://
Now do the same thing, switching back to bolt://
The strange errors I'm seeing may be related to connection reuse in the worker node? I'm guessing. Not reporting this as a separate issue right now because I can't reliably reproduce it myself. But related to the ticket, some questions arise for me:
What should the strategy be for connection pooling within a worker node?
After some action on the worker node is complete, should the driver instance stay open? Keeping it open reduces startup time for the next operation. but what would happen if I wanted to create a new driver with different settings on top of that? How would it be handled?
While writing tests I came across this scenario:
Given that I'm 101% sure that the assertEquals is green, executing this causes this timeout error
Connection log is:
Including an action in the test (like
df.count()
make the whole thing work, no error anymore and the test is green.Investigate if we have useless connection hanging that causes the problem or if it's test configuration issue.
The text was updated successfully, but these errors were encountered: