-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate on different timezone returned by the connector #295
Comments
This Java test works String localDateTime = "2007-12-03T10:15:30";
Dataset<Row> df = initTest("CREATE (p:Person {aTime: localdatetime('"+localDateTime+"')})");
Timestamp result = df.select("aTime").collectAsList().get(0).getTimestamp(0);
assertEquals(Timestamp.from(LocalDateTime.parse(localDateTime).toInstant(ZoneOffset.UTC)), result); Behind the scenes we convert the DateTime to UTC both when reading https://github.com/neo4j-contrib/neo4j-spark-connector/blob/4.0/common/src/main/scala/org/neo4j/spark/util/Neo4jUtil.scala#L110 from Neo4j and when writing to https://github.com/neo4j-contrib/neo4j-spark-connector/blob/4.0/common/src/main/scala/org/neo4j/spark/util/Neo4jUtil.scala#L159 Specifying timezones on the python test make the test green. dtString = "2015-06-24T12:50:35+00:00"
df = init_test(
"CREATE (p:Person {datetime: datetime('"+dtString+"')})")
dt = datetime.datetime(
2015, 6, 24, 12, 50, 35, 0, datetime.timezone.utc)
dtResult = df.select("datetime").collect()[
0].datetime.astimezone(datetime.timezone.utc)
print(dt)
print(dtResult)
assert dt == dtResult I think we should improve the documentation on how to use timezones. But for me the spark connector works as expected. |
This is a tricky one but based on what you've said, this doesn't seem surprising to me. Whenever you use the cypher function Now, what would be surprising is if you created two timestamps with the same value in this way, took the explicit step of converting both to UTC and they still disagree. This might be a simple test to put in place but I bet that passes. I think the confusion arises from expectations (maybe). You do localdatetime in cypher from the spark connector and it feels like you're doing this on spark, but of course you're not. That conversion is a computation happening on another machine (neo4j) so there's no reason to expect them to agree. Even system clocks can be wrong. If I'm right about this, I'm not sure even what to put in the documentation. Localdatetime() may function exactly as Neo4j documents it. Datetime in pyspark may function exactly as python documents it. |
I think this will be fixed by #358 |
@utnaf I use neo4j server version 4.1.6 and neo4j browser 4.2.1 and neo4j-connector-apache-spark_2.12-4.0.2_for_spark_3.jar. In above issue, from version 4.0.1 fixed this error. But i still encountered it
When i write to neo4j, time zone auto convert to GMT or UTC (2021-08-19T03:02:28.569000000Z) |
@utnaf i tried neo4j-connector-apache-spark_2.12-4.0.3-pre_for_spark_3.jar but it still not work |
Hi @AnhQuanTran, can you give me the code you are trying to execute, the error that is happening and your expected result? |
@utnaf This is my source code df.write.format("org.neo4j.spark.DataSource") Datetime attribute on neo4j show 2021-08-19T03:02:28.569000000Z No error encountered but datetime attribute auto convert to UTC0, i expect it same on dataframe spark with datetime col is 2021-08-19 10:02:28.569 Where im i wrong, thank you |
Yes it's expected because we internally convert Spark |
@conker84 so how to keep timezone on neo4j same as dataframe spark. Can i change some option or config neo4j-spark connector? |
Given this pyspark script
the assertions fails, the two printed dates are:
Investigate if the two hours difference is a spark connector writing/reading issue or if it's some sort of server / client misconfiguration.
/cc @conker84
The text was updated successfully, but these errors were encountered: