Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read Data after enabling TTL on container level #457

Open
kiranvsk1 opened this issue Jul 19, 2021 · 5 comments
Open

Read Data after enabling TTL on container level #457

kiranvsk1 opened this issue Jul 19, 2021 · 5 comments

Comments

@kiranvsk1
Copy link

kiranvsk1 commented Jul 19, 2021

Hi Team,

I am trying to read data from a Cosmos(SQL API) container using the custom query option, resulting in errors.

Setup of the container - Enable TTL with a default value of 1 week(72460*60)

What works -

  1. Can write data to the container using azure-cosmos DB-spark connector
  2. Able to read stats from the container on the portal (count(1) etc..,)

What does not work-

  1. Reads from the container using spark connector end up resulting in a 500 error.

But, if I remove the TTL setting on the container I am able read data using azure-cosmos DB-spark connector

Is this expected behavior with TTL turned on?

@sajins2005
Copy link

I am facing same issue. Look like cosmos OLTP connector not working with TTL . I have faced it when TTL on (No default) and on with some seconds set in edit filed .

@FabianMeiswinkel
Copy link
Member

Hi, can you tell us which version of the Spark Connector you are using? Also it would be great to see the error details (error message with callstack) of the failure.

Thanks,
Fabian

@sajins2005
Copy link

sajins2005 commented Jul 22, 2021

@FabianMeiswinkel ,
I am using com.azure.cosmos.spark:azure-cosmos-spark_3-1_2-12:4.2.0. Faced the same issue in com.azure.cosmos.spark:azure-cosmos-spark_3-1_2-12:4.1.0 as well. Azure spark runtime is 7.3 LTS (includes Apache Spark 3.0.1, Scala 2.12)

Stack trace
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, 10.139.64.4, executor 0): {"ClassName":"InternalServerErrorException","userAgent":"azsdk-java-cosmos/4.17.0-beta.1 Linux/5.4.0-1051-azure JRE/1.8.0_282","statusCode":500,"resourceAddress":"rntbd://cdb-ms-prod-eastus1-fd32.documents.azure.com:14054/apps/274509a2-d536-4a09-b0a3-f4fd526feb25/services/57940846-7939-4603-bc4e-e0297e4bd3b6/partitions/6c48721d-54ff-4bcb-8827-26bfca38bfe5/replicas/132707874448886293s/","error":"{"Errors":["An unknown error occurred while processing this request. If the issue persists, please contact Azure Support: http://aka.ms/azure-support"]}","innerErrorMessage":"["An unknown error occurred while processing this request. If the issue persists, please contact Azure Support: http://aka.ms/azure-support"]","causeInfo":null,"responseHeaders":"{x-ms-last-state-change-utc=Thu, 15 Jul 2021 01:51:45.529 GMT, x-ms-request-duration-ms=1.523, x-ms-session-token=0:-1#2302125, lsn=2302125, x-ms-request-charge=1.00, x-ms-schemaversion=1.12, x-ms-transport-request-id=4, x-ms-number-of-read-regions=0, x-ms-activity-id=dc18ad1d-eaef-11eb-ae6b-a915eade79fd, x-ms-xp-role=1, x-ms-global-Committed-lsn=2302124, x-ms-cosmos-llsn=2302125, x-ms-serviceversion= version=2.14.0.0}","cosmosDiagnostics":{"userAgent":"azsdk-java-cosmos/4.17.0-beta.1 Linux/5.4.0-1051-azure JRE/1.8.0_282","requestLatencyInMs":7,"requestStartTimeUTC":"2021-07-22T13:22:27.562Z","requestEndTimeUTC":"2021-07-22T13:22:27.569Z","responseStatisticsList":[{"storeResult":{"storePhysicalAddress":"rntbd://cdb-ms-prod-eastus1-fd32.documents.azure.com:14054/apps/274509a2-d536-4a09-b0a3-f4fd526feb25/services/57940846-7939-4603-bc4e-e0297e4bd3b6/partitions/6c48721d-54ff-4bcb-8827-26bfca38bfe5/replicas/132707874448886293s/","lsn":2302125,"globalCommittedLsn":2302124,"partitionKeyRangeId":"0","isValid":true,"statusCode":500,"subStatusCode":0,"isGone":false,"isNotFound":false,"isInvalidPartition":false,"isThroughputControlRequestRateTooLarge":false,"requestCharge":1.0,"itemLSN":-1,"sessionToken":"-1#2302125","backendLatencyInMs":1.523,"exception":"["An unknown error occurred while processing this request. If the issue persists, please contact Azure Support: http://aka.ms/azure-support"]","transportRequestTimeline":[{"eventName":"created","startTimeUTC":"2021-07-22T13:22:27.563Z","durationInMicroSec":0},{"eventName":"queued","startTimeUTC":"2021-07-22T13:22:27.563Z","durationInMicroSec":0},{"eventName":"channelAcquisitionStarted","startTimeUTC":"2021-07-22T13:22:27.563Z","durationInMicroSec":1000},{"eventName":"pipelined","startTimeUTC":"2021-07-22T13:22:27.564Z","durationInMicroSec":1000},{"eventName":"transitTime","startTimeUTC":"2021-07-22T13:22:27.565Z","durationInMicroSec":4000},{"eventName":"received","startTimeUTC":"2021-07-22T13:22:27.569Z","durationInMicroSec":0},{"eventName":"completed","startTimeUTC":"2021-07-22T13:22:27.569Z","durationInMicroSec":0}],"rntbdRequestLengthInBytes":498,"rntbdResponseLengthInBytes":326,"requestPayloadLengthInBytes":55,"responsePayloadLengthInBytes":null,"channelTaskQueueSize":1,"pendingRequestsCount":1,"serviceEndpointStatistics":{"availableChannels":1,"acquiredChannels":0,"executorTaskQueueSize":0,"inflightRequests":1,"lastSuccessfulRequestTime":"2021-07-22T13:22:26.781Z","lastRequestTime":"2021-07-22T13:22:27.411Z","createdTime":"2021-07-22T13:22:26.765Z","isClosed":false}},"requestResponseTimeUTC":"2021-07-22T13:22:27.569Z","requestResourceType":"Document","requestOperationType":"Query"}],"supplementalResponseStatisticsList":[],"addressResolutionStatistics":{},"regionsContacted":["[REDACTED]"],"retryContext":{"statusAndSubStatusCodes":null,"retryCount":0,"retryLatency":0},"metadataDiagnosticsContext":{"metadataDiagnosticList":null},"serializationDiagnosticsContext":{"serializationDiagnosticsList":null},"gatewayStatistics":null,"systemInformation":{"usedMemory":"202493 KB","availableMemory":"2670339 KB","systemCpuLoad":"empty","availableProcessors":4},"clientCfgs":{"id":0,"connectionMode":"DIRECT","numberOfClients":1,"connCfg":{"rntbd":"(cto:PT5S, rto:PT5S, icto:PT0S, ieto:PT1H, mcpe:130, mrpc:30, cer:false)","gw":"(cps:1000, rto:PT5S, icto:null, p:false)","other":"(ed: true, cs: false)"},"consistencyCfg":"(consistency: Eventual, mm: true, prgns: [])"}}}
at azure_cosmos_spark.com.azure.cosmos.implementation.directconnectivity.rntbd.RntbdRequestManager.messageReceived(RntbdRequestManager.java:807)
at azure_cosmos_spark.com.azure.cosmos.implementation.directconnectivity.rntbd.RntbdRequestManager.channelRead(RntbdRequestManager.java:181)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at azure_cosmos_spark.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
at azure_cosmos_spark.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at azure_cosmos_spark.io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
at azure_cosmos_spark.io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:253)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at azure_cosmos_spark.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at azure_cosmos_spark.io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1368)
at azure_cosmos_spark.io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1234)
at azure_cosmos_spark.io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1280)
at azure_cosmos_spark.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507)
at azure_cosmos_spark.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446)
at azure_cosmos_spark.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at azure_cosmos_spark.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at azure_cosmos_spark.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at azure_cosmos_spark.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at azure_cosmos_spark.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
at azure_cosmos_spark.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
at azure_cosmos_spark.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
at azure_cosmos_spark.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
at azure_cosmos_spark.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at azure_cosmos_spark.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at azure_cosmos_spark.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at azure_cosmos_spark.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2519)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2466)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2460)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2460)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1152)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1152)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1152)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2721)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2668)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2656)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:938)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2339)
at org.apache.spark.sql.execution.collect.Collector.runSparkJobs(Collector.scala:298)
at org.apache.spark.sql.execution.collect.Collector.collect(Collector.scala:308)
at org.apache.spark.sql.execution.collect.Collector$.collect(Collector.scala:82)
at org.apache.spark.sql.execution.collect.Collector$.collect(Collector.scala:88)
at org.apache.spark.sql.execution.ResultCacheManager.getOrComputeResult(ResultCacheManager.scala:508)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollectResult(limit.scala:58)
at org.apache.spark.sql.Dataset.collectResult(Dataset.scala:2994)
at org.apache.spark.sql.Dataset.$anonfun$collectResult$1(Dataset.scala:2985)
at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3709)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$5(SQLExecution.scala:116)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:249)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:101)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:845)
at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:77)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:199)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3707)
at org.apache.spark.sql.Dataset.collectResult(Dataset.scala:2984)
at com.databricks.backend.daemon.driver.OutputAggregator$.withOutputAggregation0(OutputAggregator.scala:194)
at com.databricks.backend.daemon.driver.OutputAggregator$.withOutputAggregation(OutputAggregator.scala:57)
at com.databricks.backend.daemon.driver.PythonDriverLocal.generateTableResult(PythonDriverLocal.scala:1157)
at com.databricks.backend.daemon.driver.PythonDriverLocal.$anonfun$getResultBufferInternal$1(PythonDriverLocal.scala:1069)
at com.databricks.backend.daemon.driver.PythonDriverLocal.withInterpLock(PythonDriverLocal.scala:856)
at com.databricks.backend.daemon.driver.PythonDriverLocal.getResultBufferInternal(PythonDriverLocal.scala:938)
at com.databricks.backend.daemon.driver.DriverLocal.getResultBuffer(DriverLocal.scala:538)
at com.databricks.backend.daemon.driver.PythonDriverLocal.outputSuccess(PythonDriverLocal.scala:898)
at com.databricks.backend.daemon.driver.PythonDriverLocal.$anonfun$repl$8(PythonDriverLocal.scala:383)
at com.databricks.backend.daemon.driver.PythonDriverLocal.withInterpLock(PythonDriverLocal.scala:856)
at com.databricks.backend.daemon.driver.PythonDriverLocal.repl(PythonDriverLocal.scala:370)
at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$10(DriverLocal.scala:431)
at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:239)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:234)
at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:231)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:48)
at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:276)
at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:269)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:48)
at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:408)
at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:653)
at scala.util.Try$.apply(Try.scala:213)
at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:645)
at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:486)
at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:598)
at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:391)
at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337)
at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:219)
at java.lang.Thread.run(Thread.java:748)

@samuelramos
Copy link

samuelramos commented Jul 27, 2021

Hi,
I'm facing the exactly same error here.

I am using:

Databricks Runtime Version: 8.3 (Apache Spark 3.1.1, Scala 2.12) and/or 8.4 (Apache Spark 3.1.2, Scala 2.12)
Cosmos DB Spark Connector: com.azure.cosmos.spark:azure-cosmos-spark_3-1_2-12:4.2.0

Stacktrace attached.
stacktrace.txt

Thanks,
SR

@sajins2005
Copy link

@FabianMeiswinkel
Is there any update on this issue ?

Regards,
Sajin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants