You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As discussed in discord, some community members including me have been facing inconsistent timeouts and errors during the snapshot expiry process.
There seems to be some bug with Athena and a case has been raised by me with regards to it. In parallel, to overcome the issue of timeouts, I tried experimenting by changing the schedule and stepfunction timeout to 30m from 1hr and it worked well for me. Expiry averages to 16 mins of running before failing with ICEBERG_VACUUM_MORE_RUNS_NEEDED and then subsequent query sucesssfully executes with average of 30 seconds for my volume of logs with this new setting!!
It would be helpful, if the schedule and stepfunction timeout is kept as part of config so that the consumer can find the sweet spot where the expiry works as expected depending on the size of the logs they ingest. This will also help in managing the athena related issue until it gets resolved.
The text was updated successfully, but these errors were encountered:
@rams3sh I am also facing same issue, but I didn't get what you are trying to say. Can you please tell me how to avoid the timeout errors while running the vacuum from stepfunction. please post a code snippet how to add timeout variable
@B161851 There is an inherent issue with Athena because of which timeout issues are happening.
As a workaround, I manually updated the event bridge time scheduled to run the VACUUM command as there exists no parameter in matano config to do it from CLI. Decreasing the time of VACUUM ensures that cleanup data does not get accumulated faster. Please note, this is only a temporary fix. Permanent fix can only be provided by Athena team from AWS.
As and when you add more data sources, you may start witnessing the timeout even within the small duration.
As discussed in discord, some community members including me have been facing inconsistent timeouts and errors during the snapshot expiry process.
There seems to be some bug with Athena and a case has been raised by me with regards to it. In parallel, to overcome the issue of timeouts, I tried experimenting by changing the schedule and stepfunction timeout to 30m from 1hr and it worked well for me. Expiry averages to 16 mins of running before failing with ICEBERG_VACUUM_MORE_RUNS_NEEDED and then subsequent query sucesssfully executes with average of 30 seconds for my volume of logs with this new setting!!
It would be helpful, if the schedule and stepfunction timeout is kept as part of config so that the consumer can find the sweet spot where the expiry works as expected depending on the size of the logs they ingest. This will also help in managing the athena related issue until it gets resolved.
The text was updated successfully, but these errors were encountered: