Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDS disk space issue #106

Open
kxyne opened this issue Apr 11, 2017 · 4 comments
Open

RDS disk space issue #106

kxyne opened this issue Apr 11, 2017 · 4 comments
Assignees

Comments

@kxyne
Copy link

kxyne commented Apr 11, 2017

It looks like we're short on space in the current RDS instance, are there other DBs in it @zelima or do we need to reinstantiate with more disk?

Traceback (most recent call last): File "/home/cybergreen/etl2/venv/lib/python3.4/site-packages/luigi/worker.py", line 328, in check_complete is_complete = task.complete() File "/home/cybergreen/etl2/venv/lib/python3.4/site-packages/luigi/task.py", line 795, in complete return all(r.complete() for r in flatten(self.requires())) File "/home/cybergreen/etl2/cybergreen_data_pipeline.py", line 126, in requires aggregator.run() File "/home/cybergreen/etl2/aggregator/main.py", line 74, in run self.aggregate() File "/home/cybergreen/etl2/aggregator/main.py", line 206, in aggregate conn.execute(query) File "/home/cybergreen/etl2/venv/lib/python3.4/site-packages/sqlalchemy/engine/base.py", line 939, in execute return self._execute_text(object, multiparams, params) File "/home/cybergreen/etl2/venv/lib/python3.4/site-packages/sqlalchemy/engine/base.py", line 1097, in _execute_text statement, parameters File "/home/cybergreen/etl2/venv/lib/python3.4/site-packages/sqlalchemy/engine/base.py", line 1189, in _execute_context context) File "/home/cybergreen/etl2/venv/lib/python3.4/site-packages/sqlalchemy/engine/base.py", line 1394, in _handle_dbapi_exception exc_info File "/home/cybergreen/etl2/venv/lib/python3.4/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause reraise(type(exception), exception, tb=exc_tb, cause=cause) File "/home/cybergreen/etl2/venv/lib/python3.4/site-packages/sqlalchemy/util/compat.py", line 186, in reraise raise value.with_traceback(tb) File "/home/cybergreen/etl2/venv/lib/python3.4/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context context) File "/home/cybergreen/etl2/venv/lib/python3.4/site-packages/sqlalchemy/engine/default.py", line 470, in do_execute cursor.execute(statement, parameters) sqlalchemy.exc.InternalError: (psycopg2.InternalError) Disk Full DETAIL: ----------------------------------------------- error: Disk Full code: 1016 context: node: 0 query: 882976 location: fdisk_api.cpp:398 process: query0_20 [pid=16549] -----------------------------------------------

[SQL: "\nINSERT INTO count\n(SELECT\n date, risk, country, asn, count(*) as count, 0 as count_amplified\nFROM(\nSELECT DISTINCT (ip), date_trunc('day', date) AS date, risk, asn, country FROM logentry) AS foo\nGROUP BY date, asn, risk, country ORDER BY date DESC, country ASC, asn ASC, risk ASC)\n"]

INFO: Informed scheduler that task RedShiftAggregation__99914b932b has status UNKNOWN

@zelima
Copy link
Contributor

zelima commented Apr 11, 2017

@kxyne This is not RDS but Redshift error and yes we may lack disk space there, as we currently have the default (smallest) Node type.

Capacity Details
Current Node Type - dc1.large
CPU 7 - EC2 Compute Units (2 virtual cores) per node
Memory - 15 GiB per node
Storage 160GB SSD - storage per node
I/O PerformanceI/O - Moderate
PlatformProcessor- 64-bit

this is the screenshot after today's run
image

@kxyne
Copy link
Author

kxyne commented Apr 11, 2017

Definitely the cause, however it seems to sit at 45% full all the time.

Is there another db on it? I'll spin up a new redshift for this run but we need to clean up any old datasets on it too.

Definitely the cause, however it seems to sit at 45% full all the time.

Is there another db on it? I'll spin up a new redshift for this run but we need to clean up any old datasets on it too.

diskfill

@zelima
Copy link
Contributor

zelima commented Apr 11, 2017

@kxyne My guess here is that first query was executed fine, which loads scanned data into logentry table and went out of space while second query was executing and exited with error. This left first table full. I checked right now the dev database and logentry table is full.

Aggregator script drops all tables before it starts load and after aggregated data is unloaded to S3 successfully.

I dropped table logentry manually and used disk space went towards 0

image

@kxyne
Copy link
Author

kxyne commented Apr 11, 2017

Ah, so it's full because of the previous run, that makes sense, sorry, a bit tired :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants