RAM is being full and updates error #5

avpco · 2022-12-26T13:07:01Z

Hello.
I tested new TOAST (https://github.com/postgrespro/postgres/tree/jsonb_toaster) mechanism for jsonb data types. Version (15.1 (Debian 15.1-1.pgdg110+1), server 15beta1). Test table has 6 columns which has jsonb data type.
I was inserting a "large" amount of data (over 1 million) and noticed the session “eats up” the entire RAM. (I saw it in Zabbix monitoring system)

Also, each I run update statement I get the following error:

WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

obartunov · 2022-12-26T13:32:51Z

On Mon, Dec 26, 2022 at 4:07 PM Valerian ***@***.***> wrote: Hello. I tested new TOAST mechanism for jsonb data types. Version (15.1 (Debian 15.1-1.pgdg110+1), server 15beta1). Test table has 6 columns which has jsonb data type.

New TOAST doesn't available yet, what did you tested ?

…

I was inserting a "large" amount of data (over 1 million) and noticed the session “eats up” the entire RAM. (I saw it in Zabbix monitoring system) Also, each I run update statement I get the following error: WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. — Reply to this email directly, view it on GitHub <#5>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABQURYTNKE37BYXA2HDVPKTWPGKAFANCNFSM6AAAAAATJUSFZY> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

-- Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

avpco · 2022-12-26T13:37:51Z

So I build it from source https://github.com/postgrespro/postgres/tree/jsonb_toaster and tested it.

nik1tam · 2022-12-26T13:54:21Z

Hi!
Please provide your test scripts/queries, we'll try to reproduce your problem.

avpco · 2022-12-26T14:20:21Z

queries.zip

avpco · 2022-12-26T14:27:58Z

dayInfos_ddl.sql.zip

nik1tam · 2022-12-28T08:30:44Z

Thanks! Could you please also provide Postgres logfile with a crash? And did you try to get backtrace for it?

avpco · 2022-12-28T10:14:36Z

Hi. I didn't turn on logging on PostgreSQL. I just have reproduced insert and update statements with enabled logging. Unexpected insert statement (2000000 rows) executed successfully but update statement still generates error.

postgresql-Wed.log.zip

nik1tam · 2022-12-28T12:20:01Z

I've reproduced it, checking out what's going on.

nik1tam · 2022-12-28T20:20:07Z

Iterators in jsonb Toaster need to be reworked. Here's output from Heaptrack (update of 1000000 records):

709 calls to allocation functions with 5.68G peak consumption from
AllocSetAlloc
at /home/user/ppg15toast/src/backend/utils/mmgr/aset.c:920
in /usr/local/pgsql/bin/postgres
230 calls with 1.90G peak consumption from:
palloc
at /home/user/ppg15toast/src/backend/utils/mmgr/mcxt.c:1082
in /usr/local/pgsql/bin/postgres
create_toast_buffer
at /home/user/ppg15toast/src/backend/access/common/toast_internals.c:1356
in /usr/local/pgsql/bin/postgres
jsonx_create_detoast_iterator
at /home/user/ppg15toast/contrib/jsonb_toaster/jsonb_toast_internals.c:1739
in /usr/local/pgsql/lib/jsonb_toaster.so
jsonxzInitWithHeader
at /home/user/ppg15toast/contrib/jsonb_toaster/jsonb_toaster.c:2113
in /usr/local/pgsql/lib/jsonb_toaster.so
jsonxzInitContainerFromDatum
at /home/user/ppg15toast/contrib/jsonb_toaster/jsonb_toaster.c:2402
in /usr/local/pgsql/lib/jsonb_toaster.so
jsonb_toaster_save_object
at /home/user/ppg15toast/contrib/jsonb_toaster/jsonb_toaster.c:2644
in /usr/local/pgsql/lib/jsonb_toaster.so
jsonb_toaster_save
at /home/user/ppg15toast/contrib/jsonb_toaster/jsonb_toaster.c:2762
in /usr/local/pgsql/lib/jsonb_toaster.so
jsonb_toaster_toast
at /home/user/ppg15toast/contrib/jsonb_toaster/jsonb_toaster.c:3191
in /usr/local/pgsql/lib/jsonb_toaster.so
toast_tuple_externalize
at /home/user/ppg15toast/src/backend/access/table/toast_helper.c:343
in /usr/local/pgsql/bin/postgres
heap_toast_tuple_externalize
at /home/user/ppg15toast/src/backend/access/heap/heaptoast.c:119
in /usr/local/pgsql/bin/postgres
heap_toast_insert_or_update
at /home/user/ppg15toast/src/backend/access/heap/heaptoast.c:262
in /usr/local/pgsql/bin/postgres
heap_update
at /home/user/ppg15toast/src/backend/access/heap/heapam.c:3727
in /usr/local/pgsql/bin/postgres

The same is with bulk insert (with generate_series, not in cycle)
I'll rebase jsonb toaster onto the latest master, since there were some changes
that may affect such behavior (bulk insert and update patch), and checking out
how we can deal with such memory consumption.

nik1tam · 2023-01-12T09:09:38Z

Hi! We're working on memory consumption, there are some improvements already, please check them out. Bulk insert (with generate_series) now also works fine for large number of records.

1. TruncateMultiXact() performs the SLRU truncations in a critical section. Deleting the SLRU segments calls ForwardSyncRequest(), which will try to compact the request queue if it's full (CompactCheckpointerRequestQueue()). That in turn allocates memory, which is not allowed in a critical section. Backtrace: TRAP: failed Assert("CritSectionCount == 0 || (context)->allowInCritSection"), File: "../src/backend/utils/mmgr/mcxt.c", Line: 1353, PID: 920981 postgres: autovacuum worker template0(ExceptionalCondition+0x6e)[0x560a501e866e] postgres: autovacuum worker template0(+0x5dce3d)[0x560a50217e3d] postgres: autovacuum worker template0(ForwardSyncRequest+0x8e)[0x560a4ffec95e] postgres: autovacuum worker template0(RegisterSyncRequest+0x2b)[0x560a50091eeb] postgres: autovacuum worker template0(+0x187b0a)[0x560a4fdc2b0a] postgres: autovacuum worker template0(SlruDeleteSegment+0x101)[0x560a4fdc2ab1] postgres: autovacuum worker template0(TruncateMultiXact+0x2fb)[0x560a4fdbde1b] postgres: autovacuum worker template0(vac_update_datfrozenxid+0x4b3)[0x560a4febd2f3] postgres: autovacuum worker template0(+0x3adf66)[0x560a4ffe8f66] postgres: autovacuum worker template0(AutoVacWorkerMain+0x3ed)[0x560a4ffe7c2d] postgres: autovacuum worker template0(+0x3b1ead)[0x560a4ffecead] postgres: autovacuum worker template0(+0x3b620e)[0x560a4fff120e] postgres: autovacuum worker template0(+0x3b3fbb)[0x560a4ffeefbb] postgres: autovacuum worker template0(+0x2f724e)[0x560a4ff3224e] /lib/x86_64-linux-gnu/libc.so.6(+0x27c8a)[0x7f62cc642c8a] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x7f62cc642d45] postgres: autovacuum worker template0(_start+0x21)[0x560a4fd16f31] To fix, bail out in CompactCheckpointerRequestQueue() without doing anything, if it's called in a critical section. That covers the above call path, as well as any other similar cases where RegisterSyncRequest might be called in a critical section. 2. After fixing that, another problem became apparent: Autovacuum process doing that truncation can deadlock with the checkpointer process. TruncateMultiXact() sets "MyProc->delayChkptFlags |= DELAY_CHKPT_START". If the sync request queue is full and cannot be compacted, the process will repeatedly sleep and retry, until there is room in the queue. However, if the checkpointer is trying to start a checkpoint at the same time, and is waiting for the DELAY_CHKPT_START processes to finish, the queue will never shrink. More concretely, the autovacuum process is stuck here: #0 0x00007fc934926dc3 in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x000056220b24348b in WaitEventSetWaitBlock (set=0x56220c2e4b50, occurred_events=0x7ffe7856d040, nevents=1, cur_timeout=<optimized out>) at ../src/backend/storage/ipc/latch.c:1570 #2 WaitEventSetWait (set=0x56220c2e4b50, timeout=timeout@entry=10, occurred_events=<optimized out>, occurred_events@entry=0x7ffe7856d040, nevents=nevents@entry=1, wait_event_info=wait_event_info@entry=150994949) at ../src/backend/storage/ipc/latch.c:1516 #3 0x000056220b243224 in WaitLatch (latch=<optimized out>, latch@entry=0x0, wakeEvents=wakeEvents@entry=40, timeout=timeout@entry=10, wait_event_info=wait_event_info@entry=150994949) at ../src/backend/storage/ipc/latch.c:538 #4 0x000056220b26cf46 in RegisterSyncRequest (ftag=ftag@entry=0x7ffe7856d0a0, type=type@entry=SYNC_FORGET_REQUEST, retryOnError=true) at ../src/backend/storage/sync/sync.c:614 #5 0x000056220af9db0a in SlruInternalDeleteSegment (ctl=ctl@entry=0x56220b7beb60 <MultiXactMemberCtlData>, segno=segno@entry=11350) at ../src/backend/access/transam/slru.c:1495 postgres#6 0x000056220af9dab1 in SlruDeleteSegment (ctl=ctl@entry=0x56220b7beb60 <MultiXactMemberCtlData>, segno=segno@entry=11350) at ../src/backend/access/transam/slru.c:1566 postgres#7 0x000056220af98e1b in PerformMembersTruncation (oldestOffset=<optimized out>, newOldestOffset=<optimized out>) at ../src/backend/access/transam/multixact.c:3006 postgres#8 TruncateMultiXact (newOldestMulti=newOldestMulti@entry=3221225472, newOldestMultiDB=newOldestMultiDB@entry=4) at ../src/backend/access/transam/multixact.c:3201 postgres#9 0x000056220b098303 in vac_truncate_clog (frozenXID=749, minMulti=<optimized out>, lastSaneFrozenXid=749, lastSaneMinMulti=3221225472) at ../src/backend/commands/vacuum.c:1917 postgres#10 vac_update_datfrozenxid () at ../src/backend/commands/vacuum.c:1760 postgres#11 0x000056220b1c3f76 in do_autovacuum () at ../src/backend/postmaster/autovacuum.c:2550 postgres#12 0x000056220b1c2c3d in AutoVacWorkerMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at ../src/backend/postmaster/autovacuum.c:1569 and the checkpointer is stuck here: #0 0x00007fc9348ebf93 in clock_nanosleep () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fc9348fe353 in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x000056220b40ecb4 in pg_usleep (microsec=microsec@entry=10000) at ../src/port/pgsleep.c:50 #3 0x000056220afb43c3 in CreateCheckPoint (flags=flags@entry=108) at ../src/backend/access/transam/xlog.c:7098 #4 0x000056220b1c6e86 in CheckpointerMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at ../src/backend/postmaster/checkpointer.c:464 To fix, add AbsorbSyncRequests() to the loops where the checkpointer waits for DELAY_CHKPT_START or DELAY_CHKPT_COMPLETE operations to finish. Backpatch to v14. Before that, SLRU deletion didn't call RegisterSyncRequest, which avoided this failure. I'm not sure if there are other similar scenarios on older versions, but we haven't had any such reports. Discussion: https://www.postgresql.org/message-id/[email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAM is being full and updates error #5

RAM is being full and updates error #5

avpco commented Dec 26, 2022 •

edited

Loading

obartunov commented Dec 26, 2022 via email

avpco commented Dec 26, 2022

nik1tam commented Dec 26, 2022

avpco commented Dec 26, 2022

avpco commented Dec 26, 2022

nik1tam commented Dec 28, 2022

avpco commented Dec 28, 2022

nik1tam commented Dec 28, 2022

nik1tam commented Dec 28, 2022 •

edited

Loading

nik1tam commented Jan 12, 2023

RAM is being full and updates error #5

RAM is being full and updates error #5

Comments

avpco commented Dec 26, 2022 • edited Loading

obartunov commented Dec 26, 2022 via email

avpco commented Dec 26, 2022

nik1tam commented Dec 26, 2022

avpco commented Dec 26, 2022

avpco commented Dec 26, 2022

nik1tam commented Dec 28, 2022

avpco commented Dec 28, 2022

nik1tam commented Dec 28, 2022

nik1tam commented Dec 28, 2022 • edited Loading

nik1tam commented Jan 12, 2023

avpco commented Dec 26, 2022 •

edited

Loading

nik1tam commented Dec 28, 2022 •

edited

Loading