Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to copy target table data to source when target partitioned #700

Open
jnels124 opened this issue Mar 11, 2024 · 7 comments
Open

Unable to copy target table data to source when target partitioned #700

jnels124 opened this issue Mar 11, 2024 · 7 comments

Comments

@jnels124
Copy link

I am attempting to use pgcopydb to migrate data from our current schema to our new schema which uses citus. When executing pgcopydb copy table-data --table-jobs 8 --index-jobs 2 errors are logged indicating that truncate only was called on the parent table for a partitioned table. Truncate only is only valid to execute on specific partitions but not the parent table.

17:32:30.001 784336 ERROR  [TARGET 1674604687] ERROR:  cannot truncate only a partitioned table
17:32:30.001 784336 ERROR  [TARGET 1674604687] HINT:  Do not specify the ONLY keyword, or use TRUNCATE ONLY on the partitions directly.
17:32:30.003 784336 ERROR  [TARGET 1674604687] ERROR:  cannot truncate only a partitioned table
17:32:30.003 784336 ERROR  [TARGET 1674604687] SQL query: TRUNCATE ONLY public.transaction
17:32:30.003 784336 ERROR  [TARGET 1674604687] SQL params:
17:32:30.003 784336 ERROR  Failed to copy data for table with oid 16742 and part number 0, see above for details
@dimitri
Copy link
Owner

dimitri commented Mar 12, 2024

Is my understanding correct that the table (here public.transaction) is a plain table on the source database, but a partitionned-table (and the parent table) on the target database?

@arajkumar
Copy link
Contributor

The same problem we have encountered while using pgcopydb for Postgres to Timescale migration as well. The source would have plain Postgres table, but the same table on target will be partitioned using Postgres inheritance.

@kbarber
Copy link
Contributor

kbarber commented Apr 19, 2024

I've seen this also, very rare but it happens. I'm having trouble reproducing it with my own smaller example however to share.

@dimitri
Copy link
Owner

dimitri commented Apr 20, 2024

I think you need to split the schema and data parts of the processing, and then tweak the schema on the target database to introduce partitioning, in order to repro. See https://pgcopydb.readthedocs.io/en/latest/tutorial.html#how-to-edit-the-schema-when-copying-a-database for details.

Now if you can reproduce, the way to fix it might be to decide if we can use TRUNCATE or TRUNCATE ONLY by having a look at the TARGET schema to see if the table is partitioned there, whatever it might be on the source. At the moment we only have catalog entries for tables on the sourceDB, we need to grab them at the right point in time for the targetDB too now.

@arajkumar
Copy link
Contributor

@dimitri If I understand correctly, we should TRUNCATE if the table on the target is partitioned but not on the source. If the table remains partitioned on both the source and target, we must stick to TRUNCATE ONLY, correct?

@dimitri
Copy link
Owner

dimitri commented May 13, 2024

We have many cases to consider, but today only consider a few. The deal with TRUNCATE is how we use it in the same transaction as the COPY statement, in order to be able to use the COPY FREEZE option. In the case when either the table is not partitioned, or partitioned exactly the same way on the source and the target, that's fine. In any other case, we should disable this optimization.

Here some details of the reasoning:

  • when the table is not partitioned on the source, not partitioned on the table, we're good (truncate)
  • when the table is partitioned on the source and the target using the same scheme, we COPY DATA each partition and TRUNCATE ONLY partitions, but do not COPY nor TRUNCATE the parent table
  • when the table is partitioned on the source and not on the target, I don't think we support that use-case at all today, which requires routing the COPY DATA statements from every partition all to the same target table, omitting TRUNCATE at the partition level
  • when the table is not partitioned on the source, and partitioned on the target, we should route the COPY DATA statements all to the same target table so that the target server implements the partitioning rules, and then we can't truncate for each partition, we should truncate the parent table, only once, and have Postgres TRUNCATE the whole hierarchy for us.

Then we might want to also have support for a change of partitioning scheme between source and target, including more partitioning levels (sub-partitions). In which case we want to route everything to the new parent table of course. And again, implement TRUNCATE on the parent table on the target, allowing Postgres to descend in the hierarchy.

The simplest approach for now would consist of assuming we target the same partitioning schema on source and target when given a full clone command with --drop-if-exists. A second step may include checking for the same partitioning scheme on both source and target in cases when the schema might have been tampered with on the target, but waiting for that we should certainly revisit our approach and have the target Postgres instance route the data in the partitions, and have pgcopydb target the parent table only.

@mldisibio
Copy link

mldisibio commented Jul 4, 2024

I am also getting ERROR: cannot truncate only a partitioned table for the case where source and target schemas are exactly the same. (In fact, target was empty of any schemas). The table partitions are created correctly.
However, I think the issue is that the partitions are nested. For example:
tbl_sales partitioned by category,
tbl_sales_retail partitioned by year,
tbl_sales_retail_2023 partitioned by month, and
tbl_sales_retail_202312 has the data, not partitioned.
I see the error when TRUNCATE ONLY is applied to the two middle nested partition levels.

UPDATE:
I was finally able to work around this issue by adding all the parent and nested parent partition tables to the [exclude-table-data] section of the filter file. This allows the partition hierarchy to be copied correctly from source to target, but avoids any TRUNCATE ONLY on the intermediate parent tables.

Side note: Probably documented somewhere, but learned you cannot have blank lines within any one of the ini filter sections. (Was adding blank lines for readability since the exclude list was lengthy).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants