Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: copy_dm_to() uses dm_sql(). Unique keys (#1887) and autoincrement primary keys (#1725) are created on the database. Data models with cyclic references are supported on databases that allow adding constraints in ALTER TABLE statements (at this time, all except DuckDB and SQLite, #664) #2022

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ VignetteBuilder:
Config/autostyle/scope: line_breaks
Config/autostyle/strict: true
Config/testthat/edition: 3
Config/testthat/parallel: true
Config/testthat/parallel: false
Config/testthat/start-first: zzx-deprecated, flatten, dplyr, filter-dm,
draw-dm, bind, rows-dm, learn
Encoding: UTF-8
Expand Down
154 changes: 76 additions & 78 deletions R/db-interface.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,20 @@
#' and a [`dm`] object as its second argument.
#' The latter is copied to the former.
#' The default is to create temporary tables, set `temporary = FALSE` to create permanent tables.
#' Unless `set_key_constraints` is `FALSE`, primary key constraints are set on all databases,
#' and in addition foreign key constraints are set on MSSQL and Postgres databases.
#' Unless `set_key_constraints` is `FALSE`, primary key, foreign key, and unique constraints
#' are set, and indexes for foreign keys are created, on all databases.
#'
#' @inheritParams dm_examine_constraints
#'
#' @param dest An object of class `"src"` or `"DBIConnection"`.
#' @param dm A `dm` object.
#' @inheritParams rlang::args_dots_empty
#' @param set_key_constraints If `TRUE` will mirror `dm` primary and foreign key constraints on a database
#' and create indexes for foreign key constraints.
#' Set to `FALSE` if your data model currently does not satisfy primary or foreign key constraints.
#' @param set_key_constraints If `TRUE` will mirror
#' the primary, foreign, and unique key constraints
#' and create indexes for foreign key constraints
#' for the primary and foreign keys in the `dm` object.
#' Set to `FALSE` if your data model currently does not satisfy
#' primary or foreign key constraints.
#' @param temporary If `TRUE`, only temporary tables will be created.
#' These tables will vanish when disconnecting from the database.
#' @param schema Name of schema to copy the `dm` to.
Expand Down Expand Up @@ -93,21 +96,13 @@
progress = NA,
unique_table_names = NULL,
copy_to = NULL) {
# for the time being, we will be focusing on MSSQL
# we want to
# 1. change `dm_get_src_impl(dm)` to `dest`
# 2. copy the tables to `dest`
# 3. implement the key situation within our `dm` on the DB
#

if (!is.null(unique_table_names)) {
deprecate_warn(
deprecate_stop(

Check warning on line 102 in R/db-interface.R

View check run for this annotation

Codecov / codecov/patch

R/db-interface.R#L102

Added line #L102 was not covered by tests
"0.1.4", "dm::copy_dm_to(unique_table_names = )",
details = "Use `table_names = identity` to use unchanged names for temporary tables."
details = "Use `table_names = set_names(names(dm))` to use unchanged names for temporary tables."

Check warning on line 104 in R/db-interface.R

View check run for this annotation

Codecov / codecov/patch

R/db-interface.R#L104

Added line #L104 was not covered by tests
)

if (is.null(table_names) && temporary && !unique_table_names) {
table_names <- identity
}
}

if (!is.null(copy_to)) {
Expand All @@ -124,113 +119,116 @@
check_suggested("dbplyr", "copy_dm_to")

dest <- src_from_src_or_con(dest)
src_names <- src_tbls_impl(dm)

if (is_db(dest)) {
dest_con <- con_from_src_or_con(dest)

# in case `table_names` was chosen by the user, check if the input makes sense:
# 1. is there one name per dm-table?
# 2. are there any duplicated table names?
# 3. is it a named character or ident_q vector with the correct names?
if (is.null(table_names)) {
table_names_out <- repair_table_names_for_db(src_names, temporary, dest_con, schema)
# https://github.com/tidyverse/dbplyr/issues/487
if (is_mssql(dest)) {
temporary <- FALSE
}
} else {
if (!is.null(schema)) abort_one_of_schema_table_names()
if (is_function(table_names) || is_bare_formula(table_names)) {
table_name_fun <- as_function(table_names)
table_names_out <- set_names(table_name_fun(src_names), src_names)
} else {
table_names_out <- table_names
}
check_naming(names(table_names_out), src_names)

if (anyDuplicated(table_names_out)) {
problem <- table_names_out[duplicated(table_names_out)][[1]]
abort_copy_dm_to_table_names_duplicated(problem)
}

names(table_names_out) <- src_names
}
} else {
# FIXME: Other data sources than local and database possible
deprecate_warn(
"0.1.6", "dm::copy_dm_to(dest = 'must refer to a remote data source')",
if (!is_db(dest)) {
deprecate_stop(
"0.1.6", "dm::copy_dm_to(dest = 'must refer to a DBI connection')",
"dm::collect.dm()"
)
table_names_out <- set_names(src_names)
}

# FIXME: if same_src(), can use compute() but need to set NOT NULL and other
# constraints
src_names <- src_tbls_impl(dm)
dest_con <- con_from_src_or_con(dest)

# in case `table_names` was chosen by the user, check if the input makes sense:
# 1. is there one name per dm-table?
# 2. are there any duplicated table names?
# 3. is it a named character or ident_q vector with the correct names?
if (is.null(table_names)) {
table_names_out <- repair_table_names_for_db(src_names, temporary, dest_con, schema)
# https://github.com/tidyverse/dbplyr/issues/487
if (is_mssql(dest)) {
temporary <- FALSE

Check warning on line 141 in R/db-interface.R

View check run for this annotation

Codecov / codecov/patch

R/db-interface.R#L141

Added line #L141 was not covered by tests
}
} else {
if (!is.null(schema)) abort_one_of_schema_table_names()

Check warning on line 144 in R/db-interface.R

View check run for this annotation

Codecov / codecov/patch

R/db-interface.R#L144

Added line #L144 was not covered by tests
if (is_function(table_names) || is_bare_formula(table_names)) {
table_name_fun <- as_function(table_names)
table_names_out <- set_names(table_name_fun(src_names), src_names)
} else {
table_names_out <- table_names
}
check_naming(names(table_names_out), src_names)

# Shortcut necessary to avoid copying into .GlobalEnv
if (!is_db(dest)) {
return(dm)
if (anyDuplicated(table_names_out)) {
problem <- table_names_out[duplicated(table_names_out)][[1]]
abort_copy_dm_to_table_names_duplicated(problem)
}

names(table_names_out) <- src_names
}

# Must be done here because table types may depend on string length, #2066
dm <- collect(dm, progress = progress)

queries <- build_copy_queries(dest_con, dm, set_key_constraints, temporary, table_names_out)
if (isTRUE(set_key_constraints)) {
dm_for_sql <- dm
} else {
def_no_keys <- dm_get_def(dm)

Check warning on line 167 in R/db-interface.R

View check run for this annotation

Codecov / codecov/patch

R/db-interface.R#L167

Added line #L167 was not covered by tests
# FIXME: Add test
def_no_keys$pks[] <- list(new_pk())
def_no_keys$uks[] <- list(new_uk())
def_no_keys$fks[] <- list(new_fk())

Check warning on line 171 in R/db-interface.R

View check run for this annotation

Codecov / codecov/patch

R/db-interface.R#L169-L171

Added lines #L169 - L171 were not covered by tests
# Must keep primary keys
dm_for_sql <- dm_from_def(def_no_keys)

Check warning on line 173 in R/db-interface.R

View check run for this annotation

Codecov / codecov/patch

R/db-interface.R#L173

Added line #L173 was not covered by tests
}

ticker_create <- new_ticker(
sql <- dm_sql(dm_for_sql, dest_con, table_names_out, temporary)

# FIXME: Extract function
# FIXME: Make descriptions part of the dm_sql() output

pre <- unlist(sql$pre)
load <- unlist(sql$load)
post <- unlist(sql$post)

ticker_pre <- new_ticker(
"creating tables",
n = length(queries$sql_table),
n = length(pre),
progress = progress,
top_level_fun = "copy_dm_to"
)

# create tables
walk(queries$sql_table, ticker_create(~ {
walk(pre, ticker_pre(~ {
DBI::dbExecute(dest_con, .x, immediate = TRUE)
}))

ticker_populate <- new_ticker(
ticker_load <- new_ticker(
"populating tables",
n = length(queries$name),
n = length(load),
progress = progress,
top_level_fun = "copy_dm_to"
)

# populate tables
pwalk(
queries[c("name", "remote_name")],
ticker_populate(~ db_append_table(
con = dest_con,
remote_table = .y,
table = dm[[.x]],
progress = progress,
autoinc = dm_get_all_pks(dm, table = !!.x)$autoincrement
))
)
walk(load, ticker_load(~ {
DBI::dbExecute(dest_con, .x, immediate = TRUE)
}))

ticker_index <- new_ticker(
ticker_post <- new_ticker(
"creating indexes",
n = sum(lengths(queries$sql_index)),
n = length(post),
progress = progress,
top_level_fun = "copy_dm_to"
)

# create indexes
walk(unlist(queries$sql_index), ticker_index(~ {
walk(post, ticker_post(~ {
DBI::dbExecute(dest_con, .x, immediate = TRUE)
}))

# remote dm is same as source dm with replaced data
# FIXME: Extract function
def <- dm_get_def(dm)

remote_tables <- map2(
table_names_out,
map(def$data, colnames),
~ tbl(dest_con, ..1, vars = ..2)
~ tbl(dest_con, .x, vars = .y)
)

def$data <- unname(remote_tables[names(dm)])
def$data <- unname(remote_tables)
remote_dm <- dm_from_def(def)

invisible(debug_dm_validate(remote_dm))
Expand Down
13 changes: 8 additions & 5 deletions man/copy_dm_to.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions tests/testthat/_snaps/db-interface.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
# copy_dm_to() copies data frames from any source

Code
copy_dm_to(default_local_src(), dm_for_filter())
Condition
Error:
! The `dest` argument of `copy_dm_to()` must refer to a DBI connection as of dm 0.1.6.
i Please use `collect.dm()` instead.

# copy_dm_to() rejects overwrite and types arguments

Code
Expand Down
20 changes: 10 additions & 10 deletions tests/testthat/_snaps/filter-dm.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
Condition
Warning:
The `table` argument of `dm_filter()` is deprecated as of dm 1.0.0.
`dm_filter()` now takes named filter expressions, the names correspond to the tables to be filtered. You no longer need to call `dm_apply_filters()` to materialize the filters.
i `dm_filter()` now takes named filter expressions, the names correspond to the tables to be filtered. You no longer need to call `dm_apply_filters()` to materialize the filters.
Output
-- Metadata --------------------------------------------------------------------
Tables: `tf_1`, `tf_2`, `tf_3`, `tf_4`, `tf_5`, `tf_6`
Expand All @@ -19,10 +19,10 @@
Condition
Warning:
The `dm` argument of `dm_filter()` is deprecated as of dm 1.0.0.
Please use the `.dm` argument instead.
i Please use the `.dm` argument instead.
Warning:
The `table` argument of `dm_filter()` is deprecated as of dm 1.0.0.
`dm_filter()` now takes named filter expressions, the names correspond to the tables to be filtered. You no longer need to call `dm_apply_filters()` to materialize the filters.
i `dm_filter()` now takes named filter expressions, the names correspond to the tables to be filtered. You no longer need to call `dm_apply_filters()` to materialize the filters.
Output
-- Metadata --------------------------------------------------------------------
Tables: `tf_1`, `tf_2`, `tf_3`, `tf_4`, `tf_5`, `tf_6`
Expand All @@ -36,7 +36,7 @@
Condition
Warning:
The `table` argument of `dm_filter()` is deprecated as of dm 1.0.0.
`dm_filter()` now takes named filter expressions, the names correspond to the tables to be filtered. You no longer need to call `dm_apply_filters()` to materialize the filters.
i `dm_filter()` now takes named filter expressions, the names correspond to the tables to be filtered. You no longer need to call `dm_apply_filters()` to materialize the filters.
Output
-- Metadata --------------------------------------------------------------------
Tables: `tf_1`, `tf_2`, `tf_3`, `tf_4`, `tf_5`, `tf_6`
Expand All @@ -48,7 +48,7 @@
Condition
Warning:
`dm_apply_filters()` was deprecated in dm 1.0.0.
Calling `dm_apply_filters()` after `dm_filter()` is no longer necessary.
i Calling `dm_apply_filters()` after `dm_filter()` is no longer necessary.
Output
-- Metadata --------------------------------------------------------------------
Tables: `tf_1`, `tf_2`, `tf_3`, `tf_4`, `tf_5`, `tf_6`
Expand All @@ -60,7 +60,7 @@
Condition
Warning:
The `table` argument of `dm_filter()` is deprecated as of dm 1.0.0.
`dm_filter()` now takes named filter expressions, the names correspond to the tables to be filtered. You no longer need to call `dm_apply_filters()` to materialize the filters.
i `dm_filter()` now takes named filter expressions, the names correspond to the tables to be filtered. You no longer need to call `dm_apply_filters()` to materialize the filters.
Output
# A tibble: 3 x 4
c d e e1
Expand All @@ -73,7 +73,7 @@
Condition
Warning:
`dm_apply_filters_to_tbl()` was deprecated in dm 1.0.0.
Access tables directly after `dm_filter()`.
i Access tables directly after `dm_filter()`.
Output
# A tibble: 3 x 4
c d e e1
Expand All @@ -86,7 +86,7 @@
Condition
Warning:
The `table` argument of `dm_filter()` is deprecated as of dm 1.0.0.
`dm_filter()` now takes named filter expressions, the names correspond to the tables to be filtered. You no longer need to call `dm_apply_filters()` to materialize the filters.
i `dm_filter()` now takes named filter expressions, the names correspond to the tables to be filtered. You no longer need to call `dm_apply_filters()` to materialize the filters.
Output
# A tibble: 1 x 3
table filter zoomed
Expand All @@ -97,10 +97,10 @@
Condition
Warning:
`dm_get_filters()` was deprecated in dm 1.0.0.
Filter conditions are no longer stored with the dm object.
i Filter conditions are no longer stored with the dm object.
Output
# A tibble: 0 x 3
# ... with 3 variables: table <chr>, filter <list>, zoomed <lgl>
# i 3 variables: table <chr>, filter <list>, zoomed <lgl>

# data structure

Expand Down
Loading
Loading