Skip to content

Commit

Permalink
Backport PR pandas-dev#57311 on branch 2.2.x (Fixing multi method for…
Browse files Browse the repository at this point in the history
… to_sql for non-oracle databases) (pandas-dev#57466)

Backport PR pandas-dev#57311: Fixing multi method for to_sql for non-oracle databases

Co-authored-by: Samuel Chai <[email protected]>
  • Loading branch information
meeseeksmachine and kassett authored Feb 17, 2024
1 parent 5550bdb commit 32d2b99
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 7 deletions.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.2.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ Fixed regressions
- Fixed regression in :meth:`DataFrame.sort_index` not producing a stable sort for a index with duplicates (:issue:`57151`)
- Fixed regression in :meth:`DataFrame.to_dict` with ``orient='list'`` and datetime or timedelta types returning integers (:issue:`54824`)
- Fixed regression in :meth:`DataFrame.to_json` converting nullable integers to floats (:issue:`57224`)
- Fixed regression in :meth:`DataFrame.to_sql` when ``method="multi"`` is passed and the dialect type is not Oracle (:issue:`57310`)
- Fixed regression in :meth:`DataFrameGroupBy.idxmin`, :meth:`DataFrameGroupBy.idxmax`, :meth:`SeriesGroupBy.idxmin`, :meth:`SeriesGroupBy.idxmax` ignoring the ``skipna`` argument (:issue:`57040`)
- Fixed regression in :meth:`DataFrameGroupBy.idxmin`, :meth:`DataFrameGroupBy.idxmax`, :meth:`SeriesGroupBy.idxmin`, :meth:`SeriesGroupBy.idxmax` where values containing the minimum or maximum value for the dtype could produce incorrect results (:issue:`57040`)
- Fixed regression in :meth:`ExtensionArray.to_numpy` raising for non-numeric masked dtypes (:issue:`56991`)
Expand Down
3 changes: 3 additions & 0 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -2969,6 +2969,9 @@ def to_sql(
database. Otherwise, the datetimes will be stored as timezone unaware
timestamps local to the original timezone.
Not all datastores support ``method="multi"``. Oracle, for example,
does not support multi-value insert.
References
----------
.. [1] https://docs.sqlalchemy.org
Expand Down
11 changes: 4 additions & 7 deletions pandas/io/sql.py
Original file line number Diff line number Diff line change
Expand Up @@ -1012,22 +1012,19 @@ def _execute_insert(self, conn, keys: list[str], data_iter) -> int:

def _execute_insert_multi(self, conn, keys: list[str], data_iter) -> int:
"""
Alternative to _execute_insert for DBs support multivalue INSERT.
Alternative to _execute_insert for DBs support multi-value INSERT.
Note: multi-value insert is usually faster for analytics DBs
and tables containing a few columns
but performance degrades quickly with increase of columns.
"""

from sqlalchemy import insert

data = [dict(zip(keys, row)) for row in data_iter]
stmt = insert(self.table)
# conn.execute is used here to ensure compatibility with Oracle.
# Using stmt.values(data) would produce a multi row insert that
# isn't supported by Oracle.
# see: https://docs.sqlalchemy.org/en/20/core/dml.html#sqlalchemy.sql.expression.Insert.values
result = conn.execute(stmt, data)
stmt = insert(self.table).values(data)
result = conn.execute(stmt)
return result.rowcount

def insert_data(self) -> tuple[list[str], list[np.ndarray]]:
Expand Down

0 comments on commit 32d2b99

Please sign in to comment.