Add support for batch operations (R2DBC) #27229

MarkMarkyMarkus · 2021-07-30T09:17:27Z

Add support for batch operations to the DefaultDatabaseClient:

databaseClient.sql("INSERT INTO legoset (id, name, manual) VALUES(:id, :name, :manual)")
				.bind("id", 42055)
				.bind("name", "SCHAUFELRADBAGGER")
				.bindNull("manual", Integer.class)
				.add()
				.bind("id", 2021)
				.bind("name", "TOM")
				.bindNull("manual", Integer.class)
				.fetch().rowsUpdated()

Closes #259

Demo app: springR2dbcBatchExample.tar.gz

patrickhuy · 2021-10-06T22:19:50Z

@sbrannen @mp911de this looks very intresting for me, could you share some insights what the next steps for this PR should be?

mp911de · 2021-10-07T07:13:19Z

Batching through .add() would only properly work if the SQL doesn't change. DatabaseClient uses named parameters and expands named parameter markers using collection arguments into native bind markers. The number of bind markers depends on the actual collection size. This is useful when using IN clauses. I'm not quite sure about the direction.

We could follow the idea of NamedParameter of expanding the SQL using the first bindings and raising an exception later on if the expanded parameter count doesn't match the number of parameters found in a collection argument.

Other than that, please check out the contribution guide to learn how we accept contributions.

MarkMarkyMarkus · 2021-10-08T16:00:41Z

Thanks for the response.

We could follow the idea of NamedParameter of expanding the SQL using the first bindings and raising an exception later on if the expanded parameter count doesn't match the number of parameters found in a collection argument.

Do you mean that we should check the expanded parameters count in NamedParameterUtils.substituteNamedParameters() or NamedParameter.addPlaceholder()?

Andro999b · 2022-05-14T11:21:01Z

Is there any plans to add batching?

gabfssilva · 2022-10-25T22:15:48Z

Batching through .add() would only properly work if the SQL doesn't change.

@mp911de, correct me if I'm wrong, but the SPI has two ways of batching, right? One that allows you to send a single command with multiple parameters (via parameter binding) & another that allows you to pass multiple commands and, although both are sent in a single roundtrip, the latter one is slower since the database will have to parse every single command, so if your command won't change, you should stick with the first approach.

I'm asking because the implementation of this PR is probably the most optimized (since it uses the first approach), although less flexible. Isn't there a way to add both ways to the DatabaseClient, so the developer would choose which way fits better for them?

mp911de · 2022-10-26T09:37:40Z

@gabfssilva your perception is correct. Parametrized batching parses the command once and sends a execute command to the database for each binding set. I think it would be possible to bridge the .add(…)-based mechanism into DatabaseClient. The other type of batching, where commands are concatenated with ; can be used already because you're in control of the SQL statement.

bwhyman · 2023-01-22T17:56:23Z

It works for me now.
spring-data-r2dbc: 6.0.1; jasync-r2dbc-mysql: 2.1.12.

@Autowired
private io.r2dbc.spi.ConnectionFactory connectionFactory;
@Transactional
    public Mono<Void> update(List<User> users) {
        String sql = "update user u set u.address=? where s.id=?";
        return DatabaseClient.create(connectionFactory).sql(sql).filter(statement -> {
            for (User u : users) {
                statement.bind(0, u.getAddress()).bind(1, s.getId()).add();
            }
            return statement;
        }).then();
    }

snicoll · 2023-08-26T15:14:32Z

@mp911de I am happy helping the merge process but I could use a review on your side first.

mp911de · 2023-08-28T12:41:50Z

From the spec, we've learned that statement.….add().….add().….execute() (no trailing .add() before .execute()) is a pretty sharp edge leading easily to quite some confusion when using that API.

Considering batching with DatabaseClient would allow us to come up with a smoother API and an approach, that doesn't create DefaultGenericExecuteSpec instances for each bound parameter.

As a starting point, how about a bind (or params as in JdbcClient) method lambda, something along the lines of:

databaseClient.sql("INSERT INTO legoset (id, name, manual) VALUES(:id, :name, :manual)")
				.bind(params -> params.bind("name", name1).bind("manual", manual1))
				.bind(params -> params.bind("name", name2).bind("manual", manual2))
				.fetch();

The lambda parameter would be a simple interface carrying bind methods only to avoid confusion with Statement:

interface Params {
  Params bind(String name, Object value);
  Params bind(int index, Object value);
  Params bindNull(String name, Class<?> type);
  Params bindNull(int index, Class<?> type);
}

snicoll · 2023-09-11T08:42:53Z

@MarkMarkyMarkus would you have some cycle to amend your PR with the review of Mark? If not, that's cool we can take over when time permits.

MarkMarkyMarkus · 2023-09-11T14:24:58Z

Yes, I will take a look.

jhoeller · 2023-11-23T10:42:33Z

Note that we intentionally designed our new JdbcClient without batch operations, so it is not a priority for R2DBC either. That said, if we can agree on an implementation style for multiple parameter bindings that is not at odds with the existing API design and that is reasonable to use with named parameters as well as indexed parameters, we may revisit this for both JDBC and R2DBC.

hantsy · 2024-05-05T06:50:31Z

Hope there is a bind method to accept a list/array as params directly to avoid bind object one by one.

.bind(
  List.of(
    Tuple.of(param1, param2, param3...), 
    Tuple.of(param1, param2, param3...),
    ... 
  )
)

sdeleuze · 2024-10-17T14:24:14Z

I did some test with https://github.com/TechEmpower/FrameworkBenchmarks/tree/master/frameworks/Java/spring-webflux in order to use batch update instead of individual requests. Under high latency, batch update from this PR allows to increase the throughput by 50%, so I would be in favor of trying to move forward on this PR if we can find the right API.

Speaking about the API and related to @hantsy comment, I think we should evolve the API to make it easier to use a collection of data to bind, because if I am not mistaken, the current one is not practical to use for this very popular (likely the most popular) use case. I had to do something like that:

// Wrapper class is needed because variable used in lambda must be final or effectively final
var wrapper = new Object(){ GenericExecuteSpec spec =
        databaseClient.sql("UPDATE world SET randomnumber=:randomnumber WHERE id = :id"); };
worlds.forEach(world -> wrapper.spec = wrapper.spec.bind(params ->
        params.bind("id", world.id).bind("randomnumber", world.randomnumber)));
return wrapper.spec.fetch().rowsUpdated();

Maybe we should just have a GenericExecuteSpec method that takes a list/collection for the values to bind and a lambda to do binding logic, as a modernized version of JdbcTemplate#batchUpdate. I am not sure we should keep the variant proposed in this PR (I am not fully convinced we need batch updates for specifying the binding via the fluent API), but happy to get feedback on this.

I was not able to make it work with UPDATE world SET randomnumber=$2 WHERE id = $1 (which works for individual update) and had to use UPDATE world SET randomnumber=:randomnumber WHERE id = :id for the batch update variant. I am not sure why.

@bwhyman I was not able to make your filter() based proposal working.

@jhoeller @mp911de Any thoughts?

bwhyman · 2024-10-19T14:36:41Z

databaseClient.inConnection(conn -> {
            Statement statement = conn.createStatement(sql);
            for (int i = 0; i < timetables.size(); i++) {
                var tb = timetables.get(i);
                statement.bind(0, snowflake.nextId())
                        .bind(1, collid)
                        .bind(2, tb.getStartweek())
                        .bind(3, tb.getEndweek());
                // The add() method cannot be called for the last time
                if (i < timetables.size() - 1) {
                    statement.add();
                }
            }
            return Flux.from(statement.execute()).collectList();
        });

R2DBC:3.3.1
Statement add() method cannot be called for the last time, it is very wired.

sdeleuze · 2024-10-28T21:25:06Z

@bwhyman I was able to use batch updates with what you suggested but using inConnectionMany instead of inConnection. Except the if (i < timetables.size() - 1) part, it is pretty straightforward.

@mp911de @jhoeller Do you think we could relax the requirement of skipping the last add() to make it easier to use? If we do, I am not sure we need a dedicated API, we can maybe just create a related issue, close this PR, and document something like:

databaseClient.inConnectionMany(conn -> {
            Statement statement = conn.createStatement(sql);
            for (World world : worlds) {
                statement.bind(0, world.randomnumber).bind(1, world.id).add();
            }
            return Flux.from(statement.execute()).collectList();
        });

I am also wondering if dedicated batch update support would provide even more performance benefits (the improvement I get in Techempower are visible here).

mp911de · 2024-10-29T09:06:13Z

Implicit batching with the current API opens pathways for a new class of bugs.

DatabaseClient.sql(…).bind(0, foo).bind(0, bar) uses a single statement while DatabaseClient.sql(…).bind(params -> …).bind(params -> …) runs the same statement with two parameter binding sets. Ideally, after the new method's first invocation, the calling code should never get a chance to call bind(position|name, …). Also, we should have a method that indicates that we're entering batch mode to make it obvious (andBind(params -> …)?). Probably someone can come up with a better name, though.

I like bind(params -> …) as a general addition.

sdeleuze · 2024-10-29T11:22:15Z

I like bind(params -> …) as a general addition.

Ok so if nobody objects, I will close this PR unmerged, create a related issue for bind(params -> …) and work on that tentatively targeting Spring Framework 6.2.0.

sdeleuze · 2024-10-29T14:06:41Z

Closing this (old) PR unmerged in favor of #33812.

spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Jul 30, 2021

sbrannen added in: data Issues in data modules (jdbc, orm, oxm, tx) type: enhancement A general enhancement labels Jul 30, 2021

cristina-britez approved these changes Jan 30, 2023

View reviewed changes

lilianalhindawi approved these changes Jan 31, 2023

View reviewed changes

patrickhuy approved these changes Jan 31, 2023

View reviewed changes

snicoll added the status: waiting-for-feedback We need additional information before we can continue label Aug 26, 2023

spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Aug 28, 2023

marcusdacoregio mentioned this pull request Sep 1, 2023

Implement a ReactiveSessionRepository backed by R2DBC spring-projects/spring-session#1748

Open

snicoll added status: waiting-for-feedback We need additional information before we can continue and removed status: feedback-provided Feedback has been provided labels Sep 11, 2023

spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Sep 11, 2023

snicoll added status: waiting-for-feedback We need additional information before we can continue and removed status: feedback-provided Feedback has been provided labels Sep 15, 2023

MarkMarkyMarkus added 2 commits September 17, 2023 19:49

Add support for batch operations

5dc0388

Add Params interface to carry binding

9bf751a

MarkMarkyMarkus force-pushed the feature/add-batch-operations branch from 063786f to 9bf751a Compare September 20, 2023 15:09

snicoll removed the status: waiting-for-feedback We need additional information before we can continue label Sep 27, 2023

jhoeller removed the status: waiting-for-triage An issue we've not yet triaged or decided on label Nov 23, 2023

jhoeller added this to the 6.x Backlog milestone Nov 23, 2023

jhoeller modified the milestones: 6.x Backlog, General Backlog Jun 18, 2024

haiderRizvii approved these changes Aug 29, 2024

View reviewed changes

sdeleuze self-assigned this Oct 17, 2024

sdeleuze added the status: waiting-for-internal-feedback An issue that needs input from a member or another Spring Team label Oct 21, 2024

sdeleuze mentioned this pull request Oct 29, 2024

Add batch operations support to DatabaseClient #33812

Open

sdeleuze closed this Oct 29, 2024

sdeleuze added status: superseded An issue that has been superseded by another and removed status: waiting-for-internal-feedback An issue that needs input from a member or another Spring Team labels Oct 29, 2024

jhoeller removed this from the General Backlog milestone Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for batch operations (R2DBC) #27229

Add support for batch operations (R2DBC) #27229

MarkMarkyMarkus commented Jul 30, 2021

patrickhuy commented Oct 6, 2021

mp911de commented Oct 7, 2021

MarkMarkyMarkus commented Oct 8, 2021

Andro999b commented May 14, 2022

gabfssilva commented Oct 25, 2022

mp911de commented Oct 26, 2022 •

edited

Loading

bwhyman commented Jan 22, 2023

snicoll commented Aug 26, 2023

mp911de commented Aug 28, 2023

snicoll commented Sep 11, 2023

MarkMarkyMarkus commented Sep 11, 2023

jhoeller commented Nov 23, 2023

hantsy commented May 5, 2024 •

edited

Loading

sdeleuze commented Oct 17, 2024 •

edited

Loading

bwhyman commented Oct 19, 2024

sdeleuze commented Oct 28, 2024 •

edited

Loading

mp911de commented Oct 29, 2024

sdeleuze commented Oct 29, 2024

sdeleuze commented Oct 29, 2024 •

edited

Loading

Add support for batch operations (R2DBC) #27229

Add support for batch operations (R2DBC) #27229

Conversation

MarkMarkyMarkus commented Jul 30, 2021

patrickhuy commented Oct 6, 2021

mp911de commented Oct 7, 2021

MarkMarkyMarkus commented Oct 8, 2021

Andro999b commented May 14, 2022

gabfssilva commented Oct 25, 2022

mp911de commented Oct 26, 2022 • edited Loading

bwhyman commented Jan 22, 2023

snicoll commented Aug 26, 2023

mp911de commented Aug 28, 2023

snicoll commented Sep 11, 2023

MarkMarkyMarkus commented Sep 11, 2023

jhoeller commented Nov 23, 2023

hantsy commented May 5, 2024 • edited Loading

sdeleuze commented Oct 17, 2024 • edited Loading

bwhyman commented Oct 19, 2024

sdeleuze commented Oct 28, 2024 • edited Loading

mp911de commented Oct 29, 2024

sdeleuze commented Oct 29, 2024

sdeleuze commented Oct 29, 2024 • edited Loading

mp911de commented Oct 26, 2022 •

edited

Loading

hantsy commented May 5, 2024 •

edited

Loading

sdeleuze commented Oct 17, 2024 •

edited

Loading

sdeleuze commented Oct 28, 2024 •

edited

Loading

sdeleuze commented Oct 29, 2024 •

edited

Loading