Bulk insert in Dapper

Dapper is a simple object mapper, a nuget package that extends the IDbConnection interface. This powerful package come in handy when writing simple CRUD operations. The thing I struggle from time to time is handling big data with Dapper. When handling hundreds of thousands of objects at once brings a whole variety of performance problems you might run into. I’ll show you today how to handle many inserts with Dapper.

The problem

Let’s have a simple repository, that inserts users into DB. Table in DB will look like this:

Now let’s have a look at the code:

Very simple code, that takes user names and passes a collection of objects to Dapper extension method ExecuteAsync. This is a wonderful shortcut, that instead of one object, you can pass a collection and have this sql run for every object. No need to write a loop for that! But how this is done in Dapper? Lucky for us, Dapper code is open and available on GitHub. In SqlMapper.Async.cs on line 590 you will see:

There is a loop inside the code. Fine, nothing wrong with that… as long as you don’t need to work with big data. With this approach, you end up having a call to DB for every object in the list. We can do it better.

What if we could…

What if we could merge multiple insert sqls into one big sql? This brilliant idea gave me my colleague, Miron. Thanks, bro!:) So instead of having:

We can have:

The limit here is 1000, cause SQL server does not allow to set more values in one insert command. Code gets a bit more complicated, cause we need to create separate sqls for every 1000 users.

Lets compare!

Code is nice and tidy, but is it faster? To check it I uesd a local database and a simple users name generator. It’s just a random, 10 character string.

I tested this code for 100, 1000, 10k and 100k. Results surprised me.

The more users I added, the best performance gain I got. For 10k users it 42x and for 100k users it’s 48x improvement in performance. This is awesome!

It’s not safe

Immediately after posting this article, I got comments from you, that this code is not safe. Joining raw strings like that in a SQL statement is a major security flaw, cause it’s exposed for SQL injection. And that is something we need to take care of. So I came up with the code, that Nicholas Paldino suggested in his comment. I used DynamicParameters to pass values to my sql statement.

 

This code works fine, however it’s performance is comparable to regular approach. So it is not really a way to insert big amounts of data. An ideal way to go here is to use SQL Bulk Copy and forget about Dapper.

  All code posted here you can find on my GitHub: https://github.com/mikuam/Blog

I know that there is a commercial Dapper extension, that helps with bulk operations. You can have a look here. But wouldn’t it be nice, to have a free nuget package for it? What do you think?

8 thoughts on “Bulk insert in Dapper

  1. There are several problems with the approach your suggesting here. First off, your approach of building the sql statement puts the raw values directly into the statement, rather than using SQL parameters. This opens you up for SQL injection attacks.

    Second, while inserting 1000 rows at a time is better than one at a time, it’s still not as effective as using SQL’s native Bulk Copy functionality (1). The performance is extremely good with SQL Bulk Copy and is absolutely the fastest way to get massive amounts of data into the database. AFAIK, Dapper doesn’t expose anything over bulk copy, so you’ll just have to use a standard ADO.net approach (2).

    1 – https://docs.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlbulkcopy
    2 – https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/sql/single-bulk-copy-operations

  2. This is subject to SQL injection attacks.

    Instead of using string concatenation, look at using DyanmicParameters in dapper and manufacturing a parameter name (p1, p2, p3, etc) and assign values to that.

  3. Using String.Format() against String inputs… seems like this is a large opening for a SQL Injection attack.

    In our projects, if we’re doing bulk data with Dapper we use Stored Procedures and TVP’s.

  4. May I suggest using transaction (works fine with sqlite, haven’t tried SQL Server). Something like this:

    class PersonDto { public string Name { get; set; } public int Age { get; set; } }

    var records = new List();

    // loop to create list of records …

    var sql = @”insert into Person values (@name, @age)”;
    using (var transaction = connection.BeginTransaction())
    {
    connection.Execute(sql, records);
    transaction.Commit();
    }

    1. Good idea,
      Having a transaction is always a good idea when doing multiple SQL statements together. Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *