Tuesday, May 24, 2016

SuperBulkCopy (SBCP) Walk-through

Recently, We've introduced a free new tool called SuperBulkCopy (short name: sbcp)



What it does is simple to explain, and yet pretty challenging to implement - it does everything we wanted BULK INSERT, BCP, or any existing flat file importer to do: Get the job done, as automatic as possible.

In this post I'll provide a walk-through, with many images. Video will also become available in the future but hopefully this segment can get you up to speed as to: "Why do I REALLY need this!"

Again, as mentioned above, this tool is absolutely free. So just get it and start playing!

Let's begin.

Part 1: The struggle


Here's the setup - I got a fairly big file from a client, and need to turn it into a SQL Table for further analysis.

I look at the file:


It's about 1.7gb big. Thinking - already let's have a look with Notepad:


Nah! too big. How about Notepad++?


Nope. The same thing.

OK, next step -- I do know (and we'll see this in a moment) that this file has mixed data types, and everything is wrapped with "quotes". Also - it has headers, which eliminates the ability for either BULK INSERT or BCP to handle it correctly.
So we'll do the next natural thing and jump right into the "Text Import Wizard", or use SSIS directly. (Same engine anyway)



Loaded the file, with the correct delimiters, as well as specified the Text Qualifier - A quick preview indicates we're almost good to go:



Mapping the data into a new table, after also using the "Suggest Data Types" feature




But - OH NO! The process has failed.
This can be caused by so many reasons, from data truncation, into type mismatch, headers not 100% matching the field structure and more.


Looking at the friendly logs:


Fun Times!

Now, I'm not saying that everything fails, and even the ones that do are in some cases solvable after a few iterations of trial & error.

But the question is - WHY BOTHER? Where's there is a free tool today that will do all the hard work for you?

Part 2: The Easy Way Out


Let's try to do that same thing with SuperBulkCopy.
It's very fast and easy to use, so I'll take the time showing a few different methods of getting the job done.

Let's go! (again)

Okay, after installing the tool, let's run it. I'm using the DoccoLabs launcher but you might as well simply run it directly from the start menu.


Okay - now it's loaded.
On Input Settings, I choose that same file. 



Now, even before going further - there's a quick "Preview" option, regardless of file size - it will show you a sample


Yep, Looks great!
Now, there's not a lot to do: Simply defining the Destination SQL Server connection. I'll even ignore the fine-tuning options and simply let it run.
(Note: for the demo I've only imported part of the file, which is another feature, but one can easily import the entire content)


So - completed. Let's go check the table in SSMS:



It's right there. Looking at the table definitions even shows that the best data-types were used:


Basically, we're done.
But I want to spend another minute of showing you how SuperBulkCopy can run just as easily as a console-app from command line.

After setting all the relevant input-output configuration for this demo, we can save the configuration for later use.

In our case, the configuration looks like this:

<?xml version="1.0" encoding="UTF-8" ?>
<config>
    <input>
    <Filename>c:\temp\text_samples\big_file_with_quoted_identifiers_20160301.tsv</Filename>
    <AutoDetectHeaders>True</AutoDetectHeaders>
    <AutoGuessDelimiters>True</AutoGuessDelimiters>
    </input>
    <output>
        <TableName>test1</TableName>
        <OptimizeDataTypes>True</OptimizeDataTypes>
        <ConnectionString>Data Source=127.0.0.1;Initial Catalog=tempdb;Integrated Security=True;</ConnectionString>
        <DropTableIfExists>True</DropTableIfExists>
    </output>
    <AllowAsyncDataWrite>True</AllowAsyncDataWrite>
</config>


And I can save it using the main menu.




Now - I can start using sbcp.exe
Of course, instead of generating the XML file, I could write it myself. Typing sbcp.exe /? provides all the essential information of how to build a configuration file.
Alternatively, everything *can* be done with simple command-line arguments but I find XML option to be more clear and easy to understand.

So, now let's open a command window and direct sbcp.exe to the configuration file:


And here are the results:


Basically - the exact same result as the UI version produces. Only here - the process can be fully automated. (for example, if such file arrives every day and needs to be turned into a table)

Furthermore - this can run directly from SSMS by using the SQLCMD mode. Here's how

Switch into SQLCMD Mode on the SSMS main menu:



Now, run the same as you just did from the command line. don't forget to add the !! indicator.


Part 3: Conclusions

In short, SuperBulkCopy can easily get into flat-file input into your SQL Server. It does a lot such as detecting headers, delimiters, data-types and more.
It's a free tool that can potentially turn this annoying task into a rather easy one.

You can get it from this line: http://www.doccolabs.com/products/superbulkcopy

---

Have anything to add? Let us know! (comments, twitter mentions, emails or whatever channel you like best)