Tuesday, December 9, 2014

Introducing LABS, our app playground.

A bit about the "LABS" part in Docco labs


So, in addition to our well matured products, we keep creating new apps, usually built around needs we come across during our daily tasks.
While these apps are usually not shared publicly, I've decided to introduce (some of) them to the public. That - in order to expose them to a wider audience (such as yourself) let you use them freely and hopefully get some feedback.

While far from being complete (consider that a one-sentence disclaimer), all of those apps will be available for download upon request (on the first stage) and widely available later.

OK, so, before we lose our TL;DR readers, let's jump right to it! shall we?
(You can start by reading the taglines, and then drill-down to whatever interests you)



*** Validate SSDT Publish Script

Tagline: Make sure you SSDT deployment scripts are not locking down your databases

For those of you using SQL Server Data Tools (a.k.a SSDT) for continuous delivery, you may have encountered cases when a generated deploy script is trying to drop and recreate multi tables.
Why does this happen? well, a typical scenario would be that a developer is trying to add a new column to an existing table, but instead of adding it last, he or she is adding the column in the middle of the table.
What happens then?
Well, since there is no T-SQL syntax of pushing a new column in the middle of a table, the generated script will do the following:

  • create a tmp_xxxx table with the new structure
  • copy over all content from the original table
  • drop the original table 
  • sp_rename the tmp table to the original table
  • restore all FK's indexes etc.
And that's the *SIMPLE* case. When the table has a lot of references, other tables may be affected as well! And if these are large tables and you're in prod, you can only imagine the potential downtime you'd experience by that deployment.

So, that's the main reason for the validation app to be born. Here's what it does:


  • Analyze the SSDT generated .sql script of any build
  • Detect tables re-creations
  • (Now comes the cool part)
  • Connect to the destination database and measure the dropped/re-created table size
  • If the total size is over a predefined threshold - raise a flag and return a 'fail' notice on the build.
So, to summarize, this 'lab' app will save you from unintentionally run deployments that will potentially lock your database for quite a long time.


This is a console app, with a powerful configuration options allowing you to customize many aspects such as scripts locations, destination database connection scripts, size threshold and more.

Here's a sample screenshot:


*** Splunk to SQL Table


Tagline: Turn (almost) any Splunk query into a well-formed SQL Table


If you are unfamiliar with Splunk, it is (wiki quote):
"Splunk (the product) captures, indexes and correlates real-time data in a searchable repository from which it can generate graphs, reports, alerts, dashboards and visualizations"

While having the exporting abilities, run a Splunk query result into a SQL Table is not an easy task. At least if you're not using "Splunk to SQL Table"! :)
So, what this app does is basically the following (assume everything is heavily configurable)

  • Connect to a Splunk server, run a pre-defined splunk query
  • Analyzes the result, converts to a SQL Table with optimal data-types (based on content)
    • (So for example, that OrderId will be "INT" rather than "NVARCHAR(MAX)")
  • Connect to a configured SQL Instance
  • Create the table, push the content in.
Some screenshots? sure!

Splunk search screen:

+

The App (with relevant configuration)

=

A new SQL Table



*** Text/CSV file slicer


Tagline: A powerful slicer for large text/csv/tsv files

Whenever in need to split a big "something"-delimited file, this powerful tool will do the job, allowing a lot of different customizations such as:
- Control the naming patterns of the output files (identities, guids, random strings and more)
- Repeat headers on each file
- Control the end-of-line characters (windows/unix compatibility)

While you may find other tools out there, this one has been written to provide more robust option base. Oh, and it's also *REALLY* fast!


The screenshot shows the simple UI implementation, the tool itself is basically a console app.



Monday, October 20, 2014

Turn JSON into a SQL Table

One of the new features introduced in the latest version of ClipTable is the ability to parse (most) JSON data (as long as it can be converted to tabular format).

While ClipTable's main purpose is to turn any textual data into a SQL Table (either from your clipboard or a flat file) - we've made two 'other' formats it can read. The first was xml, and now, quite naturally - JSON.

Want to see how it works? Here's a quick example:

Run ClipTable

Copy this following (JSON) text to your clipboard:

[
 {
  color: "red",
  value: "#f00"
 },
 {
  color: "green",
  value: "#0f0"
 },
 {
  color: "blue",
  value: "#00f"
 },
 {
  color: "cyan",
  value: "#0ff"
 },
 {
  color: "magenta",
  value: "#f0f"
 },
 {
  color: "yellow",
  value: "#ff0"
 },
 {
  color: "black",
  value: "#000"
 }
]

Now, Paste!

You should be getting this:



And if you don't know ClipTable yet - the distance between this point to actually having a SQL Table with the above data is about two clicks away.


Feel free to try, and challenge with other tabular clipboard content.

ClipTable - Data import made easy! (Real easy!)

Tuesday, September 9, 2014

Your new data Import-Export power tools


If you’re working with databases, you are obviously no stranger to the endless need for getting data in and out the database. And not only the traditional inserts and selects, but rather importing from/exporting to “foreign” sources.

Two of our tools are just around that area and can pretty much solve any scenario of import/export you may come across!

How?

Well, the first tool, SqlTableZip can export a data from any query (table, stored procedure execution result, free t-sql) into a compressed file - Which includes the query text, headers, field types, and of course, the data - highly compressed!





This (.sqz) file, can be moved around and restored anywhere. Supporting multiple SQL Server versions, including Azure (so in fact, it is also a very easy way to get data in & out between Azure and local SQL Instances)
DBA’s sending production-data to developers (who don’t have access), Field engineers bringing offline data for analysis/support purposes, or even a simple table-level backup -- these are all valid scenarios that SqlTableZip can take care of; easily, quickly and with zero-hassle.
There are no size limits. You can backup huge amount of data and everything runs super-fast.

So, getting data in & out of SQL Server is covered, but what about data some comes from foreign sources?

If your source is a table coming from anything like html formatted tables, xml, excel, pdf, word and pretty much any other tabular format that needs to be turned into a relational-database table: ClipTable is here to help!
Instead of supporting each of the formats above (and many more) - we took a different approach by building a smart algorithm that can analyze the content you copy into your clipboard.



So the process becomes very simple - you select the desired content, paste it into ClipTable and get a new table ready to be created.
ClipTable will determine the correct format, the best field data types, column names and the correct row/col delimiters. All of these are automatically done and all that is left for you, the user, is to decide where you would like the new table to be created.

As a bonus - ClipTable supports most of the popular relational databases out there today - so you can create your new table/s in SQL Server, Oracle, MySQL, PostgreSQL and Firebird.

So, to summarize: if data import/export is part of your routine, this is definitely worth trying.
To learn more and of course get the software, visit our website @ www.doccolabs.com



Tuesday, September 2, 2014

ClipTable to support Firebird Database


Calling out all Firebird Database users!


ClipTable, our revolutionary *anything* to SQL Table importer, is now offering Firebird support!


What does this mean?

If you are new to ClipTable, this is a robust application which lets you copy any tabular data to your clipboard (yes, Ctrl-C) and paste it into ClipTable, which analyzes the input and creates an optimal SQL Table with all the data.
This replaces the need to manually modify the data in order to insert it into a table.
ClipTable engine can detect the most suitable data-types, sizes, headers, delimiters and everything else involved in the table creation.
Firebird support now joins the existing SQL Server, Oracle, MySQL and PostgreSQL which are currently supported.

What's required?

In order to work with Firebird, you'll need to install Firebird's ODBC drivers from this link: http://www.firebirdsql.org/en/odbc-driver/
Also - ClipTable runs on Windows and required .Net framework 4.5 (included in Windows Update, but just in case - http://www.microsoft.com/en-us/download/details.aspx?id=30653)

Known Issues

There are currently not known issues reported. If you encounter one, please contact us at support@doccolabs.com


Get the latest version here - http://doccolabs.com/products/cliptable

Monday, August 25, 2014

ClipTable on Steroids!


A new version of ClipTable was released earlier today and among other changes it contains a very important improvement - The clipboard/file parsing engine is now 10x Times Faster (!)

This, along with the native 64 bit support, now enables you to import large objects, of any kind into your SQL Server.

For those of you who are still not familiar with ClipTable - It is an application that lets you copy any tabular data from any application (for example - Word, Excel, Acrobat, web sites and more) and turn it into an optimized SQL Table in seconds!
There is no need to deal with field types, sizes, delimiter or any other hassle along the way.

You can download, learn more and get to know our other products at www.doccolabs.com


Thursday, August 7, 2014

SQL Server - Coding guidelines

(Note - this is the actual article from the last post, rather than a link to my LinkedIn article)

Let's talk about best practices around writing SQL Server code, specifically in a project (which means - not just an ad-hoc t-sql script)

This post goes through various SQL (Server) code design guidelines and best-practices.

Following these guidelines, we are able to achieve:
- Cleaner, more readable code
- Getting the most out of Database Projects (SSDT)
- Keeping a standard, which makes our SQL Code clear and easily readable across different teams/organizations (by following a similar pattern)

Background:

Existing resources on databases design are usually focusing on relational tables, keys, relationships, indexes, performance and other topics which share one thing in common: Schema & Data.
While this is probably the most important aspect of a relational database, most of today’s SQL databases contain code, which in many cases, lack a decent design best-practices rules, unlike its data-related objects.
Over the years, in various SQL-Related roles such as Developer, Dev Manager, Architect and consultant, I have come across numerous SQL Projects and was often surprised by the lack of common interface and by the different variations each company/team used to write its SQL Code.
These are the main reasons for forming this document.

Using:

We are using Visual Studio SQL Database Project (a.k.a. SSDT).
Although many of the guidelines can be used on a different environment, SSDT (at the point of writing this document), is the most complete SQL Project container, compiler and publisher.

Samples:

Many of the examples given in this article are taken from our sample SQL Server Database Project: UserManagementSampleDatabase
You can download the sample project HERE
Project structure:
There are two different methods you can use for your project;
- Schema First
- Object Type First
When creating a new empty database project and choosing to import from an existing database, Visual Studio will use the Schema-First structure (see screenshot)
Visual Studio Project Structure
Our sample project is using the Object-Type first, as we believe it is easier and more readable to have all tables under the same branch (for example)
Having that said - this is not a strong recommendation.

Our Project Structure

Coding guidelines:

General concept:
Your SQL Server stored procedures code is not an ad-hoc scripting workarounds, it is a code, part of your enterprise applications and should be treated as such.
This is rather than an ad-hoc scripting language.

Covered objects:

All coding guidelines can be applied to SQL Stored Procedures, Functions, Triggers or any other SQL language script.
This excludes .Net CLR objects which are written in different languages (c# / vb.net) to which a same language coding guidelines should be applied.

Headers

The reason for a descriptive header is to allow some generic information which can be easily viewed on the server itself.
While the true source of record for all changes is usually your source control, it is important to understand that the database code is often being accessed on premise, by users who do not have quick access to the source control.
Seeing a brief explanation and change log may help these users quickly understand the purpose of the procedure.
A typical example is when DBA’s connect to your Production SQL Server in order to troubleshoot a recent issue. They typically do not have access to the developers’ source control, however while going through the procedures which are involved on the incident and seeing a quick change-log, they are usually able to easily pinpoint the problem (a classic: “what’s different” investigation approach)
A suggested header structure is:
  1. Object Drop/Recreate, which cannot run within SSDT, but good to keep whenever this procedure is being executed manually,
    followed by a simple one line procedure overview.
  2. Change-log which contains the “when”, the “who” and the “what” of the change
  3. A sample, which may be the same test-script you later use for the Unit-Testing project
Here is a sample header for a stored procedure:
Note: More samples can be found in the sample database project.

(Over) Sizing

As a generic guideline, a stored procedure should not exceed ~50 lines (excluding header).
Long procedures are typically harder to read & understand and in most cases can, and should be refactored to multiple smaller procedures. This is also relevant from a usability point of view.
Try to keep your code a small and atomic as possible. When a procedure exceeds the recommended size, consider refactoring, splitting logic to child procedures.

Naming Conventions - Do’s

Consider using schemas as a limited (yet better than none) solution for namespaces.
Although SQL Server can’t handle multiple namespaces such as App.UserManagement.Users.Save(), we can still create a UserManagement schema under which we place all the user-management-related object.
This is both more readable and in addition - easier to use for IntelliSense interpretation

Naming Conventions - Don’ts

Following the above guidelines, do not name your procedures with the same prefix ‘indicating’ that they are indeed - procedures (or functions)
Being more specific, do not use [sp_*] and/or [fn_*] which, in addition for being an invalid naming convention will also affect performance (reserved prefix which causes SQL Server to look for an instance in master database on every execution call)

User-Defined table types (UDTT)

We recommend using udtt as input parameters in the following cases:
- Input may contain multiple rows
  • (use instead of either xml or a delimited string)
- Procedure with large number of input variables
- When a set of the input variables is passed to another procedure
  • On which case - it is better to pass the entire @udtt rather than calling a child procedure while passing multiple input variables (again, mostly from readability point of view)

#Temp Tables

Temp tables can be used to pass data between stored procedures executed within the same session.
While this can be a rather simple way to pass data, the general recommendation would be to pass the data as UDTT rather than *assume* a certain temporary table exists and call it in a stored procedure which was not responsible of creating it.
Note: this guideline does not reference the usage of udtt vs. #temp in queries, where in many cases leans towards temp tables which may carry better execution plans.

Private/Public (Accessibility Levels)

The concept is to let the user (i.e. the procedure executor) know which stored procedures should be called, vs. which stored procedure are considered ‘child’ or ‘internal’ procedures that are being executed through the ‘public’ procedures.
In common programming languages we user the Accessibility Levels in a class, which is designed for that matter.
However, the closest thing to Accessibility Levels in SQL Server would be roles/permissions, which is somewhat problematic to use since when a procedure is executed by a user, the permissions for any ‘child’ execution is inherited.
With that in mind, there are two recommended approaches:
- Begin ‘public’ procedures/functions with a capital letter (PascalCasting), and ‘private’ with a lowercase letter (camelCasting).
(ex: UserManagement.UserDetailsSave for public & UserManagement .userPhoneNumberSave for private)
(This example appears in our sample database)
- Create two schemas for public/private. (ex: UserManagement.UserDetailsSave for public & userManagementPrivate.userPhoneNumberSave for private)
It is important to note that both options do not serve as restrictions (i.e. prevent the user from executing anything) but rather to display a clear picture of what should be executed.
To physically prevent a user from executing a procedure refer to Roles/Permissions on SQL Server.

Backward Compatibility / Versioning

Stored procedure versioning (i.e. creating multiple instances of the same procedure whenever the signature changes to ensure backward compatibility) is not recommended.
When we need to change a method we do not typically clone it, unless absolutely required, and the same logic should apply here.
In many cases, writing a correct code which works with our database prevents backward compatibility issues. This may include:
- Calling a procedure using @variable=value instead of relying on variable position within the procedure
- Allowing a more flexible returned DataTable - meaning that if a new/addition column is returned, the code doesn’t break
- In case of @UDTT input variables, consider using a function that returns the UDTT structure. (Example is provided within the sample project)
And from the procedure side
- Avoiding “SELECT *” as a return value (which is being affected whenever one of the participating tables changes)
- Avoiding OUTPUT @variables as return values

Keywords

Schema & object names using reserved keywords should be avoided.
When absolutely necessary, reserved names must be wrapped by [square brackets], even when no error occurs otherwise.
(Bad) Example: CREATE PROCEDURE [Version].[Procedure]

Inline Comments

In addition to the object header, here are some guidelines around inline comments:
- Use in order to briefly describe a part of the code that may not be easily understood
-- In case @UserId is still null, try to search on Inactive Users table
IF @UserId IS NULL...
- Use “--“ for a single line comment and /* … */ for a block content
- Don’t “over-comment” the obvious
i.e.
-- We're going to select UserId and UserName from our Users table
SELECT UserId, UserName FROM Users

Logging

At the present point of writing this document, many of the “logging” and/or “debugging” inside stored procedures are done using “PRINT”
This is an incorrect usage and is basically missing both purposes.
It is highly recommend to create a simple logging table for logical messages.
There are pre-existing logging platforms such as Log-Table Viewer (link). A logging table/sproc is also provided within the sample project.
If you do need an immediate console return - use RAISERROR (with nowait) instead of PRINT

Returning Results

There are quite a few approaches for getting a result back from an executed stored procedure. These include:
- Using the RETURN exit code
- Using OUTPUT @variables
- Returning a table/s
The high-level recommendation is to return a table. (DataTable or DataSet on the caller side)
Using OUTPUT variables may also be a valid option but may be a bit tricky from a backward compatibility point of view.
The reason is that output variables always need to be the last, which means that if the procedure needs new variables is a newer version - the output variable/s are being pushed.
As long as all code against this procedure does not rely on the order (i.e. populated by @variable=value) you are safe. Otherwise - consider either returning a table and/or refactor your client code.
Note that not all stored procedures need to return anything (for example, insisting on having the last line as “RETURN 0”). In many cases - executing without getting an exception thrown means a successful execution.

Error handling

Following both output & logging paragraphs above, we recommend using the following guidelines for handling errors:
- To handle errors, use the “BEGIN TRY”/”BEGIN CATCH”/”THROW” (2012 and up)
- Only use error handling when an additional action is made in the error handling section, otherwise have the caller handle the exception
- Some errors may be reported using the logging mechanism described above. Keep in mind that if your error handling includes rolling back a transaction, you should log the error AFTER it is rolled back.
- Remember: returning the entire exception (either by not handling or by using “THROW;” at the handling section) is usually more helpful for the calling app rather than formatting your own textual message.
- Using RAISERROR is not recommended

References (project/.dacpacs)

SSDT provides an easy (database) reference usage. A database reference can either be another project on the same solution, or any compiled .dacpac file.
A reference can be embedded as:
- Different Server
- Same Server, different database
- Same Server/Database
The choice whether to use project or .dacpac reference is very similar to choosing between a .dll vs. hosting a class in your .net code.
In general, the recommendation is that if the reference is not directly and exclusively exists under the same project (in a logical concept) - use a .dacpac
For example - many developers create their “Common” SQL Library for repeating tasks. Such class should be referred as .dacpac from any used project.
You can find an example in our sample project where LoggerSupport.dacpac is being referred to (configured as ‘same database’)
Note: if you are using a .dacpac reference and actively work with Debug/Release environments, you may need to change your project file manually to automatically point at the correct referred .dacpac according to the build configuration.
In our sample project, this means changing UserManagementSampleDatabase.sqlproj to the following:
- Current:
  • <HintPath>..\LoggerSupport\bin\Debug\LoggerSupport.dacpac</HintPath>
- Modified:
  • <HintPath>..\LoggerSupport\bin\ $(Configuration)\LoggerSupport.dacpac</HintPath>

Unit-Testing

Visual Studio offers an integrated database unit-testing environment.
The sample project contains a unit-test sample. Some recommendations include:
- Recommended:
  • Create a separate class for each procedure (rather than containing all tests in a single class)
  • Name the class as the procedure name unless
  • Multiple tests for a specific procedure should be created under the same class
  • Test name should be declarative
- Alternative:
  • Create a class per a logic group of testing, for example (test names colored) :
    • User
      • UserDetailsSave
        • Add User
        • Update User
        • UserDetailsDelete
          • Delete existing user
          • Delete non-existing user
References: