Archive for ‘T-SQL’

June 27, 2014

Cleaning up after disabling Azure SQL Data Sync

If you are a SQL Server enthusiast like me, you were probably excited when Microsoft introduced Azure SQL Data Sync. It allows you to setup synchronization between two databases with a simple point & click interface.

What I liked about it:

  • You can synchronize in multiple directions and your databases are available for use while synchronization is running (Options available: Bi-directional, Sync to the Hub or Sync from the Hub)
  • No need to change your model to accommodate any special requirements – it just works with SQL Azure compatible databases
  • For small databases you can be up and running within minutes

What I didn’t like:

  • Synchronization is tightly bound to the schema at the time you created your SQL Data Sync Group’s Sync Rules (I know, a mouthful, right?). Anytime you make schema changes you have to update your synchronization configuration. Look under “Sync Rules” for this. Don’t forget to Refresh your Schema every time you use this tool as it seems to cache your schema. Note that this will only replicate the schema for your tables. You have to manually synchronize other objects (views, stored procedures, functions, etc).
  • Synchronization is performed by an agent that runs periodically. The lowest frequency available is 5 minutes. Feature request: Allow for a continuous option to sync changes as they happen.
  • SQL Data Sync adds quite a few objects in your database (the purpose of this article). A new schema will be added to your database – DataSync. With supporting objects. Based on my experience, at least 11 stored procedures, 3 triggers and 1 type is created for every table in your database.
  • You can only synchronize within the same Azure account. I had a use case that necessitated doing this across two accounts (one in China and one in the US) and was not able to use this technology.

OK, so you tried it out and decided its not for you. Disabling it is simple – Delete the SQL Data Sync Group and you’re done, right?

Almost… Say you didn’t do that, but simply dropped one of the databases or for some reason the group deletion failed to fully cleanup your database. In either case, you’ll be left with quite a few objects in your database. In my testing, as of 6-27-2014, when you delete the sync group, you’re still left with 4 tables and the schema “DataSync” isn’t dropped, this is probably a bug that will be fixed eventually, but in the meantime below are some scripts to help you out.

This is how you can find all the objects created by DataSync:

-- Find all SQL Data Sync triggers
select name from sys.triggers where name like '%_dss_%_trigger';
-- Find all SQL Data Sync tables
select table_schema + '.' + table_name from information_schema.tables where table_schema='DataSync'
-- Find all SQL Data Sync procedures
select 'DataSync.' + name from sys.procedures where schema_id = SCHEMA_ID('DataSync')
-- Find all SQL Data Sync types
select name from sys.types where schema_id = SCHEMA_ID('DataSync')

Note that in the case of triggers, I am relying on the current naming convention. Be careful that this doesn’t match any of your triggers. If it does, exclude them by adding additional where criteria. The rest should be safe, assuming you haven’t created any of your own objects under the DataSync schema.

This script will generate the drop statements you need:

-- Hit CTRL-T for "Results to Text"
set nocount on
-- Generate drop scripts for all SQL Data Sync triggers
select 'drop trigger [' + name + ']' from sys.triggers where name like '%_dss_%_trigger';
-- Generate drop scripts for all SQL Data Sync tables
select 'drop table ' + table_schema + '.[' + table_name + ']' from information_schema.tables where table_schema='DataSync'
-- Generate drop scripts for all SQL Data Sync procedures
select 'drop procedure DataSync.[' + name + ']' from sys.procedures where schema_id = SCHEMA_ID('DataSync')
-- Generate drop scripts for all SQL Data Sync types
select 'drop type DataSync.[' + name + ']' from sys.types where schema_id = SCHEMA_ID('DataSync')
-- COPY output and run in a separate connection

Finally, drop the schema:

-- Once you run the drop scripts above, drop the schema
drop schema DataSync

You are done!

Here is the complete script: azure_sql_data_sync_clean.sql

January 25, 2011

Re-index All Tables in your SQL Database: Short & Dirty Script

I am not advocating that you use this in a production system, but if you’re looking to re-index all tables in your database (say on development or test), here is a short script for it:

exec sp_msforeachtable 'dbcc dbreindex(''?'', '''', 80)'

Note, the last parameter controls the fill factor, set it to 0 to re-use the last known fill factor or read more about it in Books Online. With sp_msforeachtable you’re looping through all tables in the database and executing the provided script. As you can see above you use a question mark (?) to substitute the table name. This is a decent alternative to manual looping through INFORMATION_SCHEMA.TABLES which is another way to get a list of all tables in your database.

If you were just looking to obtain a dbcc script you can run this command and it will generate the equivalent script that is run by the first command I showed:

	'dbcc dbreindex(''' + TABLE_NAME + ''', '''', 80)' 
January 3, 2011

T-SQL: Parsing data type xml

If you only occasionally dabble in XML and XPath utilizing the XML data type in SQL Server can be a bit daunting. I found it very useful in passing parameters to stored procedures containing object graphs. The same can be accomplished the table parameter if you only have a list of one type of objects, but if you need to pass an object graph of parent/child objects of different types, using XML will be your best bet.

Below I demonstrate how one could represent a list of Contact objects in an XML document and subsequently access the data via a select statement. the @Contacts parameter could be used as a parameter to a stored procedure if you’re running SQL 2005 or higher. Use your imagination for the rest.

declare @Contacts xml

set @Contacts = 

    ContactID = Contact.value('ContactID[1]', 'int'),
    Email = Contact.value('Email[1]', 'varchar(255)')
    @Contacts.nodes('root/Contact') as Contacts([Contact])
Tags: , ,
November 26, 2010

T-SQL coalescing column values across rows

Recently, I came across a requirement to take a resultset containing NULL fields and fill a column’s NULL value with the last known non-null value of the given column. This is best illustrated via an example.

If we have the results:



The desired output would be:


After testing a few methodologies, the fastest solution utilized an update with a variable.

Suppose, we have this table definition:

declare @mytable table (n int)

The statement to “fill the gaps” would be:

-- this declaration should match the column type containing NULLs
declare @n int
set @n = null -- by default it will be NULL, but this is a good practice
update @mytable 
     set @n = coalesce(n, @n), n = coalesce(n, @n)

To explain, the above statement assigns a new value to the variable @n if the current row’s “n” column is not null. Otherwise it assigns the value of @n. Next it will assign the value of @n to column “n”, IF column “n” is null. This solution will scan every row (in the order of insertion into your table or based on a clustered index if one exists) and replace NULL values in the column of your choice with the last non-NULL value in the sequence of values.

Review the definition of COALESCE in Books Online if you’re not clear.

Tags: ,