Thursday, December 8, 2011

MERGE Statement MATCHED clause headache relief

Have you ever been using a MERGE statement and when you get the MATCHED clause you think to yourself, “This is easy, I’ll just script out an UPDATE statement and do a quick replace”? Only when you get around to it doing the quick replace it isn’t so quick.

A friend of mine(SB) once said “I work hard at working smart”. It is a great motto to work by especially when working in a technical role. So in the spirit of this motto, I decided to find a smarter way to do a quick replace for a MERGE statement’s MATCHED clause.

So here is our scenario: I’ve got a table with about 200 columns in it. I need to update about 180 of them from my staging table to my dimension table. When I script out an update statement for my table and put the column list in my MERGE statement I get something like this:

WHEN MATCHED THEN
UPDATE SET
[RegistrationDate] = <RegistrationDate, int,>
,[RegistrationDateKey] = <RegistrationDateKey, int,>
,[CreationDate] = <CreationDate, datetime,>
,[CreationDateKey] = <CreationDateKey, int,>
,[FirstName] = <FirstName, varchar(100),>
,[MiddleName] = <MiddleName, varchar(100),>
,[LastName] = <LastName, varchar(100),>
,[AddressLine1] = <AddressLine1, varchar(100),>
,[AddressLine2] = <AddressLine2, varchar(100),>
,[City] = <City, varchar(100),>
,[State] = <State, varchar(5),>
,[PostalCode] = <PostalCode, varchar(100),>
,[EMail] = <EMail, varchar(200),>
,[PhoneNumber] = <PhoneNumber, varchar(100),>
,[CustomDateTimeValueLastChangedDateTime] =
<CustomDateTimeValueLastChangedDateTime, datetime,>

,[CustomDateTimeName1] = <CustomDateTimeName1, varchar(100),>
,[CustomDateTimeValue1] = <CustomDateTimeValue1, datetime,>
,[CustomDateTimeName2] = <CustomDateTimeName2, varchar(100),>
,[CustomDateTimeValue2] = <CustomDateTimeValue2, datetime,>
,[CustomDateTimeName3] = <CustomDateTimeName3, varchar(100),>
,[CustomDateTimeValue3] = <CustomDateTimeValue3, datetime,>
,[CustomDateTimeName4] = <CustomDateTimeName4, varchar(100),>
,[CustomDateTimeValue4] = <CustomDateTimeValue4, datetime,>

,[CustomDateTimeName5] = <CustomDateTimeName5, varchar(100),>
......

Only I need it to look like this:

WHEN MATCHED THEN
UPDATE SET
[RegistrationDate] = Source.[RegistrationDate]
,[RegistrationDateKey] = Source.[RegistrationDateKey]
,[CreationDate] = Source.[CreationDate]
,[CreationDateKey] = Source.[CreationDateKey]
,[FirstName] = Source.[FirstName]
,[MiddleName] = Source.[MiddleName]
,[LastName] = Source.[LastName]
,[AddressLine1] = Source.[AddressLine1]
,[AddressLine2] = Source.[AddressLine2]
,[City] = Source.[City]
,[State] = Source.[State]
,[PostalCode] = Source.[PostalCode]
,[EMail] = Source.[EMail]
,[PhoneNumber] = Source.[PhoneNumber]
,[CustomDateTimeValueLastChangedDateTime] = Source.[CustomDateTimeValueLastChangedDateTime]
,[CustomDateTimeName1] = Source.[CustomDateTimeName1]
,[CustomDateTimeValue1] = Source.[CustomDateTimeValue1]
,[CustomDateTimeName2] = Source.[CustomDateTimeName2]
,[CustomDateTimeValue2] = Source.[CustomDateTimeValue2]
,[CustomDateTimeName3] = Source.[CustomDateTimeName3]
,[CustomDateTimeValue3] = Source.[CustomDateTimeValue3]
,[CustomDateTimeName4] = Source.[CustomDateTimeName4]
,[CustomDateTimeValue4] = Source.[CustomDateTimeValue4]

,[CustomDateTimeName5] = Source.[CustomDateTimeName5]

.....


I’ve been using regular expressions for a long time, but I recently discovered a little known feature called replacement expressions. Basically, it allows the replace to use a portion of what it found and put it back. This is great news!
So here is what I came up with:
In my Find What box I use the following: {\[[:a]+\]} = \<[:a, ()]+\>
In my Replace with box I use the following: \1 = Source.\1




I run my replace all and I instantly have the columns with the syntax I need. I hope this comes in handy for you; I know it sure came in handy for me.

Thursday, October 21, 2010

Database Mirroring monitor for ETL

BY Jim Bennett

Do you have databases that are mirrored that your ETL is populating?  Does the database mirror ever start to lag? 

If you answered yes to either of these questions then this is for you.  SQL Server has a handy table that is created when database mirroring is configured.  This table (msdb.dbo.dbm_monitor_data) allows you to see the status of your mirrored databases.  So how can we make use of this in ETL?  By using the code provided below, you can pause your ETL if database mirroring is behind and give SQL Server a chance to catch up before you start loading more data from your ETL process. 

The usp_CheckMirroringStatus stored procedure does two things for you:  1) it captures information about the status of the mirrored database and logs this information to a table (dbo.MirroringStatus).  This can help show you longer term trends about how mirroring is working on the server.  2) the stored procedure will not finish until the redo_queue falls below your threshold.  This essentially can give the SQL Server some breathing room before more data is introduced from your ETL. 

Overall this is a handy piece of logic that can be baked-in to your ETL process for those times when your server is not keeping up with your ETL process.

IF  EXISTS (SELECT 1 FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[MirroringStatus]') AND type in (N'U'))
DROP TABLE [dbo].[MirroringStatus]
GO

PRINT 'Create Table dbo.MirroringStatus'
GO
CREATE TABLE [dbo].[MirroringStatus](
       [Id] [int] IDENTITY(1,1) NOT NULL,
       database_name varchar(128),
       mirroring_state varchar(20),
       witness_status varchar(15),
       log_flush_rateKBs int,
       send_queue_sizeKB int,
       send_rateKBs int,
       redo_queue_sizeKB int,
       redo_rateKBs int,
       transaction_delayMS int,
       transactions_per_sec int,
       local_time datetime,
       end_of_log_lsn numeric(25,0),
       failover_lsn numeric(25,0)
 CONSTRAINT [PK_MirroringStatus] PRIMARY KEY CLUSTERED
(
       [Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON)
)
GO

IF  EXISTS (SELECT 1 FROM sys.procedures WHERE name = 'usp_CheckMirroringStatus')
       DROP PROCEDURE [dbo].[usp_CheckMirroringStatus]
GO

PRINT 'Create SP dbo.usp_CheckMirroringStatus'
GO
CREATE PROCEDURE [dbo].[usp_CheckMirroringStatus]
  @dbname varchar(128),
  @RedoQueueThresholdKB int = 1024,
  @WaitDelayTimeInSeconds int = 15
AS
/*
Procedure Name: usp_CheckMirroringStatus
Created By:     Jim Bennett – CapstoneBI
Created Date:   10/20/2010
Website:       
www.capstonebi.com
Purpose:        Allows database mirroring to be monitored and cause calling
    process to wait until mirroring is caught up.
*/

DECLARE @isFinished tinyint = 0;
DECLARE @CurrentRedoQueueSizeKB int;

CREATE TABLE #Result
       (
       database_id int,
       role int,
       mirroring_state int,
       witness_status int,
       log_flush_rate int,
       send_queue_size int,
       send_rate int,
       redo_queue_size int,
       redo_rate int,
       transaction_delay int,
       transactions_per_sec int,
       [time] datetime,
       end_of_log_lsn numeric(25,0),
       failover_lsn numeric(25,0),
       local_time datetime
       )

IF (
SELECT COUNT(*)
FROM sys.database_mirroring
WHERE database_id = DB_ID(@dbname)
AND mirroring_guid IS NOT NULL
) > 0
BEGIN
       --The database is being mirrored
       PRINT 'Database: ' + @dbname + ' is mirrored'

       WHILE (@isFinished = 0)
       BEGIN
              TRUNCATE TABLE #Result;

              PRINT 'Checking Redo Queue Size'
             INSERT INTO #Result
              SELECT TOP 1 *
              FROM msdb.dbo.dbm_monitor_data WITH (NOLOCK)
              WHERE database_id = DB_ID(@dbname)
              ORDER BY [time] DESC

              SELECT
                     @isFinished = CASE WHEN redo_queue_size <= @RedoQueueThresholdKB THEN 1 ELSE 0 END,
                     @CurrentRedoQueueSizeKB = redo_queue_size
              FROM #Result
              PRINT 'Current RedoQueue Size in KB: ' + CAST (@CurrentRedoQueueSizeKB as varchar(50))

              IF (@isFinished = 0)
                     WAITFOR DELAY @WaitDelayTimeInSeconds

              INSERT INTO dbo.MirroringStatus
                     (
                     [database_name],
                     mirroring_state,
                     witness_status,
                     log_flush_rateKBs,
                     send_queue_sizeKB,
                     send_rateKBs,
                     redo_queue_sizeKB,
                     redo_rateKBs,
                     transaction_delayMS,
                     transactions_per_sec,
                     local_time,
                     end_of_log_lsn,
                     failover_lsn
                     )
               SELECT
                     db_name(database_id),
                     CASE mirroring_state
                           WHEN 0 THEN 'Suspended'
                           WHEN 1 THEN 'Disconnected'
                           WHEN 2 THEN 'Synchronizing'
                           WHEN 3 THEN 'Pending Failover'
                           WHEN 4 THEN 'Synchronized'
                           ELSE 'Unknown'
                     END,
                     CASE witness_status
                           WHEN 0 THEN 'Unknown'
                           WHEN 1 THEN 'Connected'
                           WHEN 2 THEN 'Disconnected'
                            ELSE 'Unknown'
                     END,
                     log_flush_rate,
                     send_queue_size,
                     send_rate,
                     redo_queue_size,
                     redo_rate,
                     transaction_delay,
                     transactions_per_sec,
                     local_time,
                     end_of_log_lsn,
                     failover_lsn
               FROM #Result

       END

       DROP TABLE #Result

END
ELSE
       PRINT 'Database: ' + @dbname + ' is not mirrored'

 

GO

.