This following is a simple, yet informative, exercise that provides insight into just how big a 64-bit integer truly is.

The maximum positive value for bigint is ((2^63)-1), and it requires 8 bytes of storage for each value. Using some very simple math we come up with the following

(Maximum positive bigint value) * (8 Bytes per value) = Total storage required to store all positive bigint values

((2^63)-1) * 8 = 73,786,976,294,838,206,456 Bytes
____________________________________________
73.7 EB (Exabyte’s)
65,536 PB (Petabytes)
67,108,864 TB (Terabytes)

That’s right, sixty-seven million Terabytes! This is without any additional data in the records, only a bigint ID column with all positive values. If you use negative values, then just double everything shown above.

Note: This same exercise, using the int (4 byte) data type, takes only 8 GB.

In many cases when using a CLR stored procedure I have needed to increase the memory allocated for CLR execution within the SQL Server instance. This is accomplished using a SQL Server startup option, as listed in this article on MSDN. The “-g” option is the one we are interested in for CLR, so be sure to read and understand what it does before making any changes. The default of 256MB is sufficient in some cases, but with memory intensive CLR procedures it may fall short of requirements. Below is a sample memory configuration I used for the system running the CLR GZIP compression procedure.

16GB Physical RAM
14,400MB Configured for SQL Server use (90% of total)
2,048MB Configured using “-g” startup option

Problem: Business needs to query and analyze the data from over 5 million XML files per week. There are over 100 XML file versions, none of which have a well documented schema. The task involves retrieving data from all XML elements in a file, and storing them in a flexible manner to allow for easy evaluation and profiling. Once the information has been scrutinized, a proper relational database schema and procedures can be produced.

Solution: Use CLR to indiscriminately shred the XML and insert it into Key-Value database tables. The downside of this type of architecture is a significant amount of work to ‘flatten’ the data into a (normal) results set for reporting purposes, as well as poor performance for these types of queries. This disadvantage is far outweighed by the ease of implementation, as well as the excellent data profiling and discovery capabilities of this design.

Part 1 – The C# Assembly
This procedure will take XML data from the XMLData parameter value, or by passing the full path to the XML file in the fileName variable and setting the doFileRead variable to true. A forward-only XMLReader is used to quickly shred the in-memory XML data and produce the elementFullPath and elementValue variables. The fileID is passed through to each call of insert_XMLKeyValuePair.

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Text;
using System.IO;
using System.IO.Compression;
using System.Xml;

public partial class ShredXML
{

    /// <summary>
    /// places XML data in memory stream
    /// shreds XML
    /// inserts Key-Value pair into database
    /// </summary>
    /// <remarks>
    /// http://developerspla.net/
    /// </remarks>
    /// <param name="fileID">Unique file identifier, by default is created by 'insert_XMLFile' procedure</param>
    /// <param name="fileName">name of the file, or full path to file if using 'doFileRead' option</param>
    /// <param name="XMLData">XML data to shred, or empty if using 'doFileRead' option</param>
    /// <param name="doFileRead">causes XML data to be read from the file specified in 'fileName' variable</param>

    [Microsoft.SqlServer.Server.SqlProcedure]
    public static void clrsp_ShredXML(SqlInt32 fileID, SqlString fileName, SqlString XMLData, SqlBoolean doFileRead)
    {
        Stream memStream;

        //read XML data from file, or get from XMLData variable
        if (doFileRead == true)
        {
            StreamReader sr = new StreamReader(fileName.ToString(), ASCIIEncoding.Default);
            memStream = new MemoryStream(ASCIIEncoding.Default.GetBytes(sr.ReadToEnd()));
        }
        else
        {
            memStream = new MemoryStream(ASCIIEncoding.Default.GetBytes(XMLData.ToString()));
        }

        //create XMLReader
        XmlTextReader xmlReader = new XmlTextReader(memStream);

        //array used to hold individual elements of current XML path
        string[] xmlNodes = new string[10];
        string nodePath = "";
        string nodeName = "";
        string nodeValue = "";

        //use context connection for CLR procedure
        using (SqlConnection cnn = new SqlConnection("context connection = true"))
        {

            //create SQL command and parameters
            SqlCommand cmd = new SqlCommand("insert_XMLKeyValuePair");
            cmd.CommandType = CommandType.StoredProcedure;
            cmd.Parameters.Add(new SqlParameter("@fileID", SqlDbType.Int));
            cmd.Parameters.Add(new SqlParameter("@elementFullPath", SqlDbType.VarChar, 255));
            cmd.Parameters.Add(new SqlParameter("@elementValue", SqlDbType.Variant));

            //open SQL connection
            cnn.Open();

            //read XML data using XMLReader
            while (xmlReader.Read())
            {
                //get XML node type
                XmlNodeType nType = xmlReader.NodeType;

                //process element path
                if (nType == XmlNodeType.Element)
                {
                    nodeName = xmlReader.Name.ToString();
                    xmlNodes[xmlReader.Depth] = nodeName;

                }

                //process value data and insert sql record
                if (nType == XmlNodeType.Text)
                {
                    nodePath = "";
                    for (int i = 2; i < xmlReader.Depth; i++)
                    {
                        nodePath += "<" + xmlNodes[i] + ">";
                    }

                    nodeValue = xmlReader.Value.ToString();

                    cmd.Parameters["@fileID"].Value = fileID;
                    cmd.Parameters["@elementFullPath"].Value = nodePath.ToString();
                    cmd.Parameters["@elementValue"].Value = nodeValue;
                    cmd.Connection = cnn;

                    cmd.ExecuteNonQuery();
                }
            }
        }
    }
}

Part 2 – Register Assembly and CLR Stored Procedure

USE [Sandbox]
GO

ALTER DATABASE [Sandbox]
SET TRUSTWORTHY ON
GO

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[clrsp_ShredXML]') AND type in (N'P', N'PC'))
DROP PROCEDURE [dbo].[clrsp_ShredXML]
GO

IF  EXISTS (SELECT * FROM sys.assemblies asms WHERE asms.name = N'ShredXML')
DROP ASSEMBLY [ShredXML]
GO

CREATE ASSEMBLY [ShredXML] FROM 'E:\DATA\Users\Mike\My Projects\ShredXML\ShredXML\bin\Release\shredxml.dll'
WITH PERMISSION_SET = EXTERNAL_ACCESS
GO

CREATE PROCEDURE [clrsp_ShredXML]
	@fileID int
	,@fileName nvarchar (255)
	,@XMLData nvarchar(MAX)
	,@doFileRead bit
AS EXTERNAL NAME [ShredXML].[ShredXML].[clrsp_ShredXML]
GO

Part 3 – Create SQL Server Database Objects

--XMLElements
CREATE TABLE [dbo].[XmlElements] (
	[ElementID] [int] IDENTITY (1, 1) NOT NULL
	,[ElementName] [varchar] (255) NOT NULL
	,[DateCreated] [datetime] NOT NULL
) ON [PRIMARY]
GO

--XMLElementValues
CREATE TABLE [dbo].[XmlElementValues] (
	[FileID] [int] NOT NULL
	,[ElementID] [int] NOT NULL
	,[ElementValue] [sql_variant] NOT NULL
	,[DateCreated] [datetime] NULL
) ON [PRIMARY]
GO

--XMLFiles
CREATE TABLE [dbo].[XMLFiles](
	[FileID] [int] IDENTITY(1,1) NOT NULL
	,[FileName] varchar (255)
	,[DateCreated] [datetime] NOT NULL
) ON [PRIMARY]
GO

--insert_XMLFile
CREATE PROCEDURE [dbo].[insert_XMLFile]
	@XMLData varchar(max) = NULL
	,@fileName varchar(255) = NULL
	,@doFileRead bit = 0
AS
BEGIN

	DECLARE @fileID int

	INSERT	XMLFiles
			(FileName
			,DateCreated)
	SELECT	@fileName
	        ,GETDATE()

	SET @fileID = SCOPE_IDENTITY()

	EXEC clrsp_ShredXML
			@fileID
			,@fileName
			,@XMLData
			,@doFileRead

END
GO

--insert_XMLKeyValuePair
CREATE PROCEDURE [dbo].[insert_XMLKeyValuePair]
(
	@fileID int
	,@elementFullPath	varchar(255)
    ,@elementValue      sql_variant

)
AS

BEGIN
	DECLARE	@elementID int

CheckElementID:
	SET @elementID=
		(SELECT	TOP(1) ElementID
		FROM	dbo.XmlElements XE WITH (NOLOCK)
		WHERE	XE.ElementName = @elementFullPath)

	/* Check if the element already exist in XmlElements table */
	If (@elementID Is Null)
	BEGIN 

		INSERT	dbo.XmlElements
				(ElementName
				,DateCreated)
		SELECT	@elementFullPath
		        ,GETDATE()

		GOTO	CheckElementID

	END

	INSERT	XmlElementValues
			(FileID
			,ElementID
			,ElementValue
			,DateCreated)
	SELECT	@fileID
			,@elementID
			,@elementValue
			,GETDATE()

END

Kills all user processes (SPIDs) for a database, waits 5 seconds, then returns any remaining running processes.

--SQL Server process(SPID) killer
--Works best with 'Results to Text' option
--Be sure to set the database name
--http://developerspla.net/2010/tsql/sql-server-kill-user-processes/

--For result display purposes
SET NOCOUNT ON
GO

DECLARE @spid VARCHAR(10)
        ,@database VARCHAR(100)

/*****BE SURE TO SET THIS********/
SET     @database = 'sandbox'
--SET @database = DB_NAME()
/********************************/

--Check that database exists
IF NOT EXISTS (SELECT * FROM [master].[dbo].[sysdatabases] sd WHERE sd.[name] LIKE @database)
BEGIN
    PRINT   'Database does not exist'
    GOTO Finished
END

PRINT   'Current processes for database ''' + @database + ''''
PRINT   ''

--Get current user processes for selected database
SELECT  [session_id]    = es.[session_id]
        ,[login_name]   = es.[login_name]
        ,[status]       = es.[status]
FROM    sys.dm_exec_sessions es
JOIN    sys.dm_exec_requests er
    ON  es.security_id = er.session_id
    AND er.database_id = DB_ID(@database)
WHERE   es.[session_id] <> @@SPID
AND     es.[is_user_process] = 1

PRINT   ''
PRINT   ''
PRINT   'Killing processes for database ''' + @database + ''''
PRINT   ''
PRINT   ''

--Get cursor for each user process to kill
DECLARE SPIDCursor CURSOR FOR
    SELECT  [session_id]    = es.[session_id]
    FROM    sys.dm_exec_sessions es
    JOIN    sys.dm_exec_requests er
        ON  es.security_id = er.session_id
        AND er.database_id = DB_ID(@database)
    WHERE   es.[session_id] <> @@SPID
    AND     es.[is_user_process] = 1

OPEN    SPIDCursor

FETCH   NEXT
FROM    SPIDCursor
INTO    @spid

--Use instead of @@FETCHSTATUS = 0
WHILE (SELECT fetch_status FROM	sys.dm_exec_cursors(@@SPID)WHERE [name] = 'SPIDCursor') = 0  BEGIN

	--Kill process
	PRINT 'KILL process ' + @spid
	EXEC ('KILL ' + @spid)

	FETCH NEXT FROM SPIDCursor INTO @spid
END
CLOSE SPIDCursor
DEALLOCATE SPIDCursor

--Wait for 5 seconds
WAITFOR DELAY '000:00:05'

PRINT ''
PRINT 'Processes remaining for database ''' + @database + ''''
PRINT ''

--Get any user processes still remaining on selected database
SELECT  [session_id]    = es.[session_id]
        ,[login_name]   = es.[login_name]
        ,[status]       = es.[status]
FROM    sys.dm_exec_sessions es
JOIN    sys.dm_exec_requests er
    ON  es.security_id = er.session_id
    AND er.database_id = DB_ID(@database)
WHERE   es.[session_id] <> @@SPID
AND     es.[is_user_process] = 1

Finished:
GO

SET NOCOUNT OFF
GO

The T-SQL code below creates a database mail profile and account, and then configures them for use. This is necessary (unless for some reason you are still using SQLMail) in order to receive notifications for events like fatal errors, or SQL Agent job status notification.

Note: This will not work on SQL Server 2000, as Database Mail is a feature of SQL Server 2005 (9.0) and up.

--Configure SQL Server database mail
--http://developerspla.net/2010/tsql/configure-database-mail/

-- Turn on advanced configuration options
EXECUTE sp_configure 'show advanced options', 1;
GO
RECONFIGURE
GO

-- Enable database mail
EXECUTE sp_configure 'Database Mail XPs', 1;
GO
RECONFIGURE
GO

-- Turn off advanced configuration options
EXECUTE sp_configure 'show advanced options', 0;
GO
RECONFIGURE
GO

-- Create a Database Mail profile
EXECUTE msdb.dbo.sysmail_add_profile_sp
@profile_name = @@SERVERNAME,
@description = 'My Mail Profile' ;

-- Create a Database Mail account
EXECUTE msdb.dbo.sysmail_add_account_sp
@account_name = @@SERVERNAME,
@description = 'My SQL Server Instance',
@email_address = 'dbateam@mycompany.com',
@replyto_address = 'dbateam@mycompany.com',
@display_name = @@SERVERNAME,
@mailserver_name = 'myexchangeserver.mycompany.com' ;

-- Add the account to the profile
EXECUTE msdb.dbo.sysmail_add_profileaccount_sp
@profile_name = @@SERVERNAME,
@account_name = @@SERVERNAME,
@sequence_number =1;

-- Grant access to the profile to the DBMailUsers role
-- Set as default mail profile
EXECUTE msdb.dbo.sysmail_add_principalprofile_sp
@profile_name = @@SERVERNAME,
@principal_id = 0,
@is_default = 1 ;

--Check that mail config was successful
SELECT * FROM msdb.dbo.sysmail_profile
SELECT * FROM msdb.dbo.sysmail_account
--Quick check for query timeouts
--http://developerspla.net/2010/tsql/quick-check-for-query-timeouts/

SELECT   *
FROM     sys.dm_exec_query_optimizer_info
WHERE    counter = 'timeout'

This helpful query uses dynamic management views to provide a quick overview of query performance. A simple change of the ORDER BY statement can give you a different perspective on the information, depending on your current area of concern or interest.

--Top 20 queries
--http://developerspla.net/2010/tsql/top-20-queries/

SELECT 	TOP 20
        [CreationTime]              = es.creation_time
        ,[LastExecutionTime]        = es.last_execution_time
        ,[TotalElapsedTime]         = es.total_elapsed_time
        ,[TotalPhysicalReads]       = es.total_physical_reads
        ,[TotalLogicalReads]        = es.total_logical_reads
        ,[TotalLogicalWrites]       = es.total_logical_writes
        ,[AvgElapsedTime]           = (es.total_elapsed_time / es.execution_count)
        ,[AgvPhysicalReads]         = (es.total_physical_reads / es.execution_count)
        ,[AvgLogicalReads]          = (es.total_logical_reads / es.execution_count)
        ,[AvgLogicalWrites]         = (es.total_logical_writes/ es.execution_count)
        ,[LastElapsedTime]          = es.last_elapsed_time
        ,[LastPhysicalReads]        = es.last_physical_reads
        ,[LastLogicalReads]         = es.last_logical_reads
        ,[LastLogicalWrites]        = es.last_logical_writes
        ,[ExecutionCount]           = es.execution_count
        ,[QueryText]                = (SELECT	SUBSTRING(text, (statement_start_offset/2)+1,
                                        (CASE
                                            WHEN statement_end_offset = -1 THEN
                                                LEN(CONVERT(nvarchar(max),text)) * 2
                                            ELSE
                                                statement_end_offset
                                        END
                                        - statement_start_offset)/2)
                                      FROM 		sys.dm_exec_sql_text(sql_handle))
FROM    sys.dm_exec_query_stats es
ORDER BY [AvgLogicalReads] DESC

I use the SyntaxHighlighter Evolved plugin for WordPress to format and highlight my code snippets. The plugin uses the SyntaxHighlighter JavaScript package by Alex Gorbatchev, and it’s accompanying java script files, called ‘brushes’, to highlight many different programming languages. While it is possible to register new brush files for additional languages, I decided to modify an existing file for simplicity’s sake.

Being that I work mostly with SQL Server, I decided to tailor the ‘shBrushSql.js’ file specifically for T-SQL. I started by looking for lists of the SQL 2008 built-in functions, which I found a good copy of at sql2008.info. Next was the reserved keywords, which i pulled from a MSDN BOL article for SQL Server 2008. Lastly, I compiled the list of text operators from a few sources, and made sure there were no duplicates between lists. The final enhancement was to update the RegEx statement to better handle comments. I am by no means a RegEx expert, more like a newbie, but I found this current statement to work on all of my tested comments.

Code for the updated ‘shBrushSql.js’ file is below.

/**
 * SyntaxHighlighter
 * http://alexgorbatchev.com/
 *
 * SyntaxHighlighter is donationware. If you are using it, please donate.
 * http://alexgorbatchev.com/wiki/SyntaxHighlighter:Donate
 *
 * @version
 * 2.1.364 (October 15 2009)
 *
 * @copyright
 * Copyright (C) 2004-2009 Alex Gorbatchev.
 *
 * @license
 * This file is part of SyntaxHighlighter.
 *
 * SyntaxHighlighter is free software: you can redistribute it and/or modify
 * it under the terms of the GNU Lesser General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * SyntaxHighlighter is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with SyntaxHighlighter. If not, see <http://www.gnu.org/copyleft/lesser.html>.
 */
SyntaxHighlighter.brushes.Sql = function()
{
	var funcs	=	'abs acos app_name ascii asin assemblyproperty asymkey_id '+
					'asymkeyproperty atan atn2 avg case cast convert ceiling cert_id char '+
					'charindex checksum_agg coalesce col_length col_name collationproperty '+
					'columnproperty columns_updated containstable cos cot count count_big '+
					'current_timestamp current_user '+
					'database_principal_id databaseproperty databasepropertyex datalength '+
					'dateadd datediff datename datepart day db_id db_name degrees dense_rank '+
					'difference error_line error_message error_number error_procedure '+
					'error_severity error_state exp file_id file_idex file_name filegroup_id '+
					'filegroup_name filegroupproperty fileproperty floor fn_helpcollations '+
					'fn_listextendedproperty fn_my_permissions '+
					'fn_virtualfilestats formatmessage freetexttable fulltextcatalogproperty '+
					'fulltextserviceproperty getansinull getdate getutcdate grouping '+
					'has_perms_by_name host_id host_name ident_current ident_incr ident_seed '+
					'identity index_col indexkey_property indexproperty is_member '+
					'is_srvrolemember isdate isnull isnumeric key_guid key_id key_name '+
					'left len log log10 lower ltrim max min month nchar newid ntile nullif '+
					'object_definition object_id object_name object_schema_name '+
					'objectproperty objectpropertyex opendatasource openquery openrowset '+
					'openxml original_login parsename patindex pi '+
					'power pwdcompare pwdencrypt quotename radians rand rank replace '+
					'replicate reverse right round row_number rowcount_big rtrim '+
					'schema_id schema_name scope_identity serverproperty '+
					'session_user sessionproperty setuser sign sin soundex space '+
					'sql_variant_property sqrt square stats_date stdev stdevp str stuff '+
					'substring sum suser_id suser_name suser_sid suser_sname symkeyproperty '+
					'fn_builtin_permissions sysdatetime sysdatetimeoffset '+
					'system_user sysutcdatetime tan textptr textvalid type_id type_name '+
					'typeproperty unicode upper user_id user_name user_name var varp '+
					'xact_state year';

	var keywords =	'add alter as asc authorization backup begin '+
					'break browse bulk by cascade check checkpoint '+
					'close clustered collate column commit compute '+
					'constraint contains continue create '+
					'current current_date current_time '+
					'cursor database dbcc deallocate declare default delete deny '+
					'desc disk distinct distributed double drop dump else end '+
					'errlvl escape except exec execute exit external fetch '+
					'file fillfactor for foreign freetext from full '+
					'function goto grant group having holdlock '+
					'identity_insert identitycol if index insert intersect '+
					'into is key kill lineno load merge national '+
					'nocheck nonclustered of off offsets on open '+
					'option order '+
					'over percent pivot plan precision primary print proc '+
					'procedure public raiserror read readtext reconfigure references '+
					'replication restore restrict return revert revoke rollback '+
					'rowcount rowguidcol rule save schema securityaudit select '+
					'set shutdown statistics '+
					'table tablesample textsize then to top tran transaction trigger '+
					'truncate tsequal union unique unpivot update updatetext use '+
					'user values varying view waitfor when where while with '+
					'bigint bit decimal int money numeric smallint smallmoney tinyint '+
					'float real date datetimeoffset datetime2 smalldatetime datetime time '+
					'binary image varbinary hierarchyid sql_variant timestamp uniqueidentifier xml';

	var operators =	'all and any between exists cross in inner join like not null or outer some';

	this.regexList = [
		{ regex: /([/]\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+[/])|(--.*)/gm, css: 'comments' },			// one line and multiline comments
		{ regex: SyntaxHighlighter.regexLib.multiLineDoubleQuotedString,	css: 'string' },			// double quoted strings
		{ regex: SyntaxHighlighter.regexLib.multiLineSingleQuotedString,	css: 'string' },			// single quoted strings
		{ regex: new RegExp(this.getKeywords(funcs), 'gmi'),				css: 'color2' },			// functions
		{ regex: new RegExp(this.getKeywords(operators), 'gmi'),			css: 'color1' },			// operators and such
		{ regex: new RegExp(this.getKeywords(keywords), 'gmi'),				css: 'keyword' }			// keyword
		];
};

SyntaxHighlighter.brushes.Sql.prototype	= new SyntaxHighlighter.Highlighter();
SyntaxHighlighter.brushes.Sql.aliases	= ['sql'];

The code below is a pasting of all the functions, keywords, and operators from the brush file, as well as some sample comments, to demonstrate the SyntaxHighlighter at work.

--One line comment
No Comment
/*Multi-line Comment
continued...*/
No Comment
/**************start big comment******************
big comment continued...
big comment continued...
/**************end big comment******************/
No Comment
/*
*Comment
*Comment
*Comment
*/

Functions:
abs acos app_name ascii asin assemblyproperty asymkey_id
asymkeyproperty atan atn2 avg case cast convert ceiling cert_id char
charindex checksum_agg coalesce col_length col_name collationproperty
columnproperty columns_updated containstable cos cot count count_big
current_timestamp current_user
database_principal_id databaseproperty databasepropertyex datalength
dateadd datediff datename datepart day db_id db_name degrees dense_rank
difference error_line error_message error_number error_procedure
error_severity error_state exp file_id file_idex file_name filegroup_id
filegroup_name filegroupproperty fileproperty floor fn_helpcollations
fn_listextendedproperty fn_my_permissions
fn_virtualfilestats formatmessage freetexttable fulltextcatalogproperty
fulltextserviceproperty getansinull getdate getutcdate grouping
has_perms_by_name host_id host_name ident_current ident_incr ident_seed
identity index_col indexkey_property indexproperty is_member
is_srvrolemember isdate isnull isnumeric key_guid key_id key_name
left len log log10 lower ltrim max min month nchar newid ntile nullif
object_definition object_id object_name object_schema_name
objectproperty objectpropertyex opendatasource openquery openrowset
openxml original_login parsename patindex pi
power pwdcompare pwdencrypt quotename radians rand rank replace
replicate reverse right round row_number rowcount_big rtrim
schema_id schema_name scope_identity serverproperty
session_user sessionproperty setuser sign sin soundex space
sql_variant_property sqrt square stats_date stdev stdevp str stuff
substring sum suser_id suser_name suser_sid suser_sname symkeyproperty
fn_builtin_permissions sysdatetime sysdatetimeoffset
system_user sysutcdatetime tan textptr textvalid type_id type_name
typeproperty unicode upper user_id user_name user_name var varp
xact_state year

Keywords:
add alter as asc authorization backup begin
break browse bulk by cascade check checkpoint
close clustered collate column commit compute
constraint contains continue create
current current_date current_time
cursor database dbcc deallocate declare default delete deny
desc disk distinct distributed double drop dump else end
errlvl escape except exec execute exit external fetch
file fillfactor for foreign freetext from full
function goto grant group having holdlock
identity_insert identitycol if index insert intersect
into is key kill lineno load merge national
nocheck nonclustered of off offsets on open
option order
over percent pivot plan precision primary print proc
procedure public raiserror read readtext reconfigure references
replication restore restrict return revert revoke rollback
rowcount rowguidcol rule save schema securityaudit select
set shutdown statistics
table tablesample textsize then to top tran transaction trigger
truncate tsequal union unique unpivot update updatetext use
user values varying view waitfor when where while with
bigint bit decimal int money numeric smallint smallmoney tinyint
float real date datetimeoffset datetime2 smalldatetime datetime time
binary image varbinary hierarchyid sql_variant timestamp uniqueidentifier xml

Operators:
all and any between exists cross in inner join like not null or outer some
--Fast row count of all tables
--http://developerspla.net/2010/t-sql/row-count-of-all-tables/

--SQL Server 2000+
SELECT      [TableName]		= so.name
            ,[RowCount]		= si.rowcnt
FROM        sysindexes si
JOIN        sysobjects so
ON          si.id = so.id
WHERE       si.indid < 2  --clustered index or heap
AND         OBJECTPROPERTY(so.id, 'IsMSShipped') = 0
ORDER BY    [RowCount] DESC

--SQL Server 2005+
SELECT      [TableName]     = OBJECT_SCHEMA_NAME(sp.[object_id]) + '.'
                                + OBJECT_NAME(sp.[object_id])
            ,[RowCount]     = SUM(sp.rows)
FROM        sys.partitions sp
WHERE       OBJECT_SCHEMA_NAME(sp.[object_id]) <> 'sys'
AND         sp.index_id < 2
GROUP BY    object_id
ORDER BY    [RowCount] DESC

This code is an extension of a CLR UDF I originally found here. The changes came out of a need to package large amounts of XML files, stored in SQL Server, for distribution to multiple clients. The distribution system would call this procedure with the ID’s of 500 queued records to be queried, wrapped, and compressed. The reasoning behind doing this in the RDBMS versus the distribution system was simple: save 80-90% on network bandwidth, and drastically reduce the amount of system resources required in the distribution system to work with the extremely large data sets.

This code has been abstracted from its original purpose, and tailored to be much more flexible. Although originally incarnated solely for use with XML data, this adaptation will compress any number of records, with an optional header/footer, and in any data type that can be CAST to varbinary. I highly recommend reading this MSDN blog post regarding CLR memory, as memory control is vital to the operation of this procedure. As a note: I had to set the -g startup flag to 2048MB (incrementally), on the SQL Server instance because the default of 256MB was nowhere near sufficient for concurrent executions.

Realized Performance:
Average Record Size: 250KB
Average Result Set Size Per Execution (Uncompressed): 125MB
Average Compression: 87%
Average Round-Trip Execution Time: 11 seconds
Average Throughput (Call to Delivery): 11.3MB per second

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Text;
using System.IO;
using System.IO.Compression;
using System.Xml;

public partial class StoredProcedures
{

    /// <summary>
    /// builds SQL statement
    /// encodes & compressess header
    /// queries data from table
    /// reads, compresses, and concatenates data
    /// encodes & compresses footer
    /// returns compressed data
    /// </summary>
    /// <remarks>
    /// http://developerspla.net/2010/t-sql/clr-sproc-concatenation-gzip-compression/
    /// </remarks>
    /// <param name="tableName">name of table containing both recordID column and data column</param>
    /// <param name="idColumnName">name of column containing recordIDs</param>
    /// <param name="recordIDs"> comma delimited list of recordIDs</param>
    /// <param name="dataColumnName">name of column containing data</param>
    /// <param name="headerString">string for wrapping beginning of record data</param>
    /// <param name="footerString">string for wrapping end of record data</param>

    [Microsoft.SqlServer.Server.SqlProcedure]
    public static void clrsp_GetCompressedData(SqlString tableName
        , SqlString idColumnName
        , SqlString recordIDs
        , SqlString dataColumnName
        , SqlString headerString
        , SqlString footerString)
    {

        //build sql query string
        StringBuilder sbQuery = new StringBuilder(200);
        sbQuery.Append(@"SELECT CAST([");
        sbQuery.Append(dataColumnName);
        sbQuery.Append("] as varbinary(max))");
        sbQuery.Append(Environment.NewLine);
        sbQuery.Append("FROM ");
        sbQuery.Append(tableName);
        sbQuery.Append(" WITH (NOLOCK)");
        sbQuery.Append(Environment.NewLine);
        sbQuery.Append("WHERE ");
        sbQuery.Append(idColumnName);
        sbQuery.Append(" IN (");
        sbQuery.Append(recordIDs.ToString());
        sbQuery.Append(")");

        //empty recordIDs string
        recordIDs = null;

        //create memory stream to hold compressed bytes
        MemoryStream compressedData = new MemoryStream();

        //create instance of GZipStream class to use as the compressor
        GZipStream compressor = new GZipStream(compressedData, CompressionMode.Compress, true);

        //get encoder to convert string to byte array
        System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();

        if (headerString != null)
        {
            //convert header string to byte array - required to compress data
            byte[] headerBytes = encoding.GetBytes(headerString);

            //empty header string
            headerString = null;

            //write header byte array to compressedData stream using compressor object
            compressor.Write(headerBytes, 0, headerBytes.Length);

            //empty header byte array
            headerBytes = null;
        }

        //create connection to sql server based on current execution context
        using (SqlConnection conn = new SqlConnection("context connection=true"))
        {
            //create new sql command using query text and connection
            SqlCommand cmd = new SqlCommand(sqlQuery, conn);

            //open sql connection
            conn.Open();

            //create sql data reader and execute command against SQL Server
            SqlDataReader reader = cmd.ExecuteReader();

            //create byte array to hold record data for compressing
            byte[] blobData;

            //read one record per loop from the executed SQL command
            while (reader.Read())
            {
                //get byte array from data reader column(0) and stuff into blobData
                blobData = reader.GetSqlBytes(0).Buffer;

                //write blobData byte array to compressedData stream using compressor object
                compressor.Write(blobData, 0, blobData.Length);
            }

            //close sql data reader
            reader.Close();

            //release reader resources
            reader.Dispose();

            //release connection resources
            conn.Dispose();
        }

        if (footerstring != null)
        {
            //convert footer string to byte array - required to compress data
            byte[] footerBytes = encoding.GetBytes(footerString);

            //empty footer string
            footerString = null;

            //write footer byte array to compressedData stream using compressor object
            compressor.Write(footerBytes, 0, footerBytes.Length);

            //empty footer byte array
            footerBytes = null;
        }

        // closing compressor to allow ALL compressed bytes to be written to compressedData stream
        compressor.Flush();
        compressor.Close();
        compressor = null;

        //Create and format timestamp
        string timeStamp = (DateTime.Now.ToUniversalTime().ToString("yyyy-MM-ddTHH:mm:ss.ff")) + "Z";

        //create sql record to use for returning results to client
        SqlDataRecord record = new SqlDataRecord(
            new SqlMetaData("timestamp", SqlDbType.NVarChar, 30)
            ,new SqlMetaData("compressedData", SqlDbType.Image));

        //set values for each column in sql record
        record.SetString(0, timeStamp.ToString());
        record.SetBytes(1, 0, compressedData.ToArray(), 0, (int)compressedData.Length);

        ////used for testing output of compressed data
        //FileStream testFileWriter = new FileStream(@"G:\test.gz", FileMode.Create);
        //testFileWriter.Write(compressedData.ToArray(), 0, (int)compressedData.Length);
        //testFileWriter.Close();

        //empty memory stream
        compressedData = null;

        //return record to client
        SqlContext.Pipe.Send(record);

        //empty sql record
        record = null;

        //force immediate garbage collection
        GC.Collect();
    }
}
Powered by WordPress Web Design by SRS Solutions © 2012 Developers Planet Design by SRS Solutions