Jane's Technical Stuff

Tuesday, September 30, 2008

SQL Coding Dojo


Last week I led a SQL based Coding Dojo as part of the Madgex ILP programme. A lot of ILP sessions are in a presentation format, and I wanted to do something with a bit more, or in fact a lot more, attendee participation. I've been interested in the idea of deliberate practice for a while, and Richard has been to quite a few coding dojos and so I figured it was time to give it a try. Shortly after I'd decided to do one, I attended a Skills Matter evening and stumbled across Ivan Sanchez, who blogged about Starting a Coding dojo which helped me get my thoughts in hand.

I chose to use SQL as the area to practice, mainly because it is an area I am really familiar with and so could help out if necessary. Also I have a copy of Joe Celko's SQL Puzzles and Answers which provided me with a great collection of puzzles to practice with.

I chose a puzzle called Double Duty which is described as follows:
A person may have more than one role. The roles are defined as follows:
O – Officer
D – Director

The PersonRole table is defined as:
PersonName VARCHAR(50)
RoleCode CHAR(1)


Produce a list of People and their roles where they are either Officer or Director, and where anyone with both the O and D role is represented once as B – Both.

I've put a copy of the SQL file I produced to create the tables, and populate the data here, so feel free to take a look.

I opened the session by explaining what a dojo was (a training hall for practice in the martial arts), what a coding dojo was (a practice area for coding skills) and introduced the problem domain (I provided print outs of the table structure, data and expected results). I then introduced the roles we were going to use - driver and co-pilot - and the rules we were going to adhere to. In our main meeting room we have a large plasma screen and so I connected this to my laptop so that the audience could see exactly the same as the driver and co-pilot. I provided a couple of SQL books and a SQL Server cheatsheet for reference.

After the initial self-consciousness had gone, the pairs settled into the idea pretty well, and it was interesting to watch the different pairs interact and form new (rapid) working relationships. There were 7 participants, and each spent 5 minutes as the driver and 5 minutes as the co-pilot. By the end of the session there were 4 solutions produced which meant that the group as a whole felt that they'd achieved something. We ran a quick retrospective at the end of the session and the following were the major points:
  • The audience found a certain amount of frustration by not being able to help, and not being able to shout out suggestions.
  • At least one person acting as the co-pilot found it hard to articulate what needed to be done.
  • At least one person admitted to finding it hard to think without having their fingers on the keyboard.
  • At least one person learnt something about SQL.
  • Quite a few people found a laptop keyboard hard to use, so a proper keyboard would have helped.

As a result of the success of this one, I've booked in another one for a couple of months time - again SQL based but I'm also thinking of making use of TDD Problems for another practicable skill.

Labels: , , ,

// posted by Jane @ 8:28 PM   save to del.icio.us

Comments:

Monday, September 22, 2008

SELECT COUNT(*) vs SELECT COUNT(colname)


Richard mentioned this article to me yesterday, and was surprised at what it told him.

To summarize:

SELECT COUNT() has 2 meanings:
  1. Count the number of rows
  2. Count the number of values
Sometimes these are the same thing, but not always.

SELECT COUNT(*) always counts the number of rows in the result

SELECT COUNT(colname) counts the number of times that the column colname is not null

So, if colname is always populated the result will be the same but otherwise the value will be different.

Labels:

// posted by Jane @ 1:57 PM   save to del.icio.us

Comments:

Wednesday, September 03, 2008

Database Comparison tools: Redgate SQL Compare Review


I first started using Redgate SQL Compare when I was at Glass's in 2003/04 and I'm still using it 4 years later. Over those years the product has evolved and new products have been released from Redgate. I have only ever used it for SQL Server (7, 2000 and 2005) and so can't comment on it's appropriateness or reliableness targeting other database products.

At the time that I first found SQL Compare, there weren't that many tools available that did a reliable job of comparing objects and producing scripts of the differences. I know that one of my colleagues at Madgex is a fan of SQL Delta. One day we sat down to compare the tools and discovered that there really wasn't much to choose between them.

I have always used SQL Compare more as a tool to check my scripted updates, rather than to produce those scripts. After a bad experience at a previous company when a member of my team managed to empty an articles table rather than update it when manipulating the database via Enterprise Manager I've habitually manually created defensive SQL change scripts with appropriate transactions and error handling.

SQL Compare is simple to use, works quickly and produces a detailed list of differences between database objects. The differences that are detected are configurable, so you can choose to ignore or report upon white space or comment differences.

When you first open the screen you are presented with a screen offering options of what you want to compare.



Once you have registered a couple of databases to compare, it is a short wait until the differences are displayed.



The display of differences are divided into:
  • objects that appear on the source database but not the target
  • objects that appear on the target database but not the source
  • objects that appear on both databases but are different
  • objects that are the same on both databases




For each object that is different you can choose to view the details of that object which will display a SQL based description of the object with the differences identified by a highlight line - sometimes it might be the whole object, a part of an object (i.e.a column definition) and sometimes it will be a constraint, or a grant statement that is missing.



From the summary screen you can select the changes that you're interested in and get SQL Compare to automatically synchonise your target database to match the source, or vice versa. You can also get it to generate some SQL to do the generation to allow you to put your SQL upgrade scripts into a build process, or under source control. In the majority of cases I will use this tool to check my upgrade scripts, rather than to produce them and if I spot any difference I tend to write my own, defensive, SQL to make the upgrade scripts.



In the 4 or so years I've been using SQL Compare I've found it to be a stable, evolving product which I have come to rely upon completely to check upgrade processes. I made the most use of this tool at Glass's where I was working on a product which was client/server application where it couldn't be relied upon that the customer was always running the latest version. Consequently, our preparation for release process involved installing the earliest supported version of the system, and upgrading it to the release candidate on one machine and installing a clean, new build of the release candidate from scratch and comparing the two and producing upgrade SQL scripts for any identified differences before starting the process again. We also included a call to the command line version into the automated MSBuild process producing an HTML report of the differences which could be used to fail the build if anything other than 0 differences was reported.

Labels: , , , ,

// posted by Jane @ 9:36 AM   save to del.icio.us

Comments:

Wednesday, August 20, 2008

SQL Snap


At last September's BarCamp Brighton I saw the CSS Specificity Snap cards and this gave me an idea. So, armed with a concept of showing the different ways of producing the same output using the SQL Server 2005 flavour of SQL I started generating SQL statements. Alex was kind enough to make them pretty, and last week I ran a Madgex ILP session to play.

There are 26 cards in my pack, each card has a SQL statement, and a letter. The letter is used for the crib sheet to allow me to easily spot the matches.



I prepared by producing two piles of 13 cards, with a match in both piles. I then put one pile in the middle of the table face down, and dealt out the rest of the cards to the 5 people who were playing. I placed the 3 spare cards on the table face up so that we could all see them and proceeded to turn them over one by one. When a card was turned over everyone looked at the SQL on the overturned card, and checked their cards and the spare cards to see if there was a match. If a match wasn't spotted then I explained what the SQL was doing, and at only one point did I have to say which letter the matching card would have. During this I had quite a few comments about functions people didn't recognise - especially COALESCE and NULLIF. I also explained some of the performance, and functional differences between the SQL statements when they were seen - so for instance I explained the differences between DELETE and TRUNCATE but I still I have these as a match because they can provide the same result and provided an interesting talking point.

The second game we played was a memory game where I placed all of the cards on the table face down and everyone turned over 2 cards at a time. If the SQL on the cards resulted in the same output, then it was a match and they took the cards away. If they didn't then the cards got turned back over and the next person had a go.

All in all this session took about 30 minutes, and resulted in quite a lot of noise and laughter (so much so that someone came and closed the door of the room we were in).

The following table contains the Letter Code, SQL statement, and the matching Letter code. They are ordered so that the matches are grouped together.
CodeSQLMatch
VSELECT CAST(GETDATE() AS VARCHAR(11))G
GSELECT CONVERT (VARCHAR(11), GETDATE())V
ZSELECT name FROM sysobjects WHERE xtype = 'P'Q
QSELECT name FROM sys.proceduresZ
DSELECT sysobjects.name, syscolumns.name FROM sysobjects INNER JOIN syscolumns ON sysobjects.id = syscolumns.id WHERE sysobjects.xtype = 'u'A
ASELECT sysobjects.name, syscolumns.name FROM sysobjects, syscolumns WHERE sysobjects.id = syscolumns.id AND sysobjects.xtype = 'u'D
USELECT getdate()O
OEXEC('SELECT getdate()')U
JDELETE FROM TestT
TTRUNCATE TABLE TestJ
YSELECT ISNULL(NULL,1)W
WSELECT COALESCE(NULL,1)Y
PSELECT [name], xtype FROM sysobjects ORDER BY xtypeM
MSELECT [name], xtype FROM sysobjects ORDER BY 2P
HSELECT NULLIF(1,1)R
RSELECT CASE 1 WHEN 1 THEN NULL ELSE 1 ENDH
ISELECT [name], xtype FROM sysobjects ORDER BY 1N
NSELECT [name], xtype FROM sysobjects ORDER BY [name]I
XSELECT CAST(GETDATE() AS VARCHAR(20))F
FSELECT CONVERT (VARCHAR(20), GETDATE())X
SSELECT name FROM syscolumns WHERE id = ( SELECT id FROM sysobjects WHERE xtype = 'u' AND name = 'Jobs' )K
KWITH objects (id) AS
( SELECT id FROM sysobjects WHERE xtype = 'u' AND name = 'Jobs' ) SELECT name FROM syscolumns INNER JOIN objects ON syscolumns.id = objects.id
S
LINSERT INTO Job (JobID, PrimaryJobTypeID) VALUES (1,1)B
BINSERT INTO Job (JobID, PrimaryJobTypeID) SELECT 1,1L
CDECLARE @sMessage AS VARCHAR(20) SET @sMessage = 'Hello'E
EDECLARE @sMessage AS VARCHAR(20) SELECT @sMessage = 'Hello'C

Labels: ,

// posted by Jane @ 8:17 PM   save to del.icio.us

Comments:

Wednesday, August 06, 2008

Intersect vs Inner Join - Timings


I read the article SQL SERVER - 2005 - Difference Between INTERSECT and INNER JOIN - INTERSECT vs. INNER JOIN today and was interested in the relative speed of Inner join vs Intersect for simple matcjhes and so thought I'd make another use of my timing code to see the difference.

I produced a couple of very simple tables:
Summary ([ID] INT, [Name] NVARCHAR(10))
Detail ([ID] INT, [Age] INT)

both with primary keys on ID and populated them both as follows:

INSERT INTO Summary
SELECT 1, 'Jane' UNION ALL
SELECT 2, 'Richard'

INSERT INTO Detail
SELECT 1, 36 UNION ALL
SELECT 2,40


I chose to do a very simple task - retrieve the ID for an entry which is in both Summary and Detail - note: this assumes that the ID is a foreign key constraint and the IDs are relating to the same item.

Using my rudimentary timing code, I think that intersect is more performant than the inner join - based on running the same query 100000 times

SELECT Summary.ID
FROM Summary
INNER JOIN Detail
ON Summary.ID = Detail.ID


and

SELECT ID FROM Summary
INTERSECT
SELECT ID FROM Detail


which has the following results:
11746
DescriptionTimeInMS
Join
Intersect10203

I ran them both together and took a look at the actual execution plans and it does indicate that the intersect takes a little less effort - which must be down to the Seek instead of Scan in the Join'd table detail


So, another option for me to remember to consider when determining the best way to get at some data

Labels: ,

// posted by Jane @ 6:28 PM   save to del.icio.us

Comments:

Friday, July 18, 2008

TSQL: Enumations and constants


Bruce sent me a link the other day to an article T4 template for generating SQL view from C# enumeration which I found interesting from a modelling constants/enumerations in SQL viewpoint.

The example used was modelling an enumeration of ContactType which has valid items of Individual and Organisation.

The article used a view to model this, as per
CREATE VIEW enumContactType
AS
  SELECT
    0 AS Individual,
    1 AS Organization


and then using it within a SELECT as

SELECT *
FROM Contact
WHERE Type = (SELECT Organization FROM enumContactType)

(Note: in the original article Oleg used a schema called enum, but I'm just ignoring this at the moment and have thus changed the name from enum.ContactType to enumContactType)

An alternative


In my previous company, we used Scalar-Valued Functions to mimic constants, and I guess this could be extended to enumerations. I thought I'd re-create the above example and give it a try to see how it looks and compares.

So, to model the enumeration ContactType, I've created two functions as follows:
CREATE FUNCTION enumContactTypeIndividual()
RETURNS INT
AS
BEGIN
  RETURN 0
END
GO

CREATE FUNCTION enumContactTypeOrganisation()
RETURNS INT
AS
BEGIN
  RETURN 1
END
GO


And then to reproduce the SELECT query I wrote:
SELECT *
FROM Contact
WHERE Type = dbo.enumContactTypeOrganisation()


The resulting data matches that used in the VIEW model and provides an alternative. I'm sure that a template could be written to produce those functions as an output as per the end part of Oleg's article.

Performance and timings


I was interested in the relative performance of these two methods, so armed with my timing code from last week I checked them out. I amended the SELECT to bring back the COUNT(*) FROM Contact into a local integer variable, and ran it 1000000 times.

The results are as follows:
CodeDescription TimeInMS
EnumViewUsing the view13010
EnumUDFUsing the UDF21450

showing that the view method is more performant.

I then changed the function to make use of SCHEMABINDING. The new functions look like:
CREATE FUNCTION enumContactTypeIndividual()
RETURNS INT
WITH SCHEMABINDING
AS
BEGIN
  RETURN 0
END
GO

CREATE FUNCTION enumContactTypeOrganisation()
RETURNS INT
WITH SCHEMABINDING
AS
BEGIN
  RETURN 1
END
GO


And the timings change to be:
CodeDescription TimeInMS
EnumViewUsing the view13010
EnumUDFUsing the UDF20280

which do reduce the time taken for the UDF but still means that the view is faster.

For interests sake I then ran a comparison timing against the code using the literal as:
SELECT *
FROM Contact
WHERE Type = 1

which resulted in
CodeDescription TimeInMS
EnumLiteralUsing the literal12043

showing it is faster, but not by much, than the view.

Summary


So, what has this shown?
  • Using a view is quite efficient and effective for modelling enumerations
  • Using a UDF is an alternative, but is slower
  • Schema binding makes UDF usage quicker
  • The difference between using a VIEW and using the hard-coded literal isn't a lot in perfomance terms

Labels: ,

// posted by Jane @ 1:24 PM   save to del.icio.us

Comments:
Great article. Thanks Jane! I would not have expected a UDF to be slower than a VIEW. Any ideas on why?
 

Tuesday, July 08, 2008

INFORMATION_SCHEMA views


As I alluded to the other day, I'm gradually weaning myself off my dependency on (the fairly ugly) sys.objects, sys.columns etc as a way to query the meta data about my database. Instead I'm using the SQL-92 compliant INFORMATION_SCHEMA views.

Information schema views provide an internal, system table-independent view of the SQL Server metadata. Information schema views enable applications to work correctly although significant changes have been made to the underlying system tables. The information schema views included in SQL Server 2005 comply with the SQL-92 standard definition for the INFORMATION_SCHEMA.


So now, when I'm writing database upgrade scripts and attempting to write defensive SQL (which is my usual position these days, regardless of whether I think the script will be run more than once - lets just say I've learnt from making such assumptions) I usually wrap
ALTER TABLE
statements within
IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE Table_Name = 'MyTable' AND Column_Name = 'MyNewColumn'),
CREATE TABLE
statements within
IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE Table_Name = 'MyTable') etc

The main area that I have to revert to the sys views for is indexes, and finding out what columns are included in which index, which is the uglier, but no-less-effective
SELECT
  OBJECT_NAME (i.object_id) AS Tablename,
  i.name AS IndexName,
  c.name AS ColumnName,
  CASE ic.is_descending_key
  WHEN 1 THEN 'DESC'
  ELSE 'ASC'
  END as ColumnSort
FROM sys.indexes i
INNER JOIN sys.index_columns ic
ON i.object_id = ic.object_id
AND i.index_id = ic.index_id
INNER JOIN sys.columns c
ON ic.object_id = c.object_id
AND ic.column_id = c.column_id
INNER JOIN sys.objects o
on c.object_id = o.object_id
WHERE o.type = 'U'
ORDER BY TableName, indexName, ic.key_ordinal


MSDN has an interesting article Querying the SQL Server System Catalog FAQ which has examples for finding out (using the various object catalog views) many different areas of meta data across a SQL Server 2005 database and is worth using as a starting point.

Labels: ,

// posted by Jane @ 2:20 PM   save to del.icio.us

Comments:
Cool, I've just INFORMATION_SCHEMA that to save me scrolling through the enormous MSSMS treeview. Love it!
 

Monday, July 07, 2008

TSQL - Timings


A while ago I blogged about how to get the date element of a datetime column in TSQL. In that post I said
I would probably have done it via a CONVERT/CAST operation, converting to a VARCHAR and then back to a DATETIME, but this is a much more efficient method.

but I didn't prove it at the time. I gave it some more thought and wanted to know what the differences were, so I wrote some timings code.

This script creates one table Timings with columns of Code, Description, ActionTime and IsComplete. It has a combined primary key of Code and IsComplete. Code must be unique - and can be a string of up to 10 characters long to uniquely identify the action being timed. IsComplete is used to differentiate between the start time and end time of the process being monitored.

The script also creates 3 stored procedures:
  • up_RecordStart which takes 2 parameters - the unique code and optional description. This is used to record the start of the activity being monitored.
  • up_RecordEnd which takes just 1 parameter - the code - should match the code used in up_RecordStart. This is used to record the end of the activity being monitored.
  • up_GetTimings which again takes just 1 parameter - the code to return the timings from. It then returns the Code, Description and the length of time the action took in ms.

I wrote some script to then use these objects to test the assertion I made that FLOOR and combinations of converting DATETIME to FLOAT etc would be more efficient than using either CAST or CONVERT to VARCHAR(12) and back again to a DATETIME.

------------------------------
-- Clean up before we start --
------------------------------
DELETE FROM Timings
WHERE Code IN ('FLR','CONVERT','CAST')
GO

---------------------
-- Try using Floor --
---------------------
EXEC up_RecordStart @Code='FLR', @Description='SELECT CONVERT(DATETIME,FLOOR(CONVERT(FLOAT,GETDATE())))'
GO

DECLARE @i AS INTEGER
DECLARE @floorDate AS DATETIME
SET @i = 0

WHILE @i < 1000000 -- try the next statement for 1000000 times - this should be enough to see some differences
BEGIN

  SET @floorDate = CONVERT(DATETIME,FLOOR(CONVERT(FLOAT,GETDATE())))
  SET @i = @i + 1
END
PRINT @floorDate
GO

EXEC up_RecordEnd @Code='FLR'
GO

-----------------------
-- Try using convert --
-----------------------
EXEC up_RecordStart @Code='CONVERT', @Description='SELECT CONVERT(DATETIME,CONVERT(VARCHAR(12),GETDATE()))'
GO

DECLARE @i AS INTEGER
DECLARE @floorDate AS DATETIME
SET @i = 0
WHILE @i < 1000000
BEGIN

  SET @floorDate = CONVERT(DATETIME,CONVERT(VARCHAR(12),GETDATE()))
  SET @i = @i + 1
END
PRINT @floorDate
GO

EXEC up_RecordEnd @Code='CONVERT'
GO

--------------------
-- Try using Cast --
--------------------
EXEC up_RecordStart @Code='CAST', @Description='SELECT CAST(CAST(GETDATE() AS VARCHAR(12)) AS DATETIME)'
GO

DECLARE @i AS INTEGER
DECLARE @floorDate AS DATETIME
SET @i = 0
WHILE @i < 1000000
BEGIN

  SET @floorDate = CAST(CAST(GETDATE() AS VARCHAR(12)) AS DATETIME)
  SET @i = @i + 1
END
PRINT @floorDate
GO

EXEC up_RecordEnd @Code='CAST'
GO

-------------------------
-- Now get the timings --
-------------------------
EXEC up_GetTimings 'FLR'
GO

EXEC up_GetTimings 'CONVERT'
GO

EXEC up_GetTimings 'CAST'
GO


This results in the following data being returned:
CodeDescription TimeInMS
FLRSELECT CONVERT(DATETIME,FLOOR(CONVERT(FLOAT,GETDATE())))1313
CONVERTSELECT CONVERT(DATETIME,CONVERT(VARCHAR(12),GETDATE()))3236
CASTSELECT CAST(CAST(GETDATE() AS VARCHAR(12)) AS DATETIME)3203

which shows that the method using FLOOR is more efficient, and that there isn't a lot to chose between CONVERT and CAST

Labels: ,

// posted by Jane @ 7:11 PM   save to del.icio.us

Comments:

Friday, July 04, 2008

Monty Hall problem - TSQL


Following on from this morning's post about the Monty Hall problem, and proving it in PHP I figured I'd prove it in TSQL as well.

So here is my SQL version.

To maintain consistency with my PHP version, I've made it output similar text, so the results are along the lines of:
Monty Hall Problem
This is a simple TSQL query to prove the Monty Hall problem [http://en.wikipedia.org/wiki/Monty_hall_problem]

------------------------
The Results are in:
------------------------
Out of 10000 games, the contestant was right to swap 66.94% of the time and wrong 33.06% of the time


The TSQL version is a bit more elegant with regards to working out which door to open for the contestant, as it is a simple statement of
SELECT TOP 1 @Opened = DoorNumber
FROM @Doors
WHERE DoorNumber NOT IN (@Prize, @Picked)
ORDER BY NEWID()

making the most of set theory to enable the exclusion of the @Prize door and the @Picked door as opposed to the same thing in my PHP code
$remaining = array();
/* the gameshow host opens a door which has nothing behind it, so the gameshow host knows where the prize is
but can't choose to open the door the contestant has chosen, so remove both picked and prize from the options,
this leaves either one of two doors that can be opened, so pick one randomly */
for ($i=0; $i<3; $i ++)
{
  switch($doors[$i])
  {
    case $prize:
      break;
    case $picked:
      break;
    default:
      array_push($remaining,$doors[$i]);
  }
}
$opened = $remaining[array_rand($remaining)];

which is all a bit more procedural and, at least to my mind, less elegant - but then again I like the syntax of SQL which either makes me a freak or a masochist (according to at least one colleague)

Labels: ,

// posted by Jane @ 6:08 PM   save to del.icio.us

Comments:

Monday, June 30, 2008

Find a string in a stored procedure - Revisited


A little over 2 years ago I posted a couple of methods of finding the instances of a string within stored procedures - at the time I'd left some debug 'Print @' code in at least one stored procedure and needed to find it before releasing the code.

Today, I wanted to do the same thing, and as over the past 2 years I've been weaning myself off accessing the sysobjects tables, instead making use of the INFORMATION_SCHEMA views, I decided it was time to add an update to that earlier post.

So, using INFORMATION_SCHEMA.ROUTINES it is coded as :
  SELECT SPECIFIC_NAME
  FROM INFORMATION_SCHEMA.ROUTINES
  WHERE ROUTINE_DEFINITION LIKE '%PRINT @%'

Labels: , ,

// posted by Jane @ 2:53 PM   save to del.icio.us

Comments:
Hi Jane

Brighton Bloggers Aggregate doesn't seem to be working

hope all OK with you

best wishes

Ian Lawes/Abi Rhodes
 

Monday, June 16, 2008

Scripting out object level SQL


In SQL Server 2000 Enterprise Manager provided the ability to output SQL file on an object level for tables, functions, stored procedures etc. In SQL Server 2005 this option disapeared and the only outputting was in a single file. This wasn't great for controlling objects within a source control system.

A while ago I stumbled across Scriptio, a ClickOnce application to do just this. It has a very simple interface, and you can tell that it was designed to solve a problem, rather than as an end solution.

Tab One - Database Objects



Scriptio front page

To get a list of database objects, enter a Server name, select either Use Windows Authentication or enter a Username and password and press Connect. This should populate the Database drop down. Select a database and the bottom section should be populated with the objects that are scriptable within your database. From here, it is a simple case of selecting the appropriate options from the top right hand section (Include CREATE, Include DROP etc) and selecting the appropriate objects from the bottom section (there is a handy Script All option which selects everything. I have never been able to get the Schema or Type drop downs to populate, which may be something to do with the design of my objects.

Tab Two - More Options



Scriptio Options

The options tab allows choices of how many files to create (or even if you want to create a file), where to store them and some additional scripting objects (Include collation, Qualify CREATEs with Schema etc).

Tab Three - Generated Script



scriptio script generated

This tab displays the SQL generated - in an single box, allowing the content to be copied to clipboard and used elsewhere. Just as an observation, I've never seen the "Save As..." button enabled but I've never needed to use it either so that hasn't been a problem.

Sample script - a table



IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[Book]') AND type in (N'U'))
DROP TABLE [dbo].[Book]
GO

/****** Object: Table [dbo].[Book] Script Date: 06/16/2008 17:30:18 ******/
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Book](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Title] [varchar](255) COLLATE Latin1_General_CI_AS NOT NULL,
[Author] [varchar](255) COLLATE Latin1_General_CI_AS NOT NULL,
[PublisherId] [int] NULL,
[ISBN] [varchar](25) COLLATE Latin1_General_CI_AS NULL,
[PublishedYear] [int] NULL,
[BookAgeId] AS (case datepart(year,getdate())-[PublishedYear] when (0) then (1) when (1) then (2) else (3) end)
) ON [PRIMARY]

GO


Sample script - a stored procedure



IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[up_GetBooks]') AND type in (N'P', N'PC'))
DROP PROCEDURE [dbo].[up_GetBooks]
GO


/****** Object: StoredProcedure [dbo].[up_GetBooks] Script Date: 06/16/2008 17:30:19 ******/
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: Jane Dallaway
-- Create date: 12 June 2008
-- Description: Get all books, in full detail
-- =============================================
CREATE PROCEDURE up_GetBooks
AS
BEGIN
SET NOCOUNT ON;

SELECT Book.Id, Book.Title, Book.Author, Publisher.Name, Book.ISBN, Book.PublishedYear, BookAge.Name BookAge
FROM Book
INNER JOIN BookAge
ON Book.BookAgeId = BookAge.Id
INNER JOIN Publisher
ON Book.PublisherId = Publisher.Id

END

GO


Summary



This is a great utility to allow scripting on an object by object basis. The source code is available too.

Labels: ,

// posted by Jane @ 5:43 PM   save to del.icio.us

Comments:

Friday, May 30, 2008

Searching for 'special' characters in a database using TSQL


One of my colleagues asked me for some quick SQL help earlier. Some data in the database he was looking at had ended up with a 'special character' on the end of it in a specific table. This didn't affect all rows, just some, but he needed helping identify those rows.

For the sake of arguments, lets imagine that the table was called Item, and the affected column was named Name, and there was also an Id column named ItemId. He knew one of the affected Items, and so we examined the data to determine what character it was at the end. To do this we did

SELECT ASCII(SUBSTRING(Name,10,1))
FROM Item
WHERE ItemId = 100


This informed us that the character was actually ASCII value 160. Knowing this, I then used the PATINDEX function and produced a query like:

SELECT *
FROM Item
WHERE PATINDEX('%' + CHAR(160) + '%',Name) > 0


which seemed to return the row that he'd already identified along with new ones. This is probably not the most performant method, but this was needed to check something in a debugging/support environment and not code that would be run on a regular basis.

Labels: ,

// posted by Jane @ 12:56 PM   save to del.icio.us

Comments:

Thursday, May 01, 2008

TSQL - How to get the date element of a datetime object


In SQL Server 2000 and 2005 there is no concept of a DATE data type, only a DATETIME. So, how do you get the DATE element only, i.e. with a 00:00 time element. One of my colleagues needed to do this, so after a bit of google searching he discovered the following.

-- Get today's date without the time element
SELECT CONVERT(DATETIME,FLOOR(CONVERT(FLOAT,GETDATE())))


I would probably have done it via a CONVERT/CAST operation, converting to a VARCHAR and then back to a DATETIME, but this is a much more efficient method.

In SQL Server 2008 there is a DATE data type so this will no longer be an issue.

Labels: ,

// posted by Jane @ 5:38 PM   save to del.icio.us

Comments:

Monday, April 28, 2008

More on transactions


After Friday's experiment with transactions in SQL Server I got to wondering about what would happen to actions between the
  ROLLBACK TRAN
and the
  IF @@TRANCOUNT > 0
    COMMIT TRAN

so I amended the code to have an additional INSERT as follows (Note: I also changed the final COMMIT TRAN to be a ROLLBACK TRAN to help show the differenence)

  BEGIN TRAN
  INSERT INTO COUNT VALUES (1)
    BEGIN TRAN
    INSERT INTO COUNT VALUES (4)
    ROLLBACK TRAN
  INSERT INTO COUNT VALUES (99) -- New line
  IF @@TRANCOUNT > 0
    ROLLBACK TRAN


which results in the value 99 being inserted into the table, outside of the scope of any transactions - all transactions were rolled back in the inner ROLLBACK TRAN

To ensure that the second part of the statement, which might be unrelated to the first and therefore not dependent on the result of that transaction, is within a transaction, the following seems to work with the expected results:

  BEGIN TRAN
  INSERT INTO COUNT VALUES (1)
    BEGIN TRAN
    INSERT INTO COUNT VALUES (4)
  ROLLBACK TRAN

  IF @@TRANCOUNT = 0
    BEGIN TRAN
    INSERT INTO COUNT VALUES (99)
  IF @@TRANCOUNT > 0
    ROLLBACK TRAN


No new row is added as a result of this action as now all actions are covered within a transaction.

Labels: ,

// posted by Jane @ 1:03 PM   save to del.icio.us

Comments:

Friday, April 25, 2008

TSQL Transactions - simple nesting


After Bruce and I were looking at DBVerse earlier, we started specifiying how our tool would work, especially with regards to transaction. Our outer process will need to handle its own transactions, and will need to be aware of inner errors and inner transactions to ensure that errors don't ensue.

For example, in the following scenario
BEGIN TRAN
INSERT INTO COUNT VALUES (1)
  BEGIN TRAN
  INSERT INTO COUNT VALUES (4)
  ROLLBACK TRAN
COMMIT TRAN

the following error is produced
Msg 3902, Level 16, State 1, Line 17
The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION.


This is because the inner ROLLBACK TRAN rollsback ALL transactions, not just the inner transaction. So, by the time the COMMIT TRAN is called there are no longer any transactions.

To get around this, the code can be changed to be:
BEGIN TRAN
INSERT INTO COUNT VALUES (1)
  BEGIN TRAN
  INSERT INTO COUNT VALUES (4)
  ROLLBACK TRAN

IF @@TRANCOUNT > 0
  COMMIT TRAN

which works happily and will not cause an error.

NOTE: Count is just a test table I created with a single column of Count which is defined as an int. Nothing very exciting.

Labels: ,

// posted by Jane @ 6:01 PM   save to del.icio.us

Comments:

Wednesday, March 26, 2008

Defensive SQL Updates


One of my recent tasks, well before my month off anyway, was to upgrade a number of similar SQL Server 2005 databases to a new structure. The databases had all started off the same, and were going to end up the same, but had been subjected to a differing number of bug fixes and enhancements along the way.

If I was updating just one database I would have used the excellent Redgate SQL Compare (of which more in a future post) but as I had to update many then this would not have been the most reliable way.

Instead, I wrote defensive SQL, that is SQL which checks for impact before doing it. So, before adding a column for instance, it checks to see if that column exists, and only attempts to add it if it doesn't - thus preventing errors. The whole update script was wrapped in a Transaction, and by making use use of a variable @bError of type BIT to track any errors, the last statement could COMMIT or ROLLBACK the transaction as appropriate.

So, for example, to add a new column to a table, here is the script I used:
IF @bError = 0
BEGIN
  IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE Table_Name = 'MyTable' AND Column_Name = 'MyNewColumn')
  BEGIN

    ALTER TABLE MyTable ADD MyNewColumn INT NOT NULL

    IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE Table_Name = 'MyTable' AND Column_Name = 'MyNewColumn')
    BEGIN
      PRINT 'Error Adding column ColumnName to table MyTable'
      SET @bError = 1
    END
    ELSE
    BEGIN
      PRINT 'Added column ColumnName to table MyTable'
    END
  END
END


By checking for the column's existance after applying the change, I could determine if an error had occurred, and set the @bError flag accordingly. I used the Print statements to produce a log of exactly what stages had been carried out on each database, using the function DB_NAME() to output the name of the current database at the beginning of the log.

Labels: ,

// posted by Jane @ 9:37 AM   save to del.icio.us

Comments:

Thursday, January 24, 2008

Finding some text within a SQL Server 2005 Database


I was asked what sounded like a fairly simple question today - is there a way to do a full text search of a database, without having to set up full text searching. As this was to answer an immediate query - where was some text that was being displayed coming from - and not within the context of a query to form part of an application I thought about it a while, and decided that some dynamic SQL was what was needed. I started off with the fact that I wanted to pull together a statement like:
SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%SearchTerm%'
and then run that across all columns which are text, ntext, varchar, nvarchar, char or nchar.

Getting the list of table and column names is reasonably straightforward
SELECT TABLE_NAME,COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE DATA_TYPE IN ('varchar', 'nvarchar', 'text', 'ntext', 'char', 'nchar')


I wanted to loop through all of these and get the counts, so I used a local table variable to hold the results
DECLARE @tabSearchableColumns TABLE (TableName VARCHAR(100), ColumnName VARCHAR(100), Matches int) and simply inserted the results of the above query into it (along with a NULL for the Matches column). I then set up a loop, looping based on the COUNT(*) FROM @tabSearchableColumns WHERE Matches IS NULL. Within that loop I take the top item from @tabSearchableColumns, and use the TableName and ColumnName to populate the SQL SELECT above.

Now came the interesting part, getting the variable @intDataCount from the following statement:
EXEC (SELECT @intDataCount = COUNT(*) FROM TableName WHERE ColumnName LIKE '%SearchTerm%)
as the variable @intDataCount has the scope of the execute statement, and doesn't get propogated to any outside variable with the same name.

To get around this involves the use of sp_executesql specifying the parameter that we expect to get out, i.e. EXEC sp_executesql @strSQL, N'@intDataCount INT OUTPUT', @intDataCount OUTPUT

An important point here is that @strSQL, or the string being sent in as the SQL string MUST be cast be an NVARCHAR, either by DECLARE @strSQL NVARCHAR(100) or N'SELECT ColumnName FROM TableName'. Otherwise, don't be surprised to encounter the following error:
Msg 214, Level 16, State 2, Procedure sp_executesql, Line 1
Procedure expects parameter '@statement' of type 'ntext/nchar/nvarchar'.


Finally, having got the count out into my @intDataCount variable, I can update the appropriate row in tabSearchableColumns, redo the COUNT(*) FROM @tabSearchableColumns WHERE Matches IS NULL and continue looping. The final stage is just to select the table and column data which has Matches > 0.

The script for FindTableColumnDataMatches is available for download, and as with sp_generateinsert is tested against SQL Server 2005, leave me comments if you've found this useful, or have suggestions for improvements.

Labels: , ,

// posted by Jane @ 7:59 PM   save to del.icio.us

Comments:

Wednesday, January 23, 2008

Update to spu_generateinsert - use of UNION SELECT


Jon Green from 4R Systems Inc left a comment earlier proposing a suggested improvement to spu_generateinsert.

He suggested that it would be easier to scan the produced SQL if it made use of SELECT and UNION SELECT rather than producing a whole host of INSERT statements. This changes the output from:
INSERT INTO [TableName] ([ColumnA], [ColumnB]) VALUES ('A','B')
INSERT INTO [TableName] ([ColumnA], [ColumnB]) VALUES ('C','D')

and replaces it with
INSERT INTO [TableName] ([ColumnA], [ColumnB])
SELECT 'A','B'
UNION SELECT 'C','D'


This is a great suggestion and I have implemented it by the use of an additional parameter @producesingleinsert. By default this is set to 0 which produces the separate INSERT statements as it has done all along. If it is specified as 1, however, it now produces a single statement making use of SELECT and UNION SELECT. I have implemented this as an optional parameter so as not to break any reliance on this functionality.

I also made another change today which is to exclude calculated columns. This has involved joining INFORMATION_SCHEMA.COLUMNS to syscolumns via sysobjects to make use of syscolumns.IsComputed.

The final change was the addition of another optional parameter @ debug. Again this defaults to 0, but if set to 1 prints out the SELECT clause which will be used to obtain the data. This can help to debug problems with the procedure, both in terms of the internal logic, but also with regards to checking that the criteria specified is being handled correctly.

As ever this is work in progress, so please continue sending in your suggestions and I'll attempt to implement them.

Download the script from here.

Labels: , ,

// posted by Jane @ 6:52 PM   save to del.icio.us

Comments:

Tuesday, January 15, 2008

More on spu_generateinsert


Dave posted a comment on the original post the other day seeking for help with resolving an error he was having running the script. The problem transpired to be with dealing with columns of data type image. On investigation, I discovered that there were quite a few column types which wouldn't work as they hadn't been catered for.

I've updated the script to work with all the data types I can get it to work with. For images, binary and varbinary columns, it attempts to insert NULL and puts a warning at the bottom of the generated script saying:
-- ** WARNING: There is an image column in your table which has not been migrated - this has been replaced with NULL. You will need to do this by hand. Images are not supported by this script at this time. or
-- ** WARNING: There is a binary or varbinary column in your table which has not been migrated - this has been replaced with NULL. You will need to do this by hand. Binary and VarBinary are not supported by this script at this time. as appropriate
This isn't great, but is at least informative and explains what has happened.

I will try and think about a way of scripting out data from columns of type image, binary and varbinary, but at least, in the meantime the script no longer errors (at least to my knowledge - if you know otherwise, then please let me know and I'll make more fixes).

Download updated spu_generateinsert.

Labels: , , ,

// posted by Jane @ 6:32 PM   save to del.icio.us

Comments:
What's the limitation on image/binary columns? Is it how to represent them in a plain-text SQL command? If so, how about representing them with e.g. base64? Presumably, it's not that simple though...!
 

Thursday, January 03, 2008

Update - Generate Insert statement from table for SQL Server


I had a bit of spare time today, so I've updated spu_generateInsert.sql to fix a couple of annoyances:
  • it now handles text and ntext appropriately - wrapping the data in single quotes, and escaping any single quotes as appropriate - it also displays a warning indicating that text data will be converted into varchar(8000)
  • it now handles NULL values appropriately, and so the step in my previous post on this procedure about "replace of 'NULL' with NULL" can now be safely ignored.
I've put comments into the stored procedure to indicate the changes.

I've tested this on SQL Server 2005 but I'd recommend testing it and checking it yourself before relying on it.

Download it here

Labels: , , ,

// posted by Jane @ 6:32 PM   save to del.icio.us

Comments:
Great stuff!
 
This post was linked from Reflective Perspective - Chris Alcock - Morning Brew #4
 
Thank u so much Jane!
 

Tuesday, December 04, 2007

Getting the ID from a concatenated string


A colleague is doing some data migration work, and had a string he needed to manipulate to get at the id. The string had been put together as a title, with spaces replaced with -, and an ID with the title and ID separated by a -, i.e. Software-Developer-123 where title is Software Developer and the ID is 123.

He'd written some code to manipulate this, but it was taking quite a while and wanted to know if there was a one line TSQL statement to do it.

After a bit of head scratching we came up with the following:

DECLARE @strTemp VARCHAR(100)
SET @strTemp = 'Software-Developer-123'
SELECT SUBSTRING(@strTemp, LEN(@strTemp)-(CHARINDEX('-',REVERSE(@strTemp)))+2, CHARINDEX('-',REVERSE(@strTemp))-1)


This returns 123, and so is a (long) one line fix to his problem. As there is no LastIndexOf or similar function, I've used the REVERSE function to enable us to find the last instance of - and then its just some manipulation using CHARINDEX and SUBSTRING to get at the ID. The use of the variable @strTemp was just so we could easily swap in different test data and so isn't of any real importance to the solution.

I really enjoy challenges like this and have emailed this on as a brainteaser to the rest of the development team.

Update:
@DECLARE @strTemp VARCHAR(100)
SET @strTemp = 'Software-Developer-123'

SELECT RIGHT(@strTemp, CHARINDEX('-',REVERSE(@strTemp))-1)

A lot cleaner :-) Thanks Alex.

Labels: , ,

// posted by Jane @ 5:33 PM   save to del.icio.us

Comments:

Tuesday, November 13, 2007

Generate SQL Insert statement from table data


Years ago I stumbled upon a helpful SQL script (SQL Server 2000 and 2005) written by a chap called Keith E Kratochvil which you could point at a table in your database and which would then generate a series of Insert scripts.

I made some amendments at the time and over the next few years to make a bit more useful to me for some specific requirements, including the ability to take bits of a where clause to restrict the data coming back, allowing me to script subsets of data. I've carried this script with me from company to company, and decided it was time to share the updated script.

One word of advice, before running the generated inserts, do a replace of 'NULL' with NULL as I failed to get NULLs appended to a string without causing the whole string to become NULL.

The procedure is available here so feel free to take a look and use it if you find it helpful.

Labels: , , ,

// posted by Jane @ 5:39 PM   save to del.icio.us

Comments:
Used this again today - thanks!
 
Hi Jane.

Ran the sp and got the following errors:

-- ** Start of Inserts

Msg 536, Level 16, State 5, Procedure sup_GenerateInsert, Line 151
Invalid length parameter passed to the SUBSTRING function.
Msg 156, Level 15, State 1, Line 1
Incorrect syntax near the keyword 'FROM'.
-- ** End of Inserts

Probably "user error" but I would appreciate any insights you might be able to offer.

Thanks.
 
Hi Dave

You haven't left me an email address or any other way to contact you, and your blogger profile is switched off or similar, so I'm hoping you're going to head back here and take a look.

There is a section in the script which looks like the following:

SELECT @SelList = SUBSTRING(@SelList, 1,(len(@SelList)-1))
SELECT @SelList = @SelList + ' FROM ' + @table
--SELECT @SelList AS TheSelList

I recommend uncommenting the "SELECT @SelList AS TheSelList" line to help enable debugging.

I'd be interested in seeing what is being generated, and the structure of the table you're running it against if you have the time. I'll then try and sort out the problem and update the stored procedure and hopefully fix the problem. Drop me a line at jane(@)dallaway.com - without the ()s of course :-) with any information you have to help.

Thanks
Jane