Tuesday, September 30, 2008
SQL Coding Dojo
Last week I led a SQL based Coding Dojo as part of the
Madgex ILP programme. A lot of ILP sessions are in a presentation format, and I wanted to do something with a bit more, or in fact a lot more, attendee participation. I've been interested in the idea of deliberate practice for a while, and Richard has been to quite a few coding dojos and so I figured it was time to give it a try. Shortly after I'd decided to do one, I attended a
Skills Matter evening and stumbled across
Ivan Sanchez, who blogged about
Starting a Coding dojo which helped me get my thoughts in hand.
I chose to use SQL as the area to practice, mainly because it is an area I am really familiar with and so could help out if necessary. Also I have a copy of
Joe Celko's SQL Puzzles and Answers which provided me with a great collection of puzzles to practice with.
I chose a puzzle called Double Duty which is described as follows:
A person may have more than one role. The roles are defined as follows:
O – Officer
D – Director
The PersonRole table is defined as:
PersonName VARCHAR(50)
RoleCode CHAR(1)
Produce a list of People and their roles where they are either Officer or Director, and where anyone with both the O and D role is represented once as B – Both.
I've put a copy of the SQL file I produced to create the tables, and populate the data
here, so feel free to
take a look.
I opened the session by explaining what a dojo was (a training hall for practice in the martial arts), what a coding dojo was (a practice area for coding skills) and introduced the problem domain (I provided print outs of the table structure, data and expected results). I then introduced the roles we were going to use - driver and co-pilot - and the rules we were going to adhere to. In our main meeting room we have a large plasma screen and so I connected this to my laptop so that the audience could see exactly the same as the driver and co-pilot. I provided a couple of SQL books and a
SQL Server cheatsheet for reference.
After the initial self-consciousness had gone, the pairs settled into the idea pretty well, and it was interesting to watch the different pairs interact and form new (rapid) working relationships. There were 7 participants, and each spent 5 minutes as the driver and 5 minutes as the co-pilot. By the end of the session there were 4 solutions produced which meant that the group as a whole felt that they'd achieved something. We ran a quick retrospective at the end of the session and the following were the major points:
- The audience found a certain amount of frustration by not being able to help, and not being able to shout out suggestions.
- At least one person acting as the co-pilot found it hard to articulate what needed to be done.
- At least one person admitted to finding it hard to think without having their fingers on the keyboard.
- At least one person learnt something about SQL.
- Quite a few people found a laptop keyboard hard to use, so a proper keyboard would have helped.
As a result of the success of this one, I've booked in another one for a couple of months time - again SQL based but I'm also thinking of making use of
TDD Problems for another practicable skill.
Labels: CodingDojo, ilp, madgex, SQL
// posted by Jane @ 8:28 PM
Comments:
Monday, September 22, 2008
SELECT COUNT(*) vs SELECT COUNT(colname)
Richard mentioned
this article to me yesterday, and was surprised at what it told him.
To summarize:
SELECT COUNT() has 2 meanings:
- Count the number of rows
- Count the number of values
Sometimes these are the same thing, but not always.
SELECT COUNT(*) always counts the number of rows in the result
SELECT COUNT(colname) counts the number of times that the column colname is not null
So, if colname is always populated the result will be the same but otherwise the value will be different.
Labels: SQL
// posted by Jane @ 1:57 PM
Comments:
Wednesday, September 03, 2008
Database Comparison tools: Redgate SQL Compare Review
I first started using
Redgate SQL Compare when I was at Glass's in 2003/04 and I'm still using it 4 years later. Over those years the product has evolved and new products have been released from Redgate. I have only ever used it for SQL Server (7, 2000 and 2005) and so can't comment on it's appropriateness or reliableness targeting other database products.
At the time that I first found SQL Compare, there weren't that many tools available that did a reliable job of comparing objects and producing scripts of the differences. I know that one of my colleagues at
Madgex is a fan of
SQL Delta. One day we sat down to compare the tools and discovered that there really wasn't much to choose between them.
I have always used SQL Compare more as a tool to check my scripted updates, rather than to produce those scripts. After a bad experience at a previous company when a member of my team managed to empty an articles table rather than update it when manipulating the database via Enterprise Manager I've habitually manually created
defensive SQL change scripts with appropriate transactions and error handling.
SQL Compare is simple to use, works quickly and produces a detailed list of differences between database objects. The differences that are detected are configurable, so you can choose to ignore or report upon white space or comment differences.
When you first open the screen you are presented with a screen offering options of what you want to compare.

Once you have registered a couple of databases to compare, it is a short wait until the differences are displayed.

The display of differences are divided into:
- objects that appear on the source database but not the target
- objects that appear on the target database but not the source
- objects that appear on both databases but are different
- objects that are the same on both databases

For each object that is different you can choose to view the details of that object which will display a SQL based description of the object with the differences identified by a highlight line - sometimes it might be the whole object, a part of an object (i.e.a column definition) and sometimes it will be a constraint, or a grant statement that is missing.

From the summary screen you can select the changes that you're interested in and get SQL Compare to automatically synchonise your target database to match the source, or vice versa. You can also get it to generate some SQL to do the generation to allow you to put your SQL upgrade scripts into a build process, or under source control. In the majority of cases I will use this tool to check my upgrade scripts, rather than to produce them and if I spot any difference I tend to write my own, defensive, SQL to make the upgrade scripts.

In the 4 or so years I've been using SQL Compare I've found it to be a stable, evolving product which I have come to rely upon completely to check upgrade processes. I made the most use of this tool at Glass's where I was working on a product which was client/server application where it couldn't be relied upon that the customer was always running the latest version. Consequently, our preparation for release process involved installing the earliest supported version of the system, and upgrading it to the release candidate on one machine and installing a clean, new build of the release candidate from scratch and comparing the two and producing upgrade SQL scripts for any identified differences before starting the process again. We also included a call to the command line version into the automated
MSBuild process producing an HTML report of the differences which could be used to fail the build if anything other than 0 differences was reported.
Labels: database change management, review, SQL, SQLServer2005, tool
// posted by Jane @ 9:36 AM
Comments:
Wednesday, August 20, 2008
SQL Snap
At last September's BarCamp Brighton I saw the
CSS Specificity Snap cards and this gave me an idea. So, armed with a concept of showing the different ways of producing the same output using the SQL Server 2005 flavour of SQL I started generating SQL statements.
Alex was kind enough to make them pretty, and last week I ran a
Madgex ILP session to play.
There are 26 cards in my pack, each card has a SQL statement, and a letter. The letter is used for the crib sheet to allow me to easily spot the matches.

I prepared by producing two piles of 13 cards, with a match in both piles. I then put one pile in the middle of the table face down, and dealt out the rest of the cards to the 5 people who were playing. I placed the 3 spare cards on the table face up so that we could all see them and proceeded to turn them over one by one. When a card was turned over everyone looked at the SQL on the overturned card, and checked their cards and the spare cards to see if there was a match. If a match wasn't spotted then I explained what the SQL was doing, and at only one point did I have to say which letter the matching card would have. During this I had quite a few comments about functions people didn't recognise - especially
COALESCE and
NULLIF. I also explained some of the performance, and functional differences between the SQL statements when they were seen - so for instance I explained the differences between
DELETE and TRUNCATE but I still I have these as a match because they can provide the same result and provided an interesting talking point.
The second game we played was a memory game where I placed all of the cards on the table face down and everyone turned over 2 cards at a time. If the SQL on the cards resulted in the same output, then it was a match and they took the cards away. If they didn't then the cards got turned back over and the next person had a go.
All in all this session took about 30 minutes, and resulted in quite a lot of noise and laughter (so much so that someone came and closed the door of the room we were in).
The following table contains the Letter Code, SQL statement, and the matching Letter code. They are ordered so that the matches are grouped together.
| Code | SQL | Match |
| V | SELECT CAST(GETDATE() AS VARCHAR(11)) | G |
| G | SELECT CONVERT (VARCHAR(11), GETDATE()) | V |
| Z | SELECT name FROM sysobjects WHERE xtype = 'P' | Q |
| Q | SELECT name FROM sys.procedures | Z |
| D | SELECT sysobjects.name, syscolumns.name FROM sysobjects INNER JOIN syscolumns ON sysobjects.id = syscolumns.id WHERE sysobjects.xtype = 'u' | A |
| A | SELECT sysobjects.name, syscolumns.name FROM sysobjects, syscolumns WHERE sysobjects.id = syscolumns.id AND sysobjects.xtype = 'u' | D |
| U | SELECT getdate() | O |
| O | EXEC('SELECT getdate()') | U |
| J | DELETE FROM Test | T |
| T | TRUNCATE TABLE Test | J |
| Y | SELECT ISNULL(NULL,1) | W |
| W | SELECT COALESCE(NULL,1) | Y |
| P | SELECT [name], xtype FROM sysobjects ORDER BY xtype | M |
| M | SELECT [name], xtype FROM sysobjects ORDER BY 2 | P |
| H | SELECT NULLIF(1,1) | R |
| R | SELECT CASE 1 WHEN 1 THEN NULL ELSE 1 END | H |
| I | SELECT [name], xtype FROM sysobjects ORDER BY 1 | N |
| N | SELECT [name], xtype FROM sysobjects ORDER BY [name] | I |
| X | SELECT CAST(GETDATE() AS VARCHAR(20)) | F |
| F | SELECT CONVERT (VARCHAR(20), GETDATE()) | X |
| S | SELECT name FROM syscolumns WHERE id = ( SELECT id FROM sysobjects WHERE xtype = 'u' AND name = 'Jobs' ) | K |
| K | WITH objects (id) AS ( SELECT id FROM sysobjects WHERE xtype = 'u' AND name = 'Jobs' ) SELECT name FROM syscolumns INNER JOIN objects ON syscolumns.id = objects.id | S |
| L | INSERT INTO Job (JobID, PrimaryJobTypeID) VALUES (1,1) | B |
| B | INSERT INTO Job (JobID, PrimaryJobTypeID) SELECT 1,1 | L |
| C | DECLARE @sMessage AS VARCHAR(20) SET @sMessage = 'Hello' | E |
| E | DECLARE @sMessage AS VARCHAR(20) SELECT @sMessage = 'Hello' | C |
Labels: SQL, SQLServer2005
// posted by Jane @ 8:17 PM
Comments:
Wednesday, August 06, 2008
Intersect vs Inner Join - Timings
I read the article
SQL SERVER - 2005 - Difference Between INTERSECT and INNER JOIN - INTERSECT vs. INNER JOIN today and was interested in the relative speed of Inner join vs Intersect for simple matcjhes and so thought I'd make
another use of my
timing code to see the difference.
I produced a couple of very simple tables:
Summary ([ID] INT, [Name] NVARCHAR(10))
Detail ([ID] INT, [Age] INT)both with primary keys on ID and populated them both as follows:
INSERT INTO Summary
SELECT 1, 'Jane' UNION ALL
SELECT 2, 'Richard'
INSERT INTO Detail
SELECT 1, 36 UNION ALL
SELECT 2,40I chose to do a very simple task - retrieve the ID for an entry which is in both Summary and Detail - note: this assumes that the ID is a foreign key constraint and the IDs are relating to the same item.
Using my rudimentary timing code, I think that intersect is more performant than the inner join - based on running the same query 100000 times
SELECT Summary.ID
FROM Summary
INNER JOIN Detail
ON Summary.ID = Detail.IDand
SELECT ID FROM Summary
INTERSECT
SELECT ID FROM Detail which has the following results:
| Description | TimeInMS |
| Join | 11746
| Intersect | 10203 |
I ran them both together and took a look at the actual execution plans and it does indicate that the intersect takes a little less effort - which must be down to the Seek instead of Scan in the Join'd table detail

So, another option for me to remember to consider when determining the best way to get at some data
Labels: SQL, SQLServer2005
// posted by Jane @ 6:28 PM
Comments:
Friday, July 18, 2008
TSQL: Enumations and constants
Bruce sent me a link the other day to an article
T4 template for generating SQL view from C# enumeration which I found interesting from a modelling constants/enumerations in SQL viewpoint.
The example used was modelling an enumeration of ContactType which has valid items of Individual and Organisation.
The article used a view to model this, as per
CREATE VIEW enumContactType
AS
SELECT
0 AS Individual,
1 AS Organizationand then using it within a SELECT as
SELECT *
FROM Contact
WHERE Type = (SELECT Organization FROM enumContactType)(Note: in the original article Oleg used a schema called enum, but I'm just ignoring this at the moment and have thus changed the name from enum.ContactType to enumContactType)
An alternative
In my previous company, we used
Scalar-Valued Functions to mimic constants, and I guess this could be extended to enumerations. I thought I'd re-create the above example and give it a try to see how it looks and compares.
So, to model the enumeration ContactType, I've created two functions as follows:
CREATE FUNCTION enumContactTypeIndividual()
RETURNS INT
AS
BEGIN
RETURN 0
END
GO
CREATE FUNCTION enumContactTypeOrganisation()
RETURNS INT
AS
BEGIN
RETURN 1
END
GOAnd then to reproduce the SELECT query I wrote:
SELECT *
FROM Contact
WHERE Type = dbo.enumContactTypeOrganisation()The resulting data matches that used in the VIEW model and provides an alternative. I'm sure that a template could be written to produce those functions as an output as per the end part of
Oleg's article.
Performance and timings
I was interested in the relative performance of these two methods, so armed with my
timing code from
last week I checked them out. I amended the SELECT to bring back the COUNT(*) FROM Contact into a local integer variable, and ran it 1000000 times.
The results are as follows:
| Code | Description | TimeInMS |
| EnumView | Using the view | 13010 |
| EnumUDF | Using the UDF | 21450 |
showing that the view method is more performant.
I then changed the
function to make use of SCHEMABINDING. The new functions look like:
CREATE FUNCTION enumContactTypeIndividual()
RETURNS INT
WITH SCHEMABINDING
AS
BEGIN
RETURN 0
END
GO
CREATE FUNCTION enumContactTypeOrganisation()
RETURNS INT
WITH SCHEMABINDING
AS
BEGIN
RETURN 1
END
GOAnd the timings change to be:
| Code | Description | TimeInMS |
| EnumView | Using the view | 13010 |
| EnumUDF | Using the UDF | 20280 |
which do reduce the time taken for the UDF but still means that the view is faster.
For interests sake I then ran a comparison timing against the code using the literal as:
SELECT *
FROM Contact
WHERE Type = 1which resulted in
| Code | Description | TimeInMS |
| EnumLiteral | Using the literal | 12043 |
showing it is faster, but not by much, than the view.
Summary
So, what has this shown?
- Using a view is quite efficient and effective for modelling enumerations
- Using a UDF is an alternative, but is slower
- Schema binding makes UDF usage quicker
- The difference between using a VIEW and using the hard-coded literal isn't a lot in perfomance terms
Labels: SQL, SQLServer2005
// posted by Jane @ 1:24 PM
Comments:
Tuesday, July 08, 2008
INFORMATION_SCHEMA views
As I
alluded to the other day, I'm gradually weaning myself off my dependency on (the fairly ugly)
sys.objects,
sys.columns etc as a way to query the meta data about my database. Instead I'm using the SQL-92 compliant
INFORMATION_SCHEMA views.
Information schema views provide an internal, system table-independent view of the SQL Server metadata. Information schema views enable applications to work correctly although significant changes have been made to the underlying system tables. The information schema views included in SQL Server 2005 comply with the SQL-92 standard definition for the INFORMATION_SCHEMA.
So now, when I'm writing database upgrade scripts and attempting to write
defensive SQL (which is my usual position these days, regardless of whether I think the script will be run more than once - lets just say I've learnt from making such assumptions) I usually wrap
ALTER TABLE statements within
IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE Table_Name = 'MyTable' AND Column_Name = 'MyNewColumn'),
CREATE TABLE statements within
IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE Table_Name = 'MyTable') etc
The main area that I have to revert to the sys views for is indexes, and finding out what columns are included in which index, which is the uglier, but no-less-effective
SELECT
OBJECT_NAME (i.object_id) AS Tablename,
i.name AS IndexName,
c.name AS ColumnName,
CASE ic.is_descending_key
WHEN 1 THEN 'DESC'
ELSE 'ASC'
END as ColumnSort
FROM sys.indexes i
INNER JOIN sys.index_columns ic
ON i.object_id = ic.object_id
AND i.index_id = ic.index_id
INNER JOIN sys.columns c
ON ic.object_id = c.object_id
AND ic.column_id = c.column_id
INNER JOIN sys.objects o
on c.object_id = o.object_id
WHERE o.type = 'U'
ORDER BY TableName, indexName, ic.key_ordinalMSDN has an interesting article
Querying the SQL Server System Catalog FAQ which has examples for finding out (using the various
object catalog views) many different areas of meta data across a SQL Server 2005 database and is worth using as a starting point.
Labels: SQL, SQLServer2005
// posted by Jane @ 2:20 PM
Comments:
Monday, July 07, 2008
TSQL - Timings
A while ago I blogged about
how to get the date element of a datetime column in TSQL. In that post I said
I would probably have done it via a CONVERT/CAST operation, converting to a VARCHAR and then back to a DATETIME, but this is a much more efficient method.
but I didn't prove it at the time. I gave it some more thought and wanted to know what the differences were, so I wrote some
timings code.
This script creates one table
Timings with columns of
Code,
Description,
ActionTime and
IsComplete. It has a combined primary key of Code and IsComplete. Code must be unique - and can be a string of up to 10 characters long to uniquely identify the action being timed. IsComplete is used to differentiate between the start time and end time of the process being monitored.
The script also creates 3 stored procedures:
- up_RecordStart which takes 2 parameters - the unique code and optional description. This is used to record the start of the activity being monitored.
- up_RecordEnd which takes just 1 parameter - the code - should match the code used in up_RecordStart. This is used to record the end of the activity being monitored.
- up_GetTimings which again takes just 1 parameter - the code to return the timings from. It then returns the Code, Description and the length of time the action took in ms.
I wrote some script to then use these objects to test the assertion I made that FLOOR and combinations of converting DATETIME to FLOAT etc would be more efficient than using either CAST or CONVERT to VARCHAR(12) and back again to a DATETIME.
------------------------------
-- Clean up before we start --
------------------------------
DELETE FROM Timings
WHERE Code IN ('FLR','CONVERT','CAST')
GO
---------------------
-- Try using Floor --
---------------------
EXEC up_RecordStart @Code='FLR', @Description='SELECT CONVERT(DATETIME,FLOOR(CONVERT(FLOAT,GETDATE())))'
GO
DECLARE @i AS INTEGER
DECLARE @floorDate AS DATETIME
SET @i = 0
WHILE @i < 1000000 -- try the next statement for 1000000 times - this should be enough to see some differences
BEGIN
SET @floorDate = CONVERT(DATETIME,FLOOR(CONVERT(FLOAT,GETDATE())))
SET @i = @i + 1
END
PRINT @floorDate
GO
EXEC up_RecordEnd @Code='FLR'
GO
-----------------------
-- Try using convert --
-----------------------
EXEC up_RecordStart @Code='CONVERT', @Description='SELECT CONVERT(DATETIME,CONVERT(VARCHAR(12),GETDATE()))'
GO
DECLARE @i AS INTEGER
DECLARE @floorDate AS DATETIME
SET @i = 0
WHILE @i < 1000000
BEGIN
SET @floorDate = CONVERT(DATETIME,CONVERT(VARCHAR(12),GETDATE()))
SET @i = @i + 1
END
PRINT @floorDate
GO
EXEC up_RecordEnd @Code='CONVERT'
GO
--------------------
-- Try using Cast --
--------------------
EXEC up_RecordStart @Code='CAST', @Description='SELECT CAST(CAST(GETDATE() AS VARCHAR(12)) AS DATETIME)'
GO
DECLARE @i AS INTEGER
DECLARE @floorDate AS DATETIME
SET @i = 0
WHILE @i < 1000000
BEGIN
SET @floorDate = CAST(CAST(GETDATE() AS VARCHAR(12)) AS DATETIME)
SET @i = @i + 1
END
PRINT @floorDate
GO
EXEC up_RecordEnd @Code='CAST'
GO
-------------------------
-- Now get the timings --
-------------------------
EXEC up_GetTimings 'FLR'
GO
EXEC up_GetTimings 'CONVERT'
GO
EXEC up_GetTimings 'CAST'
GO This results in the following data being returned:
| Code | Description | TimeInMS |
| FLR | SELECT CONVERT(DATETIME,FLOOR(CONVERT(FLOAT,GETDATE()))) | 1313 |
| CONVERT | SELECT CONVERT(DATETIME,CONVERT(VARCHAR(12),GETDATE())) | 3236 |
| CAST | SELECT CAST(CAST(GETDATE() AS VARCHAR(12)) AS DATETIME) | 3203 |
which shows that the method using FLOOR
is more efficient, and that there isn't a lot to chose between CONVERT and CAST
Labels: SQL, SQLServer2005
// posted by Jane @ 7:11 PM
Comments:
Friday, July 04, 2008
Monty Hall problem - TSQL
Following on from
this morning's post about the Monty Hall problem, and proving it in PHP I figured I'd prove it in TSQL as well.
So
here is my SQL version.
To maintain consistency with
my PHP version, I've made it output similar text, so the results are along the lines of:
Monty Hall Problem
This is a simple TSQL query to prove the Monty Hall problem [http://en.wikipedia.org/wiki/Monty_hall_problem]
------------------------
The Results are in:
------------------------
Out of 10000 games, the contestant was right to swap 66.94% of the time and wrong 33.06% of the timeThe TSQL version is a bit more elegant with regards to working out which door to open for the contestant, as it is a simple statement of
SELECT TOP 1 @Opened = DoorNumber
FROM @Doors
WHERE DoorNumber NOT IN (@Prize, @Picked)
ORDER BY NEWID()making the most of set theory to enable the exclusion of the @Prize door and the @Picked door as opposed to the same thing in my PHP code
$remaining = array();
/* the gameshow host opens a door which has nothing behind it, so the gameshow host knows where the prize is
but can't choose to open the door the contestant has chosen, so remove both picked and prize from the options,
this leaves either one of two doors that can be opened, so pick one randomly */
for ($i=0; $i<3; $i ++)
{
switch($doors[$i])
{
case $prize:
break;
case $picked:
break;
default:
array_push($remaining,$doors[$i]);
}
}
$opened = $remaining[array_rand($remaining)];which is all a bit more procedural and, at least to my mind, less elegant - but then again I like the syntax of SQL which either makes me a freak or a masochist (according to at least one colleague)
Labels: Monty Hall problem, SQL
// posted by Jane @ 6:08 PM
Comments:
Monday, June 30, 2008
Find a string in a stored procedure - Revisited
A little over 2 years ago I
posted a couple of methods of finding the instances of a string within stored procedures - at the time I'd left some debug 'Print @' code in at least one stored procedure and needed to find it before releasing the code.
Today, I wanted to do the same thing, and as over the past 2 years I've been weaning myself off accessing the sysobjects tables, instead making use of the
INFORMATION_SCHEMA views, I decided it was time to add an update to that earlier post.
So, using
INFORMATION_SCHEMA.ROUTINES it is coded as :
SELECT SPECIFIC_NAME
FROM INFORMATION_SCHEMA.ROUTINES
WHERE ROUTINE_DEFINITION LIKE '%PRINT @%'Labels: SQL, SQLServer, SQLServer2005
// posted by Jane @ 2:53 PM
Comments:
Monday, June 16, 2008
Scripting out object level SQL
In SQL Server 2000 Enterprise Manager provided the ability to output SQL file on an object level for tables, functions, stored procedures etc. In SQL Server 2005 this option disapeared and the only outputting was in a single file. This wasn't great for controlling objects within a source control system.
A while ago I
stumbled across Scriptio, a ClickOnce application to do just this. It has a very simple interface, and you can tell that it was designed to solve a problem, rather than as an end solution.
Tab One - Database Objects

To get a list of database objects, enter a Server name, select either Use Windows Authentication or enter a Username and password and press Connect. This should populate the Database drop down. Select a database and the bottom section should be populated with the objects that are scriptable within your database. From here, it is a simple case of selecting the appropriate options from the top right hand section (Include CREATE, Include DROP etc) and selecting the appropriate objects from the bottom section (there is a handy Script All option which selects everything. I have never been able to get the Schema or Type drop downs to populate, which may be something to do with the design of my objects.
Tab Two - More Options

The options tab allows choices of how many files to create (or even if you want to create a file), where to store them and some additional scripting objects (Include collation, Qualify CREATEs with Schema etc).
Tab Three - Generated Script

This tab displays the SQL generated - in an single box, allowing the content to be copied to clipboard and used elsewhere. Just as an observation, I've never seen the "Save As..." button enabled but I've never needed to use it either so that hasn't been a problem.
Sample script - a table
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[Book]') AND type in (N'U'))
DROP TABLE [dbo].[Book]
GO
/****** Object: Table [dbo].[Book] Script Date: 06/16/2008 17:30:18 ******/
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Book](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Title] [varchar](255) COLLATE Latin1_General_CI_AS NOT NULL,
[Author] [varchar](255) COLLATE Latin1_General_CI_AS NOT NULL,
[PublisherId] [int] NULL,
[ISBN] [varchar](25) COLLATE Latin1_General_CI_AS NULL,
[PublishedYear] [int] NULL,
[BookAgeId] AS (case datepart(year,getdate())-[PublishedYear] when (0) then (1) when (1) then (2) else (3) end)
) ON [PRIMARY]
GOSample script - a stored procedure
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[up_GetBooks]') AND type in (N'P', N'PC'))
DROP PROCEDURE [dbo].[up_GetBooks]
GO
/****** Object: StoredProcedure [dbo].[up_GetBooks] Script Date: 06/16/2008 17:30:19 ******/
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: Jane Dallaway
-- Create date: 12 June 2008
-- Description: Get all books, in full detail
-- =============================================
CREATE PROCEDURE up_GetBooks
AS
BEGIN
SET NOCOUNT ON;
SELECT Book.Id, Book.Title, Book.Author, Publisher.Name, Book.ISBN, Book.PublishedYear, BookAge.Name BookAge
FROM Book
INNER JOIN BookAge
ON Book.BookAgeId = BookAge.Id
INNER JOIN Publisher
ON Book.PublisherId = Publisher.Id
END
GOSummary
This is a great utility to allow scripting on an object by object basis. The
source code is available too.
Labels: SQL, SQLServer2005
// posted by Jane @ 5:43 PM
Comments:
Friday, May 30, 2008
Searching for 'special' characters in a database using TSQL
One of my colleagues asked me for some quick SQL help earlier. Some data in the database he was looking at had ended up with a 'special character' on the end of it in a specific table. This didn't affect all rows, just some, but he needed helping identify those rows.
For the sake of arguments, lets imagine that the table was called Item, and the affected column was named Name, and there was also an Id column named ItemId. He knew one of the affected Items, and so we examined the data to determine what character it was at the end. To do this we did
SELECT ASCII(SUBSTRING(Name,10,1))
FROM Item
WHERE ItemId = 100This informed us that the character was actually ASCII value 160. Knowing this, I then used the
PATINDEX function and produced a query like:
SELECT *
FROM Item
WHERE PATINDEX('%' + CHAR(160) + '%',Name) > 0which seemed to return the row that he'd already identified along with new ones. This is probably not the most performant method, but this was needed to check something in a debugging/support environment and not code that would be run on a regular basis.
Labels: SQL, SQLServer2005
// posted by Jane @ 12:56 PM
Comments:
Thursday, May 01, 2008
TSQL - How to get the date element of a datetime object
In SQL Server 2000 and 2005 there is no concept of a
DATE data type, only a
DATETIME. So, how do you get the
DATE element only, i.e. with a 00:00 time element. One of my colleagues needed to do this, so after a bit of google searching he discovered the following.
-- Get today's date without the time element
SELECT CONVERT(DATETIME,FLOOR(CONVERT(FLOAT,GETDATE())))I would probably have done it via a
CONVERT/
CAST operation, converting to a
VARCHAR and then back to a
DATETIME, but this is a much more efficient method.
In SQL Server 2008 there is a
DATE data type so this will no longer be an issue.
Labels: SQL, SQLServer
// posted by Jane @ 5:38 PM
Comments:
Monday, April 28, 2008
More on transactions
After Friday's
experiment with transactions in SQL Server I got to wondering about what would happen to actions between the
ROLLBACK TRANand the
IF @@TRANCOUNT > 0
COMMIT TRANso I amended the code to have an additional INSERT as follows (Note: I also changed the final
COMMIT TRAN to be a
ROLLBACK TRAN to help show the differenence)
BEGIN TRAN
INSERT INTO COUNT VALUES (1)
BEGIN TRAN
INSERT INTO COUNT VALUES (4)
ROLLBACK TRAN
INSERT INTO COUNT VALUES (99) -- New line
IF @@TRANCOUNT > 0
ROLLBACK TRANwhich results in the value 99 being inserted into the table, outside of the scope of any transactions - all transactions were rolled back in the inner
ROLLBACK TRANTo ensure that the second part of the statement, which might be unrelated to the first and therefore not dependent on the result of that transaction, is within a transaction, the following seems to work with the expected results:
BEGIN TRAN
INSERT INTO COUNT VALUES (1)
BEGIN TRAN
INSERT INTO COUNT VALUES (4)
ROLLBACK TRAN
IF @@TRANCOUNT = 0
BEGIN TRAN
INSERT INTO COUNT VALUES (99)
IF @@TRANCOUNT > 0
ROLLBACK TRANNo new row is added as a result of this action as now all actions are covered within a transaction.
Labels: ilp, SQL
// posted by Jane @ 1:03 PM
Comments:
Friday, April 25, 2008
TSQL Transactions - simple nesting
After
Bruce and I were looking at
DBVerse earlier, we started specifiying how our tool would work, especially with regards to transaction. Our outer process will need to handle its own transactions, and will need to be aware of inner errors and inner transactions to ensure that errors don't ensue.
For example, in the following scenario
BEGIN TRAN
INSERT INTO COUNT VALUES (1)
BEGIN TRAN
INSERT INTO COUNT VALUES (4)
ROLLBACK TRAN
COMMIT TRANthe following error is produced
Msg 3902, Level 16, State 1, Line 17
The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION.This is because the inner
ROLLBACK TRAN rollsback ALL transactions, not just the inner transaction. So, by the time the
COMMIT TRAN is called there are no longer any transactions.
To get around this, the code can be changed to be:
BEGIN TRAN
INSERT INTO COUNT VALUES (1)
BEGIN TRAN
INSERT INTO COUNT VALUES (4)
ROLLBACK TRAN
IF @@TRANCOUNT > 0
COMMIT TRANwhich works happily and will not cause an error.
NOTE: Count is just a test table I created with a single column of Count which is defined as an int. Nothing very exciting.
Labels: ilp, SQL
// posted by Jane @ 6:01 PM
Comments:
Wednesday, March 26, 2008
Defensive SQL Updates
One of my recent tasks, well before my month off anyway, was to upgrade a number of similar SQL Server 2005 databases to a new structure. The databases had all started off the same, and were going to end up the same, but had been subjected to a differing number of bug fixes and enhancements along the way.
If I was updating just one database I would have used the excellent
Redgate SQL Compare (of which more in a future post) but as I had to update many then this would not have been the most reliable way.
Instead, I wrote defensive SQL, that is SQL which checks for impact before doing it. So, before adding a column for instance, it checks to see if that column exists, and only attempts to add it if it doesn't - thus preventing errors. The whole update script was wrapped in a Transaction, and by making use use of a variable
@bError of type
BIT to track any errors, the last statement could
COMMIT or
ROLLBACK the transaction as appropriate.
So, for example, to add a new column to a table, here is the script I used:
IF @bError = 0
BEGIN
IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE Table_Name = 'MyTable' AND Column_Name = 'MyNewColumn')
BEGIN
ALTER TABLE MyTable ADD MyNewColumn INT NOT NULL
IF NOT EXISTS (SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE Table_Name = 'MyTable' AND Column_Name = 'MyNewColumn')
BEGIN
PRINT 'Error Adding column ColumnName to table MyTable'
SET @bError = 1
END
ELSE
BEGIN
PRINT 'Added column ColumnName to table MyTable'
END
END
ENDBy checking for the column's existance after applying the change, I could determine if an error had occurred, and set the
@bError flag accordingly. I used the Print statements to produce a log of exactly what stages had been carried out on each database, using the function
DB_NAME() to output the name of the current database at the beginning of the log.
Labels: SQL, SQLServer2005
// posted by Jane @ 9:37 AM
Comments:
Thursday, January 24, 2008
Finding some text within a SQL Server 2005 Database
I was asked what sounded like a fairly simple question today - is there a way to do a full text search of a database, without having to set up full text searching. As this was to answer an immediate query - where was some text that was being displayed coming from - and not within the context of a query to form part of an application I thought about it a while, and decided that some dynamic SQL was what was needed. I started off with the fact that I wanted to pull together a statement like:
SELECT COUNT(*) FROM TableName WHERE ColumnName LIKE '%SearchTerm%'and then run that across all columns which are text, ntext, varchar, nvarchar, char or nchar.
Getting the list of table and column names is reasonably straightforward
SELECT TABLE_NAME,COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE DATA_TYPE IN ('varchar', 'nvarchar', 'text', 'ntext', 'char', 'nchar')I wanted to loop through all of these and get the counts, so I used a local table variable to hold the results
DECLARE @tabSearchableColumns TABLE (TableName VARCHAR(100), ColumnName VARCHAR(100), Matches int) and simply inserted the results of the above query into it (along with a NULL for the Matches column). I then set up a loop, looping based on the
COUNT(*) FROM @tabSearchableColumns WHERE Matches IS NULL. Within that loop I take the top item from @tabSearchableColumns, and use the TableName and ColumnName to populate the SQL SELECT above.
Now came the interesting part, getting the variable
@intDataCount from the following statement:
EXEC (SELECT @intDataCount = COUNT(*) FROM TableName WHERE ColumnName LIKE '%SearchTerm%)as the variable @intDataCount has the scope of the execute statement, and doesn't get propogated to any outside variable with the same name.
To get around this involves the use of
sp_executesql specifying the parameter that we expect to get out, i.e.
EXEC sp_executesql @strSQL, N'@intDataCount INT OUTPUT', @intDataCount OUTPUTAn important point here is that @strSQL, or the string being sent in as the SQL string MUST be cast be an
NVARCHAR, either by
DECLARE @strSQL NVARCHAR(100) or
N'SELECT ColumnName FROM TableName'. Otherwise, don't be surprised to encounter the following error:
Msg 214, Level 16, State 2, Procedure sp_executesql, Line 1
Procedure expects parameter '@statement' of type 'ntext/nchar/nvarchar'.Finally, having got the count out into my @intDataCount variable, I can update the appropriate row in tabSearchableColumns, redo the
COUNT(*) FROM @tabSearchableColumns WHERE Matches IS NULL and continue looping. The final stage is just to select the table and column data which has Matches > 0.
The script for
FindTableColumnDataMatches is available for
download, and as with
sp_generateinsert is tested against SQL Server 2005, leave me comments if you've found this useful, or have suggestions for improvements.
Labels: FindTableColumnDataMatches, SQL, SQLServer2005
// posted by Jane @ 7:59 PM
Comments:
Wednesday, January 23, 2008
Update to spu_generateinsert - use of UNION SELECT
Jon Green from 4R Systems Inc left a
comment earlier proposing a suggested improvement to
spu_generateinsert.
He suggested that it would be easier to scan the produced SQL if it made use of SELECT and UNION SELECT rather than producing a whole host of INSERT statements. This changes the output from:
INSERT INTO [TableName] ([ColumnA], [ColumnB]) VALUES ('A','B')
INSERT INTO [TableName] ([ColumnA], [ColumnB]) VALUES ('C','D')and replaces it with
INSERT INTO [TableName] ([ColumnA], [ColumnB])
SELECT 'A','B'
UNION SELECT 'C','D'This is a great suggestion and I have implemented it by the use of an additional parameter
@producesingleinsert. By default this is set to 0 which produces the separate INSERT statements as it has done all along. If it is specified as 1, however, it now produces a single statement making use of SELECT and UNION SELECT. I have implemented this as an optional parameter so as not to break any reliance on this functionality.
I also made another change today which is to exclude calculated columns. This has involved joining
INFORMATION_SCHEMA.COLUMNS to
syscolumns via
sysobjects to make use of
syscolumns.IsComputed.
The final change was the addition of another optional parameter
@ debug. Again this defaults to 0, but if set to 1 prints out the SELECT clause which will be used to obtain the data. This can help to debug problems with the procedure, both in terms of the internal logic, but also with regards to checking that the criteria specified is being handled correctly.
As ever this is work in progress, so please continue sending in your suggestions and I'll attempt to implement them.
Download the script from
here.
Labels: spu_generateinsert, SQL, SQLServer2005
// posted by Jane @ 6:52 PM
Comments:
Tuesday, January 15, 2008
More on spu_generateinsert
Dave posted a
comment on the
original post the other day seeking for help with resolving an error he was having running
the script. The problem transpired to be with dealing with columns of data type image. On investigation, I discovered that there were quite a few column types which wouldn't work as they hadn't been catered for.
I've updated the script to work with all the data types I can get it to work with. For images, binary and varbinary columns, it attempts to insert NULL and puts a warning at the bottom of the generated script saying:
-- ** WARNING: There is an image column in your table which has not been migrated - this has been replaced with NULL. You will need to do this by hand. Images are not supported by this script at this time. or
-- ** WARNING: There is a binary or varbinary column in your table which has not been migrated - this has been replaced with NULL. You will need to do this by hand. Binary and VarBinary are not supported by this script at this time. as appropriate
This isn't great, but is at least informative and explains what has happened.
I will try and think about a way of scripting out data from columns of type image, binary and varbinary, but at least, in the meantime the script no longer errors (at least to my knowledge - if you know otherwise, then please let me know and I'll make more fixes).
Download updated
spu_generateinsert.
Labels: spu_generateinsert, SQL, SQLServer, SQLServer2005
// posted by Jane @ 6:32 PM
Comments:
Thursday, January 03, 2008
Update - Generate Insert statement from table for SQL Server
I had a bit of spare time today, so I've updated
spu_generateInsert.sql to fix a couple of annoyances:
- it now handles text and ntext appropriately - wrapping the data in single quotes, and escaping any single quotes as appropriate - it also displays a warning indicating that text data will be converted into varchar(8000)
- it now handles NULL values appropriately, and so the step in my previous post on this procedure about "replace of 'NULL' with NULL" can now be safely ignored.
I've put comments into the stored procedure to indicate the changes.
I've tested this on SQL Server 2005 but I'd recommend testing it and checking it yourself before relying on it.
Download it
hereLabels: spu_generateinsert, SQL, SQLServer, SQLServer2005
// posted by Jane @ 6:32 PM
Comments:
Tuesday, December 04, 2007
Getting the ID from a concatenated string
A colleague is doing some data migration work, and had a string he needed to manipulate to get at the id. The string had been put together as a title, with spaces replaced with -, and an ID with the title and ID separated by a -, i.e. Software-Developer-123 where title is Software Developer and the ID is 123.
He'd written some code to manipulate this, but it was taking quite a while and wanted to know if there was a one line TSQL statement to do it.
After a bit of head scratching we came up with the following:
DECLARE @strTemp VARCHAR(100)
SET @strTemp = 'Software-Developer-123'
SELECT SUBSTRING(@strTemp, LEN(@strTemp)-(CHARINDEX('-',REVERSE(@strTemp)))+2, CHARINDEX('-',REVERSE(@strTemp))-1)This returns 123, and so is a (long) one line fix to his problem. As there is no LastIndexOf or similar function, I've used the
REVERSE function to enable us to find the last instance of - and then its just some manipulation using
CHARINDEX and
SUBSTRING to get at the ID. The use of the variable @strTemp was just so we could easily swap in different test data and so isn't of any real importance to the solution.
I really enjoy challenges like this and have emailed this on as a brainteaser to the rest of the development team.
Update:
@DECLARE @strTemp VARCHAR(100)
SET @strTemp = 'Software-Developer-123'
SELECT RIGHT(@strTemp, CHARINDEX('-',REVERSE(@strTemp))-1)A lot cleaner :-) Thanks Alex.
Labels: SQL, SQLServer, SQLServer2005
// posted by Jane @ 5:33 PM
Comments:
Tuesday, November 13, 2007
Generate SQL Insert statement from table data
Years ago I stumbled upon a helpful SQL script (SQL Server 2000 and 2005) written by a chap called Keith E Kratochvil which you could point at a table in your database and which would then generate a series of Insert scripts.
I made some amendments at the time and over the next few years to make a bit more useful to me for some specific requirements, including the ability to take bits of a where clause to restrict the data coming back, allowing me to script subsets of data. I've carried this script with me from company to company, and decided it was time to share the updated script.
One word of advice, before running the generated inserts, do a replace of 'NULL' with NULL as I failed to get NULLs appended to a string without causing the whole string to become NULL.
The procedure is available
here so feel free to take a look and use it if you find it helpful.
Labels: spu_generateinsert, SQL, SQLServer, SQLServer2005
// posted by Jane @ 5:39 PM
Comments: