Jane Dallaway

Jane Dallaway

Jane Dallaway  //  Development manager, photographer, dog owner and snowboarder based in Brighton, UK
Email: jane @ dallaway.com
Also at:        

Changing the direction of results from a SQL Server ORDER BY clause

I have a stored procedure which returns me a list of items, and I want the directional order of them to change based on a parameter. I didn't want to resort to dynamic SQL, so I had a bit of a hunt around and found the excellent blog post Dynamic/Conditional Order By Clause in SQL Server/T-SQL.  Here's my result:

 

CREATE Procedure [dbo].[mp_GetItems]
(
 @OldestFirst BIT
)
AS
SELECT 
  Item.ID, Item.Description
FROM 
Items
ORDER BY 
  CASE WHEN @OldestFirst = 1 THEN CreatedDate END ASC,
  CASE WHEN @OldestFirst = 0 THEN CreatedDate END DESC

GO

which works a treat. The key is, as mentioned in the blog post

Ascending and Descending actions need to be grouped into separate CASE statements, separated with a comma

 

Loading mentions Retweet
Filed under  //  Development   SQL   SQLServer   SQLServer2005  

Comments (0)

TSQL: Escaping % in a SQL Like clause

Today I had cause to search a text field in a SQL Server 2005 database for the phrase %3 (I had some data being url encoded incorrectly) and this wasn't something I'd tried to do before so I had to find out how to escape the % sign.  A quick google and a re-read of the LIKE syntax, and I discovered that to escape the % I had to do something like

WHERE [Value] LIKE '%!%3%' ESCAPE '!'

After talking this through with a colleague, he asked me to try some other escape characters, and here's what we found out:

Attempting to use a string as an escape phrase, i.e.

WHERE [Value] like '%WORD%1%' ESCAPE 'WORD'

doesn't work, and fails with an error of

 

Msg 506, Level 16, State 1, Line 1

The invalid escape character "WORD" was specified in a LIKE predicate.

 

Attempting to use % as the escape character, gets it quite confused 

WHERE [Value] like '%%%1%' ESCAPE '%'

returns 0 rows, but no errors

 

Attempting to use ' as an escape character, which needs to be escaped itself 

 

WHERE [Value] like '%''%1%' ESCAPE ''''

returns the correct rows

Loading mentions Retweet
Filed under  //  nts   SQL   SQLServer2005  

Comments (0)

TSQL: Get a list of values into a comma separated list without using a UDF

I needed to create a comma separated list from a series of values in a column in the database the other day, and being mindful of the affect of scalar UDFs on the query plan (after attending SQLBits VI) I looked around for an alternative method to my previous favourite using COALESCE.  A couple of blog comments mentioned using STUFF and FOR XML PATH to do this, and after a bit more research I found some examples from Kodyaz Development Resources which seemed to do the trick 

SELECT 
  STUFF(
    (
      SELECT ',' + <ColumnName>
      FROM <TableName>
      FOR XML PATH('')
    ), 1, 1, '') as CommaSeparatedList

Loading mentions Retweet
Filed under  //  nts   SQL  

Comments (1)

SQLBits VI

On Friday I attended my 2nd SQLBits conference.  This time it was hosted at the Church House Conference Centre, Westminster which was a good centre, with plenty of facilities and some beautiful rooms.  It also benefitted from being just a 15 minute walk from Victoria.

The day started with a hot breakfast sandwich (bacon, sausage or egg) and tea or coffee, before the first session started at 9.20am.  There were 4 streams of talks, and 3 different audiences - DBAs, Developers, or Business Intelligence users. There was always at least one talk for each audience going on at any one time giving a reasonable amount of choice. Only one talk had to be cancelled due to a speaker not being able to get to London because of the volcano and this was replaced with a panel session.

I chose to start by attending the talk 

Getting started with SQL Azure by Dr Keith Burns, Data Architect, Microsoft UK (I can't find a website for Keith that has been recently updated - his blog seems to stop in August 2007 - if you know of one, please leave me a comment)

SQL Azure - Keith Burns

As I mentioned the other day, I am interested in finding out more about the Microsoft offering for cloud computing, so this was an obvious choice.  As this was the first talk of the day, my notes are a little fragmented as I got my self into note taking mode, but my key takeaways were:

  • Reporting and the BI tools aren't on the cloud yet, so you'd need to use a local version of these tools
  • Availability is high - each database is in 3 places, in different scale units
  • DB performance is affected by "neighbours" on the same machine. Load balancing is happening automatically but is not perfect yet
  • Connections may get killed if it they are for long running transactions
  • Updates to SQL Azure will be rolled out on a 2 monthly basis
  • Databases can currently be 1GB, or 10GB.  If you exceed this you get a nasty looking error message.  If you currently have 1GB you can upgrade via an ALTER DATABASE statement.  A 50GB offering is in the works as is spatial data
  • Prices are in dollars, charged daily - so if you use it for one day you pay for one day - it works out that a 1GB database is approximately £72/year

Tea and pastries time before choosing a DBA talk (the developer one wasn't of real interest to me)

Inside the SQL Server Query optimizer by Conor Cunningham, Principal Architect, SQL Server Query Processor

Query optimizer Overview - Conor Cunningham

This was a packed out talk, and I ended up sitting on the floor, as did several others.

Conor's session was a bit full

I've never really known much about how the optimizer works, and I knew that this wasn't exactly going to be an introductory session, so I was somewhat concerned about how much of it I'd follow, and whether I'd actually gain anything out of it.  What I learnt was:

  • My maths isn't as good as it once was - I'm sure I'd have understood more whilst I was at university
  • The aim from query optimisation is "good enough" - trying to find the best query plan would take too long so good enough will usually do
  • Query optimizer is just a set theory engine
  • Just indexing is usually insufficient - you want to evaluate whether an index is "covering" as a non covering index will typically need to do a fetch from the base table (SQL Server 2005 and beyond I believe)
  • An index can be used to return data, not just for filtering or ordering - hence the use of covering indexes
  • Missing index DMVs can suggest covering indexes
  • Parallel queries are considered for "expensive" queries (> 5 seconds) but not all operators support parallelism and some scalar functions will prevent parallelism
  • User Defined Functions don't have good histogram support - so don't get optimized well - this was picked up in greater detail in Simon Sabin's talk later in the day - so forward to the end of this post if you can't wait to find out

Another quick cup of tea before moving on to 

Non-Functional Dependencies - A talktorial on getting performance and scalability right through proper Database Design and understanding Set and Relational theory by Tony Rogerson, SQL Server MVP.  The slides for this talk are available here

Performance & Scalability through normalisation and set theory - Tony Rogerson

A discussion on some of the things to be considered when using SQL.  Amongst them were:

  • a demo of the difference in performance around choosing the different kinds of surrogate keys - IDENTITY, NEWID and NEWSEQUENTIALID (which was new to me) - NEWID causes fragmentation of data and so affects performance, NEWSEQUENTIALID has the benefits of NEWID but without fragmentation
  • normalisation - a quick tour through what normalisation is and why you'd use it - I was happy to discover that the majority of this was still tucked away in my memory despite it being about 18 years since I was first taught it in University algorithms lessons
  • discussion of the use of NOLOCK (he doesn't like it) and the use of READ_COMMITED_SNAPSHOT (which seemed to have problems too and appeared to be able to break referential integrity) and I didn't come away with an answer of how to ensure that reading data didn't fail because of locks against the data

And then it was lunchtime - a nicely provided brown bag lunch featuring a sandwich, bag of crisp, impossible to peel orange, a muffin and a carton of fruit juice - and time to sit outside in Deans Yard, watching people playing football on the lawn and trying to get my brain to rapidly process everything I'd heard during the morning to allow me to prepare for an afternoon featuring more of the same.  The key sponsors had all been given an opportunity to display their wares during a lunchtime spot, so I stopped by to hear about

Redgate's new product SQL Source Control 

I've long been a fan of the SQL Compare tool so it seemed obvious to hear about the new product.  This demo seemed to fit in every one of their SQL related tools which meant I wish I'd made a Redgate Product Bingo card.  The premise of the tool sounds good, integrating into existing source control tools - SVN and TFS at the moment.  They also brought my attention to a free tool SQL Search which I'll be downloading and installing when I'm back in the office as it looks like it'll save me querying the INFORMATION_SCHEMA views quite so often.  Version 1 of the SQL Source Control tool doesn't manage data, which means that when you add a column that you need to be NOT NULL, you can't specify how that column should be populated for existing data - which means you get an error.  It appears that this might be a good solution whilst in early days development, but not so useful for live systems.  Deployment is via the following steps:

  1. get latest of scripts
  2. compare scripts to backup (via SQL Compare)
  3. let SQL Compare generate update scripts

Redgate are running an early access program at the moment allowing you to get hold of the tool prior to its actual release.

I had a few minutes here before the next session, although there was no tea to be found before moving on to

Denormalisation - having your cake and eating it by David Whitehorn and Yasmeen Ahmad

Denormalization - David Whitehorn and Yasmeen Ahmad

Another packed out session

Demormalisation session full

This was quite a theoretical discussion, rather than explaining any tips and tricks on how to achieve your denormalisation.  It also covered a lot of the same normalisation content that Tony had covered before lunch.  Mainly stating that normalisation is good, but sometimes denormalization is good too - for ease of use, or reporting purposes.  Normalisation guarantees good design but only forms part of it and when you normalise you remove a lot of insert, update, delete anomalies but you don't guarantee it which then means using triggers etc to ensure there are no anomalies whatsoever.  Normalisation also comes with some performance overheads which may be removed using denormalisation.

Possible methods for denormalisation are:

  • Duplicated data
  • Splitting tables
  • Using derived columns - store calculated values

Basically, the decision as to whether to normalise or denormalise depends what you're going to do with the data. 

Time for tea and donuts before the next session

Afternoon refreshments

Improving database performance by removing the database by Simon Munro

Simon Monro

This talk was about the concept of database less systems - it was pretty well attended which surprised me at a SQL conference.  One of the first things Simon said was

This talk is not about NoSQL.  It is about why NoSQL won't die

Which was a bit of a shame as I was quite keen to learn about NoSQL.  However, he did point us in the direction of this video from the last SQLBits conference.

Scalability is often mentioned as the reason why SQL isn't a good solution - but actually, NoSQL isn't going to be any better.  Scalability is about more than just this.  He recommended a book The Art of Scalability which gives guidance about the scalability of everything - hardware, software, processes, architectures etc

Some of the takeaway points for me were:

  • Before we change something, to make an effective change we need to understand why it is like it is in the first place
  • Data is temporal - it's value changes over time
  • Never forget that people create a lot of spreadsheets to mould data how they like it - so your database isn't likely to be the only place where data is stored no matter how much you want to believe it is

Simon had some of the best slide decks I saw on the day, I particularly liked this one

Simon Monro's slide deck - Do we want it to change?

and this one

Data slide from Simon Monro's deck

This was probably the most interesting and engaging of the talks I went to, and it challenged me to go away and think more about some of his points.

The final break of the day before the final talk which for me was

High Performance Functions by Simon Sabin

High Performance Functions - Simon Sabin

I'd seen Simon present on a couple of previous occasions so I already knew that he'd be a good way to end the day.  And he was.  This followed on nicely from Conor's earlier stuff about scalar valued User Defined Functions not being optimised properly and gave some demonstrations to prove it - basically the query plan showed lots of nested loops, which indicate that each scalar function believes itself to return only one row.  Simon acknowledged the reason that UDFs, when they were first created, were welcomed by the developer community was because of the help they gave to code reuse - something that I'm sure we all strive for.  The best workaround presented was to use query hints - hash /merge - to help query plan generation, where hash join is the safest option when there is no option but to use a scalar function. 

But an alternative, at least in SQL 2008, is to use Inline table valued functions, mainly because:

  • the function returns a query
  • they work a bit like a parameterized view
  • the query is consumed into main query for query plan because they are resolved down to base functions

Another alternative is to use the CLR as an alternative approach - the more complex the operation, the more chance that the CLR will perform better than SQL.

Simon's summary was:

  • Don't use scalar fns
  • Do convert to inline table fns
  • Do consider using CLR

And that was the end of SQLBits VI - the evening continued with beer, pizza and games at the conference centre, but I sloped off home with a brain full of information to process and a can of coke for my journey.  Another excellent SQLBits.

Loading mentions Retweet
Filed under  //  conference   event   SQL   SQLServer  

Comments (2)

Generating the SQL equivalent of Guid.Empty

I've been refactoring one of our projects today, and one of the areas I've been working on has been reducing 5 or 6 similar stored procedures into 1 more generic version.  In doing this, I've had to add a couple of nullable parameters, one of which is a GUID.  

If it is specified, then I need to set an AssignedToUser column to be the GUID, and I also need to record the date of the assignment.  I can do this all within the UPDATE statement, but I needed to only affect the assignment date when the GUID is specified.  Enter the ISNULL function, but what do I set my default value to when it is null? I really wanted to create a known GUID without hardcoding a fixed GUID - a SQL equivalent of Guid.Empty.  A quick bit of googling and I came up with

SELECT CAST(CAST(0 as BINARY) AS UNIQUEIDENTIFIER)

which evaluates to 

00000000-0000-0000-0000-000000000000

Job done, and recorded here so I can find it next time I need it

Loading mentions Retweet
Filed under  //  nts   SQL   SQLServer2005  

Comments (0)

TSQL: Recraft data from a 1 to many table to a 1 to 1 table preserving a single row of data

A colleague asked me to help with a SQL deletion today, and it took us both a while to work out a way to do it. As it was a one off item, we did it using temporary tables, but I figured I'd try and find out a better way of doing this.

The problem:
A table has been created which is a 1 to many relation table, but the business logic has now changed, and this is now a 1 to 1 relationship. So, we need to remove the excess rows to make the data comply with the new rule. In each case, we are going to keep the row with the lowest Child Item Id and delete the others.

The table is defined as:
CREATE TABLE [dbo].[Mapping]
(
  [ParentId] [int] NOT NULL,
  [ChildId] [int] NOT NULL
)

The data is populated as:
INSERT INTO [Mapping] ([ParentId], [ChildId])VALUES (1,1)
INSERT INTO [Mapping] ([ParentId], [ChildId])VALUES (1,3)
INSERT INTO [Mapping] ([ParentId], [ChildId])VALUES (2,1)
INSERT INTO [Mapping] ([ParentId], [ChildId])VALUES (2,4)
INSERT INTO [Mapping] ([ParentId], [ChildId])VALUES (1,5)
INSERT INTO [Mapping] ([ParentId], [ChildId])VALUES (3,1)

At the end of the deletion, the data present should be:
ParentId  ChildId
       1        1
       2        1
       3        1

What I needed to do was get a list of the items to keep, this was pretty simple
SELECT ParentId, MIN(ChildId) ChildId
FROM Mapping
GROUP BY ParentId

which produced a list which matched the data I wanted to keep. Next I wanted to get a list of the items to delete. To do this, I made use of the EXCEPT keyword, and set theory, to return me everything that was left when I'd taken my data to keep away from the total amount of data, or in SQL
SELECT ParentId, ChildId
FROM Mapping
EXCEPT
SELECT ParentId, MIN(ChildId) ChildId
FROM Mapping
GROUP BY ParentId

This produced the data I needed to delete.
Then all I needed to do was DELETE it. What I wanted to do was join the original table to the results of my set and delete them. But every time I've tried to delete based on a join I've had the following error
Msg 102, Level 15, State 1, Line 2
Incorrect syntax near 'm'.
Msg 102, Level 15, State 1, Line 9
Incorrect syntax near 'i'.

and given up. Today, I went and read the documentation, and discovered that it was possible, but I had to specify the tablename immediately after the DELETE keyword, i.e.
DELETE FROM Mapping
FROM Mapping m
INNER JOIN
(SELECT ParentId, ChildId
FROM Mapping
EXCEPT
SELECT ParentId, MIN(ChildId) ChildId
FROM Mapping
GROUP BY ParentId) i
ON m.ParentId = i.ParentId
AND m.ChildId = i.ChildId

That works, and leaves me with the correct data in the table, but looks rather odd...

Loading mentions Retweet
Filed under  //  SQL   SQLServer  

Comments (0)

SQL Server 2005 script to iterate through all the databases on a server

A few of the SQL Server 2005 helper scripts/maintenance scripts I've written recently have taken the form of
  • produce a SQL script
  • run it on all appropriate databases
  • display the output
To do this, I've adapted the following SQL to run the query I want, sometimes inserting data into a temporary table/table variable for later retrieval, and sometimes just selecting the results. The following example is checking each user database for a table called ErrorLog, and if there is one selecting the data from it, prefixing the results with a column identifying the server name (using @@SERVERNAME - this is because I've been running these scripts across many servers and manually concatenating the results) and the database name (using our local variable @DBName), and also printing any errors to the output window

DECLARE @Loop int
DECLARE @DBName varchar(300)
DECLARE @SQL varchar(max)
DECLARE @tableName VARCHAR(255)

SET @Loop = 1
SET @DBName = ''

   WHILE @Loop = 1
BEGIN

   SELECT TOP 1 @DBName = d.Name
   FROM master.sys.databases d
   WHERE d.Name > @DBName
   AND d.database_id not in (1, 2, 3, 4)
   ORDER BY d.Name

      SET @Loop = @@ROWCOUNT

   IF @Loop = 0
      BREAK

   SET @SQL =
      'IF EXISTS (SELECT 1 FROM [' + @DBName + '].sys.objects WHERE [name] = ''ErrorLog'')
            BEGIN
               SELECT ''' + @@SERVERNAME + ''', ''' + @DBName + ''', ErrorMessage, ErrorDate
               FROM [' + @DBName + '].dbo.ErrorLog
            END
            ELSE
            BEGIN
               PRINT ''ErrorLog table does not exist on server: ' + @@SERVERNAME + ' and database: ' + @DBName + '''
            END '

   BEGIN TRY
      EXEC master.dbo.sp_executeSQL @SQL
   END TRY
   BEGIN CATCH
      PRINT 'Error executing ' + @SQL + ' on server: ' + @@SERVERNAME + ' for database: ' + @DBName
   END CATCH
END

Loading mentions Retweet
Filed under  //  SQL   SQLServer2005  

Comments (0)

SQL: Building a comma separated list from a select clause

A colleague asked me today about generating a comma separated list of values based on the results of a SELECT statement. A quick google later and I found Using COALESCE to Build Comma-Delimited String

So, for my table Continent, and my usual SELECT of
SELECT [Name]
FROM Continent

with just a variable declaration and a use of COALESCE and hey presto, we get our required result of Africa, Antarctica, Asia, Australia and Oceania, Europe, North America, South America

The code is now:
DECLARE @List VARCHAR(1000)

SELECT @List = COALESCE(@List + ', ', '') + Name
FROM Continent

SELECT @List

Nice and simple. I like!

Loading mentions Retweet
Filed under  //  SQL   SQLServer  

Comments (0)

spu_generateInsert - handle unicode characters

I updated spu_generateinsert again today when one of my colleagues reminded me that I needed to prefix my strings with a capital N to ensure that any unicode values go into the database correctly.

So now instead of the call spu_generateinsert @table = 'LanguageData' producing
INSERT INTO [LanguageData] ([liID], [Value])VALUES (7,'D''Artagnan raconte qu''à sa première visite à')
it produces
INSERT INTO [LanguageData] ([liID], [Value])VALUES (7,N'D''Artagnan raconte qu''à sa première visite à')

As usual, the script is available here
And again as usual, leave me a comment if you think of some new functionality I should include, or any issues you come across.

Loading mentions Retweet
Filed under  //  spu_generateinsert   SQL   SQLServer2005  

Comments (0)

Stored Procedure comparison

I have been a long time fan of RedGate's SQL Compare tool to compare all aspects of a database. The work I'm doing at the moment needs me to integrate some new code into an existing project - some of which involves database manipulation. However, I'm only interested in a subset of stored procedures as many of them are different for valid reasons. I couldn't work out how to do this in SQL Compare so, I wrote a script to do it for me. It is called spu_CompareProcedures and I thought I'd share it in the hope that is of value to others.

There are 5 parameters available, they are:

  • db1 - the name of the 1st database to use
  • db2 - the name of the 2nd database to use
  • proceduresToCompare - either set to '' to compare ALL stored procedures in both databases, or set to a comma separated list to restrict what gets checked
  • displayOnlyDifferent - defaults to 1, i.e. only display the stored procedures which are different. If set to 0 it'll show the ones that are the same as well, but in most cases this is just too much information
  • debug - defaults to 0. Used to output the SQL being generated behind the scenes to enable debugging of this stored procedure

If I wanted to compare the procedures spu_generateinsert and spu_compareprocedures on the database MyMasterDatabase and MyOtherDatabase I would specify
spu_compareprocedures @db1 = 'MyMasterDatabase', @db2 = 'MyOtherDatabase', @proceduresToCompare='spu_generateinsert,spu_compareprocedures', @displayOnlyDifferent=1, @debug=0

As with spu_generateinsert, the output is displayed in the Messages window. It will report on what is being checked, any errors - i.e. procedures which exist in one database and not the other, or maybe either, and those which are different (or the same as well depending on the value of @displayOnlyDifferent). Sample output is:
******************************************************************
Comparing databases MyMasterDatabase and MyOtherDatabase
Objects:
spu_generateinsert
spu_compareprocedures
******************************************************************

*****************
** ERRORS **
*****************

spu_compareprocedures is missing for MyMasterDatabase
spu_compareprocedures is missing for MyOtherDatabase

*****************
** COMPARISONS **
*****************

No differences found

It is very simple, and just checks the contents of the stored procedure, it doesn't attempt to identify the differences. But, at least theoretically, this is enough to give a heads up as to the procedures to be concerned about.

Download the script here.

If you have suggestions, comments or bug reports, please leave me a comment and I'll do what I can to help out.

Loading mentions Retweet
Filed under  //  download   spu_compareprocedures   SQL  

Comments (0)