Jane Dallaway

Jane Dallaway

Jane Dallaway  //  Service Delivery manager, photographer, dog owner, gardener, reader, learner, software developer and occasional snowboarder

This blog contains all sorts of bits and bobs, from development related stuff, through process and productivity stuff, to photography stuff, and general inspiration things. It's a bit all over the place with no real theme, but then so am I

Email: jane @ dallaway.com
Also at:    

spu_generateinsert - handle schemas

Earlier today, James tweeted 

Pointing a colleague to the excellent spu_GenerateInsert by @JaneFoth

to which I responded with

@jamesbradshaw your name is even mentioned in the version history "suggestion by James Bradshaw"

and to which he responded with

@JaneFoth I have have just had the pointed out in the office. I have just to mod it to cope with schema names do you want my editions?

To which I of course said yes.  James sent the updated procedure through, and spurred on by his activity, I also added a fix I'd been considering for a while - to allow the procedure to deal with tablenames which have a . in them, after having recently discovered that this was valid (basically by wrapping the table name in [ and ]).

So, no updates in over a year, and then 2 in one day.  Must be the same as how it is with buses...  Anyway, as usual the updated procedure can be found here and is listed on my downloads page.

Filed under  //  SQLServer   download  

Comments (0)

Maximum length of SQL Server 2005 objects seems to be 128 characters

I couldn't find the answer to the maximum length of the name of a view in SQL Server 2005 online today, although I did find someone's helfpul blog post about the maximum length of a column name in SQL Server 2005, so a quick experiment later using the following code:

CREATE VIEW [vw_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890]
AS
SELECT * FROM Address

Resulted in:
Msg 103, Level 15, State 4, Line 1
The identifier that starts with'vw_1234567890.1234567890.1234567890.1234567890.1234567890.1234567890.1234567890.1234567890.1234567890.1234567890.1234567890.1234' is too long. Maximum length is 128.

Another experiment, this time for table creation

CREATE TABLE [1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890]
(
  [Id] [int] NOT NULL,
  [Line1] [varchar](50) NOT NULL,
  [Line2] [varchar](50) NULL,
  [City] [varchar](50) NOT NULL,
  CONSTRAINT [PK_Address] PRIMARY KEY CLUSTERED 
  ([Id] ASC) 
)ON [PRIMARY]

Resulted in:
Msg 103, Level 15, State 4, Line 1
The identifier that starts with '1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567890_1234567' is too long. Maximum length is 128.

Filed under  //  SQLServer   code  

Comments (0)

Changing the direction of results from a SQL Server ORDER BY clause

I have a stored procedure which returns me a list of items, and I want the directional order of them to change based on a parameter. I didn't want to resort to dynamic SQL, so I had a bit of a hunt around and found the excellent blog post Dynamic/Conditional Order By Clause in SQL Server/T-SQL.  Here's my result:

 

CREATE Procedure [dbo].[mp_GetItems]
(
 @OldestFirst BIT
)
AS
SELECT 
  Item.ID, Item.Description
FROM 
Items
ORDER BY 
  CASE WHEN @OldestFirst = 1 THEN CreatedDate END ASC,
  CASE WHEN @OldestFirst = 0 THEN CreatedDate END DESC

GO

which works a treat. The key is, as mentioned in the blog post

Ascending and Descending actions need to be grouped into separate CASE statements, separated with a comma

 

Filed under  //  SQLServer   code  

Comments (0)

TSQL: Escaping % in a SQL Like clause

Today I had cause to search a text field in a SQL Server 2005 database for the phrase %3 (I had some data being url encoded incorrectly) and this wasn't something I'd tried to do before so I had to find out how to escape the % sign.  A quick google and a re-read of the LIKE syntax, and I discovered that to escape the % I had to do something like

WHERE [Value] LIKE '%!%3%' ESCAPE '!'

After talking this through with a colleague, he asked me to try some other escape characters, and here's what we found out:

Attempting to use a string as an escape phrase, i.e.

WHERE [Value] like '%WORD%1%' ESCAPE 'WORD'

doesn't work, and fails with an error of

 

Msg 506, Level 16, State 1, Line 1

The invalid escape character "WORD" was specified in a LIKE predicate.

 

Attempting to use % as the escape character, gets it quite confused 

WHERE [Value] like '%%%1%' ESCAPE '%'

returns 0 rows, but no errors

 

Attempting to use ' as an escape character, which needs to be escaped itself 

 

WHERE [Value] like '%''%1%' ESCAPE ''''

returns the correct rows

Filed under  //  SQLServer   nts  

Comments (0)

SQL Server - List tables without a primary key

How do I list all tables WITHOUT a primary key?

Run the following SQL code:

SELECT T.TABLE_NAME AS 'Tables without PKs'
FROM INFORMATION_SCHEMA.TABLES AS T
WHERE NOT EXISTS
(SELECT *
FROM INFORMATION_SCHEMA.TABLE_CONSTRAINTS AS TC
WHERE CONSTRAINT_TYPE = 'PRIMARY KEY'
AND T.TABLE_NAME = TC.TABLE_NAME)
AND T.TABLE_TYPE = 'BASE TABLE'

Another script used as part of my performance tuning, nice and simple and making use of the INFORMATION_SCHEMA views rather than the sys tables

Filed under  //  SQLServer  

Comments (0)

SQL Server - Find out if an index is being used

As I mentioned yesterday, I'm doing some SQL performance tuning at the moment, and I've ended up looking at a series of indexes and trying to work out which are being used, and which aren't.  A quick google led me to this blog post which gave me a script (which I've since adapted) based on sys.dm_db_index_usage_stats

SELECT 

DISTINCT OBJECT_NAME(sis.OBJECT_ID) TableName, 

si.name AS IndexName, 

sis.user_seeks, 

sis.user_scans, 

sis.user_lookups, 

sis.user_updates

FROM sys.dm_db_index_usage_stats sis

INNER JOIN sys.indexes si 

ON sis.OBJECT_ID = si.OBJECT_ID 

AND sis.Index_ID = si.Index_ID

WHERE sis.Database_ID = DB_ID('<Database Name>')

which shows me all the indexes that are present for my database.

 

Further restrictions can be made as follows:

 

WHERE sis.Database_ID = DB_ID('<Database Name>')

AND sis.OBJECT_ID = OBJECT_ID('<table name>')

which shows me all the indexes for my database and my specified table

 

WHERE sis.Database_ID = DB_ID('<Database Name>')

AND sis.OBJECT_ID = OBJECT_ID('<table name>')

AND si.name = '<index name>'

which shows me all the indexes for my database, my specified table and my specified index

 

WHERE sis.Database_ID = DB_ID('<Database Name>')

AND sis.user_seeks = 0

AND sis.user_scans = 0

AND sis.user_lookups = 0

which shows me all the indexes in the database which aren't being used - Note: the values held by sys.dm_db_index_usage_stats get reset after a restart of the SQL service

 

Filed under  //  SQLServer  

Comments (1)

SQL Server Execution Plans - what should I look for in SQL Server 2005?

I've spent some time today reviewing the query plans for my current project, and as ever I keep forgetting what I'm trying to get to.  I found this useful page which reminded me of the orders - from best to worst:

  1. CLUSTERED INDEX SEEK - Very fast and efficient - the table is physically ordered according to specified column(s) and SQL can go find it AND since all of the data is phsically ordered right, can pull the data sequentially
  2. INDEX SEEK - which is very fast and efficient.  The SQL server knows pretty much where the data is and can go directly to it and seek out the rows it needs.  The data isn't ordered in the DB by the fields in the index so it's likely not pulling the data sequentially like it is in the CLUSTERED INDEX SEEK, but it can still translate to a massive improvement in query execution cost/speed.
  3. INDEX SCAN - which as you can see still implies using the index, but it doesn't know exactly where the data is, so it may scan the whole index or a partial range of the index to find it's data. Can be very slow and costly, but still scans through the index as opposed to scanning the physical table. 
  4. CLUSTERED INDEX SCAN - a clustered index is really not much more than a table with the data physically ordered by specified columns.  So if you see a CLUSTERED INDEX SCAN, it's pretty much the same as a TABLE SCAN for performance because SQL has to physically search through every single row in the clustered index (which really IS the table).  
  5. TABLE SCAN - slow, inefficient.  SQL has to physically look at every single row in the table. Shouldn't see this often, if you do, you may need to rewrite something or add indexes.

I've managed to remove all the scans, and everything is now at least using a seek.

Filed under  //  SQLServer   nts  

Comments (0)

TSQL: Get a list of values into a comma separated list without using a UDF

I needed to create a comma separated list from a series of values in a column in the database the other day, and being mindful of the affect of scalar UDFs on the query plan (after attending SQLBits VI) I looked around for an alternative method to my previous favourite using COALESCE.  A couple of blog comments mentioned using STUFF and FOR XML PATH to do this, and after a bit more research I found some examples from Kodyaz Development Resources which seemed to do the trick 

SELECT 
  STUFF(
    (
      SELECT ',' + <ColumnName>
      FROM <TableName>
      FOR XML PATH('')
    ), 1, 1, '') as CommaSeparatedList

Filed under  //  SQLServer   code  

Comments (1)

SQLBits VI

On Friday I attended my 2nd SQLBits conference.  This time it was hosted at the Church House Conference Centre, Westminster which was a good centre, with plenty of facilities and some beautiful rooms.  It also benefitted from being just a 15 minute walk from Victoria.

The day started with a hot breakfast sandwich (bacon, sausage or egg) and tea or coffee, before the first session started at 9.20am.  There were 4 streams of talks, and 3 different audiences - DBAs, Developers, or Business Intelligence users. There was always at least one talk for each audience going on at any one time giving a reasonable amount of choice. Only one talk had to be cancelled due to a speaker not being able to get to London because of the volcano and this was replaced with a panel session.

I chose to start by attending the talk 

Getting started with SQL Azure by Dr Keith Burns, Data Architect, Microsoft UK (I can't find a website for Keith that has been recently updated - his blog seems to stop in August 2007 - if you know of one, please leave me a comment)

SQL Azure - Keith Burns

As I mentioned the other day, I am interested in finding out more about the Microsoft offering for cloud computing, so this was an obvious choice.  As this was the first talk of the day, my notes are a little fragmented as I got my self into note taking mode, but my key takeaways were:

  • Reporting and the BI tools aren't on the cloud yet, so you'd need to use a local version of these tools
  • Availability is high - each database is in 3 places, in different scale units
  • DB performance is affected by "neighbours" on the same machine. Load balancing is happening automatically but is not perfect yet
  • Connections may get killed if it they are for long running transactions
  • Updates to SQL Azure will be rolled out on a 2 monthly basis
  • Databases can currently be 1GB, or 10GB.  If you exceed this you get a nasty looking error message.  If you currently have 1GB you can upgrade via an ALTER DATABASE statement.  A 50GB offering is in the works as is spatial data
  • Prices are in dollars, charged daily - so if you use it for one day you pay for one day - it works out that a 1GB database is approximately £72/year

Tea and pastries time before choosing a DBA talk (the developer one wasn't of real interest to me)

Inside the SQL Server Query optimizer by Conor Cunningham, Principal Architect, SQL Server Query Processor

Query optimizer Overview - Conor Cunningham

This was a packed out talk, and I ended up sitting on the floor, as did several others.

Conor's session was a bit full

I've never really known much about how the optimizer works, and I knew that this wasn't exactly going to be an introductory session, so I was somewhat concerned about how much of it I'd follow, and whether I'd actually gain anything out of it.  What I learnt was:

  • My maths isn't as good as it once was - I'm sure I'd have understood more whilst I was at university
  • The aim from query optimisation is "good enough" - trying to find the best query plan would take too long so good enough will usually do
  • Query optimizer is just a set theory engine
  • Just indexing is usually insufficient - you want to evaluate whether an index is "covering" as a non covering index will typically need to do a fetch from the base table (SQL Server 2005 and beyond I believe)
  • An index can be used to return data, not just for filtering or ordering - hence the use of covering indexes
  • Missing index DMVs can suggest covering indexes
  • Parallel queries are considered for "expensive" queries (> 5 seconds) but not all operators support parallelism and some scalar functions will prevent parallelism
  • User Defined Functions don't have good histogram support - so don't get optimized well - this was picked up in greater detail in Simon Sabin's talk later in the day - so forward to the end of this post if you can't wait to find out

Another quick cup of tea before moving on to 

Non-Functional Dependencies - A talktorial on getting performance and scalability right through proper Database Design and understanding Set and Relational theory by Tony Rogerson, SQL Server MVP.  The slides for this talk are available here

Performance & Scalability through normalisation and set theory - Tony Rogerson

A discussion on some of the things to be considered when using SQL.  Amongst them were:

  • a demo of the difference in performance around choosing the different kinds of surrogate keys - IDENTITY, NEWID and NEWSEQUENTIALID (which was new to me) - NEWID causes fragmentation of data and so affects performance, NEWSEQUENTIALID has the benefits of NEWID but without fragmentation
  • normalisation - a quick tour through what normalisation is and why you'd use it - I was happy to discover that the majority of this was still tucked away in my memory despite it being about 18 years since I was first taught it in University algorithms lessons
  • discussion of the use of NOLOCK (he doesn't like it) and the use of READ_COMMITED_SNAPSHOT (which seemed to have problems too and appeared to be able to break referential integrity) and I didn't come away with an answer of how to ensure that reading data didn't fail because of locks against the data

And then it was lunchtime - a nicely provided brown bag lunch featuring a sandwich, bag of crisp, impossible to peel orange, a muffin and a carton of fruit juice - and time to sit outside in Deans Yard, watching people playing football on the lawn and trying to get my brain to rapidly process everything I'd heard during the morning to allow me to prepare for an afternoon featuring more of the same.  The key sponsors had all been given an opportunity to display their wares during a lunchtime spot, so I stopped by to hear about

Redgate's new product SQL Source Control 

I've long been a fan of the SQL Compare tool so it seemed obvious to hear about the new product.  This demo seemed to fit in every one of their SQL related tools which meant I wish I'd made a Redgate Product Bingo card.  The premise of the tool sounds good, integrating into existing source control tools - SVN and TFS at the moment.  They also brought my attention to a free tool SQL Search which I'll be downloading and installing when I'm back in the office as it looks like it'll save me querying the INFORMATION_SCHEMA views quite so often.  Version 1 of the SQL Source Control tool doesn't manage data, which means that when you add a column that you need to be NOT NULL, you can't specify how that column should be populated for existing data - which means you get an error.  It appears that this might be a good solution whilst in early days development, but not so useful for live systems.  Deployment is via the following steps:

  1. get latest of scripts
  2. compare scripts to backup (via SQL Compare)
  3. let SQL Compare generate update scripts

Redgate are running an early access program at the moment allowing you to get hold of the tool prior to its actual release.

I had a few minutes here before the next session, although there was no tea to be found before moving on to

Denormalisation - having your cake and eating it by David Whitehorn and Yasmeen Ahmad

Denormalization - David Whitehorn and Yasmeen Ahmad

Another packed out session

Demormalisation session full

This was quite a theoretical discussion, rather than explaining any tips and tricks on how to achieve your denormalisation.  It also covered a lot of the same normalisation content that Tony had covered before lunch.  Mainly stating that normalisation is good, but sometimes denormalization is good too - for ease of use, or reporting purposes.  Normalisation guarantees good design but only forms part of it and when you normalise you remove a lot of insert, update, delete anomalies but you don't guarantee it which then means using triggers etc to ensure there are no anomalies whatsoever.  Normalisation also comes with some performance overheads which may be removed using denormalisation.

Possible methods for denormalisation are:

  • Duplicated data
  • Splitting tables
  • Using derived columns - store calculated values

Basically, the decision as to whether to normalise or denormalise depends what you're going to do with the data. 

Time for tea and donuts before the next session

Afternoon refreshments

Improving database performance by removing the database by Simon Munro

Simon Monro

This talk was about the concept of database less systems - it was pretty well attended which surprised me at a SQL conference.  One of the first things Simon said was

This talk is not about NoSQL.  It is about why NoSQL won't die

Which was a bit of a shame as I was quite keen to learn about NoSQL.  However, he did point us in the direction of this video from the last SQLBits conference.

Scalability is often mentioned as the reason why SQL isn't a good solution - but actually, NoSQL isn't going to be any better.  Scalability is about more than just this.  He recommended a book The Art of Scalability which gives guidance about the scalability of everything - hardware, software, processes, architectures etc

Some of the takeaway points for me were:

  • Before we change something, to make an effective change we need to understand why it is like it is in the first place
  • Data is temporal - it's value changes over time
  • Never forget that people create a lot of spreadsheets to mould data how they like it - so your database isn't likely to be the only place where data is stored no matter how much you want to believe it is

Simon had some of the best slide decks I saw on the day, I particularly liked this one

Simon Monro's slide deck - Do we want it to change?

and this one

Data slide from Simon Monro's deck

This was probably the most interesting and engaging of the talks I went to, and it challenged me to go away and think more about some of his points.

The final break of the day before the final talk which for me was

High Performance Functions by Simon Sabin

High Performance Functions - Simon Sabin

I'd seen Simon present on a couple of previous occasions so I already knew that he'd be a good way to end the day.  And he was.  This followed on nicely from Conor's earlier stuff about scalar valued User Defined Functions not being optimised properly and gave some demonstrations to prove it - basically the query plan showed lots of nested loops, which indicate that each scalar function believes itself to return only one row.  Simon acknowledged the reason that UDFs, when they were first created, were welcomed by the developer community was because of the help they gave to code reuse - something that I'm sure we all strive for.  The best workaround presented was to use query hints - hash /merge - to help query plan generation, where hash join is the safest option when there is no option but to use a scalar function. 

But an alternative, at least in SQL 2008, is to use Inline table valued functions, mainly because:

  • the function returns a query
  • they work a bit like a parameterized view
  • the query is consumed into main query for query plan because they are resolved down to base functions

Another alternative is to use the CLR as an alternative approach - the more complex the operation, the more chance that the CLR will perform better than SQL.

Simon's summary was:

  • Don't use scalar fns
  • Do convert to inline table fns
  • Do consider using CLR

And that was the end of SQLBits VI - the evening continued with beer, pizza and games at the conference centre, but I sloped off home with a brain full of information to process and a can of coke for my journey.  Another excellent SQLBits.

Filed under  //  SQLServer   event   london  

Comments (2)

Generating the SQL equivalent of Guid.Empty

I've been refactoring one of our projects today, and one of the areas I've been working on has been reducing 5 or 6 similar stored procedures into 1 more generic version.  In doing this, I've had to add a couple of nullable parameters, one of which is a GUID.  

If it is specified, then I need to set an AssignedToUser column to be the GUID, and I also need to record the date of the assignment.  I can do this all within the UPDATE statement, but I needed to only affect the assignment date when the GUID is specified.  Enter the ISNULL function, but what do I set my default value to when it is null? I really wanted to create a known GUID without hardcoding a fixed GUID - a SQL equivalent of Guid.Empty.  A quick bit of googling and I came up with

SELECT CAST(CAST(0 as BINARY) AS UNIQUEIDENTIFIER)

which evaluates to 

00000000-0000-0000-0000-000000000000

Job done, and recorded here so I can find it next time I need it

Filed under  //  SQLServer   nts  

Comments (1)