SQL Frequently Asked Questions

SQL Frequently Asked Questions
The Toys and the Faint Cat This is a story that I heard my dad tell a number of times. I was about 6 or 7 at the time and Christmas was nearing. My three brothers and I were over at my grand parents, mom and dad had been out shopping for Christmas and they were wrapping presents. My mom went downstairs to do a load of laundry and dad decided to play with one of the toys. He had bought a little toy robot for my brother Jon – Mr. Machine. Mr. Machine was a battery powered robot that rolled around on wheels, had a little bell and light bulbs for eyes that flashed. Basically, Mr. Machine would wander around on the floor and make lots of noise. Our cat took cover under the china buffet when dad turned on the robot and the toy wandered around on the floor. Dad decided to watch with the turned off, so off went the lights. Shortly, Mr. Machine wandered over toward the china buffet. The cat objected, “Meow. Meoooww. MEEEOOWWW!”. Then there was a gentle thud and it got quiet. Dad turned on the lights and the cat had fainted – out cold! Now, if you are a cat lover you might be horrified. You might also be questioning if this anecdote has anything to do with the topic; however, one of the problems that we have in Information Technology is deployment of the latest technology – new toys. Users of technology vary greatly. Some users can’t weight to start using every new feature; however, some users are afraid of the new features and are in general afraid of change. One of the real problems of managing technology changes is to work with each type of user and to allow gadget lovers to test out the new features and still not scare off more timid users. As database administrators and developers that is part of our job. Okay, let’s look at some toys. By Kent Waldrop

Overview of Topics Be Careful with User Defined Functions
A Method to Count Specific Characters Use "Set Oriented" Processes Finding the Last Day of a Month Separating Numeric from Alphanumeric Data Filtering by Date and Time Working with Lists Data Backup Pivoting Data The Data Backup topic is really more of a beginner topic; however, it is very important so it is included. Coverage of this topic is intended to be skimmed rather than thoroughly discussed. Topic priorities: + Data Backup + Set Oriented Processes + Filtering by Date and Time + Caution about User Defined Functions + Working with Lists Least Important (in order) + Separating Numeric from Alphanumeric Data (Just do the “Left Side” example and skip the “Right Side” example) + The PARSENAME function in Working with Lists + Separating Numeric from Alphanumeric Data

Qualities of Good SQL Code
Efficient Understandable (simple, obvious) Stable (consistent) Secure Portable (reusable) Maintainable (friendly) Robust (scalable, resilient) Documented Keep in mind that these qualities will vary depending on the priorities of the company or individual. Here I have put efficient at the top because it is normally my responsibility to optimize queries for efficiency. Under these circumstances this makes for me efficiency a higher priority than making readability. Frequently, readability is a higher priority for other people that I work with or with other groups. It is important to understand the priorities because some of the code that is presented is not the most obvious way to accomplish a given task.

Caution: Functions Under Construction
A caution about Scalar functions: Scalar functions can create query drag

Hidden Danger Question: What is wrong with this function?
create function dbo.ACot float) returns float with returns null on null input begin return end Two hints: 1. Why did I color one line red? 2. The problem is not in the code Other issues: Yes, this function can have an overflow if parameter is zero; however, that is not what I am getting at

Scalar Functions Cause “Query Drag”
There is overhead for each function call Scalar functions are called row-by-row Use the defining computation rather than a simple scalar function Use the definition of simple scalar functions like the inverse tangent function for reference information Beware using scalar functions in long running queries The problem with the previous function is that it is a simple calculation; it is best to avoid using this function in any SQL queries to avoid the overhead that is associated with scalar functions. Save the use of scalar functions for situations in which the calculations are more complicated. Again, keep in mind this is looking at this from the perspective of making efficiency a higher priority than readability.

Recognizing The Wrong Question
I have two tables -- Hours worked and Hours Paid. Hours Worked contains (1) employee id, (2) date and (3) hours. Hours Paid contains (1) employee id and (2) hours paid. How can I write a cursor to process the hours worked table and sum the hours worked for a particular employee over a date range for a given employee and then update the hours paid column of the hours paid table with the compouted sum? Assume a date range of 3/31/8 through 4/13/8. The issue here has to do with changing the thought process related to solving what might otherwise be an iterative process. It is very important to learn to avoid writing SQL code that heavily uses cursors and while loops. The question here is along the line of “How do I update the table when … “. I worked with Sybase before cursors were introduced to transact SQL. I got to witness and be a part of the performance decline that took place when cursors were introduced to transact SQL. I would guess that I have spent more time fixing performance problems that are associated with cursors than any other single transact SQL feature. Even worse are situations in which cursor processes are nested. Cursor processes might perform badly; however, nested cursors have a tendency to incur dreadful performance.

Use Set Oriented Thinking
Record Oriented Thinking: Process the table to identify source data Identify conditions in which the source data records can be applied to the target data records Loop through and process all records 1 at a time Set Oriented Thinking: If you choose to learn only two points from this slide series make this one of them: Use joins instead of loops Identify or collect a set of data that contains all of the needed source data Use this data set to target records and apply changes to the target Use joins instead of loops

Update Test Data declare @hours_worked table ( employee_id integer,
date datetime, hours_worked numeric(9,2)) insert select 1, '3/28/8 15:03', 6.0 union all select 1, '3/31/8 15:02', 8.0 union all select 1, '4/1/8 15:02', 8.0 union all select 1, '4/13/8 17:32', 9.5 union all select 1, '4/14/8 15:01', 7.0 union all select 2, '4/5/8 15:03', 8.0 union all select 2, '4/9/8 15:03', 8.0 table hours_paid numeric (9,2)) insert select 1, null union all select 2, null This slide shows the SQL to generate the example data for the UPDATE statement that follows in the next slide

A Set Oriented Solution
;with payCTE as ( select employee_id, sum(hours_worked) as hoursPaid where date >= '3/31/8' and date < '4/14/8' group by employee_id ) set hours_paid = hoursPaid from payCTE b where = b.employee_id select * /* Sample Output: employee_id hours_paid */ NOTE: * Add a slide that solves this without using a CTE I used a common table expression (CTE) as the basis for the update statement. I frequently get asked about the semicolon at the beginning of the WITH statement. A syntax error results if there are any SQL statements that precede the WITH statement. I have put the semicolon at the beginning of thes statement in this case to emphasize the potential need to precede the WITH statement with the semicolon. Go back to the original slide that posed the question. Note that the date range was 3/31 through 4/13. Also note that the particular way the date range is filtered. Note especially that the high date is less than 4/14/8. We will discuss this more shortly. Here is the entire test query: table ( employee_id integer, date datetime, hours_worked numeric(9,2)) insert select 1, '3/28/8 15:03', 6.0 union all select 1, '3/31/8 15:02', 8.0 union all select 1, '4/1/8 15:02', 8.0 union all select 1, '4/13/8 17:32', 9.5 union all select 1, '4/14/8 15:01', 7.0 union all select 2, '4/5/8 15:03', 8.0 union all select 2, '4/9/8 15:03', 8.0 table hours_paid numeric (9,2)) insert select 1, null union all select 2, null ;with payCTE as ( select employee_id, sum(hours_worked) as hoursPaid where date >= '3/31/8' and date < '4/14/8' group by employee_id ) set hours_paid = hoursPaid from payCTE b where = b.employee_id select * /* Sample Output: employee_id hours_paid */

A Set Oriented Solution
-- The CTE defines a logical table that will be -- used to join to the target table to obtain the -- values for the update. A semicolon is placed -- the beginning of the CTE definition to -- emphasize the need for the semicolon to avoid -- getting hit with an SQL syntax error. ;with payCTE as ( select employee_id, sum(hours_worked) as hoursPaid where date >= '3/31/8' and date < '4/14/8' group by employee_id ) set hours_paid = hoursPaid from payCTE b where = b.employee_id I used a common table expression (CTE) as the basis for the update statement. I frequently get asked about the semicolon at the beginning of the WITH statement. A syntax error results if there are any SQL statements that precede the WITH statement. I have put the semicolon at the beginning of thes statement in this case to emphasize the potential need to precede the WITH statement with the semicolon. Go back to the original slide that posed the question. Note that the date range was 3/31 through 4/13. Also note that the particular way the date range is filtered. Note especially that the high date is less than 4/14/8. We will discuss this more shortly. Here is the entire test query: table ( employee_id integer, date datetime, hours_worked numeric(9,2)) insert select 1, '3/28/8 15:03', 6.0 union all select 1, '3/31/8 15:02', 8.0 union all select 1, '4/1/8 15:02', 8.0 union all select 1, '4/13/8 17:32', 9.5 union all select 1, '4/14/8 15:01', 7.0 union all select 2, '4/5/8 15:03', 8.0 union all select 2, '4/9/8 15:03', 8.0 table hours_paid numeric (9,2)) insert select 1, null union all select 2, null ;with payCTE as ( select employee_id, sum(hours_worked) as hoursPaid where date >= '3/31/8' and date < '4/14/8' group by employee_id ) set hours_paid = hoursPaid from payCTE b where = b.employee_id select * /* Sample Output: employee_id hours_paid */

Update Results select * from @hours_paid
/* Sample Output: employee_id hours_paid */

A Derived Table Instead of a CTE
set hours_paid = hoursPaid from ( select employee_id, sum(hours_worked) as hoursPaid where date >= '3/31/8' and date < '4/14/8' group by employee_id ) b where = b.employee_id select * /* Sample Output: employee_id hours_paid */ This might be the solution chosen for SQL Also note that in this case the query might be slightly less complicated than the CTE version. Frequently this will not be true, but it is true in this particular case.

There Might Be Shortcuts
Question: How can I write a cursor to spin through a short list of states that are contained in a parameter to a function or stored procedure and fetch records that match for each state? Again, this might be asking the wrong question. As we develop into changing the question asked into a better question it is also a good idea to see if there are any shortcuts that we might be able to take. A Better Question: How do I pass a list as a parameter to a stored procedure or a function?

Filtering With A “List” Parameter
-- In this example the list is a rather short list consisting of 5 states -- or less. A simple approach to solving this issue is to just test to -- see if the state code is in a list of all possible 2-character substrings -- in parameter. char(14) = 'VT,IL,MI,M' table(id int identity, state char(2)) insert select 'CA' union all select 'MI' union all select 'CA' union all select 'CO' union all select 'VT' union all select 'MI' union all select 'M' select id, state where state in ( 1, 2), 4, 2), 7, 2), 10, 2), 13, 2)) /* Sample output: id state MI VT MI M */ This method must be used with caution. If this solution is applied to data other than states codes and there is any variation in the length of each segment this method will likely not yield expected results. If there is any doubt about the length of the data use one of the other methods that follow. Sample Query: -- In this example the list is a rather short list consisting of 5 states -- or less. A simple approach to solving this issue is to just test to -- see if the state code is in a list of all possible 2-character substrings -- in parameter. char(14) = 'VT,IL,MI,M' table(id int identity, state char(2)) insert select 'CA' union all select 'MI' union all select 'CA' union all select 'CO' union all select 'VT' union all select 'MI' union all select 'M' select id, state where state in ( 1, 2), 4, 2), 7, 2), 10, 2), 13, 2)) /* Sample output: id state MI VT MI M */

Numbers Function This function returns a list integers "n" that range in value from 1 to integer -- -- Credit: This function is based on a response in the MSDN forums that was supplied by Jacob Sebastian and based on Itzik Ben-Gan's book. The post can be found at: -- create function integer) returns table return ( WITH L0 AS (SELECT 1 AS C UNION ALL SELECT 1), rows L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B), -- 4 rows L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B), rows L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B), rows L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B), rows L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B), rows numbers AS (SELECT ROW_NUMBER() OVER(ORDER BY C) AS N FROM L5) select n from numbers where n ) I really do not generally use a “numbers function”. Rather, I would normally use the CTE that is used to define the numbers function; specifically, this fragment: WITH L0 AS (SELECT 1 AS C UNION ALL SELECT 1), rows L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B), -- 4 rows L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B), rows L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B), rows L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B), rows L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B), rows numbers AS (SELECT ROW_NUMBER() OVER(ORDER BY C) AS N FROM L5) select n from numbers where n However, there are times in which code will be more readable if the function is used so I include the function definition. Note that this CTE (and the resulting function) use nested CTEs L0-L5 to build a potentially large set of numbers. Also, the CTE tends to run very fast. Frequently, the function will run faster than a table of numbers and won’t suffer from lock contention like a table of numbers might. This function returns a list integers "n" that range in value from 1 to integer -- -- Credit: This function is based on a response in the MSDN forums that was supplied by Jacob Sebastian and based on Itzik Ben-Gan's book. The post can be found at: -- create function integer) returns table return ( WITH )

Table of Numbers -- A definition for a table of numbers. In this case I am -- using a table of numbers that are the SMALLINT -- datatype. -- -- A description of a table of numbers can be found here: i-consider-using-an-auxiliary-numbers-table.html create table numbers ( n smallint not null constraint pk_numbers primary key ) insert into numbers select n from _numbers(32767) Here is one of many possible definitions for a table of numbers. Note that the data type for the numbers is smallint. This is done to reduce the size of the target data by ½. A potential problem of using a table of numbers might be lock contention. Note that I have used our “Numbers Function” to populate this table. A “quick-and-dirty” table of numbers can be obtained from the master database using a query similar to: select number from master.dbo.spt_values where name is null This kind of query using the SPT_VALUES table should in general be avoided because (1) the number of entries returned by this select is different in SQL Server 2000 than it is in SQL Server 2005 and (2) this is not a documented feature and is not guaranteed to work in future release. Be careful! Sample Table: -- A definition for a table of numbers. In this case I am -- using a table of numbers that are the SMALLINT -- datatype. -- -- A description of a table of numbers can be found here: i-consider-using-an-auxiliary-numbers-table.html create table numbers ( n smallint not null constraint pk_numbers primary key ) insert into numbers select n from _numbers(32767)

State List with a Table of Numbers
-- In this example the list can be longer -- up to 70 different state codes. -- The code can be simplified by employing a "table of numbers" to spin -- through the string. varchar(210) = 'VT,IL,MI,M' table(id int identity, state char(2)) insert select 'CA' union all select 'MI' union all select 'CA' union all select 'CO' union all select 'VT' union all select 'MI' union all select 'M' select id, state join numbers on n <= + 2) / 3 and state = 3 * n - 2, 2) /* Sample output: id state MI VT MI M */ The advantage of using a table of numbers for this function is that it works the same way in SQL Server 2000 as it does in SQL The disadvantage is that there might be contention issues. Again, this method might not work best if the length of the individual pieces of data between the commas can vary. Sample: -- In this example the list can be longer -- up to 70 different state codes. -- The code can be simplified by employing a "table of numbers" to spin -- through the string. varchar(210) = 'VT,IL,MI,M' table(id int identity, state char(2)) insert select 'CA' union all select 'MI' union all select 'CA' union all select 'CO' union all select 'VT' union all select 'MI' union all select 'M' select id, state join numbers on n <= + 2) / 3 and state = 3 * n - 2, 2) /* Sample output: id state MI VT MI M */

Shouldn’t This be Easy? Question:
How do I find the last day of a given month? The easiest way to compute the last day of a given month is to compute the first day of the next month and subtract off 1 day. There are other alternatives if you have a “calendar table”.

The Last Day of a Given Month
-- The last day of the month is computed as: -- (1) Use the DATEDIFF function to compute the number of months from the base date of ' ' -- (2) Use the DATEADD function to compute the date that is the number of months computed in step #1 plus one more month to find the first day of the next month -- (3) Subtract 1 day from the first day of the next month datetime = current_timestamp select dateadd(month, datediff(month, + 1, ' ') - 1 as lastDayOfMonth /* Sample output: lastDayOfMonth :00:00.000 */ Sample Code: -- The last day of the month is computed as: -- (1) Use the DATEDIFF function to compute the number of months from the base date of ' ' -- (2) Use the DATEADD function to compute the date that is the number of months computed in step #1 plus one more month to find the first day of the next month -- (3) Subtract 1 day from the first day of the next month datetime = current_timestamp select dateadd(month, datediff(month, + 1, ' ') - 1 as lastDayOfMonth /* Sample output: lastDayOfMonth :00:00.000 */ This is a nice query but if you were to check up on me in the MSDN forums you would find that my usual code for this might be more like: select dateadd(month, datediff(month, + 1, 0) - 1 as lastDayOfMonth Note that this code is a little more compact; so why post the first answer? Here the difference is representing the date as ‘ ’ rather than 0. It is my belief that ‘ ’ is more intuitively recognized as a date than 0; therefore, it is my belief that ‘ ’ is more readable than 0. Also, note the potential subjective impact of formatting. I chose to format the first expression using ‘ ’ as a two-line expression. Because 0 is a more compact than the ‘ ’ representation I chose to format the second expression into a one-line expression. Which you might want to choose is largely going to depend on personal preference and which qualities of SQL are a higher priority to you or your group.

A Quick Unscientific Survey
-- Which select statement do you like better? Select dateadd(month, datediff(month, + 1, ' ') -1 Select dateadd(month, datediff(month, + 1, 0) -1

I Don’t Like That Question
How can I extract the numeric portion from a column that starts with numeric information and ends with other non-numeric information? Some questions are glaringly “wrong” questions, but sometimes questions are just a little “off”, aren’t particularly bothersome, but just seem to “itch”. What is bothersome about this question? This question tends to lead me to question the table design. One possibility is that the numeric portion of this string and the alphanumeric portion of the string should be stored in separate columns.

Data for Numeric Split from Left
-- Split apart a string at a border where the string changes -- from being numeric to being non-numeric. This routine -- assumes that the pure numeric portion will always occupy -- the leftmost portion of the string. table (combinedString varchar(25)) insert select '1015Abc' union all select '1022' union all select '5157D2' union all select 'Db2' union all select '#' union all select '#d' union all select '' union all select '105#' union all select ' ' union all select null This slide serves to provide a dataset for test purposes. This data is the input to the query presented on the next slide.

Split Numeric Data From Left
select combinedString, left(combinedString, patindex('%[^0-9]%',combinedString +'X') - 1) as [Numeric Part], substring(combinedString, patindex('%[^0-9]%',combinedString+'X'),25) as [Remainder] /* Sample Output: combinedString Numeric Part Remainder 1015Abc Abc 5157D D2 Db Db2 # # #d #d 105# # NULL NULL NULL */ What is missing from this query?

Data for Numeric Split from Right
-- Split apart a string at a border where the string -- changes from being non-numeric to being numeric. -- This routine assumes that the pure numeric portion -- will always occupy the rightmost portion of the -- string. table (combinedString varchar(25)) insert select 'Abc1015' union all select '1022' union all select 'D25157' union all select 'Db2' union all select '#' union all select '#d' union all select '' union all select '#105' union all select ' ' union all select null --select * This is the data use for a query that splits a numeric portion that is on the right rather than on the left.

Split Numeric Data From Right
select combinedString, right(combinedString, patindex('%[^0-9]%', reverse(combinedString)+'X')-1) as [Numeric Part], left(combinedString, len(combinedString+'X') - patindex('%[^0-9]%', reverse(combinedString)+'X') ) as [Remainder] /* Sample Output: combinedString Numeric Part Remainder Abc Abc D D Db Db # # #d #d # # NULL NULL NULL */

It’s the Wrong Time Question:
How can I select records from a table that match by a given date or that are within a given date range? It’s the “Wrong Time” – what are you talking about?

Test Data Setup -- ---------------------------------------
-- Indexed on “aDate” column use tempdb go create table dbo.date_test ( id int primary key, aDate datetime) create index aDate on dbo.date_test(aDate) insert into dbo.date_test select 1, '12/31/7 23:59:59.996' union all select 2, '12/31/7 23:59:59.999' union all select 3, '1/1/8' union all select 4, '1/1/8 12:00' union all select 5, '1/1/8 23:59:59.996' union all select 6, '1/2/8' union all select 7, '1/2/8 00:00:00.001' union all select 8, '1/2/8 00:00:00.003' --select * from dbo.date_test Why is it the wrong time? There are a couple of common issues that occur with the way the date/time data type is represented. A quick glance at queries that involve data/time data that includes time might lead a casual observer to believe that time is stored to the millisecond; however, that is not the case. The time portion is actually stored to 1/300th second accuracy. When this data is displayed it is represented with a decimal that is rounded. Similarly, when time (date/time) data is inserted into the database it is rounded to the nearest 1/300th second. See the data in red? The data inserted into the database for this select will be ROUNDED UP to ’01/01/ :00:00.000’. This can sometimes lead to surprises. Be aware of it!

Matching Records by Date
-- This is not a good method for filtering records -- by date. See if you can spot the problem select id, aDate from dbo.date_test where convert(varchar(10), aDate, 101) = '01/01/2008' /* Sample Output: id aDate :00:00.000 :00:00.000 :00:00.000 :59:59.997 */ Remember the “Rounding Up” problem of the previous slide? Note that record 2 is selected as a 1/1/2008 record. The application might “think” this is a 12/31/2007 record but the database rounded it up and it went from being a 2007 record to being a 2008 record. This slide also gets the other “Wrong Time” issue. Earlier I stated that I have probably have spent more time fixing cursor problems than any other problem. Another very frequent problem has to do with searching for data by date/time in which the query negates any ability to use a date/time based index. It is possible that this kind of problem occurs more frequently than the cursor problems. The advantage that this problem has over the cursor problem is that usually this problem is simple to fix. This is the reason that I say that this is not a good method for filtering. Sample Query: -- Indexed on “aDate” column use tempdb go create table dbo.date_test ( id int primary key, aDate datetime) create index aDate on dbo.date_test(aDate) insert into dbo.date_test select 1, '12/31/7 23:59:59.996' union all select 2, '12/31/7 23:59:59.999' union all select 3, '1/1/8' union all select 4, '1/1/8 12:00' union all select 5, '1/1/8 23:59:59.996' union all select 6, '1/2/8' union all select 7, '1/2/8 00:00:00.001' union all select 8, '1/2/8 00:00:00.003' --select * from dbo.date_test -- This is not a good method for filtering records -- by date. See if you can spot the problem select id, aDate from dbo.date_test where convert(varchar(10), aDate, 101) = '01/01/2008' /* Sample Output: id aDate :00:00.000 :00:00.000 :00:00.000 :59:59.997 */

Alternate Test Data Truncate table dbo.date_test
insert into dbo.date_test select 1, '12/31/7 23:59:59.996' union all select 2, '12/31/7 23:59:59.999' union all select 3, '1/1/8' union all select 4, '1/1/8 12:00' union all select 5, '1/2/8' union all select 6, '1/3/8 23:59:59.996' union all select 7, '1/4/8 00:00:00.001' union all select 8, '1/4/8 00:00:00.003' --select * from dbo.date_test Now lets look at a method of filtering records for a given range of dates. The input for the test will be the data provided above. This data will be the input to the next slide.

Matching by Date Range -- Fetch records from a table that match on a range -- of dates. In this particular case records -- targeted are records in which the aDate column -- has a date of 1/1/8, 1/2/8 or 1/3/8. datetime = '1/1/8' datetime = '1/3/8' select id, aDate from dbo.date_test where aDate and aDate < @toDate + 1 /* Sample Output: id aDate :00:00.000 :00:00.000 :00:00.000 :00:00.000 :59:59.997 */ Note that the “aDate” column is no longer enclosed in a function. This means that it is now possible for this WHERE clause to utilize an index on the “aDate” column. Also, notice that the comparison that is performed for imposes that the “aDate” column be strictly less than variable plus one day. This is so that all data for be included in the dataset. The “strictly less than” comparison is so that any records with a date/time of midnight of the next day beyond will be excluded from the dataset. Sample Query: Truncate table dbo.date_test insert into dbo.date_test select 1, '12/31/7 23:59:59.996' union all select 2, '12/31/7 23:59:59.999' union all select 3, '1/1/8' union all select 4, '1/1/8 12:00' union all select 5, '1/2/8' union all select 6, '1/3/8 23:59:59.996' union all select 7, '1/4/8 00:00:00.001' union all select 8, '1/4/8 00:00:00.003' --select * from dbo.date_test -- Fetch records from a table that match on a range -- of dates. In this particular case records -- targeted are records in which the aDate column -- has a date of 1/1/8, 1/2/8 or 1/3/8. datetime = '1/1/8' datetime = '1/3/8' select id, aDate from dbo.date_test where aDate and aDate < @toDate + 1 /* Sample Output: id aDate :00:00.000 :00:00.000 :00:00.000 :00:00.000 :59:59.997 */

Arrays and Lists Question:
How do you split a comma separated "multi-value" parameter to generate a list of the individual values? For example, how would the string '2,5,9,14,21' be split to return each individual number? Returned as: Number ------ 2 5 9 14 21

Jens Suessmeyer’s “Split” Function
-- CREATE FUNCTION dbo.Split ( @String VARCHAR(200), @Delimiter VARCHAR(5) ) TABLE ( OccurenceId SMALLINT IDENTITY(1,1), SplitValue VARCHAR(200) ) AS BEGIN INT WHILE > 0 = (CASE WHEN 0 THEN ELSE -1 END) INSERT SELECT = (CASE WHEN 0 THEN '' ELSE - 1) END) END RETURN Jens’ “Split” functions is one of my personal favorites. If you are running SQL Server 2000 this function is an excellent way to get the job done. Two common methods with 2005 include the “split” function and an “XML substitution” method – which is discussed in a couple of slides. Note that the split function takes two arguments – (1) the string to be split and (2) a short delimiter string. This allows a bit of flexibility in defining the delimiter. For instance, if the input string is text that might otherwise contain commas, using a pipe delimiter or an ascii character like char(255) might be more appropriate for a delimiter than the comma character. Also note the input length of the variable character field of 200 characters. This is something that I frequently change rather than use as presented.

Multi-value Parameter with Split Function
-- Use of Jens Suessmeyer's Split function to -- split a "multi-value" -- that contains '2,5,9,14,21' varchar(20) = '2,5,9,14,21' select * from ',') /* Sample Output: OccurenceId SplitValue */ This is an example of a good use of a multi-line function. The select statement sure looks simple. This function coupled with the CROSS APPLY operator can be applied in a number of different scenarios.

XML Substitution Used for Splitting
-- This query uses an "XML Substitution" method -- to translate the string '2,5,9,14,21' into a -- list of data segments. varchar(20) = '2,5,9,21' select t.value('.', 'varchar(11)') as segment from ( select cast( '<a>' + ',', '</a><a>') + '</a>' as xml) as xmlList ) as a (xmlList) cross apply xmlList.nodes('//a') as x(t) /* Sample Output: segment 2 5 9 21 */ Note: * Expand on this into at least two slides; the explanation here is a bit thin The XML substitution method is currently a very popular method of splitting strings into lists. The advantage of this method is that it does not require the definition of a table function and therefore can be applied to many different kinds of problems that involve splitting “string lists” into individual values.

Splitting Blocked Data
-- This example splits a list of integers are broken into 8-byte -- segments. In this case we want to be sure that we pick up -- the blank segment that is at the end of the string. varchar(56) = ' ' select n, 8*n - 7, 8) as segment from numbers where n <= 7 and n <= + 'x') + 6)/8 /* Sample Output n segment 6 */ Representing a list of data in a blocked or segmented format rather than a delimited format can sometimes provide a boost to performance. The idea is to use the SUBSTRING function to address each individual block and to use a table of numbers, a numbers function or a numbers CTE to iterate through each block. What is with the n <= 7 WHERE clause? Isn’t this clause unnecessary because of the N <= + ‘x’) clause?

Document the “Funny Stuff”
-- This example splits a list of integers are broken into 8-byte -- segments. In this case we want to be sure that we pick up -- the blank segment that is at the end of the string. -- -- The “WHERE n <= 7” is there so that the optimizer might choose -- a better query plan – with a cost estimate of about 1/8 the -- cost of the query plan without this filter. varchar(56) = ' ' select n, 8*n - 7, 8) as segment from numbers where n <= 7 and n <= + 'x') + 6)/8 /* Sample Output n segment 6 */ Technically, the n <= 7 WHERE clause is not necessary because of the AND condition. However, adding this condition to the WHERE clause will cause a better much better estimation of the plan for the optimizer and can result in a better query plan. The reason for having this condition is not obvious. This kind of “Funny Stuff” should be documented in the code to make it clear what is going on. Also, it is might be better to put this comment immediately ahead of the WHERE clause rather than in the block at the top. Sample Query: -- This example splits a list of integers are broken into 8-byte -- segments. In this case we want to be sure that we pick up -- the blank segment that is at the end of the string. -- -- The “WHERE n <= 7” is there so that the optimizer might choose -- a better query plan – with a cost estimate of about 1/8 the -- cost of the query plan without this filter. varchar(56) = ' ' select n, 8*n - 7, 8) as segment from numbers where n <= 7 and n <= + 'x') + 6)/8 /* Sample Output n segment 6 */

Split Out Numeric IP Values
Question: How do I extract the numeric segments of an IP Address? We talked about “Shortcuts” earlier. The IP value question is one situation in which there is an obscure shortcut that can make your life easier. Give a look at the PARSENAME function in books online. The PARSENAME function demands a number of constraints that normally render it problematic in dealing with situations in which the objective is to split a string into a number of constituents; however, splitting apart an IP address is one case in which the PARSENAME can be particularly useful. Skip or only gloss over this slide and the next one if you are short on time

Split Out Numeric IP Values
-- Split out the 4 numeric portions of an IP address -- -- This particular query uses the PARSENAME builtin function to -- rip out the individual numeric segments. PARSENAME will not -- generally apply to segmenting out data because it is specific -- in that (1) it cannot handle more than 4 segments and (2) it -- must use the period character for the separator character. varchar(15) = ' ' varchar(15) = ' ' varchar(15) = ' ' as [IP Address], 4) as [Seg 1], 3) as [Seg 2], 2) as [Seg 3], 1) as [Seg 4] /* Sample Output: IP Address Seg 1 Seg 2 Seg 3 Seg 4 */ I suppose that to some extent this might be abuse of a feature in that this is not how PARSENAME is intended to be used; however, this is a simple query for extracting the values out of a properly formed IP address string. Sample Query: -- Split out the 4 numeric portions of an IP address -- -- This particular query uses the PARSENAME builtin function to -- rip out the individual numeric segments. PARSENAME will not -- generally apply to segmenting out data because it is specific -- in that (1) it cannot handle more than 4 segments and (2) it -- must use the period character for the separator character. varchar(15) = ' ' varchar(15) = ' ' varchar(15) = ' ' as [IP Address], 4) as [Seg 1], 3) as [Seg 2], 2) as [Seg 3], 1) as [Seg 4] /* Sample Output: IP Address Seg 1 Seg 2 Seg 3 Seg 4 */

An Unfortunate Question
I have no backup of my database. Is there a way to recover data from a table after the data has been deleted. This question similar to this comes up sometimes in the forum. There are a few recovery programs available that might be useable for this situation; however, frequently if you don’t have a backup you have no recourse. One of the maxims I have heard a number of times is that a database administrator is only as good as his backups. It was a little after 3PM one afternoon in My server monitoring program lost contact for some reason with our Atlanta server. The lost contact persisted for several minutes so I called the application support center and explained my concern. I was told that they were unable to phone the Atlanta center and had no information. In about 15 minutes I got a response from the application support center; a big fire was in progress at the Atlanta center and both my primary and backup servers were probably lost. An hour or so later I found out that they had terminated sending the tape backups offsite for storage as part of a “cost savings” effort because these tapes had never been used – not even once. Evenutally I figured out that about a month earlier I had transferred a file copy of an Atlanta database backup to another location as part of a precaution when there had been a hurricane threat. That night when I got home my wife called me into the living room. The fire was on CNN. A few minutes later I got a call from a friend who asked if that fire killed my servers. “Yep.” Watching your servers burn down on national television gives you a … kind of a … special feeling – and it isn’t good! This really is a beginner issue; however, if you take only one thought away from this slide series remember this one: Make sure your data is backed up and secure so that you can recover from unforeseen problems.

An Ounce of Prevention Features to Assist Recovery: Maintenance Plans
Database Mirroring Replication Backup and Restore Detach and Attach The intent is to gloss over and skim the next few pages

SQL Server Recovery Models
Full Recovery Model Bulk Logged Recovery Model Simple Recovery Model

Transact SQL Recovery Commands
Backup Restore sp_detach_db sp_attach_db sp_attach_single_file_db Shock slide follows; the intent is to hit hard with a surprising slide to drive the point home: Backups are so critically important

SCREWED!!! You’re Memorandum:
What are the results of not having sufficient backups and you lose your data? Exactly!

Pivots and EAV Tables Question: How do I take these rows:
1 col1 value1 1 col2 value2 1 col3 value3 2 col1 value4 2 col2 value5 2 col3 value6 What is an EAV table? A table that contains (1) an entity identifier, (2) an attribute that is associated to a particular entity and (3) the value for the given entity-attribute. EAV tables represent data that is not normalized and as such should be used with caution. One of the problems with EAV tables is that it while they might be convenient for developers to store values into, they are frequently troublesome when it comes to read the data out. Reading the data out will frequently require some kind of pivot operation. Therefore, an EAV table is chosen as an example for using the SQL Server 2005 PIVOT operator. Note that what the PIVOT operator does is convert rows into columns. SQL Server also includes an UNPIVOT operator to convert columns into rows. And transform them into this output: id col1 col2 col value1 value2 value3 2 value4 value5 value6

EAV Table and Data Setup
-- An “EAV Table” is a table that is intended to be used -- to store generic data to provide a high degree of -- flexibility in use for an application developer. -- However, EAV tables can require SQL queries that are -- more complex than a standard table would require -- EAV queries are frequently slower than queries that -- would target a typical relational table. table ( entity integer, attribute varchar(12), aValue varchar(12) ) insert select 1, 'col1', 'value1' union all select 1, 'col2', 'value2' union all select 1, 'col3', 'value3' union all select 2, 'col1', 'value1' union all select 2, 'col2', 'value2' union all select 2, 'col3', 'value3' --Select * This slide shows the setup data that is used for the example that follows on the next page.

SQL Server 2005 Pivot -- This query uses the SQL Server 2005 PIVOT clause to -- convert specific EAV rows into columns. Select entity, col1, col2, col3 Pivot ( max(aValue) for attribute in([col1],[col2],[col3]) ) pv /* Output: entity col col col3 value value value3 value value value3 */ A few things to note about this pivot query: + The “pivot” operator is included as part of the FROM clause + Note the “max(aValue) expression; the pivot operator requires that the first expression be an aggregate expression + Note the column name that follows the “for” is the column name that is searched for specific values + The “in” clause is used to designate particular values of the column designated in the “for” clause that get pivoted from rows into columns + Note the use of square brackets to designate target values + The “pivot” operator requires a alias – in this case “pv” -- An “EAV Table” is a table that is intended to be used -- to store generic data to provide a high degree of -- flexibility in use for an application developer. -- However, EAV tables can require SQL queries that are -- more complex than a standard table would require -- EAV queries are frequently slower than queries that -- would target a typical relational table. table ( entity integer, attribute varchar(12), aValue varchar(12) ) insert select 1, 'col1', 'value1' union all select 1, 'col2', 'value2' union all select 1, 'col3', 'value3' union all select 2, 'col1', 'value1' union all select 2, 'col2', 'value2' union all select 2, 'col3', 'value3' --Select * -- This query uses the SQL Server 2005 PIVOT clause to -- convert specific EAV rows into columns. Select entity, col1, col2, col3 pivot( max(aValue) for attribute in([col1],[col2],[col3]) ) pv /* Output: entity col col col3 value value value3 value value value3 */

Support Information Problem
Question: I have a support table that contains this information: Application Name Support Role – primary or secondary support Contact – The name of the support associate Contact Phone Number The pivot operator can be used to solve some pivot type operations; however, the pivot operator is not available in SQL 2000 and earlier releases. Also, there are some pivot queries that are not as conveniently solved using the pivot operators. That is the case with this problem. This query does a “double pivot” – one pivot for contact information and another pivot for the contact telephone. How do I pivot this data such that each line displays: Application Name First Contact (Name) First Phone (Phone Number) Second Contact Second Phone

Support Information Data Setup
-- The data for this problem is stored in -- table variable table ( application_name varchar(20), support_role char(1), contact varchar(10), contact_phone varchar(14) ) insert select 'Clean', 'P', 'Rick', '(904) ' union all select 'Buggy', 'P', 'Jim', '(217) ' union all select 'Buggy', 'S', 'Chris', '(309) ' union all select 'New', 'S', 'Rick', '(904) ' --select * This short script is to establish test data for our pivot problem. This data is input to the page that follows.

“Classic” SQL 2000 Pivot -- This method uses the SQL 2000 compatible "Standard Method" for performing -- the pivot. This method is based on using an aggregate – in this case -- max – and case statements to perform the pivot. select application_name, max( case when support_role = 'P' then contact else '' end ) as [First Contact], max( case when support_role = 'P' then contact_phone else '' end ) as [First Phone], max( case when support_role = 'S' then contact else '' end ) as [Second Contact], max( case when support_role = 'S' then contact_phone else '' end ) as [Second Phone] group by application_name /* Sample Output: application_name First Contact First Phone Second Contact Second Phone Buggy Jim (217) Chris (309) Clean Rick (904) New Rick (904) */ The “classic” pivot method involves the use of an aggregate function and “case” syntax to select values that migrated from rows to columns. It is good to be familiar with this method because the “pivot” operator is somewhat limited in the kinds of problems that it can address. -- The data for this problem is stored in -- table variable table ( application_name varchar(20), support_role char(1), contact varchar(10), contact_phone varchar(14) ) insert select 'Clean', 'P', 'Rick', '(904) ' union all select 'Buggy', 'P', 'Jim', '(217) ' union all select 'Buggy', 'S', 'Chris', '(309) ' union all select 'New', 'S', 'Rick', '(904) ' --select * -- This method uses the SQL 2000 compatible "Standard Method" for performing -- the pivot. This method is based on using an aggregate – in this case -- max – and case statements to perform the pivot. select application_name, max( case when support_role = 'P' then contact else '' end ) as [First Contact], max( case when support_role = 'P' then contact_phone else '' end ) as [First Phone], max( case when support_role = 'S' then contact else '' end ) as [Second Contact], max( case when support_role = 'S' then contact_phone else '' end ) as [Second Phone] group by application_name /* Sample Output: application_name First Contact First Phone Second Contact Second Phone Buggy Jim (217) Chris (309) Clean Rick (904) New Rick (904) */

Reference – General General Reference: Microsoft SQL Server MSDN:
Microsoft SQL Examples: SQL Server Central: Erland Sommarskog:

Reference – Scalar Functions
Problems with Scalar Function: Umachandar Jayachandran: Kent Waldrop: Sreeju:

Reference – Set Oriented Processing
Kent Waldrop: Kent Waldrop and Arnie Rowland: Umachandar Jayachandran:

Reference – Last Day of the Month
Finding the Last Day of the Month: Kent Waldrop: Arnie Rowland:

Reference – Separating Data
Separating Numeric from Alphanumeric Data: Kent Waldrop: Kent Waldrop, Reuben Shaffer, Jonathan Kehayias and Mark: Matt Miller and Jeff Moden:

Reference – Filtering by Date/Time
Filtering by Date and Time: David Brit and Kent Waldrop: Jonathan Kehayias: FindDataBetweenDates&referring Title=Home Kent Waldrop Jonathan Kehayias and Adam Haines:

Reference – Working With Lists
Erland Sommarskog: Arnie Rowland: CreateACommaDelimitedList&referringTitle=Home StringArrayInput&referringTitle=Home Umachandar Jayachandran:

Reference – Working With IP Addresses
Adam Machanic and Louis Davidson: Louis Davidson: Jeff Moden:

Reference – Data Pivots
Jonathan Kehayias: &referringTitle=Home &referringTitle=Home Kent Waldrop:

SQL Frequently Asked Questions

Similar presentations

Presentation on theme: "SQL Frequently Asked Questions"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

SQL Frequently Asked Questions

Similar presentations

Presentation on theme: "SQL Frequently Asked Questions"— Presentation transcript:

Similar presentations

About project

Feedback