How to reuse a subquery in T-SQL

In the past when I was creating sql statements there was many times I would need to have more than one subquery that was just slightly different than the first. However, now there is a better way to reuse a subquery in T-SQL and that is using a common table expression (CTE).

A common table expression (CTE) is defined by:


WITH cte_name_here AS
(
cte_query_here
)
outer_sql_query_that_uses_cte_name_here

You can provide aliases for the results of your common table expression by putting the aliases in the CTE query or in the CTE definition.


WITH mycte (CustomerNumber, InvoiceYear) AS
(
SELECT CustomerId, YEAR(invoicedatepaid)
FROM Invoice
)
SELECT CustomerNumber, InvoiceYear
FROM mycte ;

OR


WITH mycte AS
(
SELECT CustomerId AS CustomerNumber, YEAR(invoicedatepaid) AS InvoiceYear
FROM Invoice
)
SELECT CustomerNumber, InvoiceYear
FROM mycte;

You can use variables with common table expressions just like you can with sub queries. The example below will filter invoices just for the month of March.


DECLARE @StartDate AS DATETIME = '3/1/2014',
@EndDate AS DATETIME = '4/1/2014';

WITH mycte (CustomerNumber, InvoiceYear) AS
(
SELECT CustomerId, YEAR(invoicedatepaid)
FROM Invoice
WHERE invoicedatepaid >= @StartDate
AND invoicedatepaid < @EndDate ) SELECT CustomerNumber, InvoiceYear FROM mycte ;

Unlike subqueries, you cannot nest common table expressions directly, but you can build upon a previous one in sequence by referring to the previous CTE.


WITH mycte1 (CustomerNumber, InvoiceYear, RetailAmount) AS
(
SELECT CustomerId, YEAR(invoicedatepaid), RetailAmount
FROM Invoice
),
mycte2 (InvoiceYear, YearlyAmount) AS
(
SELECT InvoiceYear, SUM(RetailAmount)
FROM mycte1
),
SELECT InvoiceYear, YearlyAmount
FROM mycte2 ;

The best feature of a CTE is that you can refer to the same one more than once.


WITH mycte (CustomerNumber, InvoiceYear, RetailAmount) AS
(
SELECT CustomerId, YEAR(invoicedatepaid), RetailAmount
FROM Invoice
)
SELECT tab1.InvoiceYear, tab2.RetailAmount
FROM mycte tab1
inner join mycte tab2
where tab1.CustomerId = tab2.CustomerId
and tab1.InvoiceYear = tab2.InvoiceYear;

The benefit is that you can define the recordset once and not need to copy and paste it. This will help with maintenance later.

T-SQL – Best approach to remove the time portion of the datetime datatype

In the past, when I had a datetime datatype and I needed to show just the date portion, my best approach to remove the time portion of datetime datatype was converting to a character string. When you call the CONVERT function to change a DATETIME to a string (VARCHAR), it removes all time and allows you to format the date many different ways. For a complete list of date formats available to CONVERT, view the SQL Server Books Online (http://msdn.microsoft.com/en-us/library/ms187928.aspx)

Here are a couple examples to show how to format using CONVERT:


DECLARE @mydatetime DATETIME
SET @mydatetime = GETDATE()

SELECT CONVERT(VARCHAR,@mydatetime,1) AS [MMDDYY]
, CONVERT(VARCHAR,@mydatetime,101) AS [MMDDYYYY]
, CONVERT(VARCHAR,@mydatetime,103) AS [DDMMYYYY]

MMDDYY

MMDDYYYY

DDMMYYYY

03/19/14

03/19/2014

19/03/2014


DECLARE @mydatetime DATETIME
SET @mydatetime = GETDATE()

SELECT CONVERT(VARCHAR,@mydatetime,106) AS DayThreeMonthFourYear
, CONVERT(VARCHAR,@mydatetime,107) AS ThreeMonthDayFourYear
, CONVERT(VARCHAR,@mydatetime,108) AS TimePortionOfDate

DayThreeMonthFourYear

ThreeMonthDayFourYear

TimePortionOfDate

19 Mar 2014

Mar 19, 2014

16:17:15

The biggest drawback of using CONVERT to remove the time portion of the datetime datatype is that you lose the ability to use any of the date functions. For example, you cannot sort the column by calendar order, instead the sort will follow the numeric and alphabet sort order. Also, if you want to do date arithmetic, it does not work either. So, you end up calling the CONVERT function to make the datatype a string to remove the time and then CONVERT back to a datetime in order to use the date functions on the data. This is not very productive.

The best approach to remove the time portion of the datetime datatype is to use the DATEADD function.


DECLARE @mydatetime DATETIME
SET @mydatetime = GETDATE()

SELECT DATEADD(DAY,DATEDIFF(DAY,0,@mydatetime),0) AS Today
, DATEADD(DAY,DATEDIFF(DAY,0,@mydatetime),-1) AS Yesterday
, DATEADD(MONTH,DATEDIFF(MONTH,0,@mydatetime),0) AS FirstOfThisMonth
, DATEADD(MONTH,DATEDIFF(MONTH,-1,@mydatetime),-1) AS LastOfThisMonth
, DATEADD(MONTH,DATEDIFF(MONTH,0,@mydatetime),-1) AS LastOfPrevMonth

Today

Yesterday

FirstOfThisMonth

LastOfThisMonth

LastOfPrevMonth

2014-03-19

2014-03-18

2014-03-01

2014-03-31

2014-02-28

This will enable you to have great query performance by using a start date greater than “Today” and an end date less than “Tomorrow”. Instead of using the BETWEEN command or taking 1 second away from the end of today to get 24 hours.


DECLARE @mydatetime DATETIME
SET @mydatetime = GETDATE()

SELECT DATEADD(DAY,DATEDIFF(DAY,0,@mydatetime),0) AS Today
, DATEADD(DAY,DATEDIFF(DAY,0,@mydatetime),1) AS Tomorrow

SELECT *
FROM Invoices
WHERE CreatedDate >= DATEADD(DAY,DATEDIFF(DAY,0,@mydatetime),0)
AND CreatedDate < DATEADD(DAY,DATEDIFF(DAY,0,@mydatetime),1)

T-SQL – Aggregate data without joining to a subquery

When creating a reports, you may need to aggregate only a portion of the data in the main report. In the past, I would have created a sub query to get the aggregate data and the join it to the main report. However, using the new OVER function you can aggregate data without joining to a subquery.

It is common to use aggregate functions with the GROUP function like COUNT and SUM.


SELECT CustomerId, InvoiceId,
COUNT(*) OVER(PARTITION BY CustomerId) as NumberOfInvoices
FROM Invoices

The result of this query, is that you can return the number of invoices per customer, while still showing each invoice id.
The result of this query will the customer sales per year

CustomerId

InvoiceId

NumberOfInvoices

Cust1

123

2

Cust1

124

2

Cust2

125

3

Cust2

126

3

Cust2

127

3

Using the OVER with PARTITION can be used in the ORDER BY clause as well. This will enable you to sort the record set by the grouping.


SELECT CustomerId, InvoiceId,
COUNT(*) OVER(PARTITION BY CustomerId) as NumberOfInvoices
FROM Invoices
ORDER BY COUNT(*) OVER(PARTITION BY CustomerId) DESC;

The result of this query, is that you can return the number of invoices per customer, while still showing each invoice id.

CustomerId

InvoiceId

NumberOfInvoices

Cust2

125

3

Cust2

126

3

Cust2

127

3

Cust1

123

2

Cust1

124

2

If you want to use the new RANK, DENSE_RANK, ROW_NUMBER or NTILE ranking functions, you will need to use them with the OVER clause.

T-SQL – Transpose rows into columns or columns into rows

It is quite common to need to transpose rows into columns or columns into rows for reporting purposes. The simplest way to do this change is using the PIVOT operator.

Using “Static” Pivot columns is one way to change the row data into columns, but the column names must be known up front.


SELECT *
FROM
(SELECT CustomerId, YEAR(invoicedatepaid) AS InvoiceYear,
RetailAmount
FROM Invoice) as InvoiceValue
PIVOT(SUM(RetailAmount) FOR InvoiceYear IN ([2013], [2014])) AS SUBTAB;

The result of this query will the customer sales per year

CustomerId

2013

2014

Cust1

10023.12

18923.12

Cust2

11135.42

14312.36

Using “Static” UnPivot allows you to change the column data into rows, but the column names must be known up front. If the previous sample results was a table called CustomerYearlySales, the command to convert to rows would be:


SELECT CustomerId, InvoiceYear, RetailAmount
FROM CustomerYearlySales
UNPIVOT(RetailAmount FOR InvoiceYear IN ([2013], [2014])) AS SUBTAB;

This command will split the results up into rows for each customer and year.
The result of this query will be:

CustomerId

InvoiceYear

RetailAmount

Cust1

2013

10023.12

Cust1

2014

18923.12

Cust2

2013

11135.42

Cust2

2014

14312.36

T-SQL – Join two tables where the first table has no or multiple entries in other table or query

If you are trying to join a table of customers with a query of top 10 most recent invoices per customer, this is best achieved by using the APPLY operator. The APPLY operator involves one or two steps, depending on command used. This operator will help you join two tables where the first table has no or multiple entries in other table or query.

Using the CROSS APPLY operator, it will do a right table expression to the left table rows. This will join all the customers with the their top 10 most recent invoices. However, if they don’t have any invoices, they will not appear.

select C.CustomerId, subtab.invoiceid
FROM Customers as C
CROSS APPLY
(SELECT TOP 10 I.invoiceid, I.CustomerId,
FROM invoices as I
WHERE I.CustomerId = C.CustomerId
invoice BY I.invoiceid desc) as subtab;

The result of this query will be the top 10 most recent invoices for each customer, so long as each customer has one.

CustomerId

InvoiceId

Cust1

100

Cust1

99

Cust2

101

Cust2

98

Cust2

97

Using the OUTER APPLY operator, will do the same as a CROSS APPLY but it will show ALL customers, even ones without invoices. In the place of the invoiceid, the database will return NULL.

select C.CustomerId, subtab.invoiceid
FROM Customers as C
OUTER APPLY
(SELECT TOP 10 I.invoiceid, I.CustomerId,
FROM invoices as I
WHERE I.CustomerId = C.CustomerId
invoice BY I.invoiceid desc) as subtab;

The result of this query will be that you will have a list of all customers and where a customer has one or more invoices, you will have the top 10 per customer.

CustomerId

InvoiceId

Cust3

NULL

Cust1

100

Cust1

99

Cust2

101

Cust2

98

Cust2

97