Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Linear Regression Algorithm Using Windowing Functions KEVIN MCCARTY.

Similar presentations


Presentation on theme: "A Linear Regression Algorithm Using Windowing Functions KEVIN MCCARTY."— Presentation transcript:

1 A Linear Regression Algorithm Using Windowing Functions KEVIN MCCARTY

2 What are Windowing Functions Introduced in SQL Server 2005 Used to add artificial values to a set in order to provide a specific ordering of rows ◦This was a difficult operation before, requiring liberal use of temp tables and a lot of code ◦Combines a “ranking” function with the OVER clause The ranking functions are: ◦ROW_NUMBER ◦RANK ◦DENSE_RANK ◦NTILE Windowing functions can work on individual rows or as part of an aggregate query. There are aggregate, ranking, distribution and offset functions – but we will only deal with ranking functions today 

3 What The Ranking Functions Do ROW_NUMBER – Based upon the order, attaches an ascending row value, similar to an identity RANK – Attaches an ascending row value based upon order, applying similar values to ties, in which case introduces gaps in the rank DENSE_RANK – Similar to RANK except it does not introduce gaps NTILE(n) – Splits data into subsets or groups and numbers each subgroup accordingly

4 The OVER clause Use the OVER clause to define the “window” or specific set of rows to apply the windowing function ◦OVER must be combined with an ORDER BY clause (which makes sense) SELECT BusinessEntityID AS SalesID, FirstName + ' ' + LastName AS FullName, SalesLastYear, ROW_NUMBER() OVER(ORDER BY SalesLastYear ASC) AS RowNumber FROM Sales.vSalesPerson;

5 Windowing Components Conceptual relationship between window elements: Result set (OVER) Window partition (PARTITION BY) Frame (ROWS BETWEEN) Note: Shameless theft from Microsoft courseware

6 Basic Syntax OVER ( [ ] -- optional [ ] [ ] - optional ) SELECT BusinessEntityID AS SalesID, FirstName + ' ' + LastName AS FullName, SalesLastYear, ROW_NUMBER() OVER(ORDER BY SalesLastYear ASC) AS RowNumber FROM Sales.vSalesPerson;

7 Partitioning Use Partitioning to limit sets of rows to those with the same value in a partitioning column ◦Another term for this is “framing” ◦Useful for isolating sets with specific characteristics without having to run a separate query SELECT BusinessEntityID AS SalesID, FirstName + ' ' + LastName AS FullName, SalesLastYear, TerritoryGroup, ROW_NUMBER() OVER(PARTITION BY TerritoryGroup ORDER BY SalesLastYear ASC) AS RowNumber FROM Sales.vSalesPerson;

8 Ranking Functions in Action

9 A Linear Regression Algorithm Fuzzy logic routines are often combined with optimization techniques

10 Why Use Ranking Functions? In this case, we have a fuzzy car driving around a track It’s goal is to navigate the track without hitting the sides Sometimes fuzzy algorithm is bad and needs to be fixed Other times algorithm is OK but can be better So how to optimize?

11 Why Use Ranking Functions? Note all the back-and-forth maneuvering. If we can reduce that we can optimize the car’s path. Linear regression allows us to determine a more optimal path The new path now becomes a “goal” for the algorithm Use genetic/memetic algorithm to tweak algorithm ◦In which case the “goal” becomes the fitness function Ranking functions allow us to do this as a set-based Operation instead of point-by-point. ◦Faster and easier too

12 Boring Explanation Use the second derivative to get a brute-force linear regression The tricky part is finding true peaks and valleys in the data This is where ranking functions come in


Download ppt "A Linear Regression Algorithm Using Windowing Functions KEVIN MCCARTY."

Similar presentations


Ads by Google