Functional Programming Data Aggregation and Nested Queries Ivan Yonkov Technical Trainer Software University

Slides:



Advertisements
Similar presentations
Software Quality Assurance QA Engineering, Testing, Bug Tracking, Test Automation Software University Technical Trainers SoftUni Team.
Advertisements

 Dimitar Ivanov Introduction to programming with microcontrollers.
C# Advanced Topics Methods, Classes and Objects SoftUni Team Technical Trainers Software University
Methods Writing and using methods, overloads, ref, out SoftUni Team Technical Trainers Software University
Software University Curriculum, Courses, Exams, Jobs SoftUni Team Technical Trainers Software University
Fundamentals SoftUni Welcome to Software University SoftUni Team Technical Trainers Software University
Project Tracking Tools Trello, Asana, Basecamp, GitHub Issue Tracker, TRAC SoftUni Team Technical Trainers Software University
AngularJS Directives Defining Custom Directives SoftUni Team Technical Trainers Software University
Software Testing Lifecycle Exit Criteria Evaluation, Continuous Integration Ivan Yonkov Technical Trainer Software University.
Teamwork and Personal Skills Course Introduction Software University SoftUni Team Technical Trainers.
Design Patterns: Structural Design Patterns
JavaScript Design Patterns Private Fields, Module, Revealing Module, Revealing Prototype, … Software University Technical Trainers SoftUni.
NoSQL Databases NoSQL Concepts SoftUni Team Technical Trainers Software University
Conditional Statements Implementing Control-Flow Logic in C# SoftUni Team Technical Trainers Software University
Loops Repeating Code Multiple Times SoftUni Team Technical Trainers Software University
Database APIs and Wrappers
Methods, Arrays, Lists, Dictionaries, Strings, Classes and Objects
Svetlin Nakov Technical Trainer Software University
Build Processes and Continuous Integration Automating Build Processes Software University Technical Trainers SoftUni Team.
Processing Redis with.NET How to Operate with Redis Databases SoftUni Team Technical Trainers Software University
Multidimensional Arrays, Sets, Dictionaries Processing Matrices, Multidimensional Arrays, Dictionaries, Sets SoftUni Team Technical Trainers Software University.
Project Tracking Tools Trello, Asana, Basecamp, GitHub Issue Tracker, TRAC Angel Georgiev Part-time Trainer Software University
Test-Driven Development Learn the "Test First" Approach to Coding SoftUni Team Technical Trainers Software University
Defining Classes Classes, Fields, Constructors, Methods, Properties SoftUni Team Technical Trainers Software University
Functions Reusable Parts of Code SoftUni Team Technical Trainers Software University
Templating, Routing, lodash Extending functionality using Collections SoftUni Team Technical Trainers Software University
Graphs and Graph Algorithms Fundamentals, Terminology, Traversal, Algorithms SoftUni Team Technical Trainers Software University
Arrays, Lists, Stacks, Queues Processing Sequences of Elements SoftUni Team Technical Trainers Software University
Using SQL Connecting, Retrieving Data, Executing SQL Commands, … Svetlin Nakov Technical Trainer Software University
Asynchronous Web Services Writing Asynchronous Web Services SoftUni Team Technical Trainers Software University
C# Basics Course Introduction Svetlin Nakov Technical Trainer Software University
Jekyll Static Site Generator Template-Based Site Generation Svetlin Nakov Technical Trainer Software University
Forms Overview, Query string, Submitting arrays, PHP & HTML, Input types, Redirecting the user Mario Peshev Technical Trainer Software.
Responsive Design Design that Adapts to Different Devices SoftUni Team Technical Trainers Software University
Exam Preparation Algorithms Course: Sample Exam SoftUni Team Technical Trainers Software University
Processing JSON in.NET JSON, JSON.NET LINQ-to-JSON and JSON to XML SoftUni Team Technical Trainers Software University
Tables, Rows, Columns, Cells, Header, Footer, Colspan, Rowspan
Associative Arrays and Objects Associative Arrays, Objects Svetlin Nakov Technical Trainer Software University
High-Quality Programming Code Code Correctness, Readability, Maintainability Svetlin Nakov Technical Trainer Software University
High-Quality Code: Course Introduction Course Introduction SoftUni Team Technical Trainers Software University
Design Patterns: Structural Design Patterns General and reusable solutions to common problems in software design Software University
Advanced C# Course Introduction SoftUni Team Technical Trainers Software University
Prototype Chain and Inheritance Prototype chain, Inheritance, Accessing Base Members Software University Technical Trainers SoftUni Team.
Object-Oriented Programming Course Introduction Svetlin Nakov Technical Trainer Software University
Reflection Programming under the hood SoftUni Team Technical Trainers Software University
JavaScript Applications Course Introduction SoftUni Team Technical Trainers Software University
Mocking with Moq Tools for Easier Unit Testing SoftUni Team Technical Trainers Software University
Design Patterns: Behavioral Design Patterns General and reusable solutions to common problems in software design Software University
Mocking Unit Testing Methods with External Dependencies SoftUni Team Technical Trainers Software University
Mocking with Moq Mocking tools for easier unit testing Svetlin Nakov Technical Trainer Software University
ORM Basics Repository Pattern, Models, Entity Manager Ivan Yonkov Technical Trainer Software University
Test-Driven Development Learn the "Test First" Approach to Coding Svetlin Nakov Technical Trainer Software University
Programming for Beginners Course Introduction SoftUni Team Technical Trainers Software University
Sets, Dictionaries SoftUni Team Technical Trainers Software University
High-Quality Code: Course Introduction Course Introduction SoftUni Team Technical Trainers Software University
Advanced Tree Structures Binary Trees, AVL Tree, Red-Black Tree, B-Trees, Heaps SoftUni Team Technical Trainers Software University
Programming Fundamentals Course Introduction SoftUni Team Technical Trainers Software University
Doctrine The PHP ORM SoftUni Team Technical Trainers Software University
Creating Content Defining Topic, Creating Technical Training Materials SoftUni Team Technical Trainers Software University
ASP.NET MVC Course Program, Trainers, Evaluation, Exams, Resources SoftUni Team Technical Trainers Software University
First Steps in PHP Creating Very Simple PHP Scripts SoftUni Team Technical Trainers Software University
Inheritance Class Hierarchies SoftUni Team Technical Trainers Software University
Stacks and Queues Processing Sequences of Elements SoftUni Team Technical Trainers Software University
Generics SoftUni Team Technical Trainers Software University
C# OOP Advanced Course Introduction SoftUni Team Technical Trainers Software University
High-Quality Programming Code Code Correctness, Readability, Maintainability, Testability, Etc. SoftUni Team Technical Trainers Software University
Functional Programming
Mocking tools for easier unit testing
Repeating Code Multiple Times
Iterators and Comparators
Presentation transcript:

Functional Programming Data Aggregation and Nested Queries Ivan Yonkov Technical Trainer Software University

2 1.LINQ Performance Benchmarks 2.Data Grouping 1. Group By Clause 3.Nested Queries 1. Declarative 2. SelectMany() Table of Contents

LINQ Performance Benchmark

4  LINQ extension methods extend all implementations of IEnumerable in a consistent manner  Because of the above interface all the extended collections can be enumerated  The extension methods use the enumeration property in order to do their work  E.g. to determine the count of the collection, LINQ’s Count() method enumerates the collection  The methods in most cases are not adapted to the specifics of the concrete collection they are called on LINQ Performance Benchmark

5  Calling directly Count property on lists takes only one step  Alternatively Count() extensions method is slower LINQ Performance Benchmark (2) sw.Start(); cnt = nums.Count(); sw.Stop();Console.WriteLine(sw.Elapsed); 00:00: :00: Stopwatch sw = new Stopwatch(); sw.Start(); int cnt = nums.Count; // 10M elements sw.Stop();Console.WriteLine(sw.Elapsed); 00:00: :00:

6  LINQ’s Count() Source code  c/System/Linq/Count.cs c/System/Linq/Count.cs LINQ Performance Benchmark (3) using (IEnumerator e = source.GetEnumerator()) { checked checked { while (e.MoveNext()) count++; while (e.MoveNext()) count++; }}

7  Taking value by key in dictionary takes only one step  Alternatively FirstOrDefault() extension method is slower LINQ Performance Benchmark (4) sw.Start(); name = names.Keys.FirstOrDefault(k => k == "name_1000"); sw.Stop();Console.WriteLine(sw.Elapsed); 00:00: :00: sw = new Stopwatch(); sw.Start(); string name = names["name_1000"]; // 10k names sw.Stop();Console.WriteLine(sw.Elapsed); 00:00: :00:

8  LINQ’s FirstOrDefault() Source code  c/System/Linq/First.cs c/System/Linq/First.cs  Tries to use the default ordering, otherwise flattens it LINQ Performance Benchmark (5) OrderedEnumerable ordered = source as OrderedEnumerable ; if (ordered != null) return ordered.FirstOrDefault(predicate); foreach (TSource element in source) { if (predicate(element)) return element; if (predicate(element)) return element;}

Data Grouping

 Data grouping is a concept of aggregation by association  The concept is available in any data manipulation tools and data storages e.g. Databases  Most of the popular databases are using a declarative language called SQL  SELECT FirstName, LastName, Age FROM Students 10 FirstNameLastNameAge PeshoPetrov22 DraganCankov82

Data Grouping (2)  Usually in the previous scenario students can be grouped by certain criteria (e.g. average age by FirstName)  SELECT FirstName, AVG(Age) FROM Students GROUP BY FirstName 11 FirstNameAVG(Age) Ivan28 Petar26 Georgi24 Maria18

Data Grouping (2)  Grouping can be applied on a data collection using the GroupBy extension method or the group keyword  After the group keyword is the value which should be added to that particular group  The by clause denotes the key (association) in which the data should be grouped by 12 from {rangeVariable} in {collection} group {value} by {key} into {groupVariable} select {groupVariable}

Data Grouping (3)  For instance if the task is to group collection of cities by their first letter:  After the group keyword should be each city in that group  After the by clause should be the condition (first letter of that city) 13 var citiesByLetter = from city in cities from city in cities group city by city[0] group city by city[0] into citiesWithLetter into citiesWithLetter select citiesWithLetter; select citiesWithLetter;

Data Grouping (4) 14

Data Grouping (5) 15

Data Grouping (6) 16

Data Grouping (7) 17  The previous code results into an enumerable collection of groups.  Each group consists of  A char as a key (the first letter of the city)  Enumerable of strings (each city that starts with that letter)  The collection can be enumerated. Each value will be a group  The group  Has a Key property – the first letter (char)  Can be enumerated to return each city name

Data Grouping (8) 18

Data Grouping (9) 19

Data Grouping (10) 20  Let’s make the grouping from the first slides – Average Age of Students by their first name  We have the following definition of a Student class

Data Grouping (11) 21  And the following collection  Petar (22+30)/2 = 52/2 = 26  Georgi (20+38)/2 = 58/2 = 29  Ivan (24)/1 = 24  Mimi ( )/3 = 54/3 = 18

Data Grouping (12) 22  We need to group Age by FirstName  The result will be key FirstName and enumerable of Age’s  Then we need to aggregate Enumerable of Ages to their Average  An anonymous object can be returned instead of IGrouping

Data Grouping (13) 23  The result will be Enumerable of Anonymous objects  The resulting Enumerable can be enumerated and each anonymous object printed

Data Grouping (14) 24  The result is as expected

Data Grouping (15) 25  The functional approach will require GroupBy method  The abstraction of the delegate is:  Func, Func

Nested Queries

 Very often we need to deal with the collection matching problem  To sort an array  To find products in one shop that are not present in any other  To find how many people in collection of people are dating any of the rest of the collection  And we will talk about the last one  The Student definition is expanded with a string property holding the name of their current date 27

Nested Queries (2)  The Student definition now looks like  The GoesOutWith property holds the FirstName of another Student instance in the pool 28

Nested Queries (3)  The students collection now has students with their dates 29

Nested Queries (4)  Our task is to get each student and find all other students that goes out with this student (or at least with its FirstName)  For instance we start traversing the collection with “Petar”  It seems that “Mimi” and “Geri” are dating “Petar”  Then we hit “Georgi”  It seems that “Kali” and “Vanq” are dating student with first name “Georgi” (don’t take in mind that it’s not the same Georgi)  In order to find that out we need to travers the collection over again for each iteration  It’s called a Nested query 30

Nested Queries (5)  For each range variable student introduce a nested range variable otherStudent to try the matchmaking  Find these otherStudents whose GoesOutWith property is the same as the student’s property FirstName 31

Nested Queries (6)  The association (key) we will group by will be the student’s FirstName  The values we will push to that association will be the FirstName’s of the otherStudents that dates this student  The result should be a string key and an enumerable of strings as a value 32

Nested Queries (7) 33

Nested Queries (8)  Enumerate the group collection 34

Nested Queries (9)  The result has duplicates because there are some keys twice and the nested query finds their corresponding dates once again 35

Nested Queries (10)  The same can be achieved via SelectMany() extension method  It takes two delegates as arguments  Func > collectionSelector  Func resultSelector  The implementation can be translated to 36 (rangeVar) => return collection, (rangeVar, nestedRangeVar) => return resultObject

Nested Queries (11) 37

Nested Queries (12) 38  The usual implementation of SelectMany() uses nested loops  q/src/System/Linq/SelectMany.cs q/src/System/Linq/SelectMany.cs foreach (TSource element in source) { foreach (TCollection subElement in collectionSelector(element)) foreach (TCollection subElement in collectionSelector(element)) { yield return resultSelector(element, subElement); yield return resultSelector(element, subElement); }}

39  LINQ can be slower if used instead of DS internal functionality  Grouping is setting data under association  Can be used with data aggregation  Nested Queries usually match an element with any other element in the collection  LINQ is open source  Take a look on GitHub Take a look on GitHub Summary

? ? ? ? ? ? ? ? ? Functional Programming Part 2

License  This course (slides, examples, demos, videos, homework, etc.) is licensed under the "Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International" licenseCreative Commons Attribution- NonCommercial-ShareAlike 4.0 International 41  Attribution: this work may contain portions from  "Fundamentals of Computer Programming with C#" book by Svetlin Nakov & Co. under CC-BY-SA licenseFundamentals of Computer Programming with C#CC-BY-SA  "OOP" course by Telerik Academy under CC-BY-NC-SA licenseOOPCC-BY-NC-SA

Free Software University  Software University Foundation – softuni.orgsoftuni.org  Software University – High-Quality Education, Profession and Job for Software Developers  softuni.bg softuni.bg  Software Facebook  facebook.com/SoftwareUniversity facebook.com/SoftwareUniversity  Software YouTube  youtube.com/SoftwareUniversity youtube.com/SoftwareUniversity  Software University Forums – forum.softuni.bgforum.softuni.bg