# CS16: Introduction to Data Structures & Algorithms

## Presentation on theme: "CS16: Introduction to Data Structures & Algorithms"— Presentation transcript:

CS16: Introduction to Data Structures & Algorithms
Thursday, March 5, 2015 GRAPHS CS16: Introduction to Data Structures & Algorithms

Outline What is a graph? Terminology Properties Graph Types
Thursday, March 5, 2015 Outline What is a graph? Terminology Properties Graph Types Representations Performance BFS/DFS Applications

What is a graph? A graph defined by:
Thursday, March 5, 2015 What is a graph? A graph defined by: a set of vertices (V) a set of edges (E) Vertices and edges can both store data ORD PVD MIA DFW SFO LAX LGA HNL 849 802 1387 1743 1843 1099 1120 1233 337 2555 142 Example: A vertex represents an airport and stores the 3-letter airport code An edge represents a flight route between 2 airports and stores the mileage of the route

Terminology End vertices (or endpoints) of an edge
Thursday, March 5, 2015 Terminology End vertices (or endpoints) of an edge U and V are the endpoints of a Incident edges on a vertex a, d, and b are incident on V Adjacent vertices U and V are adjacent Degree of a vertex X has degree 5 Parallel (multiple) edges h and i are parallel edges Self-loop j is a self-loop X U V W Z Y a c b e d f g h i j

Terminology (2) Path Simple path Examples
Thursday, March 5, 2015 Terminology (2) Path sequence of alternating vertices and edges begins with a vertex ends with a vertex each edge is preceded and followed by its endpoints Simple path path such that all its vertices and edges are visited at most once Examples P1=(V–b->X–h->Z) is a simple path P2=(U–c->W–e->X–g->Y–f->W–d->V) is not a simple path, but is still a path P1 X U V W Z Y a c b e d f g h P2 Shortest path from U to X Another path? A third? A fourth?

Thursday, March 5, 2015 Graph Properties A graph G’ = (V’, E’) is a subgraph of G = (V, E) if V’ is contained in V and E’ is contained in E A graph is connected if there exists a path from each vertex to every other vertex in G. A cycle is a path that starts and ends at the same vertex A graph is acyclic if no subgraph is a cycle

Thursday, March 5, 2015 Graph Properties (2) A spanning tree is a subgraph that contains all the graph’s vertices in a single tree and enough edges to connect each vertex without cycles. Graph G Spanning Tree for G ORD PVD MIA DFW LGA ORD PVD MIA DFW LGA

Thursday, March 5, 2015 Graph Properties (3) A spanning forest is a subgraph that consists of a spanning tree in each connected component of a graph Spanning forests never contain cycles. Keep in mind that this might not be the “best” or shortest path to each node. ORD PVD MIA DFW SFO LAX LGA HNL

Thursday, March 5, 2015 Graph Properties (4) A graph G is a tree if and only if it satisfies any of the following 5 conditions: G has V-1 edges and no cycles G has V-1 edges and is connected G is connected, but removing any edge disconnects it G is acyclic, but adding any edges creates a cycle Exactly one simple path connects each pair of vertices in G

D = deg(v1) + deg (v2) + … + deg(vp).
Thursday, March 5, 2015 Graph Proof 1 Prove that the sum of the degrees of all the vertices of some graph G equals twice the number of edges of G. Let V = {v1, v2, …, vp}, where p is the number of vertices in G. When computing the total sum of degrees D, we see that D = deg(v1) + deg (v2) + … + deg(vp). Notice that each edge is counted twice in D: one for each of the two vertices incident on an edge. Thus D = 2*|E|, where |E| is the number of edges.

D ≥ 2(k+1)  2|E| ≥ 2(k+1)  |E| ≥ k + 1  |E| ≥ k
Thursday, March 5, 2015 Graph Proof 2 Using induction prove that in every connected graph: |E| ≥ |V| – 1 Base Case: A graph with one vertex will have 0 edges, and 0 ≥ 1 – 1 Assume: In a graph with k vertices, |E| ≥ k – 1 Let G be any connected graph with k + 1 vertices. We must show that |E| ≥ k. Let u be the vertex of minimum degree in G. If deg(u) = 1: u must be a leaf. Let G’ be G without u and its 1 incident edge. G’ is clearly still connected, because we only removed a leaf. By our assumption, G’ has at least k – 1 edges, which means that G has at least k edges. If deg(u) ≥ 2: Every vertex has at least two incident edges, so the total degree (D) of the graph is: D ≥ 2(k+1) Since we know from the last proof that D = 2*|E|: D ≥ 2(k+1)  2|E| ≥ 2(k+1)  |E| ≥ k + 1  |E| ≥ k True for 0, and for k+1 assuming k, so true for all n≥0.

Graph Types Directed graph Undirected graph all the edges are directed
Thursday, March 5, 2015 Graph Types Directed graph all the edges are directed ordered pair of vertices (u,v) first vertex u is the origin second vertex v is the destination e.g., a flight Undirected graph all the edges are undirected unordered pair of vertices (u,v) e.g., a flight route flight AA 600 ORD PVD ORD PVD route 849 miles

Graph Types: Directed Graph
Thursday, March 5, 2015 Graph Types: Directed Graph AA 123 ORD SFO US 32 SW 223 LH 21 LAX PA 23 DFW Here is a directed graph – note that each edge now has a beginning and an end it’s directed! Some of the types of graphs you listed are directed and some are undirected

Directed Acyclic Graph (DAG)
Thursday, March 5, 2015 Directed Acyclic Graph (DAG) CS32 CS16 CS18 CS15 CS17 CS19 CS123 CS224 CS141 CS242 CS125 CS128 CS22 means ‘is a prerequisite for’ And here is a directed acyclic graph directed but no cycles Each node is a task (or cs course) Each edge is an ordering (15 must come before 16) Topological ordering of the vertices is a list of all the vertices where the order implied by all the edges is preserved Everyone figure one out now (copy graph to board in the meantime) (left to right works because arrows all point that way) Raise your hand when you are done let’s write down a few check that they are correct We’ll talk much more about DAGs in future lectures… Acyclic = without cycles

Undirected Graph 849 PVDF 1843 ORD 142 SFO 802 LGA 1743 337 1387 2555
Thursday, March 5, 2015 Undirected Graph ORD PVDF MIA DFW SFO LAX LGA HNL 849 802 1387 1743 1843 1099 1120 1233 337 2555 142

Graph Representations
Thursday, March 5, 2015 Graph Representations Vertices usually stored in a list or hash set Three common methods of representing which vertices are adjacent: Edge list (or set) Adjacency lists (or sets) Adjacency matrix

Thursday, March 5, 2015 Edge List (or Set) Way of representing which vertices are adjacent to each other as a list of pairs Each element in the list is a single edge (a,b) from node a to node b Since order of this list doesn’t matter, we can use a single hash set instead to improve runtime Since order doesn’t matter, can be stored in a hash set, to make lookup faster Edge List: [ (1,1), (1,2), (1,5), (2,3), (2,5), (3,4), (4,5), (4,6) ]

Thursday, March 5, 2015 Adjacency Lists (or Sets) Another way of representing which vertices are adjacent to each other Each vertex has an associated list representing the vertices it neighbors Since order of these lists doesn’t matter, they could be hash sets instead to make lookup faster 1 2 3 4 5 6 [1,2,5] [1,3,5] [2,4] [3,5,6] [1,2,4] [4]

A = { (1,1), (1,2), (1,5), (2,3), (2,5), (3,4), (4,5), (4,6) }
Thursday, March 5, 2015 Sets Remember finding an element in a hashset is a constant operation: A = { (1,1), (1,2), (1,5), (2,3), (2,5), (3,4), (4,5), (4,6) } A.contains((4,5)) is O(1) Adjacency List elements can also be hashsets: B = [ {1, 2, 5}, {1, 3, 5}, {2, 4}, {3, 5, 6}, {1, 2, 4}, {4} ] B[2].contains(5) is still O(1) They can also be implemented as hashsets of hashsets! Oh boy!

Thursday, March 5, 2015 Adjacency Matrices Another way of representing which vertices are adjacent to each other Matrix is n x n, where n is the number of nodes in the graph true = edge false = no edge If m[u][v] is true, then node u has an edge to node v (and if the graph is undirected, we can assume the opposite is true as well)

Thursday, March 5, 2015 Adjacency Matrices (2) 1 2 3 4 5 6 T F

Thursday, March 5, 2015 Adjacency Matrices (3) We initialize the matrix to the predicted size of our graph (we can always expand the array later) When a vertex is added to the graph, we reserve a row and column of the matrix for that vertex When a vertex is removed, its row and column are set to false Since we can’t remove rows/columns from arrays, we keep a separate list of vertices to represent vertices actually present in the graph Added last bullet point to make it more clear what the students would actually do if using an adjacency matrix. Previously the last bullet read: “When a vertex is removed, its row and column are freed.”

Main Methods of the Graph ADT
Thursday, March 5, 2015 Main Methods of the Graph ADT Vertices and edges store values Ex: edge weights Accessor methods vertices() edges() incidentEdges(vertex) areAdjacent(v1, v2) endVertices(edge) opposite(vertex, edge) Update methods insertVertex(value) insertEdge(v1, v2) Note: Sometimes this function also takes a value! insertEdge(v1, v2, val) removeVertex(vertex) removeEdge(edge) See comments

24 Big-O Performance: Edge Set Adjacency Sets Adjacency Matrix Overall Space |V| + |E| |V|2 vertices()1 1* edges() |E| incidentEdges(v) |V| areAdjacent (v1, v2) 1 insertVertex(val) insertEdge(v1, v2) removeVertex(v) removeEdge(v1, v2) 1 In all three approaches, we maintain an additional list or set of vertices. * in-place Thursday, March 5, 2015

Big-O Performance (Edge Set)
Thursday, March 5, 2015 Big-O Performance (Edge Set) Operation Runtime Explanation vertices() O(1) Return the set of vertices edges() Return the set of edges incidentEdges(v) O(|E|) Iterate through each edge and check if it contains vertex v areAdjacent(v1,v2) Check if edge (v1,v2) exists in the set insertVertex(v) Add vertex v to the vertex list insertEdge(v1,v2) Add element (v1,v2) to the set removeVertex(v) Iterate through each edge and remove it if it has vertex v removeEdge(v1,v2) Remove edge (v1,v2)

Thursday, March 5, 2015 Big-O Performance (Adjacency Set) Operation Runtime Explanation vertices() O(1) Return the set of vertices edges() O(|E|) Concatenate each vertex with its subsequent vertices incidentEdges(v) Return v’s edge set areAdjacent(v1,v2) Check if v2 is in v1’s set insertVertex(v) Add vertex v to the vertex set insertEdge(v1,v2) Add v1 to v2’s edge set and vice versa removeVertex(v) O(|V|) Remove v from each of its adjacent vertices’ sets and remove v’s set removeEdge(v1,v2) Remove v1 from v2’s set and vice versa

Thursday, March 5, 2015 Big-O Performance (Adjacency Matrix) Operation Runtime Explanation vertices() O(1) Return the set of vertices edges() O(|V|2) Iterate through the entire matrix incidentEdges(v) O(|V|) Iterate through v’s row or column to check for trues Note: row/col are the same in an undirected graph. areAdjacent(v1,v2) Check index (v1,v2) for a true insertVertex(v) O(|V|)* Add vertex v to the matrix (*amortized) insertEdge(v1,v2) Set index (v1,v2) to true removeVertex(v) Set v’s row and column to false and remove v from the vertex list removeEdge(v1,v2) Remove v1 from v2’s set and vice versa See comment

Thursday, March 5, 2015 BFT/DFT Do you remember when we performed breadth first traversals and depth first traversals on trees? These traversals can also be applied to graphs! (A tree is just a special kind of graph, after all…) Can be used to traverse and “visit” nodes in a graph (BFT / DFT) Often used to search for a certain value in a graph (BFS / DFS)

Breadth First Traversal: Tree vs. Graph
29 Breadth First Traversal: Tree vs. Graph function treeBFT(root): //Input: Root node of tree //Output: Nothing Q = new Queue() Q.enqueue(root) while Q is not empty: node = Q.dequeue() doSomething(node) enqueue node’s children function graphBFT(start): //Input: start vertex start.visited = true Q.enqueue(start) for neighbor in node’s adjacent nodes: if not neighbor.visited: neighbor.visited = true Q.enqueue(neighbor) Mark nodes as visited, otherwise you might loop forever! doSomething(node) could print, add to list, decorate nodes, etc. Thursday, March 5, 2015

Thursday, March 5, 2015 Depth First Traversal One way to perform a DFT a graph is to use a stack instead of a queue as in BFT DFT can also be done recursively function recursiveDFT(node): // Input: start node // Output: Nothing node.visited = true for neighbor in node’s adjacent vertices: if not neighbor.visited: recursiveDFT(neighbor)

Applications Flight networks GPS maps The internet Facebook Graph
Thursday, March 5, 2015 Applications Flight networks GPS maps The internet pages are nodes links are edges Facebook Graph Modeling molecular structures atoms are nodes bonds are edges So many more!!

Applications: Flight Paths Exist
Thursday, March 5, 2015 Applications: Flight Paths Exist Problem: Given an undirected graph with airport (vertices) and flights (edges), determine whether it is possible to fly from one airport to another. Strategy: Use a breadth first search starting at the first node, and determine if the ending airport is ever visited. JFK PWM ORD PVD MIA DFW SFO LAX LGA HNL

Applications: Flight Paths Exist (2)
Thursday, March 5, 2015 Applications: Flight Paths Exist (2) Task: Is there a path from SFO to PVD? ORD PVD MIA DFW SFO LAX LGA HNL PWM JFK

Applications: Flight Paths Exist (2)
Thursday, March 5, 2015 Applications: Flight Paths Exist (2) Task: Is there a path from SFO to PVD? ORD PVD MIA DFW SFO LAX LGA HNL PWM JFK

Applications: Flight Paths Exist (2)
Thursday, March 5, 2015 Applications: Flight Paths Exist (2) Task: Is there a path from SFO to PVD? ORD PVD MIA DFW SFO LAX LGA HNL PWM JFK

Applications: Flight Paths Exist (2)
Thursday, March 5, 2015 Applications: Flight Paths Exist (2) Task: Is there a path from SFO to PVD? YES, now how do we do it in code? ORD PVD MIA DFW SFO LAX LGA HNL PWM JFK

Flight Paths Exist Pseudocode
Thursday, March 5, 2015 Flight Paths Exist Pseudocode function pathExists(from, to): //Input: from: vertex, to: vertex //Output: true if path exists, false otherwise Q = new Queue() from.visited = true Q.enqueue(from) while Q is not empty: airport = Q.dequeue() if airport == to: return true for neighbor in airport’s adjacent nodes: if not neighbor.visited: neighbor.visited = true Q.enqueue(neighbor) return false

Applications: Flight Layovers
Thursday, March 5, 2015 Applications: Flight Layovers Problem: Given an undirected graph with airport (vertices) and flights (edges), decorate the vertices with the least number of stops from a given source. If there is no way to get to a certain airport, decorate node with infinity. Strategy: Decorate each node with an initial ‘stop value’ of infinity. Use breadth first search to decorate each node with a ‘stop value’ of one greater than its previous. JFK PWM ORD PVD MIA DFW SFO LAX LGA HNL

Flight Layover Pseudocode
Thursday, March 5, 2015 Flight Layover Pseudocode function numStops(G, source): //Input: G: graph, source: vertex //Output: Nothing //Purpose: decorate each vertex with the lowest number of // layovers from source. for every node in G: node.stops = infinity Q = new Queue() source.stops = 0 source.visited = true Q.enqueue(source) while Q is not empty: airport = Q.dequeue() for neighbor in airport’s adjacent nodes: if not neighbor.visited: neighbor.visited = true neighbor.stops = airport.stops + 1 Q.enqueue(neighbor)