Chapter 2 The Big Picture.

Chapter 2 The Big Picture

Overview The big picture answers to several questions.
What are data structures? What data structures do we study? What are Abstract Data Types? Why Object-Oriented Programming (OOP) and Java for data structures? How do I choose the right data structures?

2.1 What Are Data Structures
How many cities with more than 250,000 people lie within 500 miles of Dallas, Texas? How many people in my company make over $100,000 per year? Can we connect all of our telephone customers with less than 1,000 miles of cable? To answer questions like these, it is not enough to have the necessary information. We must organize that information in a way that allows us to find the answers in time to satisfy our needs. Representing information is fundamental to computer science. The primary purpose of most computer programs is not to perform calculations, but to store and retrieve information — usually as fast as possible. For this reason, the study of data structures and the algorithms that manipulate them is at the heart of computer science. helping you to understand how to structure information to support efficient processing.

2.1 What Are Data Structures
A data structure is an aggregation of data components that, together, constitute a meaningful whole. The components themselves may be data structures. Stops at some “atomic” unit. Atomic or primitive type A data type whose elements are single, non decomposable data items (can be broken into parts) Composite type A data type whose elements are composed of multiple data items (ex: take tow integers (simple elements) x, y to form a point (x, y)

2.2 What Data Structures Do We Study?
Data structure categories: Linear Non-linear Category is based on how the data is conceptually organized or aggregated. Linear structures List, Queue and Stack are linear collections, each of them serves as a repository in which entries may be added or removed at will. Differ in how these entries may be accessed once they are added.

List The List is a linear collection of entries in which entries may be added, removed, and searched for without restrictions. Two kinds of list: Ordered List Unordered List Queue Entries may only be removed in the order in which they are added. First in First out (FIFO) data structures No search for an entry in the Queue

Stack Entries may only be removed in the reverse order in which they are added. Last In, First Out (LIFO) No search for an entry in the Stack.

Trees: A tree is a nonlinear structure with a unique starting node (the root), in which each node is capable of having many child nodes, and in which a unique path exists from the root to every other node. Trees are useful for representing hierarchical relationships among data items. Root The top node of a tree structure; a node with no parent

Not Tree

Binary Tree Binary tree A tree in which each node is capable of having two child nodes, a left child node and a right child node Leaf A tree node that has no children

A Binary Tree

Binary Search Tree Binary Search Tree
A binary tree in which the key value in any node is greater than the key value in its left child and any of its descendants (the nodes in the left subtree) and less than the key value in its right child and any of its descendants (the nodes in the right subtree)

Binary Search Tree

AVL Tree AVL Tree Height-balanced, binary search tree.
AVL Tree derives its importance from the fact that it speeds up this search process to a remarkable degree.

Heap Heap as a Priority Queue
A priority queue is a specialization of the FIFO Queue. Entries are assigned priorities. The entry with the highest priority is the one to leave first. The heap is a special type of binary tree.

Complete binary tree a complete binary tree is one in which every level but the last must have the maximum number of nodes possible at that level. The last level may have fewer than the maximum possible nodes, but they should be arranged from left to right without any empty spots.

Hash Table Hash Table: Hash Functions: A function used to manipulate the key of an element in a list to identify its location in the list Hashing: The technique for ordering and accessing elements in a list in a relatively constant amount of time by manipulating the key to identify its location in the list Hash table: Term used to describe the data structure used to store and retrieve elements using hashing

Using a hash function to Determine the Location of the Element in an Array

Graphs Two kinds of graphs:
Graph: A data structure that consists of a set of models and a set of edges that relate the nodes to each other Vertex: A node in a graph Edge (arc): A pair of vertices representing a connection between two nodes in a graph Two kinds of graphs: Undirected graph: A graph in which the edges have no direction Directed graph (digraph): A graph in which each edge is directed from one vertex to another (or the same) vertex A general tree is a special kind of graph, since a hierarchy is a special system of relationships among entities. Graphs may be used to model systems of physical connections such as computer networks, airline routes, etc., as well as abstract relationships such as course pre-requisite structures. Standard graph algorithms answer certain questions we may ask of the system.

Graphs

Graphs Adjacent vertices: Two vertices in a graph that are connected by an edge Path: A sequence of vertices that connects two nodes in a graph Complete graph: A graph in which every vertex is directly connected to every other vertex Weighted graph: A graph in which each edge carries a value

Data Type meaningful data is organized into primitive data types such as integer, real, and boolean and into more complex data structures such as arrays and binary trees. So the idea of a data type includes a specification of the possible values of that type together with the operations that can be performed on those values. Abstract data type (ADT) A data type whose properties (domain and operations) are specified independently of any particular implementation An abstract data type specifies the values of the type, but not how those values are represented as collections of bits, and it specifies operations on those values in terms of their inputs, outputs, and effects rather than as particular algorithms or program code. the primitive data types is abstract data types.

2.4 Why OOP and Java for Data Structures?
A stack may be built using a List ADT. The stack object contains a List object which implements its state, and the behavior of the Stack object is implemented in terms of the List object's behaviour. A stack interface defines four operations: push pop get-Size isEmpty OR Reuse List ADT by inheritance.

2.5 How Do I choose the Right Data Structures?
The interface of operations that is supported by a data structure is one factor to consider when choosing between several available data structures. The efficiency of the data structures: How much space does the data structure occupy? What are the running times of the operation in its interface?

Example Implementing a printing job storage for a printer: requires a queue data structure. Maintains a collection of entries in no particular order. requires an unordered list data structure.

The running time of each operation in the interface. A data structure with the best interface with the best fit may not necessarily be the best overall fit, if the running times of its operations are not up to the mark. When we have more than one data structure implementation whose interfaces satisfy our requirements, we may have to select one based on comparing the running times of the interface operations. Time is traded off for space, i.e. more space is consumed to increase speed, or a reduction in speed is traded for a reduction in the space consumption.

Time-space tradeoff We are looking to “buy” the best implementation of a stack. StackA. Does not provide a getSize operation. i.e. there is not single operation that a client can use to get the number of entries in StackA. StackB. Provides a getSize operation, implemented in the manner we discussed earlier, transferring entries back and forth between two stacks. StackC. Provides a getSize operation, implemented as follows: a variable called size is maintained that is incremented every time an entry is pushed, and decremented every time an entry is popped.

Three situations: Need to maintain a large number stacks, with no need to find the number of entries. Need to maintain only one stack, with frequent need to find the number of entries. Ex: Stack array of length 10, 1000 calls for getsize Need to maintain a large number of stacks. With infrequent need to find the number of entries. Ex: Stack array of length 10, 3 calls for getsize, stacks

Situation 1, StackA fits the bill. Tempting to pick StackC, simply because we may want to play conservative: what if we need getSize in the future?

Situation 2, StackB or Stack C. Need to use getSize. getSize in StackB is more time-consuming than that in StackC. We need only one stack, the additional size variable used by StackC is not an issue. Since we need to use getSize frequently, it is better to with StackC.

Situation 3 presents a choice between StackB and StackC. If getSize calls are infrequent, we may choose to go with StackB and suffer a loss in speed. The faster getSize delivered by StackC is at the expense of an extra variable per stack, which may add up to considerable space consumption since we plan to maintain a number of stacks.

getSize in StackB is more time-consuming that that in StackC. How can we quantify the time taken in either case? For each data structure we study, we present the running time of each operation in its interface.

Time complexity Use of time complexity makes it easy to estimate the running time of a program. Complexity can be viewed as the maximum number of primitive operations that a program may execute. Regular operations are single additions, multiplications, assignments etc. We may leave some operations uncounted and concentrate on those that are performed the largest number of times. Such operations are referred to as dominant.

Time complexity The operation in line 4 is dominant and will be executed n times. The complexity is described in Big-O notation: in this case O(n) — linear complexity.

Time complexity The complexity specifies the order of magnitude within which the program will perform its operations. More precisely, in the case of O(n), the program may perform c · n operations, where c is a constant; however, it may not perform n2 operations. when calculating the complexity we omit constants: i.e. regardless of whether the loop is executed 20· n times or n / 5 times, we still have a complexity of O(n), even though the running time of the program may vary. When analyzing the complexity we must look for specific, worst- case examples of data that the program will take a long time to process.

Comparison of different time complexities

Exponential and factorial time
It is worth knowing that there are other types of time complexity such as factorial time O(n!) and exponential time O(2n). Algorithms with such complexities can solve problems only for very small values of n, because they would take too long to execute for large values of n.

Comparison of rates of growth for different time complexities

Chapter 2 The Big Picture.

Similar presentations

Presentation on theme: "Chapter 2 The Big Picture."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 2 The Big Picture.

Similar presentations

Presentation on theme: "Chapter 2 The Big Picture."— Presentation transcript:

Similar presentations

About project

Feedback