Presentation is loading. Please wait.

Presentation is loading. Please wait.

Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.

Similar presentations


Presentation on theme: "Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through."— Presentation transcript:

1 Index Sen Zhang

2 INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through the table to locate specific records.

3 Index Structures Secondary access structure used to speed up the retrieval of records in response to certain search conditions.

4 Indexes are useful because they help you to locate specific target within a large amount of data without having to look through every object.

5 A data table First nameLast nameResid state Year of born Year of entering JohnEdwardVM19892001 KathyAlexNY19872002 JosephbushNJ19832000 GeorgeClintonCA19831999 AlexJordanVA19912000 BillHerbertAZ19841999 JamesPerlSC19861998 NarianGellerNC19921999 FrankThomasonFL19942005

6 Index table vs. data table First nameLast nameResid state Year of born Year of entering JohnEdwardVM19892001 KathyAlexNY19872002 JosephbushNJ19832000 GeorgeClintonCA19831999 AlexJordanVA19912000 BillHerbertAZ19841999 JamesPerlSC19861998 NarianGellerNC19921999 FrankThomasonFL19942005 FirstIndex pointer Alex5 Bill6 Frank9 George4 James7 John1 joseph3 kathy2 Narian8

7 Logically speaking at a conceptual level, the index is simply the row number. But physically, the index are pointers to the precise position on external storage, i.e. disks or memory (when loaded into memory.) They are pointers to the offsets of records in the physical file. The file system actually will map these offsets to a specific offsets in a specific sector in a specific track on a specific disk.

8 Which attributes should be indexed? A table could be associated with multiple indexes. Which attribute(s) should be indexed? The choice depends on user requirement, what do you want from those indexes, and depend on your applications, depend upon each individual database designer or database programmer’s understanding toward the application.

9 How index will work in a dynamic environment. Once you created an index on a table, oracle automatically keeps the index synchronized with that table. That means, when you insert a new record to the data table, oracle will insert a pointer to the index table at the right position at the same time. So every insertion will take a little bit more time. Similarly, when delete, update are involved, the index tables will also be involved, thus take a little bit more time in this sense, but it is tricky. As a tradeoff, index will expedite (speed up) other data manipulation operations such as delete, update and select.

10 Insertion to tables with indexes will take longer time! So, it is not always a good idea to enforce indexes to tables, especially for OLTP.

11 OLTP OLTP (On-Line Transaction Processing) –supports a business? day-to-day activities –high insertion rate, simple queries –MB or GB

12 OLAP OLAP (On-Line Analytical Processing) system or Data Warehouse –analyzes operational data –low update rate, complex queries –GB or TB –a strategic business decision: high rewards, but low chance of success

13 DDL statement to create index Create index index_name on table_name(column_name_list);

14 An index is an auxiliary way to organize your data file based on the characteristics of values contained in your data file.

15 Why index works Proper data structure Data manipulation related operations such as inserting, deleting, updating, and searching algorithm can achieve better time complexities. For example, searching in an unsorted list could be O(n); but searching in a sorted list could be O(lgn), binary search for example.

16 Binary search tree A binary search tree is a binary tree where every node has a value, every node's left subtree has values less than the node's value, and every right subtree has values greater. A new node is added as a leaf.

17 Are these BSTs? 50 7525 12 456690 50 7525 12 557390 1 a BST? 2 a BST?

18 Note that the worst case of this build_binary_tree routine is O(n 2 ) - if you feed it a sorted list of values, it chains them into a linked list with no left subtrees. For example, build_binary_tree([1, 2, 3, 4, 5]) yields the tree (None, 1, (None, 2, (None, 3, (None, 4, (None, 5, None))))). There are a variety of balanced schemes for overcoming this flaw with simple binary trees.

19 Because databases cannot typically be maintained entirely in memory (512M main memory is good ), b-trees or b* trees are often used to index the data and to provide fast access. Theoretically speaking, searching an unindexed and unsorted database containing n key values will have a worst case running time of O(n); if the same data is indexed with a b-tree, the same search operation will run in O(log n). For example, to perform a search for a single key on a set of one million keys (1,000,000), a linear search will require at most 1,000,000 comparisons at the worst case. If the same data is indexed with a b-tree of minimum degree 10, 114 comparisons will be required in the worst case.

20 B*-tree It is a B-tree in which nodes are kept 2/3 full by redistributing keys to fill two child nodes, then splitting them into three nodes.

21 B-tree Definition: A balanced search tree in which every node has between m/2 and m children, where m>1 is a fixed integer. m is the order. The root may have as few as 2 children. This is a good structure if much of the tree is in slow memory (disk), since the height, and hence the number of accesses, can be kept small, say one or two, by picking a large m. Also known as balanced multiway tree. Generalization :balanced tree, search tree.

22 Clearly, indexing large amounts of data can significantly improve search performance. Although other balanced tree structures can be used, a b-tree also optimizes costly disk accesses that are of concern when dealing with large data sets.

23 view


Download ppt "Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through."

Similar presentations


Ads by Google