Computational Biology

Slides:



Advertisements
Similar presentations
CS411 Database Systems Kazuhiro Minami 06: SQL. SQL = Structured Query Language Standard language for querying and manipulating data Has similar capabilities.
Advertisements

Union, Intersection, Difference (subquery) UNION (subquery) produces the union of the two relations. Similarly for INTERSECT, EXCEPT = intersection and.
SQL Group Members: Shijun Shen Xia Tang Sixin Qiang.
1 Introduction to SQL Select-From-Where Statements Multirelation Queries Subqueries.
SQL Queries Principal form: SELECT desired attributes FROM tuple variables –– range over relations WHERE condition about tuple variables; Running example.
Winter 2002Arthur Keller – CS 1806–1 Schedule Today: Jan. 22 (T) u SQL Queries. u Read Sections Assignment 2 due. Jan. 24 (TH) u Subqueries, Grouping.
SQL CSET 3300.
1 Database Systems Relations as Bags Grouping and Aggregation Database Modification.
1 Introduction to SQL Multirelation Queries Subqueries Slides are reused by the approval of Jeffrey Ullman’s.
IS698: Database Management Min Song IS NJIT. Overview  Query processing  Query Optmization  SQL.
SQL. 1.SQL is a high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation details that would be necessary in.
Fall 2001Arthur Keller – CS 1806–1 Schedule Today (TH) Bags and SQL Queries. u Read Sections Project Part 2 due. Oct. 16 (T) Duplicates, Aggregation,
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #3.
CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: Notes #3.
Winter 2002Arthur Keller – CS 1807–1 Schedule Today: Jan. 24 (TH) u Subqueries, Grouping and Aggregation. u Read Sections Project Part 2 due.
1 More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation Insert/Delete/Update.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #2.
Chapter 6 Notes. 6.1 Simple Queries in SQL SQL is not usually used as a stand-alone language In practice there are hosting programs in a high-level language.
SCUHolliday6–1 Schedule Today: u SQL Queries. u Read Sections Next time u Subqueries, Grouping and Aggregation. u Read Sections And then.
Databases : SQL-Introduction 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof. Jeffrey D. Ullman distributes.
1 IT 244 Database Management System Lecture 11 SQL Select-From-Where Statements Meaning of queries Subqueries Ref : -A First Course in Database System.
Constraints on Relations Foreign Keys Local and Global Constraints Triggers Following lecture slides are modified from Jeff Ullman’s slides
Onsdag The concepts in a relation data model SQL DDL DML.
Computational Biology Dr. Jens Allmer Lecture Slides Week 6.
Databases 1 Second lecture.
1 CSCE Database Systems Anxiao (Andrew) Jiang The Database Language SQL.
1 Introduction to SQL Database Systems. 2 Why SQL? SQL is a very-high-level language, in which the programmer is able to avoid specifying a lot of data-manipulation.
Introduction to SQL Introduction Select-From-Where Statements Queries over Several Relations Subqueries.
1 Lecture 6 Introduction to SQL part 4 Slides from
Himanshu GuptaCSE 532-SQL-1 SQL. Himanshu GuptaCSE 532-SQL-2 Why SQL? SQL is a very-high-level language, in which the programmer is able to avoid specifying.
SCUHolliday - coen 1787–1 Schedule Today: u Subqueries, Grouping and Aggregation. u Read Sections Next u Modifications, Schemas, Views. u Read.
More SQL (and Relational Algebra). More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation Insert/Delete/Update.
Databases : SQL Multi-Relations 2007, Fall Pusan National University Ki-Joune Li These slides are made from the materials that Prof. Jeffrey D. Ullman.
1 Introduction to SQL Select-From-Where Statements Subqueries Grouping and Aggregation.
1 Introduction to Database Systems, CS420 SQL JOIN, Aggregate, Grouping, HAVING and DML Clauses.
1 Database Design: DBS CB, 2 nd Edition SQL: Select-From-Where Statements & Multi-relation Queries & Subqueries Ch. 6.
Select-From-Where Statements Multirelation Queries Subqueries
CPSC-310 Database Systems
Relational Database Systems 1
Schedule Today: Jan. 28 (Mon) Jan. 30 (Wed) Next Week Assignments !!
Slides are reused by the approval of Jeffrey Ullman’s
CPSC-310 Database Systems
Outerjoins, Grouping/Aggregation Insert/Delete/Update
Foreign Keys Local and Global Constraints Triggers
Databases : More about SQL
CPSC-310 Database Systems
Schedule Today: Next After that Subqueries, Grouping and Aggregation.
Introduction to Database Systems, CS420
CPSC-608 Database Systems
06a: SQL-1 The Basics– Select-From-Where
CS 440 Database Management Systems
CPSC-608 Database Systems
Database Design and Programming
CPSC-310 Database Systems
Database Models Relational Model
CPSC-310 Database Systems
IST 210: Organization of Data
CPSC-310 Database Systems
IT 244 Database Management System
CMSC-461 Database Management Systems
The Relational Data Model
CPSC-608 Database Systems
CPSC-608 Database Systems
More SQL Extended Relational Algebra Outerjoins, Grouping/Aggregation
CPSC-608 Database Systems
CPSC-608 Database Systems
CMSC-461 Database Management Systems
Instructor: Zhe He Department of Computer Science
Select-From-Where Statements Multirelation Queries Subqueries
Presentation transcript:

Computational Biology Dr. Jens Allmer Lecture Slides Week 7

MBG404 Overview Processing Pipelining Generation Data Storage Mining

A Relation is a Table Beers name manf Winterbrew Bud Lite Pete’s Attributes (column headers) Beers name manf Winterbrew Bud Lite Pete’s Anheuser-Busch Tuples (rows) Domain All possible values Contains data -> Instance

A DB is a Collection of Relations

Decompositions and Normal Forms So you stored your data in a high normal form and decomposed your Relations because of that. How do you now get any information from your DB?

Why SQL? SQL is a very-high-level language. Say “what to do” rather than “how to do it.” Avoid a lot of data-manipulation details needed in procedural languages like C++ or Java. Database management system figures out “best” way to execute query. Called “query optimization.” 6

Select-From-Where Statements SELECT desired attributes FROM one or more tables WHERE condition about tuples of the tables ORDER BY desired attributes 7

Our Running Example All our SQL queries will be based on the following database schema. Underline indicates key attributes. Beers(name, manf) Bars(name, addr, license) Drinkers(name, addr, phone) Likes(drinker, beer) Sells(bar, beer, price) Frequents(drinker, bar) 8

Example Using Beers(name, manf), what beers are made by Anheuser-Busch? SELECT name FROM Beers WHERE manf = ’Anheuser-Busch’; Notice SQL uses single-quotes for strings. SQL is case-insensitive, except inside strings. 9

Result of Query The answer is a relation with a single attribute, name Bud Bud Lite Michelob . . . The answer is a relation with a single attribute, name, and tuples with the name of each beer by Anheuser-Busch, such as Bud. 10

Meaning of Single-Relation Query Begin with the relation in the FROM clause. Apply the selection indicated by the WHERE clause. Apply the extended projection indicated by the SELECT clause. 11

Operational Semantics name manf tv Bud Anheuser-Busch Include tv.name in the result Check if Anheuser-Busch 12

* In SELECT clauses When there is one relation in the FROM clause, * in the SELECT clause stands for “all attributes of this relation.” Example using Beers(name, manf): SELECT * FROM Beers WHERE manf = ’Anheuser-Busch’; 13

Result of Query: Now, the result has each of the attributes of Beers. name manf Bud Anheuser-Busch Bud Lite Anheuser-Busch Michelob Anheuser-Busch . . . . . . Now, the result has each of the attributes of Beers. 14

Renaming Attributes If you want the result to have different attribute names, use “AS <new name>” to rename an attribute. Example based on Beers(name, manf): SELECT name AS beer, manf FROM Beers WHERE manf = ’Anheuser-Busch’ 15

Result of Query: beer manf Bud Anheuser-Busch Bud Lite Anheuser-Busch Michelob Anheuser-Busch . . . . . . 16

Expressions in SELECT Clauses Any mathematical expression that makes sense can appear as an element of a SELECT clause. Example: from Sells(bar, beer, price): SELECT bar, beer, price * 114 AS priceInYen FROM Sells; 17

Result of Query bar beer priceInYen Joe’s Bud 285 Sue’s Miller 342 … … … 18

Another Example: Constant Expressions From Likes(drinker, beer) : SELECT drinker, ’likes Bud’ AS whoLikesBud FROM Likes WHERE beer = ’Bud’; 19

Result of Query drinker whoLikesBud Sally likes Bud Fred likes Bud … … … … 20

Complex Conditions in WHERE Clause From Sells(bar, beer, price), find the price Joe’s Bar charges for Bud: SELECT price FROM Sells WHERE bar = ’Joe’’s Bar’ AND beer = ’Bud’; Notice how we get a single-quote in strings. 21

Patterns WHERE clauses can have conditions in which a string is compared with a pattern, to see if it matches. General form: <Attribute> LIKE <pattern> <Attribute> NOT LIKE <pattern> Pattern is a quoted string with % = “any string”; _ = “any character.” 22

Example From Drinkers(name, addr, phone) find the drinkers with area code 555: SELECT name FROM Drinkers WHERE phone LIKE ’%555-_ _ _ _’; 23

NULL Values Tuples in SQL relations can have NULL as a value for one or more components. Meaning depends on context. Two common cases: Missing value : e.g., we know Joe’s Bar has some address, but we don’t know what it is. Inapplicable : e.g., the value of attribute spouse for an unmarried person. 24

Comparing NULL’s to Values The logic of conditions in SQL is really 3-valued logic: TRUE, FALSE, UNKNOWN. When any value is compared with NULL, the truth value is UNKNOWN. But a query only produces a tuple in the answer if its truth value for the WHERE clause is TRUE (not FALSE or UNKNOWN). 25

Surprising Example From the following Sells relation: bar beer price Joe’s Bar Bud NULL SELECT bar FROM Sells WHERE price < 2.00 OR price >= 2.00; UNKNOWN UNKNOWN UNKNOWN 26

End Theory I 5 min mindmapping 10 min break

Practice I

Where is my DB? Try to remember where you stored the Gene Ontology terms we used last week You can also try the Recent part in the File menu (e.g.: you are on a different PC) Select the DB and open it up for our queries

Query 1 How many enteries are there in the top categories Biological process Cellular component Molecular function To find out use the query wizard At the end we want: SELECT term.namespace, Count(term.name) AS CountOfname FROM term GROUP BY term.namespace; Follow along as I use the query wizard

Query 2 Find all terms that contain the term ‘mito’ At the end we want: SELECT term.namespace, Count(term.name) AS CountOfname FROM term GROUP BY term.namespace; Follow along as I use the query editor

Query 3 Now you are on your own Find all definitions related to catalysis and ethanol At the end we want: SELECT def.defstr FROM def WHERE (((def.defstr) Like '*catalysis*') AND ((def.defstr) Like '*ethanol*'));

Constant

End Practice I 15 min break

Theory II

Multirelation Queries Interesting queries often combine data from more than one relation. We can address several relations in one query by listing them all in the FROM clause. Distinguish attributes of the same name by “<relation>.<attribute>” 36

Example Using relations Likes(drinker, beer) and Frequents(drinker, bar), find the beers liked by at least one person who frequents Joe’s Bar. SELECT beer FROM Likes, Frequents WHERE bar = ’Joe’’s Bar’ AND Frequents.drinker = Likes.drinker; Different then inner join A simple cross product of all combination of rows 37

Formal Semantics Almost the same as for single-relation queries: Start with the product of all the relations in the FROM clause. Apply the selection condition from the WHERE clause. Project onto the list of attributes and expressions in the SELECT clause. 38

Operational Semantics Imagine one tuple-variable for each relation in the FROM clause. These tuple-variables visit each combination of tuples, one from each relation. If the tuple-variables are pointing to tuples that satisfy the WHERE clause, send these tuples to the SELECT clause. 39

Example drinker bar drinker beer tv1 tv2 Sally Bud Sally Joe’s Likes Frequents to output check these are equal check for Joe 40

Controlling Duplicate Elimination Force the result to be a set by SELECT DISTINCT . . . Force the result to be a bag (i.e., don’t eliminate duplicates) by ALL, as in . . . UNION ALL . . . 41

Example: DISTINCT From Sells(bar, beer, price), find all the different prices charged for beers: SELECT DISTINCT price FROM Sells; Notice that without DISTINCT, each price would be listed as many times as there were bar/beer pairs at that price. 42

Products and Natural Joins R NATURAL JOIN S; Product: R CROSS JOIN S; Example: Likes NATURAL JOIN Serves; Relations can be parenthesized subqueries, as well. R OUTER JOIN S is the core of an outerjoin expression. It is modified by: Optional NATURAL in front of OUTER. Optional ON <condition> after JOIN. Optional LEFT, RIGHT, or FULL before OUTER. LEFT = pad dangling tuples of R only. RIGHT = pad dangling tuples of S only. FULL = pad both; this choice is the default. 43

Outerjoins R OUTER JOIN S is the core of an outerjoin expression. It is modified by: Optional NATURAL in front of OUTER. Optional ON <condition> after JOIN. Optional LEFT, RIGHT, or FULL before OUTER. LEFT = pad dangling tuples of R only. RIGHT = pad dangling tuples of S only. FULL = pad both; this choice is the default. 44

Overview SELECT (π) AS (σ) FROM (Table) AS (σ) WHERE (Condition) SELECT Table.Attribute AS newName ... FROM (Table1 INNER JOIN Table2) ON Table1.Attribute = Table2.Attribute ... GROUP BY Table.Attribute, ... HAVING (requires GROUP BY) (COUNT|AVG|MAX|MIN|SUM) (>|=|<) (ConstValue|SubQuery) WHERE Table1.Attribute (>|=|<|IS|IS NOT|LIKE) (ConstValue|Table2.Attribute|SubQuery) ORDER BY Table.Attribute, ... 45

End Theory II Mindmap 5 min Break 10 min

Practice II

Join Tables Join term and relationship Any problems? Term.id = relationship.to SELECT term.id, term.name, term.namespace, relationship.type, relationship.to FROM relationship INNER JOIN term ON relationship.to = term.id; Any problems?

Term Interrelationships Join term with itself Select those terms wher id matches to is_a SELECT term.id, term.name, term.namespace, term.is_a, term_1.id, term_1.is_a FROM term INNER JOIN term AS term_1 ON term.id = term_1.is_a;

Term Types Join term and relationship tables Find the number of types that are associated with each term Order by highest count first Final query: SELECT term.id, Count(relationship.type) AS CountOftype FROM term INNER JOIN relationship ON term.id = relationship.to GROUP BY term.id ORDER BY Count(relationship.type) DESC; Now change to only count distinct types How? SELECT dt.id, Count(dt.type) AS CountOftype FROM (SELECT DISTINCT id, type FROM term INNER JOIN relationship ON term.id = relationship.to) AS dt GROUP BY dt.id;

Beneficial Things to do Find data worth putting into a database Even simple anlyses you do/did in the lab may be worth Check online for interesting datasets if you cannot find anything Model the data in MS Access Make at least three tables Store the data in your database