Download presentation
Presentation is loading. Please wait.
1
Computational Biology
Dr. Jens Allmer Lecture Slides Week 7
2
MBG404 Overview Processing Pipelining Generation Data Storage Mining
3
A Relation is a Table Beers name manf Winterbrew Bud Lite Pete’s
Attributes (column headers) Beers name manf Winterbrew Bud Lite Pete’s Anheuser-Busch Tuples (rows) Domain All possible values Contains data -> Instance
4
A DB is a Collection of Relations
5
Decompositions and Normal Forms
So you stored your data in a high normal form and decomposed your Relations because of that. How do you now get any information from your DB?
6
Why SQL? SQL is a very-high-level language.
Say “what to do” rather than “how to do it.” Avoid a lot of data-manipulation details needed in procedural languages like C++ or Java. Database management system figures out “best” way to execute query. Called “query optimization.” 6
7
Select-From-Where Statements
SELECT desired attributes FROM one or more tables WHERE condition about tuples of the tables ORDER BY desired attributes 7
8
Our Running Example All our SQL queries will be based on the following database schema. Underline indicates key attributes. Beers(name, manf) Bars(name, addr, license) Drinkers(name, addr, phone) Likes(drinker, beer) Sells(bar, beer, price) Frequents(drinker, bar) 8
9
Example Using Beers(name, manf), what beers are made by Anheuser-Busch? SELECT name FROM Beers WHERE manf = ’Anheuser-Busch’; Notice SQL uses single-quotes for strings. SQL is case-insensitive, except inside strings. 9
10
Result of Query The answer is a relation with a single attribute,
name Bud Bud Lite Michelob . . . The answer is a relation with a single attribute, name, and tuples with the name of each beer by Anheuser-Busch, such as Bud. 10
11
Meaning of Single-Relation Query
Begin with the relation in the FROM clause. Apply the selection indicated by the WHERE clause. Apply the extended projection indicated by the SELECT clause. 11
12
Operational Semantics
name manf tv Bud Anheuser-Busch Include tv.name in the result Check if Anheuser-Busch 12
13
* In SELECT clauses When there is one relation in the FROM clause, * in the SELECT clause stands for “all attributes of this relation.” Example using Beers(name, manf): SELECT * FROM Beers WHERE manf = ’Anheuser-Busch’; 13
14
Result of Query: Now, the result has each of the attributes of Beers.
name manf Bud Anheuser-Busch Bud Lite Anheuser-Busch Michelob Anheuser-Busch Now, the result has each of the attributes of Beers. 14
15
Renaming Attributes If you want the result to have different attribute names, use “AS <new name>” to rename an attribute. Example based on Beers(name, manf): SELECT name AS beer, manf FROM Beers WHERE manf = ’Anheuser-Busch’ 15
16
Result of Query: beer manf Bud Anheuser-Busch Bud Lite Anheuser-Busch
Michelob Anheuser-Busch 16
17
Expressions in SELECT Clauses
Any mathematical expression that makes sense can appear as an element of a SELECT clause. Example: from Sells(bar, beer, price): SELECT bar, beer, price * 114 AS priceInYen FROM Sells; 17
18
Result of Query bar beer priceInYen Joe’s Bud 285 Sue’s Miller 342
… … … 18
19
Another Example: Constant Expressions
From Likes(drinker, beer) : SELECT drinker, ’likes Bud’ AS whoLikesBud FROM Likes WHERE beer = ’Bud’; 19
20
Result of Query drinker whoLikesBud Sally likes Bud Fred likes Bud … …
… … 20
21
Complex Conditions in WHERE Clause
From Sells(bar, beer, price), find the price Joe’s Bar charges for Bud: SELECT price FROM Sells WHERE bar = ’Joe’’s Bar’ AND beer = ’Bud’; Notice how we get a single-quote in strings. 21
22
Patterns WHERE clauses can have conditions in which a string is compared with a pattern, to see if it matches. General form: <Attribute> LIKE <pattern> <Attribute> NOT LIKE <pattern> Pattern is a quoted string with % = “any string”; _ = “any character.” 22
23
Example From Drinkers(name, addr, phone) find the drinkers with area code 555: SELECT name FROM Drinkers WHERE phone LIKE ’%555-_ _ _ _’; 23
24
NULL Values Tuples in SQL relations can have NULL as a value for one or more components. Meaning depends on context. Two common cases: Missing value : e.g., we know Joe’s Bar has some address, but we don’t know what it is. Inapplicable : e.g., the value of attribute spouse for an unmarried person. 24
25
Comparing NULL’s to Values
The logic of conditions in SQL is really 3-valued logic: TRUE, FALSE, UNKNOWN. When any value is compared with NULL, the truth value is UNKNOWN. But a query only produces a tuple in the answer if its truth value for the WHERE clause is TRUE (not FALSE or UNKNOWN). 25
26
Surprising Example From the following Sells relation: bar beer price
Joe’s Bar Bud NULL SELECT bar FROM Sells WHERE price < OR price >= 2.00; UNKNOWN UNKNOWN UNKNOWN 26
27
End Theory I 5 min mindmapping 10 min break
28
Practice I
29
Where is my DB? Try to remember where you stored the Gene Ontology terms we used last week You can also try the Recent part in the File menu (e.g.: you are on a different PC) Select the DB and open it up for our queries
30
Query 1 How many enteries are there in the top categories
Biological process Cellular component Molecular function To find out use the query wizard At the end we want: SELECT term.namespace, Count(term.name) AS CountOfname FROM term GROUP BY term.namespace; Follow along as I use the query wizard
31
Query 2 Find all terms that contain the term ‘mito’
At the end we want: SELECT term.namespace, Count(term.name) AS CountOfname FROM term GROUP BY term.namespace; Follow along as I use the query editor
32
Query 3 Now you are on your own
Find all definitions related to catalysis and ethanol At the end we want: SELECT def.defstr FROM def WHERE (((def.defstr) Like '*catalysis*') AND ((def.defstr) Like '*ethanol*'));
33
Constant
34
End Practice I 15 min break
35
Theory II
36
Multirelation Queries
Interesting queries often combine data from more than one relation. We can address several relations in one query by listing them all in the FROM clause. Distinguish attributes of the same name by “<relation>.<attribute>” 36
37
Example Using relations Likes(drinker, beer) and Frequents(drinker, bar), find the beers liked by at least one person who frequents Joe’s Bar. SELECT beer FROM Likes, Frequents WHERE bar = ’Joe’’s Bar’ AND Frequents.drinker = Likes.drinker; Different then inner join A simple cross product of all combination of rows 37
38
Formal Semantics Almost the same as for single-relation queries:
Start with the product of all the relations in the FROM clause. Apply the selection condition from the WHERE clause. Project onto the list of attributes and expressions in the SELECT clause. 38
39
Operational Semantics
Imagine one tuple-variable for each relation in the FROM clause. These tuple-variables visit each combination of tuples, one from each relation. If the tuple-variables are pointing to tuples that satisfy the WHERE clause, send these tuples to the SELECT clause. 39
40
Example drinker bar drinker beer tv1 tv2 Sally Bud Sally Joe’s Likes
Frequents to output check these are equal check for Joe 40
41
Controlling Duplicate Elimination
Force the result to be a set by SELECT DISTINCT . . . Force the result to be a bag (i.e., don’t eliminate duplicates) by ALL, as in UNION ALL . . . 41
42
Example: DISTINCT From Sells(bar, beer, price), find all the different prices charged for beers: SELECT DISTINCT price FROM Sells; Notice that without DISTINCT, each price would be listed as many times as there were bar/beer pairs at that price. 42
43
Products and Natural Joins
R NATURAL JOIN S; Product: R CROSS JOIN S; Example: Likes NATURAL JOIN Serves; Relations can be parenthesized subqueries, as well. R OUTER JOIN S is the core of an outerjoin expression. It is modified by: Optional NATURAL in front of OUTER. Optional ON <condition> after JOIN. Optional LEFT, RIGHT, or FULL before OUTER. LEFT = pad dangling tuples of R only. RIGHT = pad dangling tuples of S only. FULL = pad both; this choice is the default. 43
44
Outerjoins R OUTER JOIN S is the core of an outerjoin expression. It is modified by: Optional NATURAL in front of OUTER. Optional ON <condition> after JOIN. Optional LEFT, RIGHT, or FULL before OUTER. LEFT = pad dangling tuples of R only. RIGHT = pad dangling tuples of S only. FULL = pad both; this choice is the default. 44
45
Overview SELECT (π) AS (σ) FROM (Table) AS (σ) WHERE (Condition)
SELECT Table.Attribute AS newName ... FROM (Table1 INNER JOIN Table2) ON Table1.Attribute = Table2.Attribute ... GROUP BY Table.Attribute, ... HAVING (requires GROUP BY) (COUNT|AVG|MAX|MIN|SUM) (>|=|<) (ConstValue|SubQuery) WHERE Table1.Attribute (>|=|<|IS|IS NOT|LIKE) (ConstValue|Table2.Attribute|SubQuery) ORDER BY Table.Attribute, ... 45
46
End Theory II Mindmap 5 min Break 10 min
47
Practice II
48
Join Tables Join term and relationship Any problems?
Term.id = relationship.to SELECT term.id, term.name, term.namespace, relationship.type, relationship.to FROM relationship INNER JOIN term ON relationship.to = term.id; Any problems?
49
Term Interrelationships
Join term with itself Select those terms wher id matches to is_a SELECT term.id, term.name, term.namespace, term.is_a, term_1.id, term_1.is_a FROM term INNER JOIN term AS term_1 ON term.id = term_1.is_a;
50
Term Types Join term and relationship tables
Find the number of types that are associated with each term Order by highest count first Final query: SELECT term.id, Count(relationship.type) AS CountOftype FROM term INNER JOIN relationship ON term.id = relationship.to GROUP BY term.id ORDER BY Count(relationship.type) DESC; Now change to only count distinct types How? SELECT dt.id, Count(dt.type) AS CountOftype FROM (SELECT DISTINCT id, type FROM term INNER JOIN relationship ON term.id = relationship.to) AS dt GROUP BY dt.id;
51
Beneficial Things to do
Find data worth putting into a database Even simple anlyses you do/did in the lab may be worth Check online for interesting datasets if you cannot find anything Model the data in MS Access Make at least three tables Store the data in your database
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.