Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computational Biology Dr. Jens Allmer Lecture Slides Week 6.

Similar presentations


Presentation on theme: "Computational Biology Dr. Jens Allmer Lecture Slides Week 6."— Presentation transcript:

1 Computational Biology Dr. Jens Allmer Lecture Slides Week 6

2 MBG404 Overview Data Generation Processing Storage Mining Pipelining

3 A Relation is a Table Attributes (column headers) Tuples (rows) Contains data -> Instance Domain All possible values namemanf Winterbrew Bud Lite Pete’s Anheuser-Busch Beers

4 A DB is a Collection of Relations

5 Decompositions and Normal Forms So you stored your data in a high normal form and decomposed your Relations because of that. How do you now get any information from your DB?

6 Why SQL? SQL is a very-high-level language. –Say “what to do” rather than “how to do it.” –Avoid a lot of data-manipulation details needed in procedural languages like C++ or Java. Database management system figures out “best” way to execute query. –Called “query optimization.”

7 Select-From-Where Statements SELECT desired attributes FROM one or more tables WHERE condition about tuples of the tables ORDER BYdesired attributes

8 Our Running Example All our SQL queries will be based on the following database schema. –Underline indicates key attributes. Beers(name, manf) Bars(name, addr, license) Drinkers(name, addr, phone) Likes(drinker, beer) Sells(bar, beer, price) Frequents(drinker, bar)

9 Example Using Beers(name, manf), what beers are made by Anheuser-Busch? SELECT name FROM Beers WHERE manf = ’Anheuser-Busch’; Notice SQL uses single-quotes for strings. SQL is case-insensitive, except inside strings.

10 Result of Query name Bud Bud Lite Michelob... The answer is a relation with a single attribute, name, and tuples with the name of each beer by Anheuser-Busch, such as Bud.

11 Meaning of Single-Relation Query Begin with the relation in the FROM clause. Apply the selection indicated by the WHERE clause. Apply the extended projection indicated by the SELECT clause.

12 Operational Semantics Check if Anheuser-Busch namemanf BudAnheuser-Busch tv Include tv.name in the result

13 * In SELECT clauses When there is one relation in the FROM clause, * in the SELECT clause stands for “all attributes of this relation.” Example using Beers(name, manf): SELECT * FROM Beers WHERE manf = ’Anheuser-Busch’;

14 Result of Query: namemanf BudAnheuser-Busch Bud LiteAnheuser-Busch MichelobAnheuser-Busch... Now, the result has each of the attributes of Beers.

15 Renaming Attributes If you want the result to have different attribute names, use “AS ” to rename an attribute. Example based on Beers(name, manf): SELECT name AS beer, manf FROM Beers WHERE manf = ’Anheuser-Busch’

16 Result of Query: beermanf BudAnheuser-Busch Bud LiteAnheuser-Busch MichelobAnheuser-Busch...

17 Expressions in SELECT Clauses Any mathematical expression that makes sense can appear as an element of a SELECT clause. Example: from Sells(bar, beer, price): SELECT bar, beer, price * 114 AS priceInYen FROM Sells;

18 Result of Query barbeerpriceInYen Joe’sBud285 Sue’sMiller342 … … …

19 Another Example: Constant Expressions From Likes(drinker, beer) : SELECT drinker, ’likes Bud’ AS whoLikesBud FROM Likes WHERE beer = ’Bud’;

20 Result of Query drinkerwhoLikesBud Sallylikes Bud Fredlikes Bud …

21 Complex Conditions in WHERE Clause From Sells(bar, beer, price), find the price Joe’s Bar charges for Bud: SELECT price FROM Sells WHERE bar = ’Joe’’s Bar’ AND beer = ’Bud’; Notice how we get a single-quote in strings.

22 Patterns WHERE clauses can have conditions in which a string is compared with a pattern, to see if it matches. General form: – LIKE – NOT LIKE Pattern is a quoted string with –% = “any string”; –_ = “any character.”

23 Example From Drinkers(name, addr, phone) find the drinkers with area code 555: SELECT name FROM Drinkers WHERE phone LIKE ’%555-_ _ _ _’;

24 NULL Values Tuples in SQL relations can have NULL as a value for one or more components. Meaning depends on context. Two common cases: –Missing value : e.g., we know Joe’s Bar has some address, but we don’t know what it is. –Inapplicable : e.g., the value of attribute spouse for an unmarried person.

25 Comparing NULL’s to Values The logic of conditions in SQL is really 3-valued logic: TRUE, FALSE, UNKNOWN. When any value is compared with NULL, the truth value is UNKNOWN. But a query only produces a tuple in the answer if its truth value for the WHERE clause is TRUE (not FALSE or UNKNOWN).

26 Surprising Example From the following Sells relation: barbeerprice Joe’s BarBudNULL SELECT bar FROM Sells WHERE price = 2.00; UNKNOWN

27 End Theory I 5 min mindmapping 10 min break

28 Practice I

29 Where is my DB? Try to remember where you stored the Gene Ontology terms we used last week You can also try the Recent part in the File menu (e.g.: you are on a different PC) Select the DB and open it up for our queries

30 Query 1 How many enteries are there in the top categories –Biological process –Cellular component –Molecular function To find out use the query wizard –At the end we want: SELECT term.namespace, Count(term.name) AS CountOfname FROM term GROUP BY term.namespace; –Follow along as I use the query wizard

31 Query 2 Find all terms that contain the term ‘mito’ –At the end we want: SELECT term.namespace, Count(term.name) AS CountOfname FROM term GROUP BY term.namespace; –Follow along as I use the query editor

32 Query 3 Now you are on your own Find all definitions related to catalysis and ethanol At the end we want: SELECT def.defstr FROM def WHERE (((def.defstr) Like '*catalysis*') AND ((def.defstr) Like '*ethanol*'));

33 Constant

34 End Practice I 15 min break

35 Theory II

36 Multirelation Queries Interesting queries often combine data from more than one relation. We can address several relations in one query by listing them all in the FROM clause. Distinguish attributes of the same name by “. ”

37 Example Using relations Likes(drinker, beer) and Frequents(drinker, bar), find the beers liked by at least one person who frequents Joe’s Bar. SELECT beer FROM Likes, Frequents WHERE bar = ’Joe’’s Bar’ AND Frequents.drinker = Likes.drinker; Different then inner join A simple cross product of all combination of rows

38 Formal Semantics Almost the same as for single-relation queries: 1.Start with the product of all the relations in the FROM clause. 2.Apply the selection condition from the WHERE clause. 3.Project onto the list of attributes and expressions in the SELECT clause.

39 Operational Semantics Imagine one tuple-variable for each relation in the FROM clause. –These tuple-variables visit each combination of tuples, one from each relation. If the tuple-variables are pointing to tuples that satisfy the WHERE clause, send these tuples to the SELECT clause.

40 Example drinker bardrinker beer tv1tv2 Sally Bud Sally Joe’s Likes Frequents to output check these are equal check for Joe

41 Controlling Duplicate Elimination Force the result to be a set by SELECT DISTINCT... Force the result to be a bag (i.e., don’t eliminate duplicates) by ALL, as in... UNION ALL...

42 Example: DISTINCT From Sells(bar, beer, price), find all the different prices charged for beers: SELECT DISTINCT price FROM Sells; Notice that without DISTINCT, each price would be listed as many times as there were bar/beer pairs at that price.

43 Products and Natural Joins Natural join: R NATURAL JOIN S; Product: R CROSS JOIN S; Example: Likes NATURAL JOIN Serves; Relations can be parenthesized subqueries, as well. R OUTER JOIN S is the core of an outerjoin expression. It is modified by: 1.Optional NATURAL in front of OUTER. 2.Optional ON after JOIN. 3.Optional LEFT, RIGHT, or FULL before OUTER.  LEFT = pad dangling tuples of R only.  RIGHT = pad dangling tuples of S only.  FULL = pad both; this choice is the default.

44 Outerjoins R OUTER JOIN S is the core of an outerjoin expression. It is modified by: 1.Optional NATURAL in front of OUTER. 2.Optional ON after JOIN. 3.Optional LEFT, RIGHT, or FULL before OUTER.  LEFT = pad dangling tuples of R only.  RIGHT = pad dangling tuples of S only.  FULL = pad both; this choice is the default.

45 Overview SELECT (π) AS (σ) FROM (Table) AS (σ) WHERE (Condition) SELECT Table.Attribute AS newName... FROM (Table1 INNER JOIN Table2) ON Table1.Attribute = Table2.Attribute... GROUP BY Table.Attribute,... HAVING (requires GROUP BY) (COUNT|AVG|MAX|MIN|SUM) (>|=|<) (ConstValue|SubQuery) WHERE Table1.Attribute (>|=|<|IS|IS NOT|LIKE) (ConstValue|Table2.Attribute|SubQuery) ORDER BY Table.Attribute,...

46 End Theory II Mindmap 5 min Break 10 min

47 Practice II

48 Join Tables Join term and relationship –Term.id = relationship.to SELECT term.id, term.name, term.namespace, relationship.type, relationship.to FROM relationship INNER JOIN term ON relationship.to = term.id; Any problems?

49 Term Interrelationships Join term with itself Select those terms wher id matches to is_a SELECT term.id, term.name, term.namespace, term.is_a, term_1.id, term_1.is_a FROM term INNER JOIN term AS term_1 ON term.id = term_1.is_a;

50 Term Types Join term and relationship tables Find the number of types that are associated with each term Order by highest count first Final query: –SELECT term.id, Count(relationship.type) AS CountOftype –FROM term INNER JOIN relationship ON term.id = relationship.to –GROUP BY term.id –ORDER BY Count(relationship.type) DESC; Now change to only count distinct types –How? –SELECT dt.id, Count(dt.type) AS CountOftype –FROM (SELECT DISTINCT id, type FROM term INNER JOIN relationship ON term.id = relationship.to) AS dt –GROUP BY dt.id;

51 Beneficial Things to do Find data worth putting into a database –Even simple anlyses you do/did in the lab may be worth –Check online for interesting datasets if you cannot find anything Model the data in MS Access –Make at least three tables Store the data in your database


Download ppt "Computational Biology Dr. Jens Allmer Lecture Slides Week 6."

Similar presentations


Ads by Google