Presentation is loading. Please wait.

Presentation is loading. Please wait.

Part I: Tips and techniques from curators Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science.

Similar presentations


Presentation on theme: "Part I: Tips and techniques from curators Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science."— Presentation transcript:

1 Part I: Tips and techniques from curators Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science

2 Scientists often want to work with more than one gene or protein that are related by a common feature TAIR (and the PMN) offer some basic tools to create customized data sets (e.g. lists of genes or proteins) to add more information to data sets to analyze data sets Sometimes, one gene isnt enough...

3 Data sets can be based on many different criteria: Overall sequence alignment (DNA or protein) Sequence motifs (DNA or protein) Protein domains and biochemical properties Gene/Protein function Subcellular location Molecular function Biological process Expression pattern Biochemical pathway Mapping region Phenotype Gene families Creating customized data sets using TAIR and the PMN How do you generate these data sets?

4 Creating customized data sets: TAIR Data sets can be obtained using several strategies at TAIR Advanced search pages Data-mining tools

5 Creating customized data sets: PMN Data sets can be obtained using several strategies at the PMN What is the PMN? It is the home of AraCyc – the Arabidopsis metabolic pathway database The Plant Metabolic Network (PMN) maintains a set of metabolic pathway databases for Arabidopsis and other plants Provides tools to analyze metabolic data Generates new metabolic pathway databases for crops and other important plants

6 Pathway Enzyme Gene Reaction Compound Evidence Code AraCyc Pathway pages contain several types of data Metabolic pathway data in AraCyc at the PMN

7 Pathway pages contain curated comments and useful links Metabolic pathway data in AraCyc at the PMN

8 Creating customized data sets: PMN Data sets can be obtained using several strategies at the PMN Advanced search page Data-mining tools (*coming soon*) Metabolic pathway pages

9 Additional information can be obtained for your data set Enhancing customized data sets Bulk data retrieval tool FTP files

10 You have mapped a mutation that disrupts flower development to a region of Chromosome 1 What are some good candidates in the mapping interval? Get a list of all the genes in the mapping interval and find candidates involved in flower development Find all the associated gene function (GO) and expression (PO) annotations for the candidate genes Obtain gene confidence scores for all associated gene models to choose sequence for complementation Customized data sets: case studies

11 Get a list of all the genes in a mapping interval involved in flower development Customized data sets: Flower development PVV4.1NCC1

12 Customized data sets: Flower development AT1G09000 MAP kinase kinase kinase activity, cellular component unknown, embryo, flower, flower development, kinase activity, leaf, petal differentiation and expansion stage, response to oxidative stress, root, seed, shoot apex, whole plant, D bilateral stage, E expanded cotyledon stage, F mature embryo stage, Choose gene models to express for complementation experiments...

13 Customized data sets: Flower development Obtain gene confidence scores for all associated gene models

14 You work on a transcription factor that affects jasmonic acid biosynthesis Do JA biosynthetic genes share common sequences in their promoters? Obtain a list of all the genes involved in JA biosynthesis Get upstream promoter sequences Search for over-represented DNA sequences in promoters Creating customized data sets

15 Customized data sets: JA biosynthesis jasmonic acid

16 Customized data sets: JA biosynthesis Take this gene list to TAIR... to get upstream sequences

17 Customized data sets: JA biosynthesis Get upstream promoter sequences

18 Customized data sets: JA biosynthesis Search for over-represented or prevalent DNA sequences in promoters Use the Motif Analyzer in TAIR to identify common 6-mers AT1G69490 AT1G48270 AT1G11870 AT1G12820

19 Creating customized data sets You are studying a protein with an exciting new domain: Thr-x-Ala-x-Ile-x-Arg Are there other TxAxIxR proteins? Do they share additional domains? Find all of the proteins that have the TxAxIxR domain Identify all of the other domains found in those proteins

20 Customized data sets: TxAxIxR proteins Find all of the proteins that have the TxAxIxR domain

21 Customized data sets: TxAxIxR proteins Identify all of the other domains found in those proteins

22 Analyzing data sets Sometimes you want to analyze data sets We have a few analysis tools: Analyze = DISPLAY data in a visual manner with a few statistics Data must be pre-cleaned If you want to display quantitative metabolic data on genes, enzymes or compounds OMICS viewer If you want to look for over-represented annotations for a list of genes or proteins All the genes up-regulated in a mutant All of the proteins found in the ovule GO categorization tool

23 GO categorization Classify your list of genes/proteins using GO annotations

24 GO categorization

25 ... or use a tool at AmiGO (on hand-out)

26 Putting TAIR and the PMN to work for you Use TAIR to find detailed information for specific genes or proteins Locus page, gene model page, protein page Many sections, many data types, many external links GBrowse Many tracks New gene confidence scores as part of TAIR9 release Use TAIR and the PMN to generate and work with customized data sets Create and add data to lists of proteins and genes Specific and Advanced Search pages Motif analysis tools FTP files with large data sets Visualize and analyze data OMICs viewer (PMN) GO categorization (TAIR) If youre having trouble getting any information you want...

27 We are here to help! Please visit us and ask questions at the Curation Booth! Workshop Part II: Practice sets and individual help

28 Acknowledgements TAIR, AraCyc, and the PMN Current Curators: - Tanya Berardini (lead curator – functional annotation) - David Swarbreck (lead curator – structural annotation) - Peifen Zhang (Director and lead curator- metabolism) - A. S. Karthikeyan (curator) - Philippe Lamesch (curator) - Donghui Li (curator) - Rajkumar Sasidharan (curator) Recent Past Contributors: - Debbie Alexander (curator) - Christophe Tissier (curator) - Hartmut Foerster (curator) Tech Team Members: - Bob Muller (Manager) - Larry Ploetz (Sys. Administrator) - Raymond Chetty - Anjo Chi - Vanessa Kirkup - Cynthia Lee - Tom Meyer - Shanker Singh - Chris Wilks Metabolic Pathway Software: - Peter Karp and SRI group Eva Huala (Director and Co-PI) Sue Rhee (PI and Co-PI)

29 Part I: Tips and techniques from curators Bonus slides... Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science

30 Customized data sets: Flower development Find all the associated GO terms and PO terms and get evidence codes

31 Obtain a list of all the genes involved in JA biosynthesis Customized data sets: JA biosynthesis

32

33 Another option Use pathway page

34 OMICs Viewer

35

36 Customized data sets: JA biosynthesis Experimental results provide a more detailed sequence: (A or T)C(A or C or G)TCGGT(G or T)A


Download ppt "Part I: Tips and techniques from curators Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science."

Similar presentations


Ads by Google