Presentation is loading. Please wait.

Presentation is loading. Please wait.

Symposium on Applied Bioinformatics

Similar presentations


Presentation on theme: "Symposium on Applied Bioinformatics"— Presentation transcript:

1 Symposium on Applied Bioinformatics
Rick Pascual

2

3 How are they related to the other Bear Paw reads?
Where do I start? Claimed Reads Bear Paw Hot Spring BPHS.AOIX1739-b2 DEFINE “b2” BPHS.AOIX1739-g2 DEFINE “g2” How are they related to the other Bear Paw reads?

4 Initial Alignment of Bear Paw Reads
Showed a large portion in the beginning of each sequence matched up Average of 60 nucleotides long Mostly 100% identity Alignment of the reads further confirmed findings

5 Print-Out of Finding

6

7 Going to Schoenfeld et al.
Viral DNA was physically sheared to 3 to 6 kb Fragments were ligated to a double stranded linker Found that the alignments matched up to one of the linker 5’-GGAGCAGTATCAGATACAAGCGGCC GCATC-3’

8 However… Not all the reads exactly matched Many had variations
Some did not seem to have the linker Many had “N” nucleotides included in the first parts How do I take out the garbage?

9 W.W.J.D.? What Would Jeff Do?

10 Started out small… Took my reads and found ways of cutting out the “garbage” Did a COUNT-OF the exact matches of the linker among my reads Didn’t find anything Same results when applied to Bear Paw reads

11 5’-GGAGCAGTATCAGATACAAGCGGCCGCATC-3’
6170 out 8352 Bear Paw reads had only one exact match

12 BioBIKE Created a loop function
Looked for the segment (5’-GCCGCATC-3’) in each read If match was found, took the read and took the sequence from calculated coordinate to the end Collected the reads and returned them in a list

13

14 Analysis of loop program
Worked on individual reads as confirmed by aligning it to the edited reads Also had ends taken out, analysis and alignment of back ends of several Bear Paw showed no significant evidence of similarity Not as reliable as hoped for Did not discriminate between forward and backward matches Threw out a good amount of reads (~30%)

15 Working with Edited Reads
SEQUENCE-SIMILAR-TO showed similarities (~80% identity) around the ends and beginning areas of different reads

16 BPHSe.AOIX1739-b2

17 Initial Problems… Seemed to be in the middle portions of the reads and not at the very ends Match was not 100% Over-lapping didn’t seem possible Didn’t know how to extend reads

18

19 BPHSe.AOIX1739-b2

20

21 Linkage

22 Linkage

23 Analysis Area had a high level of consensus
Over the four reads, there were many instances where three reads would have the same nucleotide in the same coordinate Used the alignment of the reads to create a segment that incorporated all the consensus coordinates as well as dominant appearing nucleotide

24 Analysis

25 Analysis Combined segment (222nts) along with the linked reads into a longer contig Looked for ORFs through NBCI and GeneMark

26 Analysis

27 Analysis

28 Analysis GeneMark showed signs of a gene in a definite length (from coordinate 821 to 973) Results from Blast were null

29 Future Directions Extend contigs to be able to find more genes for map
Use more reads beyond the claimed reads Use information from alignment for phylogeny tree creation Conduct more analyses on various aspects of the reads

30 THANK YOU!


Download ppt "Symposium on Applied Bioinformatics"

Similar presentations


Ads by Google