Presentation is loading. Please wait.

Presentation is loading. Please wait.

EDNA analyze Wang Ying & Huang Junman.

Similar presentations


Presentation on theme: "EDNA analyze Wang Ying & Huang Junman."— Presentation transcript:

1 eDNA analyze Wang Ying & Huang Junman

2 Introduction Environmental DNA (eDNA)
trace DNA in samples such as water, soil, or faeces. eDNA is a mixture of potentially degraded DNA from many different organisms.

3

4 Sampling sites Shanghai Ocean University The East China Sea
Shanghai ocean aquarium Yellow Mountain

5 Technology 1.Hybridization 2.PCR

6 MiFish,a set of universal PCR primers for metabarcoding environmental DNA from fishes :detection of more than 230 subtropical marine species

7

8 Data pre-processing MiSeq reads FASTQC, SUGAR Low-quality (out)
High-quality SolexaQA Tail-trimmed paired-end reads FLASH Assembled reads Ambiguous sites Unusual lengths (out) Perl remove primer sequences Expected size ( bp) Pre-processed reads TAGCLEANER FASTQ FASTA

9 Taxonomic assignment Pre-processed reads UCLUST “derep-fulllength”
>=10 identical reads Under-represented sequences Processed reads Pairwise If the identity more than or equal to 99% UCLUST “usearch-global” BLASTN Considered as idential >=97% E-value threshold of 10−5 Not the same species <97% The same species

10 Gene Bank https://www.ncbi.nlm.nih.gov/genome/browse/ Vertebrata
Organelles

11 Download NCBI gene bank
Smithwaterman 11 gene bank Add flank sequence Gene bank

12 Outline Database Original data Smithwaterman Trim getTarget
Rmrep(Remove replication) Assembly BLAST Result Selection

13 Read & contig

14 Trim & Rmrep Trim low quality reads too short reads adapter Rmrep rmrep.pl

15 Assembly Trinity Abssy

16 getTarget seq getarget.pl use 11 target seq of oreochromis to find target seq from database (120bp) + flank (300bp)

17 BLAST

18 E-value The lower the E-value, or the closer it is to zero, the more "significant" the match is the calculation of the E value takes into account the length of the query sequence These high E values make sense because shorter sequences have a higher probability of occurring in the database purely by chance

19 Bit score The bit score gives an indication of how good the alignment is; the higher the score, the better the alignment. In general terms, this score is calculated from a formula that takes into account the alignment of similar or identical residues, as well as any gaps introduced to align the sequences.

20 Selection Set threshold identity >97% Pick out maximum value of identity, e-value Pick out contig with more reads

21 Result identity e-value contig length number of reads Species name
Total number of species Length of target seq & flank Specie corresponding with which target seq (Tn)


Download ppt "EDNA analyze Wang Ying & Huang Junman."

Similar presentations


Ads by Google