Useful shell commands head/tail, cut, sort, uniq Virginie Orgogozo March 2011.

Slides:



Advertisements
Similar presentations
Proteins: Structure reflects function….. Fig. 5-UN1 Amino group Carboxyl group carbon.
Advertisements

BY: SHERENE MINHAS. Agr Glu Thr Ile Glu Ser Leu Ser Ser Ser Glu Glu Ser Ile Pro Glu Tyr Lys Gln Lys Val Glu Lys Val Lys His Glu Asp Gln Gln Gln Gly Thr.
ProteinStructuralDatabases. Proteins are built from amino-acids. Introduction H | NH2-c-CO2H | R.
BIOCHEMISTRY REVIEW Overview of Biomolecules Chapter 4 Protein Sequence.
AMINO ACIDS.
Learning Targets “I Can...” -State how many nucleotides make up a codon. -Use a codon chart to find the corresponding amino acid.
A program of ITEST (Information Technology Experiences for Students and Teachers) funded by the National Science Foundation Background Session #3 DNA &
Proteins Structure of proteins Proteins are made of C, H, O and nitrogen and may have sulfur. The monomers of proteins are amino acids An amino acid.
X-ray detection xray/facilities.html.
©CMBI 2008 Databases Data must be in a certain format for software to recognize Every database can have its own format but some data elements are essential.
Supplementary Fig. 1 Relative concentrations of amino acids after transamination reaction catalyzed by PpACL1, α- ketoglutarate as the amino acceptor.
Proteins Tertiary Protein Structure of Enzyme Lactasevideo Video 2.
Arginine, who are you? Why so important?. Release 2015_01 of 07-Jan-15 of UniProtKB/Swiss-Prot contains sequence entries, comprising
Useful shell commands head/tail, cut, sort, uniq Virginie Orgogozo March 2011.
F. PROTEIN SYNTHESIS [or translating the message]
Table 2. the contents of free amino acids
© 2018 Pearson Education, Inc.
Amino acids.
Translation PROTEIN SYNTHESIS.
Whole process Step by step- from chromosomes to proteins.
32 amino acids is a critical length
Transcription, Translation & Protein Synthesis
Collagen By: Yurani Farfan.
BIOLOGY 12 Protein Synthesis.
Warm-Up 3/12/13 After transcription, an mRNA molecule with the sequence A U A C G C A G U was created. What was the sequence of the original DNA strand?
Transport proteins Transport protein Cell membrane
Cathode (attracts (+) amino acids)
Protein Structure FDSC400. Protein Functions Biological?Food?
Outline What is an amino acid / protein
PROTEIN SYNTHESIS.
מרכיבי הדם.
The genetic code © 2016 Paul Billiet ODWS.
Figure 3.14A–D Protein structure (layer 1)
The forces at work on proteins/ glutamic acid and valine
The Interface of Biology and Chemistry
Haixu Tang School of Inforamtics
Quiz#8 LC710 10/20/10 name___________
TRANSLATION Protein Synthesis
Fig. 5-UN1  carbon Amino group Carboxyl group.
A Ala Alanine Alanine is a small, hydrophobic
Packet #9 Supplement.
Volume 11, Issue 10, Pages (October 2004)
Amino Acids Amine group -NH2 Carboxylic group -COOH
Packet #9 Supplement.
It og Sundhed Thomas Nordahl Petersen, Associate Professor
Protein Basics Protein function Protein structure
Proteins Genetic information in DNA codes specifically for the production of proteins Cells have thousands of different proteins, each with a specific.
After leaving the nucleus, mRNA heads to a ribosome.
Cytochrome.
The 20 amino acids.
Chapter Three Amino Acids and Peptides
Tay Sachs vs Sickle cell
Levels of Protein Structure
How to Test an Assertion
Translation.
The 20 amino acids.
It og Sundhed Thomas Nordahl Petersen, Associate Professor
Chapter 18 Naturally Occurring Nitrogen-Containing Compounds
What is the name of the amino acid shown below?
Example of regression by RBF-ANN
Proteins Proteins have many structures, resulting in a wide range of functions Proteins do most of the work in cells and act as enzymes 2. Proteins are.
Coordination geometry of nonbonded residues in globular proteins
Thomas Nordahl Petersen, Associate Prof, Food DTU
“When you understand the amino acids,
Justin Spiriti Zuckerman Lab MMBioS meeting 5/22/2014
Introduction to Bioinformatics II
Affinity maturation of high-affinity human PfCSP NANP antibodies
Thomas Nordahl Petersen, Associate Bioinformatics, DTU
Fig. 3 Organization of the active site of DHHC20.
Looking at periodicity in protein sequence and structure
Presentation transcript:

Useful shell commands head/tail, cut, sort, uniq Virginie Orgogozo March 2011

Prints out the lines containing the characters Options -c Shows only a count of the results -v Shows only the lines that do not match the pattern. Inverted search. -i ignore case -E Use regular expressions. Terms should be in quotes, use [] to indicate a character range, use [[:space:]] for \s, [[:digit:]] for \d. -n Show line number of the matches Grep

searches for a nearly exact match. Options -d "\>" uses > as a delimiter between records rather than end-of-line -B -y returns only the best match $agrep -B -y -d "\>" CYG FPexcerpt.fta -2 returns results with up to this many mismatches between query and record. Maximum allowed is 8. -l only lists filenames that contain a match -i case-insensitive search Agrep (Approximate grep)

How to write tab or enter characters in the shell? Press Ctrl+V first and then the special character. "Enter" is represented by "^M" How to search for negative numbers with grep ? $grep "\-122" ctd.txt Useful tips

Cut Head/tail Grep Sort Uniq

HEADER LUMINESCENT PROTEIN 05-MAR-04 1SL8 TITLE CALCIUM-LOADED APO-AEQUORIN FROM AEQUOREA VICTORIA COMPND MOL_ID: 1; COMPND 2 MOLECULE: AEQUORIN 1; COMPND 3 CHAIN: A; COMPND 4 ENGINEERED: YES (...) ATOM 1 N ASN A N ATOM 2 CA ASN A C ATOM 3 C ASN A C ATOM 4 O ASN A O ATOM 5 CB ASN A C ATOM 6 CG ASN A C (...) From structure_1sl8.pdb Obtain the number of amino acids ALA 42 (...) ALA 113 ALA 125 ALA 133 ALA 138 ALA 181 ALA 189 ALA 42 ALA 52 ALA 57 ALA 63 ALA 66 ALA 71 ALA 92 ARG 108 ARG 152 ARG 169 ARG 17 ARG 32 ARG 59 ARG 90 ARG 98 ASN 102 ASN 11 ASN 123 (...) Exercice 16 GLU 15 GLY 15 ASP 13 LYS 13 ILE 13 ALA 12 LEU 9 SER 8 VAL 8 ASN 8 ARG 7 TYR 7 THR 7 PHE 6 TRP 6 GLN 5 PRO 5 MET 5 HIS 3 CYS ASN 11 PRO 12 LYS 13 (...) 1 23

HEADER LUMINESCENT PROTEIN 05-MAR-04 1SL8 TITLE CALCIUM-LOADED APO-AEQUORIN FROM AEQUOREA VICTORIA COMPND MOL_ID: 1; COMPND 2 MOLECULE: AEQUORIN 1; COMPND 3 CHAIN: A; COMPND 4 ENGINEERED: YES (...) ATOM 1 N ASN A N ATOM 2 CA ASN A C ATOM 3 C ASN A C ATOM 4 O ASN A O ATOM 5 CB ASN A C ATOM 6 CG ASN A C (...) From structure_1sl8.pdb Obtain the number of amino acids ALA 42 (...) ALA 113 ALA 125 ALA 133 ALA 138 ALA 181 ALA 189 ALA 42 ALA 52 ALA 57 ALA 63 ALA 66 ALA 71 ALA 92 ARG 108 ARG 152 ARG 169 ARG 17 ARG 32 ARG 59 ARG 90 ARG 98 ASN 102 ASN 11 ASN 123 (...) Exercice 16 GLU 15 GLY 15 ASP 13 LYS 13 ILE 13 ALA 12 LEU 9 SER 8 VAL 8 ASN 8 ARG 7 TYR 7 THR 7 PHE 6 TRP 6 GLN 5 PRO 5 MET 5 HIS 3 CYS ASN 11 PRO 12 LYS 13 (...) 1 23 grep " ^ATOM " structure_1sl8.pdb |cut -c 18-21,24-26| sort -u| cut -c 1-3|uniq -c|sort -nr

HEADER LUMINESCENT PROTEIN 05-MAR-04 1SL8 TITLE CALCIUM-LOADED APO-AEQUORIN FROM AEQUOREA VICTORIA COMPND MOL_ID: 1; COMPND 2 MOLECULE: AEQUORIN 1; COMPND 3 CHAIN: A; COMPND 4 ENGINEERED: YES (...) ATOM 1 N ASN A N ATOM 2 CA ASN A C ATOM 3 C ASN A C ATOM 4 O ASN A O ATOM 5 CB ASN A C ATOM 6 CG ASN A C (...) From structure_1sl8.pdb Obtain the number of amino acids ALA 113 ALA 125 ALA 133 ALA 138 ALA 181 ALA 189 ALA 42 ALA 52 ALA 57 ALA 63 ALA 66 ALA 71 ALA 92 ARG 108 ARG 152 ARG 169 ARG 17 ARG 32 ARG 59 ARG 90 ARG 98 ASN 102 ASN 11 ASN 123 (...) Exercice 16 GLU 15 GLY 15 ASP 13 LYS 13 ILE 13 ALA 12 LEU 9 SER 8 VAL 8 ASN 8 ARG 7 TYR 7 THR 7 PHE 6 TRP 6 GLN 5 PRO 5 MET 5 HIS 3 CYS ASN 11 PRO 12 LYS 13 (...) grep " ^ATOM " structure_1sl8.pdb |cut -c 18-21,24-26| sort -u| cut -c 1-3|uniq -c|sort -nr