Presentation is loading. Please wait.

Presentation is loading. Please wait.

MCB3895-004 Lecture #3 Sept 2/14 Intro to UNIX terminal.

Similar presentations


Presentation on theme: "MCB3895-004 Lecture #3 Sept 2/14 Intro to UNIX terminal."— Presentation transcript:

1 MCB3895-004 Lecture #3 Sept 2/14 Intro to UNIX terminal

2 Introduction to UNIX Nearly all bioinformatics software runs on UNIX and its derivatives (e.g., LINUX and Mac OS) Very little bioinformatics software runs on Windows Bioinformatics is very strongly tied to the open- source software movement Lots of help available on-line Most programs are free Windows is not very open-source friendly

3 Windows users: Option 1: Do all of your work connected to the Biotechnology Cluster server. Download sshclient (ftp://ftp.uconn.edu/restricted/ssh/)ftp://ftp.uconn.edu/restricted/ssh/ Option 2: Install LINUX to run in parallel with Windows (e.g., Biolinux http://nebc.nerc.ac.uk/tools/bio-linux) http://nebc.nerc.ac.uk/tools/bio-linux

4 Terminal The terminal is the primary way to do computational biology Mac: Utilities/Applications/ Terminal Linux: Applications/Accessories/ Terminal Windows: sshclient

5 Assignment A handy resource to learn the basics of UNIX is the “Unix and Perl Primer for Biologists”, which can be found here: http://korflab.ucdavis.edu/Unix_and_Perl/unix_and _perl_v3.1.1.pdf http://korflab.ucdavis.edu/Unix_and_Perl/unix_and _perl_v3.1.1.pdf The commands they demonstrate mainly involve creating, removing and moving around files and directories Once you learn them, these commands will take you far beyond what you can do with a more familiar GUI like Mac Finder or Windows Explorer

6 Worthy of special comment 1.Directory trees 2.Using tab to autocomplete 3.Wildcard characters like * to perform the same operation to multiple files (this is insanely useful once you get the hang of it!) 4.Using nano as a very basic text editor Never, ever, ever use Word for this! 5.Use underscores “ _ ” not spaces in your filenames

7 Directory trees All computer files are organized hierarchically Each folder has an address /Users/Jonathan/ Laptop_backup/Destop/ e-Books

8

9 A quick reference to where you are in UNIX “ / ” - root “ ~ ” - your user home directory “. ” - “here”, the directory you are in now “../ ” - one level up in the directory tree

10 More UNIX tricks “ > ” (greater than) redirects the output of a command into a new file e.g., ls * > list a list of the files in this directory is now stored in the file “ list ”

11 More UNIX tricks cat joins multiple files together e.g., cat file1 file2 > file3 file3 contains file1 and file2 joined together file1 and file2 still exist as they were

12 More UNIX tricks grep extracts all lines containing a particular pattern from a file e.g., grep “NP_” file1 Prints every line that contains the pattern “ NP_ ” to the screen

13 More UNIX tricks wc counts the newlines, words and bytes in a file e.g., wc file1 Prints an output like this: 1060218921752002file1 newlineswordsbytesfilename

14 More UNIX tricks “ | ” (pipe) directs the output of one command into another e.g., grep “NP_” file1 | wc Sounds the output of the grep command into wc, because grep extracts lines from a file, can be used to count the number of lines matching the grep expression e.g., grep “NP_” file1 | less Displays grep result as a list you can scroll through

15 More UNIX tricks gzip / gunzip : single file compression e.g., gunzip file.txt.gz Decompresses file.txt e.g., gzip file.txt Creates compressed file file.txt.gz, removes file.txt

16 More UNIX tricks tar : file archive management e.g., tar -cf all.tar * Creates tar archive all.tar containing all files in that directory, individual files unchanged e.g., tar -xf all.tar Extracts all files from tar archive all.tar to the current directory, all.tar not deleted tar is very commonly used before gzip - “tarballs”

17 Connecting to the Bioinformatics facility server UNIX command ssh e.g., ssh -l jlklassen bbcsrv3.biotech.uconn.edu Will ask for a password If the first time connecting, will want you to authenticate an RSA key (security feature) Your terminal now controls the bioinformatics facility server, not your own machine You can have multiple terminals open at the same time

18 Transferring files to the Bioinformatics facility server Method 1: Filezilla (https://filezilla- project.org/)https://filezilla- project.org/ Nice GUI Works on all platforms Install the client, not the server

19 Transferring files to the Bioinformatics facility server Method 2: UNIX command scp e.g., scp jlklassen@bbcsrv3.biotech.uconn.edu:all.tar all.tar Copy all.tar from my computer to the biotech server e.g., scp -r jlklassen@bbcsrv3.biotech.uconn.edu:dir/. Copy the directory “ dir ” from the biotech server to the current working directory “ -r ” flag indicates “recursive”, needed for directories

20 Text editors Using nano works, but can be cumbersome for complex tasks Word is always bad! Adds layers you don’t see. Mac and LINUX have TextEdit and Gedit as default text editors, both work well Windows: Notepad and Wordpad are insufficient. I suggest downloading Gedit for Windows (https://wiki.gnome.org/Apps/Gedit)https://wiki.gnome.org/Apps/Gedit Other options exist for all platforms

21 Assignment See instructions posted on the website at http://wp.mcb3895.mcb.uconn.edu http://wp.mcb3895.mcb.uconn.edu Part 1: work through Korf manual sections U1- U27 (some commands require external files, ignore these but understand what they do) Part 2: log on to the Biotech server, download a genome from NCBI and answer the questions given The assignment is due at the start of class 1 week from today

22 Command line power! The simplest way to download these data is to use the terminal command wget $ wget –r --no-directories --retr-symlinks -P Acaricomes_phytoseiuli/ ftp://ftp.ncbi.nlm.gov/genomes/refseq/bacteria/Aca ricomes_phytoseiuli/latest_assembly_versions/GCF_0 00376245.1_ASM37624v1/ Deconstructed: -r – “recursive”, i.e., download everything in this directory --no-directories – does not create the entire ftp directory structure --retr-symlinks – NCBI uses a fancy file structure using something called “symbolic links”, where a file points to another file somewhere else. “--retr-symlinks” gets the actual files, not just the links -P Acaricomes_phytoseuili/ – where to put the output


Download ppt "MCB3895-004 Lecture #3 Sept 2/14 Intro to UNIX terminal."

Similar presentations


Ads by Google