Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Linux Command Line Interface

Similar presentations


Presentation on theme: "The Linux Command Line Interface"— Presentation transcript:

1 The Linux Command Line Interface
Basics of Linux I The Linux Command Line Interface Today we’re going to look at the Linux command line and shell scripting. This is a really broad topic so we’re only going to be able to touch on many of the details. It’s also quite a difficult session to pitch, as we have users of all abilities here today. [web] portal.biohpc.swmed.edu [ ] Updated for

2 Study Resources : A Free Book
500+ pages * Some of the materials covered in today’s training is from this book

3 Study Resources : Tutorial Website

4 Study Resources : Cheat Sheets
On the portal Training -> Slides & Handouts

5 Study Resources : Follow Along…
You can follow along using: the Nucleus Web terminal on the BioHPC portal: Putty or SSH from your PC Terminal from your MacBook ssh

6 Study Resources: Man pages
Get a command’s help page: man <command> Press q or Ctrl-C to exit the man page

7 The Terminal Not too long ago….. Computers were primarily found in research centers, business, educational institutions, and libraries. Access points to these computers were called terminals. Simple keyboard and monitor interface Computer may be a small, single unit or part of a larger network Many of these computers ran a licensed UNIX operating system developed by AT&T

8 Modern Unix Descendants
GNU/Linux Created by: Linus Torvalds and thousands of community developers 1991-today macOS Created by: Apple, Inc. 2001-today Android Created by: Google Forked from Linux 2008-today 2001-today

9 What operating system do BioHPC machines primarily run on?
Red Hat Enterprise Linux 7.4 Created by Red Hat, Inc. GNU/Linux distribution Linux Kernel 3.10 Gnome 3 Desktop Environment Bourne-Again Shell (bash)

10 SSH – Secure Shell The majority of your interactions with the Nucleus cluster will likely be through SSH. Most modern GNU/Linux distributions have an OpenSSH client installed by default. Mac OS X also has SSH. PuTTY is recommended for MS Windows. ssh Encrypted traffic

11 The Text (Command-Line Interface) Shell
The interaction between user and the operating system is provide by a shell. The shell accepts keyboard commands and hands them off to the operating system. The BioHPC default shell is Bash – the Bourne-Again Shell. ~]$ echo –n Hello, world! user name arguments host name A command, typed at the Unix command prompt, looks something like this: The shell is scary, but a little knowledge can make difficult things easy, and time-consuming things quick. working directory options/switches name of the command

12 Logging into Nucleus – Where Am I?
~]$ pwd – print working directory ~]pwd /home2/s178337 ls – list contents of a directory ~]ls ~]ls /home2/s178337

13 Linux Basics: The File System
Files on a Linux system are arranged in a hierarchical directory structure. / apps bin home2 project work {uid} {department} {PI_lab} {uid or project_name} shared tmp

14 Navigating the file system
How does one change their working directory? cd – change directory Let’s try accessing our personal work directory. ~]$ cd /work/biohpcadmin/s178337

15 Linux Command Line: Files and Directories
Files and directories may be referenced by an absolute or relative path Absolute path—specify the location of a file or directory from / (root directory) cd /work/biohpcadmin Pros: You know exactly where you are going! Cons: Tedious if there are many nested folders Relative path— paths relative to your working directory Working directory: Up one folder: Relative path of subfolder: s178337 / work biohpcadmin s178337 (I am here) Demonstrate chaining relative paths

16 Determining your storage quota
~]$ quota -s Disk quotas for user s (uid ): Filesystem space quota limit grace files quota limit grace lysosomehome:/home2 42872M M M k ~]$ biohpc_quota Current BioHPC Storage Quotas: FILE | SPACE USAGE | NUMBER OF FILES SYSTEM | USED SOFT HARD | USED SOFT HARD | | User quotas for s178337 home2 | M M M | k work | G T T | Group quotas for biohpc_admin project | T k k | archive | T T T |

17 How much storage is a directory occupying?
Example: cd /work/biohpcadmin/s178337/gutenberg What’s inside? gutenberg]$ ls inefficient_reader.py~ large_txt.bin cat.txt multithreaded_efficient_reader.py cat_txt.py multithreaded_efficient_reader.py~ cat_txt.py~ multithreaded_inefficient_reader.py efficient_reader.py multithreaded_inefficient_reader.py~ efficient_reader.py~ readme.txt inefficient_efficient_readers.tar.gz README.txt~ inefficient_reader.py slurm_cat.sh How much space does this directory and all of its contents use? digits_jobs]$ du -hs . 23G . disk usage human-readable; summarize

18 Exploring the file system
~]$ cd /project/shared/biohpc_training Print the contents of the directory using the long format. ~]$ ls –l biohpc_training]$ ls -l total -rwxr-xr–x 1 root root Feb 5 11:50 c475_r0ck_4m_1_r16h7.txt -rwxr-xr–x 1 root root Feb 5 13:48 HD728.R1.fastq.gz -rwxr-xr–x 1 root root Oct 31 11:34 large_txt.bin -rwxr-xr–x 1 root root Feb 5 13:49 regions.txt permissions file name owner Size (bytes) group ownership last modified time

19 Exploring the file system
What type of file is c475_r0ck_4m_1_r16h7.txt ? biohpc_training]$ file c475_r0ck_4m_1_r16h7.txt c475_r0ck_4m_1_r16h7.txt: ASCII text Let’s concatenate the contents of this file to the standard output of the terminal. (this just one way to print a file to the terminal) biohpc_training]$ cat c475_r0ck_4m_1_r16h7.txt Cats rock! Am I right? While file extensions help with organization, we can still determine what a file is without an extension. What type of file is RJ_WS? biohpc_training]$ file RJ_WS RJ_WS: ASCII text

20 Using bash auto-completion.
Bash has a very useful auto-completion shortcut for typing commands more quickly. Give it a try! Try to type: cat /project/shared/biohpc_training/c475_r0ck_4m_1_r16h7.txt Using bash auto-completion.

21 Viewing large text files
A file does not have to be very large before concatenating them to the standard output becomes unhelpful. To illustrate, let’s examine HD728.R1.fastq.gz. The file extension has a .fastq.gz file extension, but what does using file produce? biohpc_training]$ file HD728.R1.fastq.gz HD728.R1.fastq.gz: gzip compressed data, was "HD728.R1.fastq", from Unix, last modified: Sun May 13 07:25: The file is a compressed file. It’s contents would be unreadable to us. Let’s decompress the file first using gzip. $ gzip -cd HD728.R1.fastq.gz > HD728.R1.fastq Exercise: Using the program wc, count how many lines of text are inside HD728.R1.fastq How would one access information on how to use this program?

22 Viewing large text files
biohpc_training]$ wc -l HD728.R1.fastq HD728.R1.fastq Considering that most terminal emulators default to a size of 80 x 24 characters, trying to cat 2M+ lines to the standard output is going to be a bit problematic. Let’s try it anyway…

23 Interrupting a Running Program
What happens if I need to kill a program that is running? Pressing CTRL + C will send an interruption signal (SIGINT) to the program which usually kills it. If not….you should consult man kill

24 Not always practical to print an entire file to the shell.
Head, Tail, More, Less Not always practical to print an entire file to the shell. Commands: head – Print the first 10 lines of each FILE to standard output tail – Print the first 10 lines of each FILE to standard output. Exercise: Print the first 50 lines of HD728.R1.fastq! Hint: man head What if one wanted just wanted to peruse the file without using cat? Commands: more – a filter for paging through text one screenful at a time less – like more, but allows both backward movement and forward movement

25 Standard Streams

26 Redirection What if one wanted to take the stdout of a command and save it to a file? process stdin stderr “>” redirects stdout to a file $ ls /usr/bin > ls_out.txt “>>” appends stdout to a file $ echo “EOF” >> ls_out.txt Why do we see stdout for: ? $ ls /bin/usr > ls_out.txt stdout

27 What if one wanted to redirect both stdout and stderr to a file?
Redirection What if one wanted to redirect both stdout and stderr to a file? process stdin stderr Method 1: Two redirections: stderr to stdout using file descriptors and then stdout redirected to file stdout numeric file descriptor: 1 stderr numeric file descriptor: 2 $ ls –l /bin/usr > out.txt 2>$1 Method 2: Single redirection $ ls –l /bin/usr &> out.txt stdout

28 What if I DON’T CARE AT ALL about the stdout and stderr ?!?!?!
Redirection What if I DON’T CARE AT ALL about the stdout and stderr ?!?!?! process stdin We can simply redirect stdout and stderr to a special place where only things can go in and never come out. $ ls –l /bin &> /dev/null stderr stdout

29 Cryptic commands! Cheat sheet on the portal.
Text Editors Editors vi / vim Cryptic commands! Cheat sheet on the portal. Quick tutorial: Emacs An extensible, customizable text editor Quick tutorial: nano Easier to use. Quick tutorial: Any text editor from your PC or Mac Mount your directories as network drives Emacs is about the same functionality and difficulty as vi/vim

30 $ ls -l Permissions Owner Group Permissions
drwxr-xr-x 4 dtrudgian biohpc_admin 58 Feb 16 15:13 all_training drwxr-xr-x 7 dtrudgian biohpc_admin 140 Feb 12 10:36 Apps drwxr-xr-x 2 dtrudgian biohpc_admin 26 Feb 16 15:12 cli_training drwxr-xr-x 8 dtrudgian biohpc_admin 4.0K Feb 16 14:25 Cluster_Installs drwxr-xr-x 3 dtrudgian biohpc_admin 4.0K Feb 16 11:49 Desktop drwxr-xr-x 2 dtrudgian biohpc_admin 10 Feb 16 14:10 Documents drwxr-xr-x 9 dtrudgian biohpc_admin 135 Feb 16 14:32 Downloads -rw-r--r-- 1 dtrudgian biohpc_admin 336 Feb 16 15:16 error.txt drwxr-xr-x 10 dtrudgian biohpc_admin 4.0K Feb 9 12:45 Git drwxr-xr-x 17 dtrudgian biohpc_admin 4.0K Feb 16 15:17 ownCloud drwxr-xr-x 2 dtrudgian biohpc_admin 10 Feb 16 14:18 Pictures drwxr-xr-x 5 dtrudgian biohpc_admin 102 Feb 4 11:19 portal_jobs Permissions Owner Group

31 File Permissions

32 Add up the permissions you need for each class, e.g. rx = 5 rw = 6
Octal Permissions Add up the permissions you need for each class, e.g. rx = 5 rw = 6 rwx = 7 r = 4 w = 2 x = 1 -rw-r--r-- 1 dtrudgian biohpc_admin 336 Feb 16 15:16 error.txt Owner can read+write Group can read Others can read

33 chmod u/g/a +/- r/w/x filename
Permissions chmod u/g/a +/- r/w/x filename Class: u = user (owner) g = group a = all r read w write x execute + Add permission - Remove permission chmod g+rw test.txt Add read/write permissions for the group chmod a+x script.sh Add execute permission for everyone chmod g-x script.sh Remove execute permission for the group chmod 700 script.sh -rwx------ chmod 640 script.sh –rw-r-----

34 Demo - Creating a shared folder
Public Sequencing data /archive/shared/biohpc_training/sample_data/vc_calling_session2 Create a copy that is readable and writeable by your only your primary group in /project/department/lab/shared/sequencing_data Move all bam files into /project/<department>/<lab>/shared/sequencing_data/bam Move all fastq.gz files into /project/<department>/<lab>/shared/sequencing_data/fastq Remove any additional files in /archive/shared/biohpc_training/sample_data/vc_calling_session2

35 Create an empty directory
Copying data Create an empty directory $ mkdir –p /project/biohpcadmin/shared/sequencing_data (-p ensures parent folders are created if not already present) Copy each everything recursively (-r) from the source to destination $ cp -r /archive/shared/biohpc_training/sample_data/vc_calling_session2/* /project/biohpcadmin/shared/sequencing_data/ OR Copy the entire folder recursively as: $ cp -r /archive/shared/biohpc_training/sample_data/vc_calling_session2 /project/biohpcadmin/shared/sequencing_data

36 To move files, one can use the mv command.
Moving data To move files, one can use the mv command. Note: mv will attempt to preserve original permissions Moving all .bam files $ cd /project/biohpcadmin/shared/sequencing_data $ mkdir bam $ mv *.bam bam/ Moving all .fastq files $ mkdir fastq $ mv *.fastq fastq/

37 Be very cautious of your ability to destroy files!
Deleting Files Be very cautious of your ability to destroy files! There is no Recycling Bin to restore your files. Once files are deleted by the CLI, it is generally difficult to recover them. Make sure important data is backed up! rm is used similarly to cp and mv To delete everything in a folder (non-recursively) $ rm /project/biohpcadmin/shared/sequencing_data/* To delete a folder recursively $ rm –r /project/biohpcadmin/shared/sequencing_data To delete objects interactively (slowest, but safest) $ rm –i /project/biohpcadmin/shared/sequencing_data/*

38 * Match any number of characters ls * Any file
Wildcards * Match any number of characters ls * Any file ls notes* Any file beginning with notes ls *.txt Any file ending in .txt ls *2019* Any file with 2015 somewhere in its name ? Match a single character ls data_00?.txt Matches data_001, data002, data_00A etc. [] Match a set of characters (bracket expression) ls data_00[ ].txt ls data_00[0-9].txt Matches data_001 – data_009, not data_00A

39 Applying new Permissions
$ ls -la vc_calling_session2/ total drwxr-xr-x 2 s biohpc_admin Feb 12 13:29 . drwxrwxrwx 3 root root Feb 12 13:29 .. -rwxr-xr-x 1 s biohpc_admin Feb 12 13:29 germline.design.txt -rwxr-xr-x 1 s biohpc_admin Feb 12 13:29 HD728.nanocourse.bam -rwxr-xr-x 1 s biohpc_admin Feb 12 13:29 HD728.nanocourse.bam.bai -rwxr-xr-x 1 s biohpc_admin Feb 12 13:29 HD728.R1.fastq.gz By default, cp will apply your ownership and primary group to files Are these the permissions we want for group-only read and write? If not, what command will apply the appropriate permissions? $ chmod –R 770 /project/biohpcadmin/shared/sequencing_data

40 Environmental Variables – Controlling the behavior of the Shell
Several variables control the behavior of the shell. You can print all of these variables with: $ env Or print them individually $ echo $SHELL /bin/bash $ echo $HOME /home2/s178337 $ echo $USER S178337 $PATH variable is one of the most important and tells the shell where your programs are: $ echo $PATH /cm/shared/apps/slurm/ /sbin:/cm/shared/apps/slurm/ /bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/ibutils/bin:/sbin:/usr/sbin:/cm/local/apps/environment-modules/3.2.10/bin The module system on BioHPC modifies this $PATH so that programs are made available to the user. One can also manually edit their $PATH $ export PATH=/home2/s178337/bin:$PATH

41 Overview of commands used
Full Name Description man manual man <command> opens manual for a command ssh secure shell opens a remote shell on a server echo prints statement to standard output pwd print working directory prints current working directory cd change directory change to specified directory ls list list contents of a directory file determines type of file cat concatenate concatenates files to standard output head prints the top n-lines of a text file tail prints the bottom n-lines of a text file more outputs a text file page by page less like more, but allows backwards traversal of a file du disk usage calculate disk usage of a file or folder nano simple text editor cp copy copy a file or directory from a source to a destination mv move moves a file from a directory from a source to a destination rm remove deletes a file or a directory chmod change mode modifies permissions of a file or directory


Download ppt "The Linux Command Line Interface"

Similar presentations


Ads by Google