PERL SCRIPTING. COMPUTER BASICS CPU, RAM, Hard drive CPU can only use data in the register directly CPU RAM HARD DRIVE.

Slides:



Advertisements
Similar presentations
Overview of programming in C C is a fast, efficient, flexible programming language Paradigm: C is procedural (like Fortran, Pascal), not object oriented.
Advertisements

Senem Kumova Metin Introduction to Programming CS 115 Introduction to Computing PART I : Computer Basics PART II: Introduction to Computing/Programming.
The Assembly Language Level
Some computer fundamentals and jargon Memory: Basic element is a bit – value = 0 or 1 Collection of “n” bits is a “byte” Collection of several bytes is.
Elementary Data Types Prof. Alamdeep Singh. Scalar Data Types Scalar data types represent a single object, i.e. only one value can be derived. In general,
Chapter 3: Beginning Problem Solving Concepts for the Computer Programming Computer Programming Skills /1436 Department of Computer Science.
COSC 120 Computer Programming
CIS 101: Computer Programming and Problem Solving Lecture 8 Usman Roshan Department of Computer Science NJIT.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Starting Out with Programming Logic & Design First Edition by Tony Gaddis.
The Analytical Engine Module 6 Program Translation.
Copyright © 2012 Pearson Education, Inc. Chapter 1: Introduction to Computers and Programming.
CSCE 121, Sec 200, 507, 508 Fall 2010 Prof. Jennifer L. Welch.
Hash Tables1 Part E Hash Tables  
Elementary Data Types Scalar Data Types Numerical Data Types Other
String Escape Sequences
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
The computer memory and the binary number system.
Copyright © 2012 Pearson Education, Inc. Chapter 1: Introduction to Computers and Programming.
Introduction to Programming. Our Book in CS Why Program? Lets watch a video
High-Level Programming Languages: C++
Elements of a C++ program 1. Review Algorithms describe how to solve a problem Structured English (pseudo-code) Programs form that can be translated into.
CIS Computer Programming Logic
Introduction to Python
General Computer Science for Engineers CISC 106 Lecture 02 Dr. John Cavazos Computer and Information Sciences 09/03/2010.
Chapter 1: Introduction to Computers and Programming.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 1: Introduction to Computers and Programming.
Input & Output: Console
1 Computing Software. Programming Style Programs that are not documented internally, while they may do what is requested, can be difficult to understand.
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
IPC144 Introduction to Programming Using C Week 1 – Lesson 2
Software Overview. Why review software? Software is the set of instructions that tells hardware what to do The reason for hardware is to execute a program.
Cis303a_chapt03-2a.ppt Range Overflow Fixed length of bits to hold numeric data Can hold a maximum positive number (unsigned) X X X X X X X X X X X X X.
Introduction to Java Applications Part II. In this chapter you will learn:  Different data types( Primitive data types).  How to declare variables?
JAVA BASICS: Variables and References SYNTAX, ERRORS, AND DEBUGGING.
CS 147 June 13, 2001 Levels of Programming Languages Svetlana Velyutina.
Chapter 8 High-Level Programming Languages. 8-2 Chapter Goals Describe the translation process and distinguish between assembly, compilation, interpretation,
The basics of the array data structure. Storing information Computer programs (and humans) cannot operate without information. Example: The array data.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Property of Jack Wilson, Cerritos College1 CIS Computer Programming Logic Programming Concepts Overview prepared by Jack Wilson Cerritos College.
Chapter 19 Number Systems. Irvine, Kip R. Assembly Language for Intel-Based Computers, Translating Languages English: Display the sum of A times.
Data TypestMyn1 Data Types The type of a variable is not set by the programmer; rather, it is decided at runtime by PHP depending on the context in which.
Introduction to Java Java Translation Program Structure
1 Text Reference: Warford. 2 Computer Architecture: The design of those aspects of a computer which are visible to the programmer. Architecture Organization.
8-1 Compilers Compiler A program that translates a high-level language program into machine code High-level languages provide a richer set of instructions.
1 Chapter 7 Skip Lists and Hashing Part 2: Hashing.
School of Computer Science & Information Technology G6DICP - Lecture 4 Variables, data types & decision making.
Introduction to Python Dr. José M. Reyes Álamo. 2 Three Rules of Programming Rule 1: Think before you program Rule 2: A program is a human-readable set.
C++ Programming Lecture 3 C++ Basics – Part I The Hashemite University Computer Engineering Department (Adapted from the textbook slides)
1 Compiler & its Phases Krishan Kumar Asstt. Prof. (CSE) BPRCE, Gohana.
Computer Science I Storing data. Binary numbers. Classwork/homework: Catch up. Do analysis of image types.
Computer and Programming. Computer Basics: Outline Hardware and Memory Programs Programming Languages and Compilers.
CPS120: Introduction to Computer Science Variables and Constants.
FUNCTIONS. Midterm questions (1-10) review 1. Every line in a C program should end with a semicolon. 2. In C language lowercase letters are significant.
Introducing Java Chapter 3 Review. Why Program in Java? Java, is an object-oriented programming language. OOP languages evolved out of the need to better.
Dept. of Animal Breeding and Genetics Programming basics & introduction to PERL Mats Pettersson.
Introduction to Computing Systems and Programming Programming.
Hello world !!! ASCII representation of hello.c.
OCR A Level F453: The function and purpose of translators Translators a. describe the need for, and use of, translators to convert source code.
Review A program is… a set of instructions that tell a computer what to do. Programs can also be called… software. Hardware refers to… the physical components.
Introduction to Computer Programming Concepts M. Uyguroğlu R. Uyguroğlu.
INTRODUCTION TO PROGRAMING System Development Mansoura October 2015.
Introduction to Programming AP Computer Science. Computers What is a computer? –CPU –ALU –Memory –Hard Drive.
Basic Concepts: computer, program, programming …
Java Programming: From the Ground Up
Lecture 1b- Introduction
CSCE Fall 2013 Prof. Jennifer L. Welch.
CMP 131 Introduction to Computer Programming
CSCE Fall 2012 Prof. Jennifer L. Welch.
ICT Programming Lesson 1:
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Presentation transcript:

PERL SCRIPTING

COMPUTER BASICS CPU, RAM, Hard drive CPU can only use data in the register directly CPU RAM HARD DRIVE

COMPUTER LANGUAGES Machine languages: binary code directly taken by the CPU. Usually CPU model specific. Fast. Assembly language: mapping binary code to three-letter instructions; Platform-dependent. Fast High-level language: “human-like” syntax, often non-CPU dependent. Compiled into machine code before use. Fast. E.g. C, C++, Fotran, Pascal, Basic. Scripting language: usually not compiled into binary code. Interpreted and executed on request. Slow. E.g. Perl, Php, Python Javascript, Bash script,Ruby Byte-code language: source code converted to platform independent, intermediate code for rapid compilation. Java, Microsoft.NET. Speed intermediate.

TWO ELEMENTS OF A PROGRAM Data structure & Algorithm Different data structures may have corresponding, well optimized algorithms for information processing and extraction. (computer science) For example: Inserting (algorithm) a node (data structure) in a linked list (data structure).

BASIC TYPES Bit: 1 bit has 2 states, 1 or 0 1 Byte = 8 bits, i.e. max(1 Byte) = (binary) = 255 Characters in the ASCII encoding can be encoded by 1 byte. In C, data type byte is in fact written as “char” Byte is the smallest unit of storage. Boolean (true/false) theoretically takes only 1 bit, but in reality it takes 1 Byte. How many Boolean states can you store using 1 byte?

BASIC TYPES Integer: 32 bit, signed ~ ; unsigned Long integer: 64 bit. Float: 32 bit. 24bit for significand, the rest for the exponent. Float point numbers could lose precision, try this in perl: print 0.6/0.2-3; Correct way: sub round { my($n) return int($n + $n/abs($n*2)); } print round(0.6/0.2)-3;

POINTERS / REFERENCE Pointers (or reference in other languages) are essentially an integer. This integer stores a memory address. This memory address refers to another variable.

COMPLEX TYPES Set: unordered values. Array (vector): a set of ordered values of the same basic type. Index starting from 0 in most langs, last index = length -1 Hash: key => value pairs. Key must be unique. Array can be thought of as a special Hash where key values are ordered, consecutive integers. String * : in C, a string is simply an array of characters. In many other languages, strings are treated as a “basic type”. Most algorithms for arrays also works for strings.

COMPLEX TYPES Classes: objected-oriented programming A class packages related data of different datatypes, as well as algorithms associated with them into a nice blackbox for you to use. Objected-oriented programming.

PERL PERL lumps all “basic types” as “Scalar”, “$” PERL interpreter decides on what it “looks like” Convenient, but sometimes problematic, especially when you parse in a user-provided data file. Arrays, reference $. Hash, definition: %, reference $ RegExp Handlers. use strict; PERL has an ugly grammar. PERL has many short-cuts, such as $_ DO NOT USE THEM!

FLOW CONTROL for, foreach, while, unless, until, if elsif else Statements Statements

FUNCTIONS (SUBROUTINES) Traditionally, “subroutines” do not accept parameters Function is a better term, but b/c perl is ugly so it continues to use sub. sub functionname { my($param1, $param2) #get the parameters return xxxx. } Call: functionname($param1, $param2); I prefix all private functions with “fn”. But you don’t need to do that. However, capitalize first letter of each word! Use Verb + Noun phrases as function names fnGetFileName(), fnDownloadPicture.

HOW TO NAME VARIABLES Variable names should reflex their basic types. Descriptive names should be given, with each word capitalized I use the c-style prefix on them TypeprefixExp. boolb$bGenomeLoaded integern$nLen floatn/f$fAlleleFreq strings$sInFile File Handlerh$hInFile arrayarr$arrLoci hasharr$arrGeneID constantALLCAPSMAX_LINE

1.Start with the DNA sequence: ATGGAAATGGAGAGGCCTCTGCAAATGATGCCGGATTGTTTCAGACATATAGAAATGTCT, report its length and check if its length can be divided by 3, also check if it's a valid DNA sequence. If check fails, do not continue. 2.Translate it into Peptide sequences using universal codon table.universal codon table 3.Display it on screen in the following format where DNA is on first line, translated amino acids aligns with the middle letter at each codon at the second line: 4.This DNA sequence goes through generation after generation of replication. 5.At each replication, it has a user-specified probability (0-1) of single-nucleotide mutation. This mutational probability is specified through the command line.

6.If mutation happens, 1 random letter in the DNA will be changed to A,T,C or G with equal probability. It's okay if the letter "changes" to the same letter. 7.Display at each generation the DNA and protein sequence as described in step 3, also display the generation. 8.Check if a stop codon has occured at each generation. If so the protein has lost its function, stop the evolution and output the generation at which the stop codon occurs. 9.This program should be able to deal with DNA sequence with upper or lowercase letters.

Create a shell script called getdistr.sh 1.Run the simulation mutation.pl for 1000 times with mutational probabilities of 0.01, 0.1 and 0.5 respectively 2.Collect all DNA and protein sequence outputs to dist_$mutationprob.log 3.Collect the stopping generation at which stop codon first occurs in dist_$mutationprob.txt 4.Use R to plot dist_0.01.txt, dist_0.1.txt and dist_0.5.txt on a histogram (each parameter with different colors). X axis should be log10(Generation).