Presentation is loading. Please wait.

Presentation is loading. Please wait.

Martin Kruliš 29. 2. 2016 by Martin Kruliš (v1.1)1.

Similar presentations


Presentation on theme: "Martin Kruliš 29. 2. 2016 by Martin Kruliš (v1.1)1."— Presentation transcript:

1 Martin Kruliš 29. 2. 2016 by Martin Kruliš (v1.1)1

2  Dynamic Nature of PHP ◦ Values  Exist in a managed memory space  Created as literals, results of expressions, or by internal constructions and functions  Explicit data type  boolean, integer, float, string, array, object, resource, NULL ◦ Memory Management  Uses copy-on-write and reference counting  Values are not always copied on assignment  Once a value has zero references, it is garbage collected 29. 2. 2016 by Martin Kruliš (v1.1)2

3  Dynamic Nature of PHP ◦ Variables  Mnemonic references to values  No declarations, created on the first assignment  In the global or local scope  Globals can be mapped into local context ( global keyword)  No explicit type (type is determined by current value)  Type can be changed with a new assignment  Existence can be tested ( isset ) and terminated ( unset ) ◦ Arrays  An array key behaves in many ways like a variable 29. 2. 2016 by Martin Kruliš (v1.1)3

4  Implications ◦ Large data handling  Shared reading works fine thanks to CoW  Explicit unset() may release no longer needed data ◦ There are no pointers…  Some data structures depend on pointers/references  Instead of pointers  The arrays are flexible enough  Objects are passed by reference (like in C#/Java)  Variable variables  Explicit references 29. 2. 2016 by Martin Kruliš (v1.1)4

5  Indirect Access to Values ◦ Name of one variable is stored in another variable $a = 'b'; $$a = 42; // the same as $b = 42; $a = 'b'; $b = 'c'; $c = 'd'; $$$$a = 'hello'; // the same as $d = 'hello'; ◦ The {, } can be used to avoid ambiguous situations ◦ Can be used with members, functions, classes, … $obj = new $className(); $obj->$varName = 42; 29. 2. 2016 by Martin Kruliš (v1.1)5

6  References ◦ Similar to Unix hard-links in FS ◦ Multiple variables attached to the same data  Reference is taken by the & operator ◦ Independent on object references  A reference to an object can be created $a = 1; $b = &$a; $b++; echo $a; // prints 2 29. 2. 2016 by Martin Kruliš (v1.1)6 int (1) $a $b int (2)

7  Arguments as References ◦ Similar usage as var keyword in Pascal function inc(&$x) { $x++; }  Returning References function &findIt($what) { global $myArray; return &$myArray[$what]; } 29. 2. 2016 by Martin Kruliš (v1.1)7

8  References vs. Pointers function foo(&$var) { $var = &$GLOBALS['bar']; }  The unset() Function ◦ Does not remove data, only the variable ◦ Data are removed when not referenced  The global Declaration global $a;  $a = &$GLOBALS['a']; 29. 2. 2016 by Martin Kruliš (v1.1)8 $x = 42; foo($x); How is $x affected? $x = 42; foo($x); How is $x affected?

9  Declaration ◦ Keyword function followed by the identifier function foo([args, …]) { … body … } ◦ Function body  Pretty much anything (even a function/class decl.)  Nested functions/classes are declared once the function is called for the first time  Functions are 2 nd level citizens and identifier space is flat ◦ Results  Optional argument of the return construct  Only one value, but it can be an array or an object 29. 2. 2016 by Martin Kruliš (v1.1)9

10  Argument Declarations ◦ Implicit values may be provided function foo($x, $y = 1, $z = 2) { … }  Arguments with implicit values are aligned to the right  Note that PHP functions does not support overloading  Variable Number of Arguments ◦ Any function can be called with more arguments than formally declared ◦ Functions func_num_args(), func_get_arg(), and func_get_args() provide access to such arguments 29. 2. 2016 by Martin Kruliš (v1.1)10

11  Indirect Calling ◦ Calling a function by its name stored in a variable function foo($x, $y) { … } $funcName = 'foo'; $funcName(42, 54);// the same as foo(42, 54);  Similar Constructs ◦ Using specialized invocation functions  call_user_func('foo', 42, 54);  call_user_func_array('foo', array(42, 54)); 29. 2. 2016 by Martin Kruliš (v1.1)11

12  Testing Function Existence ◦ function_exists() – test whether given func. exists ◦ get_defined_functions() – list of all func. names  Cleanup Functions ◦ register_shutdown_function() – registers a function, which is executed when the script finishes  Special Case of Left-side Function ◦ Func. list() is used at the left-side of assignments  Reverse logic – it fills its arguments list($a, $b, $c) = array(1, 2, 3); 29. 2. 2016 by Martin Kruliš (v1.1)12

13  Lambda (Nameless) Functions ◦ A unique name is generated automatically ◦ Function create_function()  Gets the arguments and the body as strings  Returns an identifier of newly created function ◦ Useful in many situations  One-call functions  Call-back functions $mul = create_function('$x, $y', 'return $x * $y'); echo $mul(3, 4);// prints out '12' 29. 2. 2016 by Martin Kruliš (v1.1)13 Creates a string identifier, which cannot collide with regular identifiers

14  Anonymous Functions ◦ Better way how to implement nameless functions $fnc = function($args) { …body… }; ◦ The anonymous function is an instance of Closure  It can be passed on like an object ◦ The visible variables must be explicitly stated $fnc = function(…) use ($var, &$refvar) { … };  These variables are captured in the closure  Variables passed by reference can be modified 29. 2. 2016 by Martin Kruliš (v1.1)14 Example 1

15 29. 2. 2016 by Martin Kruliš (v1.1)15

16 Charsets, text processing, and regular expressions, 29. 2. 2016 by Martin Kruliš (v1.1)16

17  One Charset to Rule Them All ◦ HTML, PHP, database (connection), text files, … ◦ Determined by the language(s) used  Unicode covers almost every language ◦ Early incoming, late outgoing conversions  Charset in Meta-data ◦ Must be in HTTP headers header('Content-Type: text/html; charset=utf-8'); ◦ Do not use HTML meta element with http-equiv  Except special cases (like saving HTML file locally) 29. 2. 2016 by Martin Kruliš (v1.1)17

18  Multibyte Character Encoding ◦ Some charsets (e.g., UTF-8, UTF-16, …) ◦ Standard string functions are ANSI based  They treat each byte as a char  Multibyte String Functions Library ◦ Standard library, often present in PHP ◦ Duplicates most of the standard string functions, but with prefix mb_ ( mb_strlen, mb_strpos, …) ◦ Encoding conversions mb_convert_encoding() ◦ mb_internal_encoding() – specifies the internal encoding used in PHP 29. 2. 2016 by Martin Kruliš (v1.1)18 Example 2

19  Encoding Input Data from HTTP ◦ Usually done transparently  Check “mbstring” section of php.ini ◦ Can be done manually mb_parse_str()  Databases ◦ The database or the database connection usually requires to be configured ◦ An example for MySQL database  mysqli_set_charset() 29. 2. 2016 by Martin Kruliš (v1.1)19

20  Lexicographical Comparison of Strings ◦ Best to be done elsewhere (in DBMS for instance) ◦ The strcmp() function is binary safe ◦ The locale must be set correctly ( setlocale() )  Iconv Library ◦ An alternative to Multibyte String Functions ◦ Fewer functions ◦ Easier for encoding conversions  Can deal with missing mappings and replacements 29. 2. 2016 by Martin Kruliš (v1.1)20

21  What to Verify or Sanitize ◦ Everything that possibly comes from users: $_GET, $_POST, $_COOKIE, … ◦ Data that comes from external sources (database, text files, …)  When to Verify or Sanitize ◦ On input – verify correctness  Before you start using data in $_GET, $_POST, … ◦ On output – sanitize to prevent injections  When data are inserted into HTML, SQL queries, … 29. 2. 2016 by Martin Kruliš (v1.1)21

22  How to Verify ◦ Regular expressions ◦ Filter functions  filter_input(), filter_var(), …  Useful for special validations (e-mail, URL, IP, …)  How to Sanitize ◦ String and filter functions, regular expressions ◦ htmlspecialchars() – encoding for HTML ◦ urlencode() – encoding for URL ◦ DBMS-specific functions ( mysqli_escape_string() ) 29. 2. 2016 by Martin Kruliš (v1.1)22

23  String Search Patterns ◦ Special syntax that encodes a program (language) for regular automaton ◦ Simple to use  Encoding is (mostly) human readable ◦ POSIX and Perl Standards  Usage ◦ Searching strings, listing matches ◦ Find and replace ◦ Splitting a string into an array of strings 29. 2. 2016 by Martin Kruliš (v1.1)23

24  Expression ◦ expr modifiers ◦ Separator is a single character (usually /, #, %, …) ◦ Pattern modifiers are flags that affect the evaluation  Base Syntax ◦ Sequence of atoms ◦ Atom could be  Simple (non-meta) character (letter, number, …)  Dot (. ) represents any character  A list of characters in [] ( [abc], [0-9a-z_], …) 29. 2. 2016 by Martin Kruliš (v1.1)24

25  Important Meta-characters ◦ \ - an escaping character for other meta-characters ◦ Anchors ^, $ marking start/end of a string/line  ^ in character class definition inverts the set ◦ [, ] – character class definition ◦ {, } – min/max quantifier atom{n}, atom{min,max}  [0-9]{8} (8-digit number),.{1,9} (1-9 chars) ◦ (, ) – subpattern (treated like an atom) ◦ *, +, ? – repetitions, shorthand notations of {0,}, {1,}, and {0,1} respectively ◦ | - branches ( ptrn1|ptrn2 ) 29. 2. 2016 by Martin Kruliš (v1.1)25

26  Character Classes ◦ Pre-defined classes identified by names [:name:]  For example [ab[:digit:]] matches a, b, and 0-9 ◦ alpha – letters ◦ digit – decimal digits ◦ alnum – letters and digits ◦ blank – horizontal whitespace (space and tab) ◦ space – any whitespace (including line breaks) ◦ lower, upper – lowercase/uppercase letters ◦ cntrl – control characters ◦ xdigit – hexadecimal digits 29. 2. 2016 by Martin Kruliš (v1.1)26

27  Modifiers ◦ i – case Insensitive ◦ m – multiline mode ( ^, $ match start/end of a line) ◦ s – '.' matches also a newline character ◦ x – ignore whitespace in regex (except in character class constructs) ◦ S – more extensive performance optimizations ◦ U – switch to not greedy evaluation  Greedy evaluation means that patterns with *, +, or ? tries to match as many characters as possible 29. 2. 2016 by Martin Kruliš (v1.1)27

28  Subpatterns ◦ To ensure correct operation precedence (one|two|three){1,3} ◦ To add modifiers to only a part of the expression (?modifiers:ptrn) ◦ To mark important parts of the expression  Used to retrieve parts of a string after matching  Named subpatterns (? ptrn), or (?'name'ptrn)  Unnamed subpatterns (no capturing in matching) (?:ptrn) 29. 2. 2016 by Martin Kruliš (v1.1)28

29  E-mail Verification (RFC 2822) (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/ =?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21 \x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])* ")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9] (?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]| [01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]? [0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c \x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c \x0e-\x7f])+)\]) 29. 2. 2016 by Martin Kruliš (v1.1)29

30  preg_match($ptrn, $subj [,&$matches]) ◦ Searches given string by a regex ◦ Returns true if the pattern matches the subject ◦ The $matches array gathers the matched substrings of subject with respect to the expression and subpatterns  Subpatterns are indexed from 1  At index 0 is the entire expression  Named patterns are indexed associatively by their names 29. 2. 2016 by Martin Kruliš (v1.1)30 "6 eggs, 3 spoons of oil, 250g of flower" array(1) { [0] => string("6") } /[[:digit:]]+/ ~

31  preg_replace($ptrn, $repl, $str) ◦ Search and replace substrings in a string  Each match of the pattern is replaced  Replacement may contain references to subpatterns  preg_split($ptrn, $str [,$limit]) ◦ Similar to explode() function ◦ Split a string into an array of strings ◦ The pattern is used to match delimiters  Delimiters are not part of the result 29. 2. 2016 by Martin Kruliš (v1.1)31 Example 3

32  Differences ◦ The expression is not enclosed by separators  No modifiers can be added ◦ Only simple subpatterns ◦ Only a few escape sequences  Functions ◦ ereg(), ereg_replace(), split() ◦ Each function has –i version (case insensitive)  eregi() – case insensitive version of ereg() ◦ Deprecated since PHP 5.3 29. 2. 2016 by Martin Kruliš (v1.1)32

33 29. 2. 2016 by Martin Kruliš (v1.1)33


Download ppt "Martin Kruliš 29. 2. 2016 by Martin Kruliš (v1.1)1."

Similar presentations


Ads by Google