Numpy (Numerical Python)

Slides:



Advertisements
Similar presentations
Review Binary –Each digit place is a power of 2 –Any two state phenomenon can encode a binary number –The number of bits (digits) required directly relates.
Advertisements

Python Crash Course Numpy Bachelors V1.0 dd Hour 1.
Data Types. Every program must deal with data The data is usually described as a certain type This type determines what you can do with the data and how.
Multi-Dimensional Arrays in Java "If debugging is the process of removing software bugs, then programming must be the process of putting them in." -- Edsger.
1 Introduction to Arrays Problem: –Input 5 scores, compute total, average –Input Example –test scores,employees,temperatures.
Introduction to Python Session 2: Beginning Numerical Python and Visualization Jeremy Chen.
418512: Computer Programming Languages Lecture 7 Pramook Khungurn TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A AAAA.
Computational Methods of Scientific Programming Lecturers Thomas A Herring, Room A, Chris Hill, Room ,
Scientific Computing with NumPy & SciPy NumPy Installation and Documentation  Not much on the home page—don’t buy the guide, it’s.
I Power Int 2 Computing Software Development High Level Language Constructs.
DT249-Information Systems Research Practice Programming Revision Lecture 2 Lecturer: Patrick Browne.
Python Crash Course Numpy 3 rd year Bachelors V1.0 dd Hour 5.
Python Mini-Course University of Oklahoma Department of Psychology Lesson 21 NumPy 6/11/09 Python Mini-Course: Lesson 21 1.
INTRODUCTION TO MATLAB DAVID COOPER SUMMER Course Layout SundayMondayTuesdayWednesdayThursdayFridaySaturday 67 Intro 89 Scripts 1011 Work
UNIT 10 Multidimensional Arrays.
Midterm Exam Topics (Prof. Chang's section) CMSC 201.
Multidimensional Arrays Computer and Programming.
Introduction to python programming
PH2150 Scientific Computing Skills
IGCSE 4 Cambridge Data types and arrays Computer Science Section 2
Tuples and Lists.
Computer Programming Fundamentals
Data Types and Structures
Chapter 3 Arrays and Vectors
Lecture: MATLAB Chapter 1 Introduction
Python NumPy AILab Batselem Jagvaral 2016 March.
Introduction to Python
Numerical Computing in Python
IDLE Hints To re-load a module after making changes:
Slides regenerated from PPTs of
Array Data Structure Chapter 6
Python I/O.
Introduction to Python
Array Data Structure B.Ramamurthy 11/21/2018 B.Ramamurthy.
PH2150 Scientific Computing Skills
Matlab tutorial course
Numpy (Numerical Python)
Numpy (Numerical Python)
Ruth Anderson UW CSE 160 Winter 2017
Datatypes in Python (Part II)
Coding Concepts (Data- Types)
Python-NumPy Tutorial
Data Types and Data Structures
Topics Sequences Introduction to Lists List Slicing
Python Primer 1: Types and Operators
Multidimensional array
Pandas John R. Woodward.
Python Crash Course Numpy
Data Intensive and Cloud Computing Matrices and Arrays Lecture 9
Scipy 'Ecosystem' containing a variety of scientific packages including iPython, numpy, matplotlib, and pandas. numpy is both a system for constructing.
CS150 Introduction to Computer Science 1
Array Data Structure Chapter 6
Lets Play with arrays Singh Tripty
CS150 Introduction to Computer Science 1
Dr. Sampath Jayarathna Cal Poly Pomona
Topics Sequences Introduction to Lists List Slicing
Dr. Sampath Jayarathna Old Dominion University
Ruth Anderson UW CSE 160 Spring 2018
Dr. Sampath Jayarathna Old Dominion University
Comparing Python and Java
Data Types Every variable has a given data type. The most common data types are: String - Text made up of numbers, letters and characters. Integer - Whole.
Hash Maps Implementation and Applications
ESRM 250/CFR 520 Autumn 2009 Phil Hurvitz
EE 194/BIO 196: Modeling,simulating and optimizing biological systems
PYTHON PANDAS FUNCTION APPLICATIONS
Python Debugging Session
Class code for pythonroom.com cchsp2cs
DATAFRAME.
INTRODUCING PYTHON PANDAS:-SERIES
Ruth Anderson UW CSE 160 Winter 2016
Presentation transcript:

Numpy (Numerical Python) John R. Woodward

Introduction Numpy Numeric Python Fast computation with n-dimensional arrays

Numpy Based around one data structure ndarray n-dimensional array Import with import numpy as np Usage is np.command(xxx)

ndarrays 1d: 5,67,43,76,2,21 a=np.array([5,67,43,76,2,21]) 2d: 4,5,8,4 6,3,2,1 8,6,4,3 a=np.array([4,5,8,4],[6,3,2,1],[8,6,4,3])

*, + import numpy as np data = randn(2, 3) print data print data * 10 print data + data [[ 0.079 -0.8418 -0.0838] [-1.4497 0.6628 1.1006]] [[ 0.7896 -8.4175 -0.8378] [-14.4973 6.6275 11.0059]] [[ 0.1579 -1.6835 -0.1676] [-2.8995 1.3255 2.2012]]

Shape and dtype OUTPUT (2L, 3L) float64 print data.shape print data.dtype Numpy tries to infer the datatype OUTPUT (2L, 3L) float64

Creating ndarrays data1 = [6, 7.5, 8, 0, 1] arr1 = np.array(data1) print arr1 Output [ 6. 7.5 8. 0. 1. ]

Multidimensional arrays data2 = [[1, 2, 3, 4], [5, 6, 7, 8]] arr2 = np.array(data2) print arr2 print arr2.ndim print arr2.shape OUTPUT [[1 2 3 4] [5 6 7 8]] 2 (2L, 4L)

print arr2 print arr2.ndim print type(arr2.ndim) print arr2.shape print type(arr2.shape) print arr2.shape[0] print arr2.shape[1]

OUTPUT [[1 2 3 4] [5 6 7 8]] 2 <type 'int'> (2L, 4L) <type 'tuple'> 4 print arr2 print arr2.ndim print type(arr2.ndim) print arr2.shape print type(arr2.shape) print arr2.shape[0] print arr2.shape[1]

3d array data2 = [[[1]]] arr2 = np.array(data2) print arr2 print arr2.ndim print arr2.shape

3d array OUTPUT [[[1]]] 3 (1L, 1L, 1L) data2 = [[[1]]] arr2 = np.array(data2) print arr2 print arr2.ndim print arr2.shape OUTPUT [[[1]]] 3 (1L, 1L, 1L)

More making arrays np.zeros(10) np.zeros((3, 6)) np.empty((2, 3, 2)) [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [[ 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0.]] [[[ 0. 0.] [ 0. 0.] [ 0. 0.]] [[ 0. 0.] [ 0. 0.]]] np.zeros(10) np.zeros((3, 6)) np.empty((2, 3, 2))

Operations between arrays and scalars arr = np.array ([1., 2., 3.]) print arr print arr * arr print arr - arr print 1 / arr print arr ** 0.5

Operations between arrays and scalars output [ 1. 2. 3.] [ 1. 4. 9.] [ 0. 0. 0.] [ 1. 0.5 0.3333] [ 1. 1.4142 1.7321] arr = np.array ([1., 2., 3.]) print arr print arr * arr print arr - arr print 1 / arr print arr ** 0.5

Array creation functions

a=np.array([True], dtype=np.int64) print a print a.dtype a=np.array([True], dtype=np.bool)

[1] Int64 [ True] bool a=np.array([True], dtype=np.int64) print a print a.dtype a=np.array([True], dtype=np.bool) [1] Int64 [ True] bool

NumPy data types 1

NumPy data types 2

astype [ 3.7 -1.2 -2.6] [ 3 -1 -2] arr = np.array( [3.7, -1.2, -2.6]) print arr print arr.astype(np.int32) note that the data has been truncated.

astype - string to float numeric_strings = np.array(['1.25', '-9.6', '42'], dtype=np.string_) print numeric_strings print numeric_strings.astype(float) ['1.25' '-9.6' '42'] [ 1.25 -9.6 42. ]

Basic indexing and slicing (broadcasting) arr = np.arange(10) print arr print arr[5] print arr[5:8] arr[5:8] = 12 [0 1 2 3 4 5 6 7 8 9] 5 [5 6 7] [ 0 1 2 3 4 12 12 12 8 9]

The original array has changed arr_slice = arr[5:8] arr_slice[1] = 12345 print arr arr_slice[:] = 64 [ 0 1 2 3 4 12 12345 12 8 9] [ 0 1 2 3 4 64 64 64 8 9]

2d array [7 8 9] 3 arr2d = np.array ([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) print arr2d[2] print arr2d[0][2] print arr2d[0, 2] (last two are same) [7 8 9] 3

3d array [[[ 1 2 3] [ 4 5 6]] [[ 7 8 9] [10 11 12]]] [[1 2 3] [4 5 6]] [[[ 1 2 3] [ 4 5 6]] [[ 7 8 9] [10 11 12]]] [[1 2 3] [4 5 6]] [10 11 12]] arr3d = np.array ([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]) print arr3d print arr3d[0] print arr3d[1]

Indexing with slices – 1D arr = np.arange(10) print arr print arr[1:6] Output [0 1 2 3 4 5 6 7 8 9] [1 2 3 4 5]

Indexing with slices – 2D [[1 2 3] [4 5 6] [7 8 9]] [4 5 6]] [[2 3] [5 6]] arr2d = np.array( [[1, 2, 3], [4, 5, 6], [7, 8, 9]]) print arr2d print arr2d[:2] print arr2d[:2, 1:]

Indexing with slices – 2D arr2d = np.array( [[1, 2, 3], [4, 5, 6], [7, 8, 9]]) print arr2d[1, :2] print arr2d[2, :1] print arr2d[:, :1] [4 5] [7] [[1] [4] [7]]

Fancy indexing indexing using integer arrays arr = np.empty((8, 4)) [[ 0. 0. 0. 0.] [ 1. 1. 1. 1.] [ 2. 2. 2. 2.] [ 3. 3. 3. 3.] [ 4. 4. 4. 4.] [ 5. 5. 5. 5.] [ 6. 6. 6. 6.] [ 7. 7. 7. 7.]] [[ 4. 4. 4. 4.] [ 0. 0. 0. 0.] [ 6. 6. 6. 6.]] [[ 5. 5. 5. 5.] [ 1. 1. 1. 1.]] Fancy indexing indexing using integer arrays arr = np.empty((8, 4)) for i in range(8): arr[i] = i print arr print arr[[4, 3, 0, 6]] print arr[[-3, -5, -7]] Negative index select from the end

Transposing arrays and swapping axes [[ 0 1 2 3 4] [ 5 6 7 8 9] [10 11 12 13 14]] (3L, 5L) [[ 0 5 10] [ 1 6 11] [ 2 7 12] [ 3 8 13] [ 4 9 14]] (5L, 3L) Transposing arrays and swapping axes arr = np.arange(15).reshape((3, 5)) print arr print arr.shape print arr.T print arr.T.shape

Inner Product (dot operator) [[0 1] [2 3]] [[0 2] [1 3]] [[ 4 6] [ 6 10]] [[ 1 3] [ 3 13]] arr = np.arange(4). reshape((2, 2)) print arr print arr.T print np.dot(arr.T, arr) print np.dot(arr, arr.T)

Inner Product (dot operator) arr = np.arange(9).reshape((3, 3)) print arr print np.dot(arr.T, arr) [[0 1 2] [3 4 5] [6 7 8]] [[45 54 63] [54 66 78] [63 78 93]]

arr*arr [[ 0 1 4] [[0 1 2] [ 9 16 25] [3 4 5] [36 49 64]] [6 7 8]] arr = np.arange(9).reshape((3, 3)) print arr print arr*arr [[0 1 2] [3 4 5] [6 7 8]] [[ 0 1 4] [ 9 16 25] [36 49 64]]

Fast element-wise array functions arr = np.arange(5) print arr print np.sqrt(arr) print np.exp(arr) [0 1 2 3 4] [ 0. 1. 1.4142 1.7321 2. ] [ 1. 2.7183 7.3891 20.0855 54.5982]

element-wise maximum x = randn(4) y = randn(4) print x print y print np.maximum(x, y) [-0.9691 -1.4411 1.2614 -0.9615] [-0.0398 -0.0692 -1.6854 -0.3902] [-0.0398 -0.0692 1.2614 -0.3902]

element-wise add x = randn(4) y = randn(4) print x print y print np.add(x, y) [ 0.0987 -1.2579 -1.4827 -1.4299] [-0.2855 -0.7548 -1.0134 0.7546] [-0.1868 -2.0127 -2.4961 -0.6753]

Zip two lists together a = [1,2,3] b = [10, 20, 30] zipAB = zip(a,b) print zipAB OUTPUT [(1, 10), (2, 20), (3, 30)]

Zip three lists together a = [1,2,3] b = [10, 20, 30] c = [True, False, True] zipABC = zip(a,b,c) print zipABC Output [(1, 10, True), (2, 20, False), (3, 30, True)]

And is the same as a = [1,2,3] b = [10, 20, 30] c = [True, False, True] result = [(x,y,z) for x, y, z in zip(a,b,c)] print result Output [(1, 10, True), (2, 20, False), (3, 30, True)]

conditionals result = [(x if z else y) for x, y, z in zip(a,b,c)] print result OUTPUT [1, 20, 3] NOTE depending on the boolean value, it decides which list to take value from.

where an easier way to do this with np a = [1,2,3] b = [10, 20, 30] c = [True, False, True] np.where(c,a,b) Output is [ 1 20 3]

types <type 'list'> <type 'numpy.ndarray'> result = [(x if z else y) for x, y, z in zip(a,b,c)] print type(result) result = np.where(c,a,b) <type 'list'> <type 'numpy.ndarray'>

where(arr > 0, 2, -2) arr = randn(4, 4) arr print np.where(arr > 0, 2, -2) [[ 2 2 -2 -2] [-2 2 -2 2] [-2 -2 -2 -2] [ 2 -2 2 2]]

where(arr > 0, 2, arr) arr = randn(4, 4) Arr print np.where(arr > 0, 2, arr) [[ 2. 2. -0.9611 -0.3916] [-1.0966 2. -1.9922 2. ] [-0.2241 -0.9337 -0.8178 -1.1036] [ 2. -1.096 2. 2. ]]

Mathematical and statistical methods arr = np.random.randn(5, 4) print arr.mean() print np.mean(arr) print arr.sum()

Axis An array has an axis. These are labelled 0, 1, 2, … These are just the dimensions.

Mean of rows/columns (axis) arr = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) print arr print arr.mean(axis=0) print arr.mean(axis=1) [[0 1 2] [3 4 5] [6 7 8]] [ 3. 4. 5.] [ 1. 4. 7.]

Sum different axis arr = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) print arr print arr.sum(0) print arr.sum(1) [[0 1 2] [3 4 5] [6 7 8]] [ 9 12 15] [ 3 12 21]

Cumulative sum [[0 1 2] [3 4 5] [6 7 8]] [[ 0 1 2] arr = np.array( [[ 0 1 2] [ 3 5 7] [ 9 12 15]] [[ 0 1 3] [ 3 7 12] [ 6 13 21]] arr = np.array( [[0, 1, 2], [3, 4, 5], [6, 7, 8]]) print arr print arr.cumsum(0) print arr.cumsum(1) This is across different axis.

Cumulative product [[0 1 2] [3 4 5] [6 7 8]] [[ 0 1 2] arr = np.array( [[ 0 1 2] [ 0 4 10] [ 0 28 80]] [[ 0 0 0] [ 3 12 60] [ 6 42 336]] arr = np.array( [[0, 1, 2], [3, 4, 5], [6, 7, 8]]) print arr print arr.cumsum(0) print arr.cumsum(1) This is across different axis.

Methods for Boolean arrays arr = randn(10) print (arr > 0).sum() 1/ makes a random array 2/ counts only the positive entries. output 2 (of course This Number Can change)

Methods for Boolean arrays bools = np.array( [False, False, True, False]) print bools.any() print bools.all() This asks if all/any in bools is true. output True False

Sorting 1 arr = randn(4) print arr arr.sort() OUTPUT [-0.301 -0.1785 -0.9659 -0.6087] [-0.9659 -0.6087 -0.301 -0.1785]

SORTING 2 [[1 3 2] [8 4 9] [3 5 8]] [3 4 8] [8 5 9]] [[1 2 3] [5 8 9]] arr = randn(2, 3) print arr arr.sort(0) arr.sort(1) sorting on different axis

Unique and other set logic names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe']) print np.unique(names) print sorted(set(names)) OUTPUT ['Bob' 'Joe' 'Will'] ['Bob', 'Joe', 'Will']

Jrw todo

Testing Membership Test if something is in an array. values = np.array([6, 0, 0, 3, 2, 5, 6]) print np.in1d(values, [2, 3, 6]) Output [ True False False True True False True]

Storing arrays on disk in binary format arr = np.arange(10) print arr np.save('some_array', arr) arr1 = np.load('some_array.npy') print arr1 NOTE THE file extension is npy

Saving multiple arrays arr3 = np.arange(3) arr5 = np.arange(5) np.savez('array_archive.npz', a=arr3, b=arr5) arch = np.load('array_archive.npz') print type(arch) print arch['a'] print arch['b'] print dict(arch)

np.load returns dict-like object <class 'numpy.lib.npyio.NpzFile'> [0 1 2] [0 1 2 3 4] {'a': array([0, 1, 2]), 'b': array([0, 1, 2, 3, 4])}

Saving and loading text files arr = np.loadtxt('array_ex.txt', delimiter=',') print arr print type(arr) OUTPUT [[ 1. 2. 3. 4.] [ 3. 4. 5. 6.]] <type 'numpy.ndarray'> array_ex.txt 1,2,3,4 3,4,5,6

Indexing elements in a NumPy array

Two- dimensional array slicing

3d 2x2x2 a=np.array([ [ [3, 1],[4, 3] ], [2, 4],[3, 3] ] ]) 2,4 3,3 3,1 4,3

Indexing a=np.array([ [ [3, 1],[4, 3] ], [2, 4],[3, 3] ] ]) 2,4 3,3 3,1 4,3 a[0]

Indexing a=np.array([ [ [3, 1],[4, 3] ], [2, 4],[3, 3] ] ]) 2,4 3,3 3,1 4,3 a[1]

Indexing a=np.array([ [ [3, 1],[4, 3] ], [2, 4],[3, 3] ] ]) 2,4 3,3 3,1 4,3 a[0][0]

Indexing a=np.array([ [ [3, 1],[4, 3] ], [2, 4],[3, 3] ] ]) 2,4 3,3 3,1 4,3 a[0][0][0]

Indexing a=np.array([ [ [3, 1],[4, 3] ], [2, 4],[3, 3] ] ]) 2,4 3,3 3,1 4,3 a[0][1]

Indexing a=np.array([ [ [3, 1],[4, 3] ], [2, 4],[3, 3] ] ]) 2,4 3,3 3,1 4,3 a[1][0]

Slicing a=np.array([ [ [3, 1],[4, 3] ], [2, 4],[3, 3] ] ]) 2,4 3,3 3,1 4,3 a[0,0,0]

Slicing a=np.array([ [ [3, 1],[4, 3] ], [2, 4],[3, 3] ] ]) 2,4 3,3 3,1 4,3 a[0,1,0]

Slicing a=np.array([ [ [3, 1],[4, 3] ], [2, 4],[3, 3] ] ]) 2,4 3,3 3,1 4,3 a[:,0]

a[:,:,0] Slicing 2,4 3,1 3,3 4,3 Both slices, both rows, column 0 a=np.array([ [ [3, 1],[4, 3] ], [2, 4],[3, 3] ] ]) 2,4 3,3 3,1 4,3 a[:,:,0] Both slices, both rows, column 0

Slicing Remember, slicing range: a:b Means elements from a up to b-1

Data Types Every element in an ndarray has the same type Basic types: int float complex bool object string unicode

Data Types Types also have a defined size in bytes, e.g. int32 float64 The size defines storage size and accuracy To set the type: a=np.array([1,2], dtype=np.int32) print a.dtype OUTPUT IS dtype('int32')

Iterating and Processing You can iterate through a ndarray if you like: for e in a: print e or for e in a[0]: etc. but this is complicated an not advised There is a better way ...

Element-wise Operations a=a*2 a=a+5 a=a+b etc. Functions: a.sum() a.max() a.mean() a.round()

Slices and Indexes a[0]=a[0]/2 a[0,0,0]+=1 a[:,1,1].sum()