Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 1b, January 24, 2014 Relevant software and getting it installed.

Similar presentations


Presentation on theme: "1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 1b, January 24, 2014 Relevant software and getting it installed."— Presentation transcript:

1 1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 1b, January 24, 2014 Relevant software and getting it installed.

2 Admin info (keep/ print this slide) Class: ITWS-4963/ITWS 6965 Hours: 12:00pm-1:50pm Tuesday/ Friday Location: SAGE 3101 Instructor: Peter Fox Instructor contact: pfox@cs.rpi.edu, 518.276.4862 (do not leave a msg)pfox@cs.rpi.edu Contact hours: Monday** 3:00-4:00pm (or by email appt) Contact location: Winslow 2120 (sometimes Lally 207A announced by email) TA: Lakshmi Chenicheri chenil@rpi.educhenil@rpi.edu Web site: http://tw.rpi.edu/web/courses/DataAnalytics/2014http://tw.rpi.edu/web/courses/DataAnalytics/2014 –Schedule, lectures, syllabus, reading, assignments, etc. 2

3 Today Install application software Get some data and read, explore, etc. Install data technology and related software 3

4 Gnu R R Studio – see R-intro.html in manualshttp://www.rstudio.com/ide/download /–http://www.rstudio.com/ide/download / –Manuals - http://cran.r-project.org/doc/manuals/http://cran.r-project.org/doc/manuals/ –Libraries – at the command line – library(), or select the packages tab, and check/ uncheck as needed –http://cran.r-project.org/doc/manuals/R-lang.htmlhttp://cran.r-project.org/doc/manuals/R-lang.html 4

5 Scipy/numpy/ iPython (NB) Windows/Linux –http://scipy.org/install.htmlhttp://scipy.org/install.html If you have a Mac –Anaconda – http://continuum.io/downloads (preferred)http://continuum.io/downloads Use Launcher to install Spyder (and iP Qt) –Do you have macports installed? ‘$ which port’ –No? (sorry – ask me for details…) Install Xcode (from http://developer.apple.com/download - you will need to register - academic)http://developer.apple.com/download http://www.macports.org/install.php Also see individual packages on the install page.. http://scipy.org/getting-started.html 5

6 Matlab http://dotcio.rpi.edu/services/software-labs Student version License works within RPI network, so may have to use VPN if outside http://mathesaurus.sourceforge.net/octave- r.html R for Matlab usershttp://mathesaurus.sourceforge.net/octave- r.html 6

7 Files http://escience.rpi.edu/data/DA This is where the files for assignments, exercise will be placed 7

8 Exercises – getting data in Rstudio –read in csv file (two ways to do this) - GPW3_GRUMP_SummaryInformation_2010.csv –Read in excel file (directly or by csv convert) - 2010EPI_data.xls (2010EPI_data tab) –See if you can plot some variables –Anything in common between them? 8

9 Exercises Scipy –In Spyder read in a matlab file: import scipy.io as sio mat_contents = sio.loadmat(‘Williams40.mat’) mat_contents Explore – plot, etc. –Read in a csv file (your choice) –Write out as matlab file, i.e. sio.savemat (see File I/O help http://docs.scipy.org/doc/scipy/reference/tutorial/i o.html ) http://docs.scipy.org/doc/scipy/reference/tutorial/i o.html –http://docs.scipy.org/doc/scipy/reference/tutorial/s tats.html - start lookinghttp://docs.scipy.org/doc/scipy/reference/tutorial/s tats.html 9

10 Exercises Matlab –Read in two different datasets: sw40_30s.mat or sw29adcp.mat UChicago30.mat or Williams40.mat –Explore them… –Read in the csv files 10

11 If time or for fun… se_eqs.xls –Plot it –Fit it PRESSURE.xls –Plot it –Smooth it –Fit it … 11

12 Install-fest… continues http://projects.apache.org/indexes/category.ht ml#databasehttp://projects.apache.org/indexes/category.ht ml#database –Hadoop (MapReduce) –Pig (http://wiki.apache.org/pig/RunPig )http://wiki.apache.org/pig/RunPig –HIVE (http://hive.apache.org/releases.html )http://hive.apache.org/releases.html https://cwiki.apache.org/confluence/display/Hive/Gettin gStartedhttps://cwiki.apache.org/confluence/display/Hive/Gettin gStarted https://cwiki.apache.org/confluence/display/Hive/Tutori alhttps://cwiki.apache.org/confluence/display/Hive/Tutori al https://cwiki.apache.org/confluence/display/Hive/Langu ageManualhttps://cwiki.apache.org/confluence/display/Hive/Langu ageManual –Cassandra (binaries from DataStax) And MongoDB - http://www.mongodb.org/http://www.mongodb.org/ 12

13 Objective Get a good feel for the complexity and maturity of the data and tools environments See some real data and start to consider what it will take to work with it Big and complex - means time and memory and laptops only can do so much We’ll soon look at the intersections like RHadoop: https://github.com/RevolutionAnalytics/RHado op/wiki https://github.com/RevolutionAnalytics/RHado op/wiki 13

14 No more reading this week Complete the installs as best you can Pick your preferred application and data software and read up on them, try some examples 14


Download ppt "1 Peter Fox Data Analytics – ITWS-4963/ITWS-6965 Week 1b, January 24, 2014 Relevant software and getting it installed."

Similar presentations


Ads by Google