Presentation is loading. Please wait.

Presentation is loading. Please wait.

2017 OTN-TOOLBOX Presented by Marta Mihoff and Alex Nunes

Similar presentations


Presentation on theme: "2017 OTN-TOOLBOX Presented by Marta Mihoff and Alex Nunes"— Presentation transcript:

1 2017 OTN-TOOLBOX Presented by Marta Mihoff and Alex Nunes
Assisted by Brian Jones, Jon Pye, Lenore Bajona More informal. We are going to work today.

2 Start up Toolbox Open CMD window
Navigate to the install folder (Desktop/OTN-toolbox) Execute command “vagrant up” This should remain visible. Maybe with a scribble poster.

3 URLS New R-Studio (user change to vagrant pw otn123) R notebooks
New R notebooks Python notebooks This should remain visible. Maybe with a scribble poster.

4 Rstudio Changes Cosmetic only
User changed to “vagrant”, password is the same “otn123” Removed the Virtual Machine GUI which none of you will notice File structure: programs are in folder “otn-toolbox” “data” folder accessible from inside “otn-toolbox” or on its own. While we are waiting for everyone to get there toolboxes started I’ll give you a brief overview of what is changed and new

5 File Structure Home folder otn-toolbox folders
Some new “notebooks” folders which you should ignore

6 Code Exists in folders The code is PUBLIC. You can see the code and change it in any way you want Changing files in these folders could break everything. You can recover by installing a new copy Recommend you change “copies”

7 DATA Folder The “data” folder exists independently from all the code
It is accessible from RStudio or from Desktop/OTN-Toolbox NEVER delete or rename the data folder Copy files into the data folder to make them accessible to programs. In RStudio files should be saved into the “data” folder Folders will be lost or overwritten on an update if not in “data” folder.

8 R and PY Notebooks New wrappers for same code executed from RStudio GUI May find easier to use r-notebooks offer same set of functions available in Rstudio py-notebooks offer same set plus new functions In future all new functions developed will be done for py-notebooks

9 New Tools Available in PY-Notebooks only data_subsetting.ipynb
Creates a subset of an input file based on a date range or a column value Useful when input file and run time are extremely large and long residence_index.ipynb Offers four methods to choose from. Mix and Match. interactive_residence_index.ipynb same as previous, different map visual_detection_timeline.ipynb Creates an interactive time series from a detection file.

10 File Preparation OTN detection extracts are ready to go as is.
VUE CSV export needs preparation: Latitude and longitude columns must be filled in Rename column receiver  station Rename column transmitter  catalognumber Rename column datetime  datecollected Column unqdetecid can be added with function add_uniquecid Data Subset If your file is very large use the subset tool: py_notebooks/data_subsetting.ipynb The data subset tool and add unique id tools are self explanatory . Stick your hand up if you need help.

11 Data Sub-setting: Demo

12 Notebook: Execution Current cell is highlighted with a blue or green bar on LHS. When a cell is highlighted clicking the run button will execute the code in the cell.

13 Exercise: Filter suspect detections (45 min)
Copy your detection file into your “data” folder Choose one of the three urls In py-notebooks open load_and_filter_detections.ipynb In r-notebooks open filter_driver.ipynb In RStudio open filter_driver.r Need to do this to get a distance matrix Everyone needs to do the first step of this one, to get a distance matrix

14 Filter tool What to fill in
These are the parameters you need to fill in Filename detection_radius (use 400)

15 Filter Tool Output Step 1 File of suspect detections
File of calculated distances between stations (Distance Matrix) Step2 File of filtered detections (suspects removed) Distance Matrix (adjusted) Look at records 14 to 17 in suspect file Look at distance matrix

16 Exercise: Interval or Cohort data (15 min)
For Interval data (one step) In py-notebooks or r-notebooks open interval_data_driver.ipynb In RStudio open interval_data_driver.r For Cohort Data (two steps) In py-notebooks open detection_compression.ipynb first then cohort_data.ipynb In r-notebooks open compress_driver.ipynb first then cohort_driver.ipynb In RStudio open compress_driver.r then cohort_driver.r Choose which one your data is more appropriate or interesting for you

17 Interval/ Cohort What to fill in
Interval: use outputs from Filter step detection_file <- 'detections.csv' #Detection file input name distance_matrix <-'detections_distance_matrix_v00.csv‘ OR for Cohort Compression: detection_file <- 'detections.csv‘ Cohort (need output from compression step) time_interval <- 6 compressed_file <- 'compressed_detections.csv' File names will appear in messages. Cut them from messages and paste into current look. You will need the .csv suffix

18 Residence / Visual Timeline

19 Teach yourself to program
Free open software Extremely powerful Standardized Python Python(x,y): rival to MATLAB and Rstudio PostgreSQL One of the best things you can do to further your career is teach yourself to program You will be way ahead of your colleages who have not bothered to do this You may think you don’t have time But consider how much time you would spend doing these very simple, common, everyday tasks if we had not provided these programming solutions. With some basic programming skills much more complex questions can be answered In reality you do not have time not to learn to program

20 How? Coursera and Code Academy
Code Academy Python course: Rice University: An Introduction to Interactive Programming in Python Next session Sep 15 University of Michigan: Programming for Everybody Next Session Oct 6 Johns Hopkins: R Programming Part of the "Data Science" Specialization Next session Oct 6 There are some wonderful online courses. For these ones no programming experience is required. Coursera is an education platform that partners with top universities and organizations worldwide, to offer courses online for anyone to take, for free. I can attest to the quality of instructors and course sylabus They are first class These ones listed do not require any programming experience

21 Python solutions for common Science questions.
Data Science from Scratch Joel Grus O’Reilly Media Inc 2015


Download ppt "2017 OTN-TOOLBOX Presented by Marta Mihoff and Alex Nunes"

Similar presentations


Ads by Google