Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dr. Chris Musselle – Consultant R Meets Julia Dr Chris Musselle.

Similar presentations


Presentation on theme: "Dr. Chris Musselle – Consultant R Meets Julia Dr Chris Musselle."— Presentation transcript:

1 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com R Meets Julia Dr Chris Musselle

2 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com Outline Julia – What, So What, When? Julia – Where its currently at Julia and R Case Study: Calculating String Similarity

3 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com - julialang.org A flexible dynamic language appropriate for scientific and numerical computing. Arrived Feb 2012 after 2 years development at MIT. Julia 0.3 - released Aug 2014. Free and open source (MIT Licensed)

4 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com Language Features Performance comparable to compiled languages. Designed with distributed computing in mind. Dynamic typing, optional declaration, Multiple dispatch. Libs written in Julia, git based package management. Direct calling of C and Fortran libraries. Interactive REPL “Read-Eval-Print-Loop”

5 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com The Vision “We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as MATLAB, as good at gluing programs together as the shell. … something that provides the distributed power of Hadoop - without the kilobytes of boilerplate Java and XML” --- Julia’s Authors Source: http://julialang.org/blog/2012/02/why-we-created-julia/http://julialang.org/blog/2012/02/why-we-created-julia/

6 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com Too Good to be True? Scientific computing, though requiring high performance, have shifted to use dynamic languages. More productive. Human time for expensive than CPU time. Many advancements in compiler techniques and language design over the years e.g. JIT. Can now greatly mitigate the performance trade-off associated with a dynamic language. But has required building from the ground up.

7 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com So How Fast is Fast? Source: http://julialang.org/benchmarks/http://julialang.org/benchmarks/

8 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com Where’s Julia at now? Standard Library Core Syntax, Collections and Data Structures Linear Algebra, BLAS, Sparse Matrices Package Manager Graphics Unit and Functional Testing Profiling External Packages Total of 384 external packages written by 138 primary authors. http://pkg.julialang.org/

9 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com Who Uses it? JuliaLang – The Core language JuliaStats – Statistics JuliaOpt – Numerical Optimization Library JuliaSparse – Sparse Matrix Solvers JuliaDiff – Differentiation Tools JuliaWeb – Web stack tools JuliaGPU – GPU computing JuliaQuant – Financial Analysis Libraries JuliaAstro / JuliaQuantum – Astronomy/Physics/Chemistry

10 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com When to Use it? Julia allows fast prototyping of code, that is also fast to execute. Best used to code up bespoke algorithms. Julia ecosystem is in its infancy, majority of packages focus on numerical computation. May need to re-implement ‘tools’ from scratch e.g. parsers / data structures / algorithms etc.

11 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com Julia and R? Calling R from Julia: https://github.com/lgautier/Rif.jl https://github.com/lgautier/Rif.jl Calling Julia from R: System calls – New session each time https://github.com/armgong/RJulia

12 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com Case Study: String Similarity (Edit Distance) The number of “edit” operations between two strings where an edit is: An insertion A deletion A substitution E.g. Edits between sitting and Kitten Substitute “ s ” for “ k ” at position 1 Substitute “ i ” for “ e ” at position 5 Insert “ g ” at position 6

13 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com Case Study: String Similarity (Edit Distance) This particular formulation is known as the Levenshtein Distance. Used the optimised “dynamic programing” approach. Pseudocode available at http://en.wikipedia.org/wiki/Levenshtein_distance http://en.wikipedia.org/wiki/Levenshtein_distance Applications Spell checking Computational Biology Natural Language Processing Speech Recognition

14 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com Case Study: String Similarity (Edit Distance) Compared 5 different approaches: R_lev - Written purely in R. R_adist - Using the built in adist function in R Julia – Written purely in Julia Python_np_lev – Written in Python (using numpy) Python_c_lev – Python wrapper to a C function

15 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com Results

16 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com Results (minus R lev)

17 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com Key Results Pure R implementation was over 10 times slower that adist and Python and 33 time slower than Julia. Found Julia 2.5 to 3 times faster than Python and R Reading line by line <<< Reading in all at once Python + numpy ~ R’s built in adist

18 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com Summary Julia – Certainly has great potential Strengths – numerical computation in a dynamic “REPL” language with clean syntax Weakness’s – Playing catch-up with tools and libraries. Early days for integration with other languages. Julia  Other language good though. Don’t prototype your next algorithm in R if speed matters! Found Julia 2.5 to 3 times faster than Python and R

19 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com Thank You For Your Attention Any Questions? - julialang.org Calling R from Julia: https://github.com/lgautier/Rif.jlhttps://github.com/lgautier/Rif.jl Calling Julia from R: https://github.com/armgong/RJuliahttps://github.com/armgong/RJulia Edit distance: http://en.wikipedia.org/wiki/Levenshtein_distancehttp://en.wikipedia.org/wiki/Levenshtein_distance

20 Dr. Chris Musselle – Consultant cmusselle@mango-solutions.com What’s Next? Accepted GSoC projects 2014 Libgit2 support Linear algebra for generic types Julia + Light Table – IDE development IJulia Interactive Widgets 3D Visualization Package for Julia


Download ppt "Dr. Chris Musselle – Consultant R Meets Julia Dr Chris Musselle."

Similar presentations


Ads by Google