Presentation is loading. Please wait.

Presentation is loading. Please wait.

Henrik Bengtsson Mathematical Statistics, Centre for Mathematical Sciences Lund University, Sweden DSC-2003, Vienna. March 20-22, 2003.

Similar presentations


Presentation on theme: "Henrik Bengtsson Mathematical Statistics, Centre for Mathematical Sciences Lund University, Sweden DSC-2003, Vienna. March 20-22, 2003."— Presentation transcript:

1 Henrik Bengtsson hb@maths.lth.se Mathematical Statistics, Centre for Mathematical Sciences Lund University, Sweden DSC-2003, Vienna. March 20-22, 2003 The R.oo package: Robust object-oriented design & implementation with support for references

2 2 of 22 http://www.maths.lth.se/help/R/ Outline Purpose and what the package is and is not. RCC: R Coding Conventions (draft). Reference variables. The root class Object. setMethodS3() & setConstructorS3(). Rdoc comments. Static methods. Virtual fields. trycatch() - exception handling based on class.

3 3 of 22 http://www.maths.lth.se/help/R/ Purposes End user (the most important person at the end of the day!) –Provide consistent object-oriented APIs across different packages, e.g. by having a well defined naming convention for classes, methods, fields and variables. –Make class inheritance more explicit. –Provide a simpler API, e.g. less arguments. –More memory efficient packages. Developer / programmer –Provide reference variables to reduce memory req.'s and data redundancy. –R Coding Convention, e.g. naming conventions. –Create generic functions automatically. –Make code cleaner and remove the need for tedious code repetitions. –Minimize the risk for package conflicts. –More code checking when creating methods and classes to catch errors early on. –Catch rare but “classical” bugs, e.g. using reserved words in method names. –Make help pages more up to date with the source code by allowing Rd document to be placed together with the code in the source files.

4 4 of 22 http://www.maths.lth.se/help/R/ Real world example # Read all GenePix Result files gpr <- MicroarrayData$read(pattern=“*.gpr”) # Extract the foreground & background signals of the red and # the green channels. The slide layout is also included. raw <- as.RawData(gpr) # Get the background corrected signal as M=log(R/G) and A=log(RG)/2. ma <- getSignal(raw, bgSubtract=TRUE) normalizeWithinSlide(ma, method=“p”) # print-tip normalization. knownGenes <- c(50,194,3433,5541,6384) plot(ma); highlight(ma, knownGenes) # highlights the data points from the plotPrintorder(ma); highlight(ma, knownGenes) # correct slide in the correct space. plotSpatial(ma); highlight(ma, knownGenes) plotSpatial3d(gpr, field=“area”, col=getColor(ma)) # Write the normalized data to a tab-delimited file write(ma, “NormalizedExpressions.dat”)

5 5 of 22 http://www.maths.lth.se/help/R/ What the package is and isn’t Is not supposed to replace S3 or S4, but is an extra layer on top of S3 (eventually S4), to move the focus from S3 and S4 details to object- oriented design and implementation. R.oo R environment (S3 and eventually S4) It has been tested and verified for > 2 years!

6 6 of 22 http://www.maths.lth.se/help/R/ RCC: R Coding Conventions (draft) Standardizes the coding style –Example of the naming conventions: Variables, objects, fields and methods should verbs starting with a lower case letter, e.g. shape$side and normalize(). Classes should be nouns starting with an upper case letter, e.g. MicroarrayData. Constants should be in all upper case, e.g. Colors$RED.HUE. Similar to Java. Standards –make the code (and the design) easier to read, share and maintain. –reduce the risk for bugs and misunderstandings. http://www.maths.lth.se/help/R/RCC/

7 7 of 22 http://www.maths.lth.se/help/R/ Reference variables Memory efficient. Minimizes the amount redundant data. Very useful for some data structures, e.g. graphs. References in R.oo are implemented using the environment data type. –Collected by the R garbage collector. (More user friendly methods interfaces since methods can “communicate” with each other by updating the state of the object.)

8 8 of 22 http://www.maths.lth.se/help/R/ A common root class: Object 1.All classes should have the common root class Object. –A similar idea exists in R today, e.g. print(), as.character() etc, but a common root class makes it more explicit. Object $(name): ANY $<-(name, value) [[(name): ANY [[<-(name, value) as.character(): character attach(private=FALSE, pos=2) clone(): Object data.class(): character detach() equals(other): logical extend(this,...className,...): Object finalize() getFields(private=FALSE): character[] hashCode(): integer ll(...): data.frame static load(file): Object objectSize(): integer print() save(file=NULL,...)

9 9 of 22 http://www.maths.lth.se/help/R/ Object – the common root class Object Exception RccViolationException R.oo MicroarrayData GenePixData ImaGeneData QuantArrayData ScanAlyzeData SpotData SpotFinderData MAData RawData RGData TMAData Layout GalLayout com.braju.sma Reporter HtmlReporter LaTeXReporter TextReporter MultiReporter R.io File FileFilter RspEngine BitmapImage MonochromeImage GrayImage RGBImage R.graphics ColorDevice

10 10 of 22 http://www.maths.lth.se/help/R/ A common root class: Object 1.All classes should have the common root class Object. –A similar idea exists in R today, e.g. print(), as.character() etc, but a common root class makes it more explicit. 2.Fields of an Object can be accessed as elements of a list, e.g.: – square$side and – square[[“side”]] <- 23 3.Methods can also be called as – square$getArea() 4.The implementation of reference variables is taken care of within the Object class. Under the hood, we roughly have: ”$.Object” <- function(object, name) { get(name, envir=attr(object, “.env”)) } ”$<-.Object” <- function(object, name, value) { assign(name, value, envir=attr(object, “.env”)) } Object $(name): ANY $<-(name, value) [[(name): ANY [[<-(name, value) as.character(): character attach(private=FALSE, pos=2) clone(): Object data.class(): character detach() equals(other): logical extend(this,...className,...): Object finalize() getFields(private=FALSE): character[] hashCode(): integer ll(...): data.frame static load(file): Object objectSize(): integer print() save(file=NULL,...)

11 11 of 22 http://www.maths.lth.se/help/R/ Defines a method of a class. Creates a generic function automatically iff missing. RCC: –Methods should start with a lower case letter. –Asserts that a correct method name is used; reserved words and names of basic functions that must not be overwritten or redefined are protected. setMethodS3() Does not require the Object class setMethodS3(“plotPrintorder”, “MAData”, function(object,...) {... }) setMethodS3(“next”, “Iterator”, function(object,...) {... }) Error: [2003-03-18 16:28:00] RccViolationException: Method names must not be same as a reserved keyword in R: next, cf. http://www.maths.lth.se/help/R/RCC/

12 12 of 22 http://www.maths.lth.se/help/R/ Problems with generic functions Hard to check if function (generic or not) already exists. Ad hoc solutions for creating generic function “automatically”. Under the S3 schema, it is possible to create generic functions that are truly generic: normalize <- function(...) UseMethod(“normalize”) Note that the first argument is omitted. If not, it would be impossible to have default functions with no arguments, e.g. search(). The R.oo package automatically creates generic functions as above. We are not aware of how to do the same in S4 (this is the main reason for why R.oo is currently staying with S3).

13 13 of 22 http://www.maths.lth.se/help/R/ Defines the constructor method of a class, but also the class. RCC: –Asserts that a correct class name is used; reserved words and names of basic functions that must not be overwritten or redefined are protected. –Class and constructor names should start with an UPPER CASE letter. –Constructors should be named the same as the class. setConstructorS3() setConstructorS3(“MAData”, function(M, A, layout=NULL) { extend(MicroarrayData(layout=layout), “MAData”, M = as.matrix(M), A = as.matrix(A) ) }) Constructor/class definition hybrid: Creates an object of the super class, which is then “extended” into an MAData object with additional fields. Does not require the Object class

14 14 of 22 http://www.maths.lth.se/help/R/ Quick inspection of a class print( ) or simply type the class name at the prompt and press ENTER, e.g. > MAData MAData extends MicroarrayData, Object { public A public layout public M... normalizeWithinSlide(...)... public plot(what="MvsA",...) public plot3d(...) public plotPrintorder(what="M",...)... public print(...) public save(file=NULL, path=NULL,...) } MicroarrayData MAData A: matrix M: matrix as.RGData(): RGData... normalizeWithinSlide(...) normalizeAcrossSlides(...)... Object... plot(...) plot3d(...) plotPrintorder(...)... Layout ngrid.c: integer ngrid.r: integer nspot.c: integer nspot.r: integer... getName(...): character getId(...): character... nbrOfSpots(): integer nbrOfGrids(): integer...

15 15 of 22 http://www.maths.lth.se/help/R/ print( ) or simply and ENTER at the prompt, which by default is equal to print(as.character( )), e.g. > ma [1] "MAData: M (5184x4), A (5184x4), Layout: Grids: 4x4 (=16), spots in grids: 18x18 (=324), total number of spots: 5184. Spot name's are specified. Spot id's are specified." ll( ) gives details information about the (public) fields, e.g. Quick inspection of an object > ll(ma) member data.class dimension object.size 1 A SpotSlideArray c(5184,4) 143940 2 layout Layout 1 428 3 M SpotSlideArray c(5184,4) 143940 > ll(ma$layout) # or ll(getLayout(ma)) member data.class dimens2ion object.size 1 geneGrps NULL 0 0 2 geneSpotMap NULL 0 0 3 id character 5184 63868 4 ngrid.c numeric 1 36... 11 printtipGrps NULL 0 0

16 16 of 22 http://www.maths.lth.se/help/R/ Rdoc: Source-to-Rd converter Rdoc comments are Rd documentation within the source files: –easy to generate complete Rd files from source files. –less risk to forget to update Rd files. –automatically generates class hierarchy and method lists. –extra tags to include external files, e.g. example code. #####################################################################/** # @Class Matlab # # \title{Matlab client for remote or local Matlab access} # # \description{ # @include "Matlab.declaration.Rdoc" # } # # \usage{ # matlab <- Matlab(host="localhost", port=9999, remote=FALSE) # } # # \arguments{ # \item{host}{Name of host to connect to. # Default value is \code{localhost}.} # \item{port}{Port number on host to connect to. # Default value is \code{9999}.} # \item{remote}{If \code{TRUE}, all data to and from the Matlab server will # be transferred through the socket connection, otherwise the data will # be transferred via a temporary file. Default value is \code{FALSE}.} # } # # \section{Fields and Methods}{ # @include "Matlab.methods.Rdoc" # @include "Matlab.inheritedMethods.Rdoc" # } # # \examples{\dontrun{@include "Matlab.Rex"}} # # \author{Henrik Bengtsson, \url{http://www.braju.com/R/}} # # \seealso{ # Stand-alone methods \code{\link{readMAT}()} and \code{\link{writeMAT}()} # for reading and writing MAT file structures. # } # # @visibility public #*/###################################################################### setConstructorS3("Matlab", function(host="localhost", port=9999, remote=FALSE) { extend(Object(), "Matlab",... Does not require the Object class

17 17 of 22 http://www.maths.lth.se/help/R/ Static methods Methods that are specific to a class and do not belong to a certain object. Keeps the focus on classes/objects, not methods. –For instance, static method names are easy to remember for the end user (“first class then method”), e.g. MicroarrayData$read(“slide1.gpr”) Sound$read(“chime.wav”) Colors$getHeatColors(1:10) instead of readMicroarrayData(“slide1.gpr”) readSound(“chime.wav”) getHeatColors(1:10) which might not even be unique!

18 18 of 22 http://www.maths.lth.se/help/R/ Virtual fields Virtual fields are fields that does not exist, but appears to do so because of existing methods get () and set (). –Example 1: The virtual field area of the Square class is defined by defining getArea() and setArea() : square$area will call getArea(square), which will return the area (´calculated from the field side or in some other way) square$area <- -12 will call setArea(square, -12), which then throws an OutOfRangeException. –Example 2: Private fields, e.g. side, can be protected by defining setSide(), which throws an NoSuchFieldException. –Example 3: The constant field RED.HUE can be write protected by defining setRED.HUE(), which throws an AssignmentException. –Example 4: Provide cached fields that can be calculated from the other fields, but can be cached in case they are accessed often at it takes a long time to calculate them. The cache can be removed in case of low memory.

19 19 of 22 http://www.maths.lth.se/help/R/ Summary example setConstructorS3(“Square”, function(side=0) { # Creates an object of class Square. Square, whose fields are # defined at the same time, extends the class Shape. extend(Shape(), “Square”, side = side # ‘side’ is public ) }) setMethodS3(“setSide”, “Square”, function(this, side) { # sq$side <- “a” will throw a NonNumericException if (!is.numeric(side)) throw(NonNumericException(“Trying to set the side of a square \ to a non-numeric value: “, side”)) # sq$side <- -12 will throw an OutOfRangeException if (!is.numeric(side)) throw(OutofRangeException(“The side of a square must be zero \ or greater: “, side”)) this$side <- side # Assignment remains also after returning! })

20 20 of 22 http://www.maths.lth.se/help/R/ Extended exception handling Throw Exception objects, which can be caught (quietly) based on class, e.g. trycatch({ # Calls setArea(), which throws an OutOfRangeException. sq$side <- -12 }, NonNumericException = { cat(“The side of a square must be a numeric value.\n”) }, ANY = { # catches any other types of Exception (also try-error). print(Exception$getLastException()) }, finally = { # always double the side whatever happens. sq$side <- 2*sq$side }) Object Exception RccViolationException R.oo OutOfRangeException NonNumericException Exception static getLastException(): Exception getMessage(): character getWhen(): POSIX time throw() Error: [2003-03-08 12:11:43] OutOfRangeException: The side of a square must be zero or greater: -12 Does not require the Object class

21 21 of 22 http://www.maths.lth.se/help/R/ Future Make the API (even) more similar to the S4 API –Makes transitions to and from R.oo (and S4), easier. –Less confusing for beginners. Make an S4 version of the package –When the problem “generic functions are too restricted on matching argument” is solved. Make it easier to declare private fields or constants. Implement the mechanisms for field access in native code. Publish R.oo on CRAN –Requires a stable API. After 2+ years it is indeed very stable, but any major changes after v1.0 will be annoying for the user.

22 22 of 22 http://www.maths.lth.se/help/R/ Acknowledgements The R development team People on the r-help mailing list All users that have given feedback to the project See http://www.maths.lth.se/help/R/ for RCC, more documentation, help, examples, and installation of R.classes bundle: R.audio, R.base, R.graphics, R.io, R.lang, R.matlab, R.oo, R.tcltk, R.ui, cDNA microarray package: com.braju.sma.


Download ppt "Henrik Bengtsson Mathematical Statistics, Centre for Mathematical Sciences Lund University, Sweden DSC-2003, Vienna. March 20-22, 2003."

Similar presentations


Ads by Google