Portability and Safety Mahdi Milani Fard Dec, 2006 Java
Outline ● Language Overview History and design goals ● Design Goals Portability, Reliability, Safety, Simplicity, Efficiency ● Java Programming Language Objects in Java, Type system, Generics ● Java's Runtime Environment Virtual machine, Loader, Bytecodes, Security issues
History ● Primarily Designed by James Gosling and others at Sun, 1990 – 95 ● Was called “Oak” at first ● Language for “set-top box” ● Was to be used for embedded computers in home appliances This was too soon for that time Java was a side product
Design Goals ● Portability Internet-wide distribution: PC, Unix, Mac ● Reliability Avoid program crashes and error messages ● Safety Programmer may be malicious ● Simplicity and familiarity Appeal to average programmer; less complex than C++ ● Efficiency Important but secondary
Java's Success ● What did make Java so successful? Portability? ● There are lots of other languages that are as portable as Java. Even C++ has portable standards now. Reliability? ● Reliable exception handling mechanism is also implemented in C++ and many other languages Safety? ● Again there are many tools and mechanisms in other languages to support this, such as digital signature of packages and components and access rights. Simplicity? ● There are many languages easer and simpler than Java, such as Visual Basic.
Java's Success ● The true answer: None ● Tremendous marketing effort by sun Microsystems ● Java is now on you cellphones, in your browser and on your everyday used servers ● Will it keep wining the market?
Simplicity ● Almost everything is an object ● All objects on heap, accessed through references ● Omitted troublesome features No functions, no multiple inheritance, no goto, no operator overloading, few automatic coercions
Portability ● Bytecode interpreter on many platforms Windows, Linux, Mac,... ● Maybe the most important portability feature is Java's portable GUI library There are many portable GUI libraries for C++, so what's the deal? ● Only few C++ geeks can cross compile such codes under different platforms ● These libraries are not standard, they need to be installed ● Many “cool” platform depended features are not implemented for portability reasons
Reliability and Safety ● Typed source ● Typed bytecode language ● Run-time type and bounds checks ● Garbage collection ● Security mechanisms Sandboxing Digital signatures
Java Programming Language ● Syntax similar to C++ ● Simple and easy to learn and understand ● Fewer confusing syntax and programing features Better syntactic consistency Many samples of “responsible design” principle ● Enhanced mechanisms for exception handling ● Embedded mechanisms for documentation Java Doc
Java Classes and Objects ● Syntax similar to C++ ● Object has fields and methods is allocated on heap, not run-time stack accessible through reference (only ptr assignment) garbage collected (no explicit delete or free) ● There are also primitive types for efficiency int, double, boolean ● No header files A class will be defined only in a single file
A Java Sample Class class Point { private int x; protected void setX (int y) { x = y; } public int getX() { return x; } Point(int xval) { //constructor x = xval; } };
Object and Class Load and Destruction ● Java guarantees constructor call for each object Memory allocated Constructor called to initialize memory Issues related to inheritance ● We’ll discuss later… ● Static fields of class initialized at class load time Class loader is responsible for loading classes ● We’ll discuss later… ● Objects are garbage collected No explicit free, Avoids resulting type errors
Encapsulation and Packages ● Every field or method belongs to a class no global functions of variables ● Every class is part of some package Package can be unnamed (default) File declares which package code belongs to PackageA ClassX field method PackageB ClassY field method
Visibility and Access ● Four visibility distinctions (both methods and fields) Public: ● accessible by any class in any package Package (not declared) ● accessible by any class in the same package protected ● accessible by subclasses in the any package private ● accessible by methods of the same class
Java Types ● Two general kinds of times Primitive types – not objects ● Integers, Booleans, etc Reference types ● Classes, interfaces, arrays ● Static type checking Every expression has type, determined from its parts ● Dynamic type checking Downcast checked at run-time, may raise exception Method invocation (dynamic binding, polymorphism)
Subtyping and Inheritance ● Interface the external view of an object (code) ● Implementation the internal representation of an object (code) ● Subtyping relation between interfaces only inherits the signatures ● Inheritance relation between implementations the code is inherited
Java's Interfaces ● Using the “impelements” keyword for “interface” types (Only Subtyping) ● Similar to C++ completely abstract classes ● Flexibility Multiple subtyping can be done for a single class Allows subtype graph instead of tree Avoids problems with multiple inheritance of implementations (remember C++ “diamond”) ● Cost Offset in method lookup table not known at compile Different bytecodes for method lookup (we'll see)
Interfacing Sample interface Shape { public float area(); }; interface Drawable { public void draw(); }; class Circle implements Shape, Drawable { private float r; private Point c; public float draw() { /*implementation*/ } public void draw() { /*implementation*/ } };
Java's Inheritance ● Similar to Smalltalk, C++ Subclass inherits from superclass Using the “extends” keyword: ● class ColorPoint extends Point {... }; ● Single inheritance only ● Every class extends another class Superclass is Object if no other class named ● Methods of class Object: getClass toString equals hashCode clone ...
Java's Inheritance (cont) ● Java guarantees constructor call for each object This must be preserved by inheritance Subclass constructor must call super constructor ● Explicit (First line of the subclass constructor) ● Implicit (Only when there is a default constructor for the superclass) ● Restrict inheritance Final classes and methods cannot be redefined ● Important for security
Arrays Types ● Automatically defined Array type T[ ], T[ ][ ],... exists for each class or interface type T Cannot extend array types (array types are final) ● Treated as reference type An array variable is a pointer to an array, can be null Example: Circle[ ] x = new Circle[array_size] Anonymous array expression: new int[ ] {1,2,3} ● Every array type is a subtype of Object[ ] ● Length of array is not part of its static type
Array Subtyping ● Covariance if S <: T then S[ ] <: T[ ] ● Standard type error class A {…}; class B extends A {…}; B[] bArray = new B[10] A[] aArray = bArray // OK since B[] <: A[] aArray[0] = new A() // compiles // but run-time error // raises ArrayStoreException
Java Generics class Stack { void push(Object o) {... } Object pop() {... }... }; String s = "Hello"; Stack st = new Stack();... st.push(s);... s = (String) st.pop(); class Stack { void push(A a) {... } A pop() {... }... }; String s = "Hello"; Stack st = new Stack (); st.push(s);... s = st.pop();
Java's Runtime Environment ● Virtual machine ● Loader ● Bytecodes ● Security issues
Java Virtual Machine ● Three different issues when you use the term “Java Virtual Machine”: The abstract specification ● Sun's specification Concrete implementation ● Sun's implementation ● Open source implementation Runtime instance ● The process named Java
JVM Architecture
Class Loader ● Loading: finding and importing the binary data for a type ● Linking: Verification Preparation Resolution ● Initialization: invoking Java code that initializes class variables to their proper starting values.
Runtime Areas
Heap
Method Area ● The fully qualified name of the type ● The fully qualified name of the type’s direct superclass ● Whether or not the type is a class or an interface ● The type’s modifiers (public,…) ● An ordered list of the fully qualified names of any direct superinterfaces ● The constant pool for the type ● Field information ● Method information ● All class (static) variables declared in the type ● A reference to class ClassLoader ● A reference to class Class
Activation Record ● The activation record has three parts: Local variables Operand stack Frame data ● The sizes of the local variables and operand stack are determined at compile time
Java Bytecode ● Java Class A extends Object { int i void f(int val) { i = val + 1; } ● Bytecode Method void f(int) aload 0 ;object ref this iload 1 ; int val iconst 1 iadd ; add val +1 putfield #4 return val This data area local variables operand stack JVM Activation Record Return addr, exception info, Const pool res.
Field and Method Access ● Instruction includes index into constant pool Constant pool stores symbolic names ● First execution Use symbolic name to find field or method ● Second execution Use modified “quick” instruction to simplify search Putfield_quick 6
invokevirtual ● Search for method find the method entry in the constant pool pop arguments find method with the given name and signature using the reference ● Java Object x; … x.equals(”test”); ● ByteCode aload_1 ldc ”test” invokevirtual java/lang/Object/equals
Method Lookup 3 5 blue Point object ColorPoint object x mptr x c Point Method table ColorPoint Method table Code for move Code for darken Point p = new ColorPoint(3, 2, “RED”); p.move(2, 3); // (p.mptr[0])(p,2, 3)
Bytecode Rewriting: invokevirtual inv_virt_quick vtable offset “A.foo()” Bytecode invokevirtual After search, rewrite bytcode to use fixed offset into the vtable. No search on second execution.
invokeinterface ● Interfaces ● Problem with multiple interface ● Solutions: Multiple method tables Search
Bytecode Rewriting: invokeinterface Cache address of method; check class on second use inv_int_quick Constant pool “A.foo()” Bytecode invokeinterface vtable offset “A.foo()”
CPP Approach C object vptr B data vptr A data C data B object A object & C::f0 C-as-A vtbl C-as-B vtbl & B::g0 & C::f pa, pc pb ● C Extends A,B ● More memory usage ● Less dereferencing C AB
Java Security ● Security Prevent unauthorized use of computational resources ● Java security A code can read input from careless user or malicious attacker (buffer overflow,...) A code may be written by careless friend or malicious attacker (trajans, viruses,...) ● Java is designed to reduce many security risks
Java Security Mechanisms ● Sandboxing Run program in restricted environment ● Analogy: child’s sandbox with only safe toys This term refers to ● Features of loader, verifier, interpreter that restrict program ● Java Security Manager, a special object that acts as access control “gatekeeper” ● Code signing Use cryptography to establish origin of class file ● This info can be used by security manager
Java Sandbox ● Four complementary mechanisms Class loader ● Separate namespaces for separate class loaders ● Associates protection domain with each class Verifier ● Checks code consistency JVM run-time checks ● No unchecked casts or other type errors, No array overflows,... ● Preserves private, protected visibility levels Security Manager ● Limits the access
Why is typing a security feature? ● Java library functions call security manager ● Security manager object answers at run time Decide if calling code is allowed to do operation Examine protection domain of calling class ● Signer: organization that signed code before loading ● Location: URL where the Java classes came from Uses the system policy to decide access permission
Exception Handling ● Similar to C++ with a few new mechanism ● Advantages of Exceptions Separating Error-Handling Code from "Regular" Code Propagating Errors Up the Call Stack Grouping and Differentiating Error Types ● Using subtyping and inheritance
Exception Handling (cont) ● Differences with C++ Only “throwable” Objects can be thrown Every exception should either be catched or re- thrown by the method Method explicitly state the type of exceptions they throw Many error checking “checks” can be done at compile time ● Compile error is reported if an exception is not catched Unlike C++ exceptions are heavily used in Java's standard library
Exception Handling (cont) method1 { try { call method2; } catch (exception e) { doErrorProcessing; } method2 throws exception { call method3; } method3 throws exception { call readFile; }