Object Naming Analysis for Reverse- Engineered Sequence Diagrams Atanas (Nasko) Rountev Beth Harkness Connell Ohio State University.

Object Naming Analysis for Reverse- Engineered Sequence Diagrams Atanas (Nasko) Rountev Beth Harkness Connell Ohio State University

Nasko Rountev - ICSE'05 2 Example of a UML Sequence Diagram start:X m1() p:A m2() n:A m3() create() m4()

Nasko Rountev - ICSE'05 3 UML Sequence Diagrams  Popular UML artifacts for modeling of object interactions  Design-time sequence diagrams  Reverse-engineered sequence diagrams  Based on existing code  Iterative development; design recovery for software maintenance; software testing  Implemented in some commercial UML tools  Together ControlCenter (Borland)  EclipseUML (Omondo)

Nasko Rountev - ICSE'05 4 Reverse-Engineering Analyses  Dynamic analysis: tracks a set of representative run-time executions  Several research tools  Static analysis: examines only the code  Commercial tools (deficiencies)  Some research work (not comprehensive)  REDPRESTO  RED tool for Java: PRESTO group at OSU  URL: presto.cse.ohio-state.edu/red  Call chain analysis; control-flow analysis; object naming analysis; visualization and navigation; test coverage measurements

Nasko Rountev - ICSE'05 5 Object Naming class X { void m1(A p) { p.m2(this) A a = p.m2(this) ; a.m4() ; } void m3() { … } } class A { A m2(X q) { q.m3() q.m3() ; A() return new A() ; } void m4() { … } } start:X m1 p:A m2 q:X m3 a:A m4 n:A create

Nasko Rountev - ICSE'05 6 Object Naming Schemes  Based on variable names  Same run-time object could be represented by several diagram objects  Different run-time objects could be represented by the same diagram object  Handling of instance fields  Based on points-to analysis  Does not work either  Based on a new object naming analysis  “Inspired” by constant propagation analysis

Nasko Rountev - ICSE'05 7 Flow of Seed Values class X { void m1(A p) { A a = p.m2(this) ; a.m4() ; } void m3() { … } } class A { A m2(X q) { q.m3() ; return new A() ; } void m4() { … } } p:A m2 start:X m1 m3 m4 n:A create

Nasko Rountev - ICSE'05 8 Singleton Call Sites  Singleton call site  Only one possible run-time receiver object  Receiver comes from a specific seed value:  Formal of the start method (incl. this)  A new X() expression that is provably executed at most once  How about non-singleton call sites? A a = p.m2(this) ; if (…) a = new A() ; a.m4() ;

Nasko Rountev - ICSE'05 9 Naming Analysis for Singleton Call Sites  Goal: static analysis that identifies singleton call sites and their seed values  Version 1 of the analysis  Interprocedural dataflow analysis, similar to interprocedural copy constant propagation  CFGs, dataflow lattice, dataflow functions  Three-phase algorithm; flow- and context- sensitive; an IDE analysis; MVP-precise  Version 2 of the analysis  Various enhancements

Nasko Rountev - ICSE'05 10 start_m1 p.m2(this) a=m2_ret_val a.m4() m4_ret end_m1 if(…)a=new A() 1 start_m2 q.m3() m3_ret m2_ret_val=new A() 2 end_m2 Lattice elements: L this, L p, L n1, L n2,  m2rv  L n2 a  L n1 a  L n2 aaaa this  L this q  L this

Nasko Rountev - ICSE'05 11 Handling of Fields  Version 1: conservative treatment  a=b.fa=C.sf  a  a=b.f or a=C.sf results in value  for a  Version 2: more precise C.sf  Static field C.sf that is not modified by any method reachable from the start method  Treated as a seed value f  Instance field f: not modified by any method reachable from the start method  a=b.f x bx.f  a=b.f, and the algorithm computes seed value x for b: introduce new seed value x.f x.f1.f2.f3.f4  Iterative: e.g. could have x.f1.f2.f3.f4

Nasko Rountev - ICSE'05 12Experiments  21 subjects components from Java libraries and applications  For each component, the analysis was executed multiple times, once for each (non-trivial) potential start method  Implementation  Uses Soot (Sable group, McGill)  Uses several optimization techniques  Experiments: Sun Fire 280-R, 900 MHz

Nasko Rountev - ICSE'05 13 Number of Start Methods

Nasko Rountev - ICSE'05 14 Analysis Running Time [sec]

Nasko Rountev - ICSE'05 15 % Singleton Call Sites

Nasko Rountev - ICSE'05 16 Singleton Call Sites  Considered call chain depth up to 5  For 18 of the 21 components: > 75% of the call sites were singleton sites  For 7 of the 21 components: > 90%  Examined component bigdecimal (55%)  143 call sites that could not be resolved  125 of the 143 were legitimately non- singleton: multiple possible run-time objects  Conclusion: typically, the majority of call sites can be represented precisely in the reverse-engineered sequence diagrams

Nasko Rountev - ICSE'05 17 Conclusions and Future Work  Low-cost, high-precision analysis  Relatively simple to implement  There is some room for improvement  For non-singleton call sites: need careful investigation of trade-offs between different approaches  Some preliminary work under way RED  Re-implement in Eclipse and make public, together with the other analyses in RED

Nasko Rountev - ICSE'05 18 Questions?

Object Naming Analysis for Reverse- Engineered Sequence Diagrams Atanas (Nasko) Rountev Beth Harkness Connell Ohio State University.

Similar presentations

Presentation on theme: "Object Naming Analysis for Reverse- Engineered Sequence Diagrams Atanas (Nasko) Rountev Beth Harkness Connell Ohio State University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Object Naming Analysis for Reverse- Engineered Sequence Diagrams Atanas (Nasko) Rountev Beth Harkness Connell Ohio State University.

Similar presentations

Presentation on theme: "Object Naming Analysis for Reverse- Engineered Sequence Diagrams Atanas (Nasko) Rountev Beth Harkness Connell Ohio State University."— Presentation transcript:

Similar presentations

About project

Feedback