Object-Oriented Reengineering Patterns Serge Demeyer Stéphane Ducasse Oscar Nierstrasz www.iam.unibe.ch/~scg/OORP.

Slides:



Advertisements
Similar presentations
SOFTWARE MAINTENANCE 24 March 2013 William W. McMillan.
Advertisements

Describing Process Specifications and Structured Decisions Systems Analysis and Design, 7e Kendall & Kendall 9 © 2008 Pearson Prentice Hall.
1 Object-Oriented Reengineering © S. Demeyer, S.Ducasse, O. Nierstrasz Lecture 2 Radu Marinescu Reverse Engineering.
© S. Demeyer, S. Ducasse, O. Nierstrasz Reverse Engineering.1 4. Reverse Engineering JEdit Experience What and Why Setting Direction  Most Valuable First.
OORPT Object-Oriented Reengineering Patterns and Techniques 7. Problem Detection Prof. O. Nierstrasz.
Stéphane Ducasse2.1 OOP? What is OOP? Why? OOP in a nutshell.
© S. Demeyer, S. Ducasse, O. Nierstrasz Reverse Engineering.1 2. Reverse Engineering What and Why Setting Direction  Most Valuable First First Contact.
OORPT Object-Oriented Reengineering Patterns and Techniques 4. Reverse Engineering Prof. O. Nierstrasz.
Introduction to Software Engineering 13. Software Evolution.
Introduction To System Analysis and Design
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.1 7. Problem Detection Metrics  Software quality  Analyzing trends Duplicated Code  Detection techniques.
CS 290C: Formal Models for Web Software Lecture 10: Language Based Modeling and Analysis of Navigation Errors Instructor: Tevfik Bultan.
Object-Oriented Reengineering Oscar Nierstrasz Software Composition Group University of Bern.
© S. Demeyer, S. Ducasse, O. Nierstrasz Intro.1 1. Introduction Goals Why Reengineering ?  Lehman's Laws  Object-Oriented Legacy Typical Problems  common.
OORPT Object-Oriented Reengineering Patterns and Techniques 1. Introduction Prof. O. Nierstrasz.
Software Evolution Managing the processes of software system change
March 30, Exploratory Testing Testing Approaches Analytical Information-driven Intuitive Exploratory Design the tests and test concurrently Learn.
© S. Demeyer, S. Ducasse, O. Nierstrasz Reverse Engineering.1 3. Reverse Engineering What and Why Setting Direction  Most Valuable First First Contact.
OORPT Object-Oriented Reengineering Patterns and Techniques 8. Restructuring Prof. O. Nierstrasz.
Stéphane Ducasse6.1 Essential Concepts Why OO? What is OO? What are the benefits? What are the KEY concepts? Basis for all the lectures.
7. Duplicated Code Metrics Duplicated Code Software quality
© S. Demeyer, S. Ducasse, O. Nierstrasz Migration.1 5. Testing and Migration What and Why  Reengineering Life-Cycle Tests: Your Life Insurance !  Grow.
CS350/550 Software Engineering Lecture 1. Class Work The main part of the class is a practical software engineering project, in teams of 3-5 people There.
© Copyright Eliyahu Brutman Programming Techniques Course.
© S. Demeyer, S. Ducasse, O. Nierstrasz Visualization.1 3. Software Visualization Introduction SV in a Reengineering Context Static Code Visualization.
Introduction to Software Design Chapter 1. Chapter 1: Introduction to Software Design2 Chapter Objectives To become familiar with the software challenge.
1 An introduction to design patterns Based on material produced by John Vlissides and Douglas C. Schmidt.
OBJECT ORIENTED PROGRAMMING IN C++ LECTURE
Domain Modeling (with Objects). Motivation Programming classes teach – What an object is – How to create objects What is missing – Finding/determining.
Chapter 9 – Software Evolution and Maintenance
Introduction To System Analysis and design
Object Oriented Software Development
REFACTORING Lecture 4. Definition Refactoring is a process of changing the internal structure of the program, not affecting its external behavior and.
Software Reengineering & Evolution Serge Demeyer Stéphane Ducasse Oscar Nierstrasz
1 Shawlands Academy Higher Computing Software Development Unit.
Objects and Components. The adaptive organization The competitive environment of businesses continuously changing, and the pace of that change is increasing.
Sadegh Aliakbary Sharif University of Technology Fall 2011.
TOPIC R Software Maintenance, Evolution, Program Comprehension, and Reverse Engineering SEG4110 Advanced Software Design and Reengineering.
Teaching material for a course in Software Project Management & Software Engineering – part II.
05 - Patterns Intro.CSC4071 Design Patterns Designing good and reusable OO software is hard. –Mix of specific + general –Impossible to get it right the.
1 Object-Oriented Reengineering © S. Demeyer, S.Ducasse, O. Nierstrasz Lecture 1 Radu Marinescu Introduction to Object-Oriented Reengineering.
© S. Demeyer, S. Ducasse, O. Nierstrasz Intro.1 1. Introduction Goals Why Reengineering ?  Lehman's Laws  Object-Oriented Legacy Typical Problems  common.
An Introduction to Design Patterns. Introduction Promote reuse. Use the experiences of software developers. A shared library/lingo used by developers.
Object-Oriented Reengineering Patterns 5. Problem Detection.
1 The Software Development Process  Systems analysis  Systems design  Implementation  Testing  Documentation  Evaluation  Maintenance.
Introduction To System Analysis and Design
© S. Demeyer, S. Ducasse, O. Nierstrasz Chapter.1 10(a). Restructuring Most common situations Redistribute Responsibilities  Eliminate Navigation Code.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Chapter 18 Object Database Management Systems. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Motivation for object.
Object-Oriented Reengineering Patterns 1. Introduction.
Manag ing Software Change CIS 376 Bruce R. Maxim UM-Dearborn.
1 What is OO Design? OO Design is a process of invention, where developers create the abstractions necessary to meet the system’s requirements OO Design.
S.Ducasse Stéphane Ducasse 1 Object-Oriented Programming.
Object-Oriented Modeling: Static Models. Object-Oriented Modeling Model the system as interacting objects Model the system as interacting objects Match.
The Software Development Process
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
1 Object-Oriented Reengineering © S. Demeyer, S.Ducasse, O. Nierstrasz Lecture 3 Radu Marinescu Detailed Model Capture  Details matter  Pay attention.
Chapter 5: Software Re-Engineering Omar Meqdadi SE 3860 Lecture 5 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
Salman Marvasti Sharif University of Technology Winter 2015.
© 2006 Pearson Addison-Wesley. All rights reserved 2-1 Chapter 2 Principles of Programming & Software Engineering.
HNDIT23082 Lecture 06:Software Maintenance. Reasons for changes Errors in the existing system Changes in requirements Technological advances Legislation.
Testing OO software. State Based Testing State machine: implementation-independent specification (model) of the dynamic behaviour of the system State:
Generating Software Documentation in Use Case Maps from Filtered Execution Traces Edna Braun, Daniel Amyot, Timothy Lethbridge University of Ottawa, Canada.
March 1, 2004CS WPI1 CS 509 Design of Software Systems Lecture #6 Monday, March 1, 2004.
CMSC 345 Fall 2000 OO Design. Characteristics of OOD Objects are abstractions of real-world or system entities and manage themselves Objects are independent.
Chapter 18 Object Database Management Systems. Outline Motivation for object database management Object-oriented principles Architectures for object database.
1 The Software Development Process ► Systems analysis ► Systems design ► Implementation ► Testing ► Documentation ► Evaluation ► Maintenance.
Chapter 5 Introduction to Defining Classes Fundamentals of Java.
Software Metrics 1.
Presentation transcript:

Object-Oriented Reengineering Patterns Serge Demeyer Stéphane Ducasse Oscar Nierstrasz

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.2 Schedule 1. Introduction There are OO legacy systems too ! 2. Reverse Engineering How to understand your code 3. Visualization Scaleable approach 4. Restructuring How to Refactor Your Code 5. Code Duplication The most typical problems 6. Software Evolution Learn from the past 7. Conclusion

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.3 1. Introduction Goals Why Reengineering ?  Lehman's Laws  Object-Oriented Legacy Typical Problems  common symptoms  architectural problems & refactoring opportunities Reverse and Reengineering  Definitions  Techniques  Patterns

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.4 Goals We will try to convince you: Yes, Virginia, there are object-oriented legacy systems too! Reverse engineering and reengineering are essential activities in the lifecycle of any successful software system. (And especially OO ones!) There is a large set of lightweight tools and techniques to help you with reengineering. Despite these tools and techniques, people must do job and they represent the most valuable resource.

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.5 What is a Legacy System ? “legacy” A sum of money, or a specified article, given to another by will; anything handed down by an ancestor or predecessor.— Oxford English Dictionary  so, further evolution and development may be prohibitively expensive A legacy system is a piece of software that: you have inherited, and is valuable to you. Typical problems with legacy systems: original developers not available outdated development methods used extensive patches and modifications have been made missing or outdated documentation

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.6 Software Maintenance - Cost requirement design coding testing delivery x 1 x 5 x 10 x 20 x 200 Relative Maintenance Effort Between 50% and 75% of global effort is spent on “maintenance” ! Relative Cost of Fixing Mistakes Solution ? Better requirements engineering? Better software methods & tools (database schemas, CASE-tools, objects, components, …)?

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.7 Continuous Development 17.4% Corrective (fixing reported errors) 18.2% Adaptive (new platforms or OS) 60.3% Perfective (new functionality) The bulk of the maintenance cost is due to new functionality  even with better requirements, it is hard to predict new functions data from [Lien78a] 4.1% Other

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.8 (*) process-oriented structured methods, information engineering, data-oriented methods, prototyping, CASE-tools – not OO ! Contradiction ?No! modern methods make it easier to change... this capacity is used to enhance functionality! Modern Methods & Tools ? [Glas98a] quoting empirical study from Sasa Dekleva (1992) Modern methods (*) lead to more reliable software Modern methods lead to less frequent software repair and... Modern methods lead to more total maintenance time

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.9 Lehman's Laws A classic study by Lehman and Belady [Lehm85a] identified several “laws” of system change. Continuing change A program that is used in a real-world environment must change, or become progressively less useful in that environment. Increasing complexity As a program evolves, it becomes more complex, and extra resources are needed to preserve and simplify its structure. Those laws are still applicable…

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.10 What about Objects ? Object-oriented legacy systems = successful OO systems whose architecture and design no longer responds to changing requirements Compared to traditional legacy systems The symptoms and the source of the problems are the same The technical details and solutions may differ OO techniques promise better flexibility, reusability, maintainability …  they do not come for free

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.11 What about Components ? Components are very brittle … After a while one inevitably resorts to glue :)

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.12 How to deal with Legacy ? New or changing requirements will gradually degrade original design … unless extra development effort is spent to adapt the structure New Functionality Hack it in ? duplicated code complex conditionals abusive inheritance large classes/methods First … refactor restructure reengineer Take a loan on your software  pay back via reengineering Investment for the future  paid back during maintenance

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.13 Common Symptoms Lack of Knowledge obsolete or no documentation departure of the original developers or users disappearance of inside knowledge about the system limited understanding of entire system  missing tests Process symptoms too long to turn things over to production need for constant bug fixes maintenance dependencies difficulties separating products  simple changes take too long Code symptoms duplicated code code smells  big build times

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.14 Common Problems Architectural Problems insufficient documentation = non-existent or out-of-date improper layering = too few are too many layers lack of modularity = strong coupling duplicated code = copy, paste & edit code duplicated functionality = similar functionality by separate teams Refactoring opportunities misuse of inheritance = code reuse vs polymorphism missing inheritance = duplication, case- statements misplaced operations = operations outside classes violation of encapsulation = type-casting; C++ "friends" class abuse = classes as namespaces

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.15 The Reengineering Life-Cycle Requirements Designs Code (0) requirement analysis (1) model capture (2) problem detection (3) problem resolution (4) program transformation people centric lightweight

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.16 A Map of Reengineering Patterns Tests: Your Life Insurance Detailed Model Capture Initial Understanding First Contact Setting Direction Migration Strategies Detecting Duplicated Code Redistribute Responsibilities Transform Conditionals to Polymorphism

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.17 Summary Software “maintenance” is really continuous development Object-oriented software also suffers from legacy symptoms Reengineering goals differ; symptoms don’t Common, lightweight techniques can be applied to keep software healthy

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering Reverse Engineering What and Why First Contact  Interview during Demo Initial Understanding  Analyze the Persistent Data Detailed Model Capture  Look for the Contracts

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.19 What and Why ? Definition Reverse Engineering is the process of analysing a subject system  to identify the system’s components and their interrelationships and  create representations of the system in another form or at a higher level of abstraction.— Chikofsky & Cross, ’90 Motivation Understanding other people’s code (cfr. newcomers in the team, code reviewing, original developers left,...) Generating UML diagrams is NOT reverse engineering... but it is a valuable support tool

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.20 The Reengineering Life-Cycle (0) req. analysis (1) model capture issues scale speed accuracy politics Requirements Designs Code (0) requirement analysis (1) model capture (2) problem detection (3) problem resolution (4) program transformation

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.21 First Contact System experts Chat with the Maintainers Interview during Demo Talk with developers Talk with end users Talk about it Verify what you hear feasibility assessment (one week time) Software System Read All the Code in One Hour Do a Mock Installation Read itCompile it Skim the Documentation Read about it

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.22 First Project Plan Use standard templates, including: project scope  see "Setting Direction" opportunities  e.g., skilled maintainers, readable source-code, documentation risks  e.g., absent test-suites, missing libraries, …  record likelihood (unlikely, possible, likely) & impact (high, moderate, low) for causing problems go/no-go decision activities  fish-eye view

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.23 Solution: interview during demo -select several users -demo puts a user in a positive mindset -demo steers the interview Interview during Demo Solution: Ask the user!... however  Which user ?  Users complain  What should you ask ? Problem: What are the typical usage scenarios?

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.24 Initial Understanding understand  higher-level model Top down Speculate about Design Recover design Analyze the Persistent Data Study the Exceptional Entities Recover database Bottom up Identify problems

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.25 Analyze the Persistent Data Problem: Which objects represent valuable data? Solution: Analyze the database schema Prepare Model  tables  classes; columns  attributes  candidate keys (naming conventions + unique indices)  foreign keys (column types + naming conventions + view declarations + join clauses) Incorporate Inheritance  one to one; rolled down; rolled up Incorporate Associations  association classes (e.g. many-to-many associations)  qualified associations Verification  Data samples + SQL statements

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.26 Example: One To One Patient id: char(5) insuranceID: char(7) insurance: char(5) Salesman id: char(5) company: char(40) Person id: char(5) name: char(40) addresss: char(60) Patient id: char(5) insuranceID: char(7) insurance: char(5) Salesman id: char(5) company: char(40) Person id: char(5) name: char(40) addresss: char(60)

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.27 Example: Rolled Down Patient id: char(5) name: char(40) addresss: char(60) insuranceID: char(7) insurance: char(5) Salesman id: char(5) name: char(40) addresss: char(60) company: char(40) Patient id: char(5) insuranceID: char(7) insurance: char(5) Salesman id: char(5) company: char(40) Person id: char(5) name: char(40) addresss: char(60)

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.28 Example: Rolled Up Person id: char(5) name: char(40) addresss: char(60) insuranceID: char(7) «optional» insurance: char(5) «optional» company: char(40) «optional» Patient id: char(5) insuranceID: char(7) insurance: char(5) Salesman id: char(5) company: char(40) Person id: char(5) name: char(40) addresss: char(60)

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.29 Example: Qualified Association Patient id: char(5) … Treatment patientID: char(5) date: date nr: integer comment: varchar(255) Patient id: char(5) … Treatment comment: Text date: Date nr: Integer 1 1 addTreatment(d, n, t) lookupTreatment(d, n)

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.30 Initial Understanding (revisited) Top down Speculate about Design Analyze the Persistent Data Study the Exceptional Entities understand  higher-level model Bottom up ITERATION Recover design Recover database Identify problems

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.31 Detailed Model Capture Expose the design & make sure it stays exposed Tie Code and Questions Refactor to Understand Keep track of your understanding Expose design Step through the Execution Expose collaborations Use Your Tools Look for Key Methods Look for Constructor Calls Look for Template/Hook Methods Look for Super Calls Look for the Contracts Expose contracts Learn from the Past Expose evolution Write Tests to Understand

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.32 Look for the Contracts Problem: Which contracts does a class support? Solution: Look for common programming idioms Look for “key methods”  Intention-revealing names  Key parameter types  Recurring parameter types represent temporary associations Look for constructor calls Look for Template/Hook methods Look for super calls Use your tools!

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.33 Constructor Calls: Stored Result public class Employee { private String _name = ""; private String _address = ""; public File[ ] files = { }; … public class File { private String _description = ""; private String _fileID = ""; … public void createFile (int position, String description, String identification) { files [position] = new File (description, identification); } Employee _name _address File _description _fileID 1 *

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.34 Constructor Calls: "self" Argument public class Person { private String _name = ""; … public class Marriage { private Person _husband, _wife; public Marriage (Person husband, Person wife) { _husband = husband; _wife = wife;} … Person::public Marriage marryWife (Person wife) { return new Marriage (this, wife); } Person _name … Marriage _husband _wife Person _name … Marriage _husband _wife 11

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.35 Hook Methods public class PhoneDatabase {... protected Table fetchTable (String tableSpec) { //tableSpec is a filename; parse it as //a tab-separated table representation...}; public class ProjectDatabase extends PhoneDataBase {... protected Table fetchTable (String tableSpec) { //tableSpec is a name of an SQLTable; //return the result of SELECT * as a table...}; PhoneDatabase # fetchTable(tableSpec): Table ProjectDatabase Hook Method

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.36 Template / Hook Methods public class PhoneDatabase {... public void generateHTML (String tableSpec, HTMLRenderer aRenderer, Stream outStream) { Table table = this.fetchTable (tableSpec); aRenderer.render (table, outStream);} …}; public class HTMLRenderer {... public void render (Table table, Stream outStream) { //write the contents of table on the given outStream //using appropriate HTML tags …} PhoneDatabase generateHTML(String, HTMLRenderer, Stream) Template Method

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering Software Visualization Introduction  The Reengineering life-cycle Examples Lightweight Approaches  CodeCrawler Dynamic Analysis Conclusion

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.38 The Reengineering Life-cycle Requirements Designs Code (0) requirement analysis (1) model capture (2) problem detection (3) problem resolution (4) program transformation (2) problem detection issues Tool support Scalability Efficiency

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.39 Visualising Hierarchies Euclidean cones  Pros: More info than 2D  Cons: Lack of depth Navigation Hyperbolic trees  Pros: Good focus Dynamic  Cons: Copyright

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.40 Bottom Up Visualisation Filter All program entities and relations

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.41 A lightweight approach A combination of metrics and software visualization  Visualize software using colored rectangles for the entities and edges for the relationships  Render up to five metrics on one node: Size (1+2) Color (3) Position (4+5) Relationship Entity Y Coordinate Height Color tone Width X Coordinate

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.42 Nodes:Classes Edges:Inheritance Relationships Width: Number of attributes Height: Number of methods Color: Number of lines of code System Complexity View

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.43 LOC NOS Nodes:Methods Size: Method parameters Position X: Lines of code Position Y: Statements Method Assessment

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.44 Inheritance Classification View Boxes:Classes Edges:Inheritance Width: Number of Methods Added Height: Number of Methods Overridden Color: Number of Method Extended

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.45 Data Storage Class Detection View Boxes: Classes Width: Number of Methods Height: Lines of Code Color: Lines of Code

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.46 Industrial Validation Nokia (C MLOC >2300 classes) Nokia (C++/Java 120 kLOC >400 classes) MGeniX (Smalltalk 600 kLOC >2100classes) Bedag (COBOL 40 kLOC)... Personal experience 2-3 days to get something Used by developers + Consultants

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.47 Program Dynamics Simple Reproducible Scales well

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.48 Visualization of similarities in event traces Eliminate similarities Frequency Spectrum

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.49 Extract run-time coupling Apply datamining (“google”) Experiment with documented open-source cases (Ant, JMeter)  recall: %  precision: % Key Concept Identification Class IC_CC’ +web-mining Ant docs Project √√ UnknownElement √√ Task √√ Main √√ IntrospectionHelper √√ ProjectHelper √√ RuntimeConfigurable √√ Target √√ ElementHandler √√ TaskContainer ×√ Recall (%)90- Precision (%)60-

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering Restructuring Most common situations Redistribute Responsibilities  Move Behaviour Close to Data  Eliminate Navigation Code  Split up God Class  Empirical Validation Transform Conditionals to Polymorphism  Transform Self Type Checks  Transform Provider Type Checks

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.51 The Reengineering Life-Cycle Requirements Designs Code (0) requirement analysis (1) model capture (2) problem detection (3) problem resolution (4) program transformation (2) problem detection (3) problem resolution (4) program transformation issues OO paradigm When/When not

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.52 Redistribute Responsibilities Eliminate Navigation Code Data containers Monster client of data containers Split Up God Class Move Behaviour Close to Data Chains of data containers

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.53 Move Behavior Close to Data ( example 1/2 ) Employee +telephoneNrs +name(): String +address(): String Payroll +printEmployeeLabel() System.out.println(currentEmployee.name() ); System.out.println(currentEmployee.address() ); for (int i=0; i < currentEmployee.telephoneNumbers.length; i++) { System.out.print(currentEmployee.telephoneNumbers[i]); System.out.print(" "); } System.out.println(""); TelephoneGuide +printEmployeeTelephones() * * … for … System.out.print(" -- "); …

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.54 Move Behavior Close to Data ( example 2/2 ) Employee - telephoneNrs - name(): String - address(): String +printLabel(String) Payroll +printEmployeeLabel() public void printLabel (String separator) { System.out.println(_name); System.out.println(_address); for (int i=0; i < telephoneNumbers.length; i++) { System.out.print(telephoneNumbers[i]); System.out.print(separator); } System.out.println("");} TelephoneGuide +printEmployeeTelephones() * * … emp.printLabel(" -- "); … emp.printLabel(" "); …

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.55 Eliminate Navigation Code Problem:  How do you reduce the coupling due to classes that navigate an object graph? Solution:  Iteratively move behavior close the data Also Known As: Law of Demeter

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.56 The Law of Demeter Indirect Provider doSomething() Intermediate Provider +provider getProvider() Indirect Client intermediate.provider.doSomething() or intermediate.getProvider().doSomething() providerintermediate Law of Demeter: A method "M" of an object "O" should invoke only the methods of the following kinds of objects. 1. itself3. any object it creates /instantiates 2. its parameters4. its direct component objects

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.57 Car -engine +increaseSpeed() Eliminate Navigation Code … engine.carburetor.fuelValveOpen = true Engine +carburetor Car -engine +increaseSpeed() Carburetor +fuelValveOpen Engine -carburetor +speedUp() Car -engine +increaseSpeed() … engine.speedUp() carburetor.fuelValveOpen = true Carburetor -fuelValveOpen +openFuelValve() Engine -carburetor +speedUp() carburetor.openFuelValve()fuelValveOpen = true Carburetor +fuelValveOpen

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.58 Split Up God Class Problem: Break a class which monopolizes control? Solution: Incrementally eliminate navigation code Detection:  measuring size  class names containing Manager, System, Root, Controller  the class that all maintainers are avoiding How:  move behaviour close to data + eliminate navigation code  remove or deprecate façade However:  If God Class is stable, then don't split  shield client classes from the god class

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.59 Split Up God Class: 5 variants Controller A Filter1 Filter2 B Controller Filter1 Filter2 MailHeader C Controller Filter1 Filter2 MailHeader FilterAction D Controller Filter1 Filter2 MailHeader FilterAction NameValuePair E Mail client filters incoming mail Extract behavioral class Extract data class Extract behavioral class Extract data class

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.60 Empirical Validation Controlled experiment with 63 last- year master-level students (CS and ICT) Independent Variables Dependent Variables Experimental task Institution God class decomposition Time Accuracy

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.61 Interpretation of Results “Optimal decomposition” differs w.r.t. curriculum  Computer science: preference towards C- E  ICT-electronics: preference towards A-C Advanced OO training can induce a preference towards particular styles of decomposition  Consistent with [Arisholm et al. 2004]

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.62 Transform Conditionals to Polymorphism Transform Self Type Checks Test provider type Test self type Test external attribute Transform Client Type Checks Transform Conditionals into Registration Test null values Introduce Null Object Factor Out Strategy Factor Out State Test object state

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.63 class Message { private: int type_; void* data;... void send (Channel* ch) { switch (type_) { case TEXT : { ch->nextPutAll(data); break; } case ACTION : { ch->doAction(data);...  Transform Self Type Checks void makeCalls (Telephone* phoneArray[]) { for (Telephone *p = phoneArray; p; p++) { switch (p-> phoneType()) { case TELEPHONE::POTS : { POTSPhone* potsp = (POTSPhone*)p potsp->tourne(); potsp->call();... case TELEPHONE::ISDN : { ISDNPhone* isdnp = (ISDNPhone*)p isdnp->initLine(); isdnp->connect();...  Transform Client Type Checks Example: Transform Conditional

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.64 Message send() Message send() ActionMessage send() TextMessage send() switch (type_) { case TEXT : { ch->nextPutAll(data); break;} case ACTION : { ch->doAction(data);... Client1 Client2 Transform Self Type Check

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.65 TelephoneBox makeCall () Telephone POTSPhone... ISDNPhone... TelephoneBox makeCall () Telephone makeCall() POTSPhone makeCall()... ISDNPhone makeCall... Transform Client Type Check

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.66 Conclusion Navigation Code & Complex Conditionals  Most common lack of OO use Polymorphism is key abstraction mechanism  adds flexibility  reduces maintenance cost Avoid Risk  Only refactor when inferior code must be changed (cfr. God Class) Performance ?  Small methods with less navigation code are easier to optimise  Deeply nested if-statements cost more than virtual calls Long case statements cost as much as virtual calls

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering Code Duplication a.k.a. Software Cloning, Copy&Paste Programming Code Duplication  What is it?  Why is it harmful? Detecting Code Duplication Approaches A Lightweight Approach Visualization (dotplots) Duploc

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.68 The Reengineering Life-Cycle Requirements Designs Code (0) requirement analysis (1) model capture (2) problem detection (3) problem resolution (2) Problem detection issues Scale Unknown a priori

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.69 Code is Copied Small Example from the Mozilla Distribution (Milestone 9) Extract from /dom/src/base/nsLocation.cpp

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.70 Case StudyLOC Duplication without comments with comments gcc460’0008.7%5.6% Database Server 245’ %23.3% Payroll40’ %25.4% Message Board 6’ %17.4% How Much Code is Duplicated? Usual estimates: 8 to 12% in normal industrial code 15 to 25 % is already a lot!

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.71 Copied Code Problems General negative effect:  Code bloat Negative effects on Software Maintenance  Copied Defects  Changes take double, triple, quadruple,... Work  Dead code  Add to the cognitive load of future maintainers Copying as additional source of defects  Errors in the systematic renaming produce unintended aliasing Metaphorically speaking:  Software Aging, “hardening of the arteries”,  “Software Entropy” increases even small design changes become very difficult to effect

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.72 Code Duplication Detection Nontrivial problem: No a priori knowledge about which code has been copied How to find all clone pairs among all possible pairs of segments?

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.73 General Schema of Detection Process AuthorLevelTransformed Code Comparison Technique [John94a]LexicalSubstringsString-Matching [Duca99a]LexicalNormalized StringsString-Matching [Bake95a]SyntacticalParameterized StringsString-Matching [Mayr96a]SyntacticalMetric TuplesDiscrete comparison [Kont97a]SyntacticalMetric TuplesEuclidean distance [Baxt98a]SyntacticalASTTree-Matching

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.74 Simple Detection Approach (i) Assumption: Code segments are just copied and changed at a few places Code Transformation Step remove white space, comments remove lines that contain uninteresting code elements (e.g., just ‘else’ or ‘}’) … //assign same fastid as container fastid = NULL; const char* fidptr = get_fastid(); if(fidptr != NULL) { int l = strlen(fidptr); fastid = newchar[ l + 1 ]; … fastid=NULL; constchar*fidptr=get_fastid(); if(fidptr!=NULL) intl=strlen(fidptr) fastid = newchar[l+]

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.75 Simple Detection Approach (ii) Code Comparison Step  Line based comparison (Assumption: Layout did not change during copying)  Compare each line with each other line.  Reduce search space by hashing: 1. Preprocessing: Compute the hash value for each line 2. Actual Comparison: Compare all lines in the same hash bucket Evaluation of the Approach  Advantages: Simple, language independent  Disadvantages: Difficult interpretation

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.76 A Perl script for C++ (i)

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.77 A Perl script for C++ (ii) Handles multiple files Removes comments and white spaces Controls noise (if, {,) Granularity (number of lines) Possible to remove keywords

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.78 Output Sample Lines: create_property(pd,pnImplObjects,stReference,false,*iImplObjects); create_property(pd,pnElttype,stReference,true,*iEltType); create_property(pd,pnMinelt,stInteger,true,*iMinelt); create_property(pd,pnMaxelt,stInteger,true,*iMaxelt); create_property(pd,pnOwnership,stBool,true,*iOwnership); Locations: 6178/6179/6180/6181/ /6199/6200/6201/6202 Lines: create_property(pd,pnSupertype,stReference,true,*iSupertype); create_property(pd,pnImplObjects,stReference,false,*iImplObjects); create_property(pd,pnElttype,stReference,true,*iEltType); create_property(pd,pMinelt,stInteger,true,*iMinelt); create_property(pd,pnMaxelt,stInteger,true,*iMaxelt); Locations: 6177/ /6230 Lines = duplicated lines Locations = file names and line number

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.79 Visualization of Duplicated Code Visualization provides insights into the duplication situation A simple version can be implemented in three days Scalability issue Dotplots — Technique from DNA Analysis Code is put on vertical as well as horizontal axis A match between two elements is a dot in the matrix

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.80 Visualization of Copied Code Sequences All examples are made using Duploc from an industrial case study (1 Mio LOC C++ System) Detected Problem File A contains two copies of a piece of code File B contains another copy of this code Possible Solution Extract Method

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.81 Visualization of Repetitive Structures Detected Problem 4 Object factory clones: a switch statement over a type variable is used to call individual construction code Possible Solution Strategy Method

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.82 Visualization of Cloned Classes Class A Class B Class A Detected Problem: Class A is an edited copy of class B. Editing & Insertion Possible Solution Subclassing …

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.83 Visualization of Clone Families 20 Classes implementing lists for different data types Detail Overview

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.84 Lightweight is sometimes not enough It runs really everywhere (Smalltalk inside)  Contact us if you need it! Duploc is scalable, integrates detection and visualization

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering Software Evolution Exploiting the Version Control System  Visualizing CVS changes The Evolution Matrix Yesterday's weather It is not age that turns a piece of software into a legacy system, but the rate at which it has been developed and adapted without being reengineered. [Demeyer, Ducasse and Nierstrasz: Object-Oriented Reengineering Patterns]

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.86 The Reengineering Life-Cycle Requirements Designs Code (0) requirement analysis (1) model capture (2) problem detection (3) problem resolution (2) Problem detection Issues scale

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.87 Analyse CVS changes 4) Block Shift = Design Change 3) Triangle = Core Reduces 1) Vertical lines = Frequent Changers 2) Horizontal line = Shotgun Surgery

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.88 Ownership Map: Developer Activity DialogueMonologue EditTakeover Familiarization

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.89 The Evolution Matrix Last Version First Version Major Leap Removed Classes TIME (Versions) GrowthStabilisation Added Classes

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.90 Pulsar & Supernova Pulsar: Repeated Modifications make it grow and shrink. System Hotspot: Every System Version requires changes. Supernova: Sudden increase in size. Possible Reasons: Massive shift of functionality towards a class. Data holder class for which it is easy to grow. Sleeper: Developers knew exactly what to fill in.

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.91 Example: MooseFinder (38 Versions)

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.92 Yesterday’s Weather: Stability of Changes

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering Conclusion 1. Introduction There are OO legacy systems too ! 2. Reverse Engineering How to understand your code 3. Visualization Scaleable approach 4. Restructuring How to Refactor Your Code 4. Code Duplication The most typical problems 5. Software Evolution Learn from the past 6. Conclusion Did we convince you?

© S. Demeyer, S. Ducasse, O. Nierstrasz Object-Oriented Reengineering.94 Goals We will try to convince you: Yes, Virginia, there are object-oriented legacy systems too!  … actually, that's a sign of health Reverse engineering and reengineering are essential activities in the lifecycle of any successful software system. (And especially OO ones!)  … consequently, do not consider it second class work There is a large set of lightweight tools and techniques to help you with reengineering.  … check our book, but remember the list is growing Despite these tools and techniques, people must do job and represent the most valuable resource.  … pick them carefully and reward them properly  Did we convince you ?