Investigating JAVA Classes with Formal Concept Analysis Uri Dekel Based on M.Sc. work at the Israeli Institute of Technology. To appear:

Slides:



Advertisements
Similar presentations
Component Oriented Programming 1 Chapter 2 Theory of Components.
Advertisements

Overview Introduces a new cohesion metric called Conceptual Cohesion of Classes (C3) and uses this metric for fault prediction Compares a new cohesion.
Introduction to Databases
® IBM Software Group © 2006 IBM Corporation Rational Software France Object-Oriented Analysis and Design with UML2 and Rational Software Modeler 04. Other.
Chapter 14: Usability testing and field studies. 2 FJK User-Centered Design and Development Instructor: Franz J. Kurfess Computer Science Dept.
Programming Languages WHY MORE? Wasn’t ONE ENOUGH? Introduction to CS260.
Using Natural Language Program Analysis to Locate and understand Action-Oriented Concerns David Shepherd, Zachary P. Fry, Emily Hill, Lori Pollock, and.
Software engineering for real-time systems
Software Metrics II Speaker: Jerry Gao Ph.D. San Jose State University URL: Sept., 2001.
File Systems and Databases
UML CASE Tool. ABSTRACT Domain analysis enables identifying families of applications and capturing their terminology in order to assist and guide system.
Table Lens Introduction to the Table Lens concept Table Lens Implementation Projected Usage Scenarios Usage Comparison with Splus Critical Analysis.
Developed by Reneta Barneva, SUNY Fredonia Component Level Design.
Software Issues Derived from Dr. Fawcett’s Slides Phil Pratt-Szeliga Fall 2009.
Lesson-15 Systems Analysis What are information systems, and who are the stakeholders in the information systems game? Define systems analysis and relate.
1 User Interface Design CIS 375 Bruce R. Maxim UM-Dearborn.
1 An introduction to design patterns Based on material produced by John Vlissides and Douglas C. Schmidt.
SYSTEMS ANALYSIS. Chapter Five Systems Analysis Define systems analysis Describe the preliminary investigation, problem analysis, requirements analysis,
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Chapter 7 Requirement Modeling : Flow, Behaviour, Patterns And WebApps.
Chapter 10 Architectural Design
SDLC: System Development Life Cycle Dr. Bilal IS 582 Spring 2006.
© 2010 IBM Corporation © 2011 IBM Corporation September 6, 2012 NCDHHS FAMS Overview for Behavioral Health Managed Care Organizations.
Object Oriented Analysis By: Don Villanueva CS 524 Software Engineering I Fall I 2007 – Sheldon X. Liang, Ph. D.
Java Beans.
Week 1 Lecture MSCD 600 Database Architecture Samuel ConnSamuel Conn, Asst. Professor Suggestions for using the Lecture Slides.
1 CSE 2102 CSE 2102 CSE 2102: Introduction to Software Engineering Ch9: Software Engineering Tools and Environments.
CPIS 357 Software Quality & Testing
CSCI-383 Object-Oriented Programming & Design Lecture 9.
© 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1 A Discipline of Software Design.
Mathematical Modeling and Formal Specification Languages CIS 376 Bruce R. Maxim UM-Dearborn.
Detailed design – class design Domain Modeling SE-2030 Dr. Rob Hasker 1 Based on slides written by Dr. Mark L. Hornick Used with permission.
User Interface Structure Design Chapter 11. Key Definitions The user interface defines how the system will interact with external entities The system.
1 Systems Analysis and Design in a Changing World, Thursday, January 18, 2007.
Systems Analysis and Design in a Changing World, 3rd Edition
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Software Verification, Validation and Testing.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 11 Slide 1 Design.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved COS240 O-O Languages AUBG,
An Automatic Software Quality Measurement System.
Object Oriented Software Development
CSE-3421: INFORMATION SYSTEM ANALYSIS & DESIGN. DUET Copyright © 2010 Dr. M.A. Kashem DR.M.A.Kashem Associate professor SOFTWARE ENGINEERING CSE
Introduction to c++ programming - object oriented programming concepts - Structured Vs OOP. Classes and objects - class definition - Objects - class scope.
Software Architecture Evaluation Methodologies Presented By: Anthony Register.
Scenario-Based Analysis of Software Architecture Rick Kazman, Gregory Abowd, Len Bass, and Paul Clements Presented by Cuauhtémoc Muñoz.
Class Analysis with Concept Lattices Uri Dekel Department of Computer Science Technion, Haifa, Israel.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
Software Quality Assurance SOFTWARE DEFECT. Defect Repair Defect Repair is a process of repairing the defective part or replacing it, as needed. For example,
Chapter 5 System Modeling. What is System modeling? System modeling is the process of developing abstract models of a system, with each model presenting.
1 Presentation Methodology Summary B. Golden. 2 Introduction Why use visualizations?  To facilitate user comprehension  To convey complexity and intricacy.
Presented by: Samia Azhar( ) Shahzadi Samia( )
Object-Oriented Software Engineering Practical Software Development using UML and Java Modelling with Classes.
OBJECT-ORIENTED TESTING. TESTING OOA AND OOD MODELS Analysis and design models cannot be tested in the conventional sense. However, formal technical reviews.
What is this? SE-2030 Dr. Mark L. Hornick 1. Same images with different levels of detail SE-2030 Dr. Mark L. Hornick 2.
Generating ADL Descriptions ADL Module for Together 6.x Massimo Marino Lawrence Berkeley National Laboratory.
DBS201: Data Modeling. Agenda Data Modeling Types of Models Entity Relationship Model.
DATA FLOW DIAGRAMS.
1 Usability Analysis n Why Analyze n Types of Usability Analysis n Human Subjects Research n Project 3: Heuristic Evaluation.
©Ian Sommerville 2000Software Engineering, 6th edition. Chapter 19Slide 1 Verification and Validation l Assuring that a software system meets a user's.
Unit - 3 OBJECT ORIENTED DESIGN PROCESS AND AXIOMS
Object-Orientated Programming
File Systems and Databases
Software Design CMSC 345, Version 1/11.
Analysis models and design models
Chapter 19 Technical Metrics for Software
Revealing Class Structure With Zoomable Concept Lattices
Coupling Interaction: It occurs due to methods of a class invoking methods of other classes. Component Coupling: refers to interaction between two classes.
Presentation transcript:

Investigating JAVA Classes with Formal Concept Analysis Uri Dekel Based on M.Sc. work at the Israeli Institute of Technology. To appear: 10 th Working Conference on Reverse Engineering (WCRE’03), and as a poster in OOPSLA’ Software Research Seminar (SSSG)

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 2 Outline Research goals and hypotheses Research goals and hypotheses A crash-course in formal concept analysis A crash-course in formal concept analysis Interface visualization Interface visualization Reasoning about class implementation. Reasoning about class implementation. Applications to code inspection Applications to code inspection Additional research Additional research

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 3 Goals Research question: Research question: ``Can we exploit the data-member based cohesion between function-methods in a class to reason about the class and discover errors?’’ Specifically: Specifically: 1. Provide faster learning curve for new class users by improving interface presentation 2. Assist reverse engineering by visualizing structure 3. Assist code inspection by suggesting reading order Important principle: keep it simple to use and learn. Important principle: keep it simple to use and learn.

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 4 Hypothesis #1 Data-member use is fundamental to understanding a class. Data-member use is fundamental to understanding a class. All possible implementations of an operation will use the same fields All possible implementations of an operation will use the same fields Representation changes are rare Representation changes are rare Basis for cohesion-based metrics (e.g., LCOM) Basis for cohesion-based metrics (e.g., LCOM) Analogous to global variable based modularization of procedural code. Analogous to global variable based modularization of procedural code.

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 5 Hypothesis #2 Methods that use the same combination of fields are likely to be related. Methods that use the same combination of fields are likely to be related. e.g., get/set, add/remove, etc. e.g., get/set, add/remove, etc. Even more so due to the ``shopping list approach’’ Even more so due to the ``shopping list approach’’ Promotes complete interfaces using composite methods Promotes complete interfaces using composite methods

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 6 Means Formal Concept Analysis Formal Concept Analysis Mathematical classification technique Mathematical classification technique Uses binary relation (context) between objects and attributes Uses binary relation (context) between objects and attributes not to be confused with OO terms not to be confused with OO terms Produces a concept lattice (next slide) Produces a concept lattice (next slide) Much literature on applications in various fields Much literature on applications in various fields Example: Context of the Pnt3D class

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 7 Formal Concept Analysis Input: A context Input: A context O is a set of objects O is a set of objects A is a set of attributes A is a set of attributes R is a binary relation between O and A R is a binary relation between O and A Mapping: Galois Connection Mapping: Galois Connection Common attributes of a set of objects: Common attributes of a set of objects: Common objects of a set of attributes: Common objects of a set of attributes: Output: Concepts s.t. Output: Concepts s.t.

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 8 A concept lattice is based upon a partial order between concepts: Formal Concept Analysis Example: Concepts of the Pnt3D class

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 9 Concept Lattices A sparse concept lattice provides an alternate view of the tabular context and the full concept lattice A sparse concept lattice provides an alternate view of the tabular context and the full concept lattice Each concept is a group of objects which have the same attributes Each concept is a group of objects which have the same attributes The attributes are the union of attributes in that concept and all the concept that it dominates The attributes are the union of attributes in that concept and all the concept that it dominates In our case, methods that use the same fields are clustered together In our case, methods that use the same fields are clustered together Reveals structure and asymmetries Reveals structure and asymmetries

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 10 Interface Visualization The lattice partitions the methods in the interface into equivalence classes The lattice partitions the methods in the interface into equivalence classes Similar methods are heuristically clustered together. Similar methods are heuristically clustered together. An automatic ``feature categorization’’ An automatic ``feature categorization’’ Lattice provides multidimensional connections Lattice provides multidimensional connections Compare with simple lexical lists of methods Compare with simple lexical lists of methods (Note: class is “flattened” to remove inheritance details)

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 11 Interface Visualization To be effective, multiple methods should appear in each concept, on average To be effective, multiple methods should appear in each concept, on average A lattice can have up to n=2 MIN(|M|,|F|) concepts A lattice can have up to n=2 MIN(|M|,|F|) concepts In a data set of circa 6000 classes: In a data set of circa 6000 classes: In 99.5%, n < M + F In 99.5%, n < M + F In 77.4%, n < M In 77.4%, n < M Example: Concepts vs. Methods in Eclipse.

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 12 Case Study The Molecule class from CDK The Molecule class from CDK CDK: Chemistry Development Kit CDK: Chemistry Development Kit Open source library of chemistry related classes Open source library of chemistry related classes Developed at the Max Plank institute in Germany Developed at the Max Plank institute in Germany Used in chemistry visualization applications Used in chemistry visualization applications Why the Molecule class? Why the Molecule class? Has a large interface (nearly 75 public members) Has a large interface (nearly 75 public members) The represented entity is familiar to most people The represented entity is familiar to most people Our technique revealed new errors in this class. Our technique revealed new errors in this class.

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 13 Case Study Lattice structure hints on class structure Lattice structure hints on class structure A lot of independent operations on the left. A lot of independent operations on the left. Similar to a C struct. Similar to a C struct. Cohesive component on the right. Cohesive component on the right.

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 14 Interface Visualization Multiple methods with the similar signatures indicate possible repetition. Multiple methods with the similar signatures indicate possible repetition. Inconsistency in naming. Inconsistency in naming. Inconsistencies in return types. Inconsistencies in return types. Because related methods are grouped in concepts, we can notice inconsistencies or repetitions Because related methods are grouped in concepts, we can notice inconsistencies or repetitions

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 15 Investigate Implementation We examine fields and dependencies between concepts to understand the cohesive component We examine fields and dependencies between concepts to understand the cohesive component Collections of atoms and bonds Collections of atoms and bonds Micro-management of arrays (count field tracks available items) Micro-management of arrays (count field tracks available items) Inconsistencies and broken invariants. Inconsistencies and broken invariants.

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 16 Investigate Implementation Asymmetries are revealed by examining pairs of related concepts. Asymmetries are revealed by examining pairs of related concepts.

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 17 Embedded Call Graph A concept lattice clusters methods but does not portray interactions A concept lattice clusters methods but does not portray interactions Call graphs show interaction between methods but layout does not depend on semantics Call graphs show interaction between methods but layout does not depend on semantics Embedded call graph combines the two Embedded call graph combines the two

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 18 Code Inspection Lattice can help us select a reading order Lattice can help us select a reading order Minimize focus shifts. Minimize focus shifts. Similar methods are read consecutively. Similar methods are read consecutively. We define a global order between concepts. We define a global order between concepts. e.g., each component separately, topological ordering, read by order of layers. e.g., each component separately, topological ordering, read by order of layers. We define a local order between methods in each concept. We define a local order between methods in each concept. e.g., topological ordering, read by order of simplicity, etc. e.g., topological ordering, read by order of simplicity, etc.

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 19 Tooling Support Batch-mode prototype Batch-mode prototype Produces lattices and metrics Produces lattices and metrics Database-support for metrics and statistics research Database-support for metrics and statistics research Interactive Eclipse plug-in prototype Interactive Eclipse plug-in prototype Adds an additional view for a.java files Adds an additional view for a.java files Uses simplistic external static analyzer. Uses simplistic external static analyzer. Limited by current 2D capabilities of eclipse. Limited by current 2D capabilities of eclipse.

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 20 Research Directions Conduct user studies to validate methodology Conduct user studies to validate methodology Preliminary user-studies provided good feedback Preliminary user-studies provided good feedback Lattice-based metrics suite Lattice-based metrics suite Application to class design in CASE tools Application to class design in CASE tools Interactive class diagram editor based on concept lattice Interactive class diagram editor based on concept lattice Semantics assigned by connecting methods to fields. Compare with simply adding methods to a list as in current tools. Semantics assigned by connecting methods to fields. Compare with simply adding methods to a list as in current tools.

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 21 Research Directions Class-wide “diffing” Class-wide “diffing” Provide birds-eye view of changed areas. Provide birds-eye view of changed areas. Example: Differences between the original version of the “ Graph ” class of VGJ (Visualizing Graphs with Java) and the Technion adaptation of that class. Original appear in bold font, modifications appear in plain font

Backup Material

9/25/2003 Investigating Classes with FCA, Uri Dekel, Software Research Seminar 23 Graph Class