Presentation is loading. Please wait.

Presentation is loading. Please wait.

Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)

Similar presentations


Presentation on theme: "Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)"— Presentation transcript:

1 Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)

2 Agenda Introduction Problem Definition Data flow Design patterns Summary / Faculteit Wiskunde en Informatica PAGE 121-3-2015

3 Introduction Code mining – Process of extracting patterns from source code. Design Patterns – A design pattern is a general reusable solution to a commonly occurring problem in software design. Streaming Data - Data streaming is the transfer of data at a steady high-speed rate sufficient to support such applications as high-definition television or a radio signal. / Faculteit Wiskunde en Informatica PAGE 221-3-2015

4 Problem Definition Lack of synchronisation between models and source code. Significant amount of repetitive code in different modules. Identifying patterns and integrating them in the framework. Objective / Faculteit Wiskunde en Informatica PAGE 321-3-2015

5 Approach Study the Design flow models available. Study the various design pattern matching methods and tools. / Faculteit Wiskunde en Informatica PAGE 421-3-2015

6 Data flow models Kahn Process Networks. Synchronous Data Flow. / Faculteit Wiskunde en Informatica PAGE 521-3-2015

7 Kahn Process Network - Introduction Processes communicate via FIFO. Parallel communication is organized as follows Autonomous computing stations are connected to each other in a network by communication lines. A station computes on data coming on its input lines to produce output on some or all of its output lines. Assumptions Communication lines are the only means of communication. Communication lines transmit info within a finite time. / Faculteit Wiskunde en Informatica PAGE 621-3-2015

8 Kahn Process Network - Introduction Restrictions At any given time a computing station is either computing or waiting for information on one of its input lines. Each computing station follows a sequential program. / Faculteit Wiskunde en Informatica PAGE 721-3-2015

9 Kahn Process Network - Example From Kahn’s original 1974 paper process f(in int u, in int v, out int w) { int i; bool b = true; for (;;) { i = b ? wait(u) : wait(v); printf("%i\n", i); send(i, w); b = !b; } / Faculteit Wiskunde en Informatica PAGE 821-3-2015 Process alternately reads from u and v, prints the data value, and writes it to w. u v w f f

10 Kahn Process Network - Example From Kahn’s original 1974 paper process f(in int u, in int v, out int w) { int i; bool b = true; for (;;) { i = b ? wait(u) : wait(v); printf("%i\n", i); send(i, w); b = !b; } / Faculteit Wiskunde en Informatica PAGE 921-3-2015 Process interface includes FIFO’s. wait() returns the next token in an input FIFO, blocking if it’s empty send() writes a data value on an output FIFO

11 SDF - Introduction Synchronous data flow graph (SDF) is a network of synchronous nodes (also called blocks). For a synchronous node, the consumptions and productions are known a priori. Homogeneous SDF / Faculteit Wiskunde en Informatica PAGE 1021-3-2015

12 SDF - Delay Delay of signal processing Unit delay on arc between A and B, means nth sample consumed by B, is (n-1)th sample produced by A. The arc is initialized with ‘d’ zero samples. / Faculteit Wiskunde en Informatica PAGE 1121-3-2015 A d B

13 SDF - Implementation Implementation requires: Buffering of the data samples passing between nodes Schedule nodes when inputs are available Dynamic implementation (= runtime) requires Runtime scheduler checks when inputs are available and schedules nodes when a processor is free. / Faculteit Wiskunde en Informatica PAGE 1221-3-2015

14 SDF - Implementation Contribution of Lee-87: SDF graphs can be scheduled at compile time No overhead Compiler will: Determine the execution order of the nodes on one or multiple processors or data path units Determine communication buffers between nodes. / Faculteit Wiskunde en Informatica PAGE 1321-3-2015

15 Design Patterns Describe solutions for common recurring problems Can be used in a wider context as they are defined informally Documenting them in a software system simplifies maintenance and program understanding Usually it is not documented, so there is a need to discover design patterns from source code / Faculteit Wiskunde en Informatica PAGE 1421-3-2015

16 Pattern Mining Structure of design pattern is searched in the source code. Should include the main properties of the design pattern Flexible to describe the slightly distorted pattern occurrences. Helps to understand the relationships between the different parts of a large system / Faculteit Wiskunde en Informatica PAGE 1521-3-2015

17 Pattern Mining Reverse Engineering Analysis of a system to −Identify the components and their interrelationships −Create representations of the system in another form Why tools for Reverse Engineering? Existing legacy code High number of participants in code development Tools developed to mine the patterns from the source code / Faculteit Wiskunde en Informatica PAGE 1621-3-2015

18 Pattern Mining Tools Aspects in the different mining tools Programming Language : Tools for Java and C++ Method used to discover design patterns : Graph Matching, Constraint Satisfaction Problem, pattern inference Intermediate Representation – Abstract Semantic Graph, Abstract Syntax Tree, Matrix and Vector / Faculteit Wiskunde en Informatica PAGE 1721-3-2015

19 Columbus – Design Pattern Mining Tool Reverse engineering framework Developed in cooperation between the Research Group on Artificial Intelligence in Szeged, the Software Technology Laboratory of the Nokia Research Center and FrontEndART Ltd. Analyze large C/C++ projects and extract data according to the Columbus Schema Supports project handling, data extraction, data representation, data storage, filtering and visualization / Faculteit Wiskunde en Informatica PAGE 1821-3-2015

20 Columbus - Design Pattern Mining Tool Has a C/C++ extractor plug-in that performs the parsing of the source code Information collected by the plug-in corresponds to the Columbus Schema Schema captures C++ language at low detail(i.e, Abstract Syntax Tree) and has the higher –level elements(i.e., semantics of types) Supports various file formats for exporting the extracted data / Faculteit Wiskunde en Informatica PAGE 1921-3-2015

21 Other Pattern Mining Tools Other tools to be studied CPP2XMI Maisa CrocoPat / Faculteit Wiskunde en Informatica PAGE 2021-3-2015

22 Issues to be considered Can the tools support NXP source Code? Would it be possible to add proprietary patterns to these tools? Can these tools be extended to support other languages like C? / Faculteit Wiskunde en Informatica PAGE 2121-3-2015

23 Summary Overview of the Data flow models Introduced the design pattern mining tool - Columbus Find the patterns present in the NXP source code and check whether these can be mined using the available tools / Faculteit Wiskunde en Informatica PAGE 2221-3-2015

24 References E.A.Lee and D.G.Messerschmitt, “Synchronous data flow”,Proc. IEEE, vol. 75, pp. 1235-1245, Sept 1987. G.Kahn, “The semantics of a simple language for parallel programming”, Proc.IFIP congr., Stockholm, Sweden, Aug.1974, pp.471-475 Gamma, E., Helm, R., Johnson, R. and Vlissides, J. Design Patterns - Elements of Reusable Object- Oriented Software. Addison-Wesley, 1995. / Faculteit Wiskunde en Informatica PAGE 2321-3-2015

25 References R. Ferenc, A. Beszedes, M. Tarkiainen, and T. Gyimothy. Columbus – Reverse Engineering Tool and Schema for C++. In Proceedings of the 6 th International Conference on Software Maintenance (ICSM 2002), pages 172–181. IEEE Computer Society, Oct. 2002. R. Ferenc, and A. Beszedes. Data Exchange with the Columbus Schema for C++. In Proceedings of the 6th European Conference on Software Maintenance and Reengineering (CSMR 2002), pages 59–66. IEEE Computer Society, Mar. 2002. / Faculteit Wiskunde en Informatica PAGE 2421-3-2015

26 References Z. Balanyi, and R. Ferenc. Mining Design Patterns from C++ Source Code. In Proceedings of the 19 th International Conference on Software Maintenance (ICSM 2003), pages 305–314. IEEE Computer Society, Sept. 2003. / Faculteit Wiskunde en Informatica PAGE 2521-3-2015

27 Questions / Faculteit Wiskunde en Informatica PAGE 2621-3-2015


Download ppt "Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)"

Similar presentations


Ads by Google