Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Comparative Study of Two Natural Language Processing Frameworks Yixin Bian, Gunes Koru, Hongfang Liu Department of Information Systems, University of.

Similar presentations


Presentation on theme: "A Comparative Study of Two Natural Language Processing Frameworks Yixin Bian, Gunes Koru, Hongfang Liu Department of Information Systems, University of."— Presentation transcript:

1 A Comparative Study of Two Natural Language Processing Frameworks Yixin Bian, Gunes Koru, Hongfang Liu Department of Information Systems, University of Maryland, Baltimore County,MD,21250,USA June 11, 2012

2 Introduction UIMA (Unstructured Information Management Architecture) is a framework for natural language processing, originally developed by IBM but now maintained by the Apache Software Foundation. GATE (General Architecture for Text Engineering) is a Java suite of tools originally developed at the University of Sheffield and now used worldwide by a wide community of scientists, companies for all sorts of natural language processing tasks.

3 Introduction Both developed in Java. Although they share common goals, the two architectures are different in many aspects. Which one to adopt ?

4 Introduction In this paper, we compare them from three perspectives: Software design quality Code MetricsCode Metrics Software maintenance Code smellsCode smells Bugs Bug survival curves Bug survival curves User's manualUser's manual

5 The Comparison of Metrics UIMAGATE The number of classes 2,1872,822 MinMedianMaxTotalAverage Value MinMedianMaxTotalAverage Value Line of Code 0252944169,51677.510233869228,45480.95 CBO0284118225.410265112033.97 NOC007111700.53008110270.36 RFC063473522016.1032142990910.6 DIT011038371.7501847311.68 LCOM0161007937436.29001008505130.14 WMC04345151666.9302180152205.39

6 The Number of Code Smells Code SmellThe number of code smells in UIMA Average (UIMA/KLOC) The number of code smells in GATE Average (GATE/KLOC) Data Class60.035110.05 Data Clumps630.372210.091 Feature Envy260.15300 Refused Bequest1010.64482.05 Long Message Chain 190.112300.137 Shortgun Surgery 230.1361890.863 God Class160.094480.219 Total2541.57473.41

7 The Number of Bugs Detection ToolUIMAGATE FindBugs (2.0.0)6178 PMD (5.0)17981794 Lint4j (0.9.13)84494

8 The Comparison of Bug Survival Curves

9 The Comparison of User Manuals ContentsUIMAGATE Catalog Tutoral of manual Overview and characteristics of software product Installation and setup Introduction of product application Frequently Asked Questions (FAQ) × Known issues and problems with the software × Terms, concepts and their basic definitions in software ×

10 Conclusion Software design quality Software maintenance Users manual UIMA is better than GATE.

11 Thank you !


Download ppt "A Comparative Study of Two Natural Language Processing Frameworks Yixin Bian, Gunes Koru, Hongfang Liu Department of Information Systems, University of."

Similar presentations


Ads by Google