Download presentation
Presentation is loading. Please wait.
Published byDomenic Gager Modified over 10 years ago
1
A Comparative Study of Two Natural Language Processing Frameworks Yixin Bian, Gunes Koru, Hongfang Liu Department of Information Systems, University of Maryland, Baltimore County,MD,21250,USA June 11, 2012
2
Introduction UIMA (Unstructured Information Management Architecture) is a framework for natural language processing, originally developed by IBM but now maintained by the Apache Software Foundation. GATE (General Architecture for Text Engineering) is a Java suite of tools originally developed at the University of Sheffield and now used worldwide by a wide community of scientists, companies for all sorts of natural language processing tasks.
3
Introduction Both developed in Java. Although they share common goals, the two architectures are different in many aspects. Which one to adopt ?
4
Introduction In this paper, we compare them from three perspectives: Software design quality Code MetricsCode Metrics Software maintenance Code smellsCode smells Bugs Bug survival curves Bug survival curves User's manualUser's manual
5
The Comparison of Metrics UIMAGATE The number of classes 2,1872,822 MinMedianMaxTotalAverage Value MinMedianMaxTotalAverage Value Line of Code 0252944169,51677.510233869228,45480.95 CBO0284118225.410265112033.97 NOC007111700.53008110270.36 RFC063473522016.1032142990910.6 DIT011038371.7501847311.68 LCOM0161007937436.29001008505130.14 WMC04345151666.9302180152205.39
6
The Number of Code Smells Code SmellThe number of code smells in UIMA Average (UIMA/KLOC) The number of code smells in GATE Average (GATE/KLOC) Data Class60.035110.05 Data Clumps630.372210.091 Feature Envy260.15300 Refused Bequest1010.64482.05 Long Message Chain 190.112300.137 Shortgun Surgery 230.1361890.863 God Class160.094480.219 Total2541.57473.41
7
The Number of Bugs Detection ToolUIMAGATE FindBugs (2.0.0)6178 PMD (5.0)17981794 Lint4j (0.9.13)84494
8
The Comparison of Bug Survival Curves
9
The Comparison of User Manuals ContentsUIMAGATE Catalog Tutoral of manual Overview and characteristics of software product Installation and setup Introduction of product application Frequently Asked Questions (FAQ) × Known issues and problems with the software × Terms, concepts and their basic definitions in software ×
10
Conclusion Software design quality Software maintenance Users manual UIMA is better than GATE.
11
Thank you !
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.