Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sponsored by the U.S. Department of Defense © 2005 by Carnegie Mellon University page 1 Pittsburgh, PA 15213-3890 Automated Natural Language Analysis of.

Similar presentations


Presentation on theme: "Sponsored by the U.S. Department of Defense © 2005 by Carnegie Mellon University page 1 Pittsburgh, PA 15213-3890 Automated Natural Language Analysis of."— Presentation transcript:

1 Sponsored by the U.S. Department of Defense © 2005 by Carnegie Mellon University page 1 Pittsburgh, PA Automated Natural Language Analysis of Requirements Bob Ferguson, SEI Guiseppe Lami, CNR

2 © 2004 by Carnegie Mellon University page 2

3 © 2004 by Carnegie Mellon University page 3 Problem Statement Requirements and specifications are most often written in natural language. Inspections are an effective tool for defect identification and correction, but -- Inspections are time consuming and costly. Inspections have been shown to identify 25-80% of defects with a median value of 57% Can we improve the quality of requirements and specifications and the efficiency of the process by using an automated tool?

4 © 2004 by Carnegie Mellon University page 4 Why Natural Language? It is the most natural thing for us to do. Generally understood by all parties (engineer, manager, user, sponsor,…) Formal languages have been tried but have been successful in only very limited domains. Use cases and scenarios are incomplete and difficult to partition. For example, an industry standard may include a significant number of sentences. The engineering staff may need several levels of detail.

5 © 2004 by Carnegie Mellon University page 5 What past efforts exist? NASA ARM: Automated Requirements Analysis tool Tom Gilb, Competitive Engineering, 2005 Describes “Planguage” Lightfoot, David, Formal Specification Using Z, Palgrave,1998 TIGER: INCOSE Formalize the language or formalize the analysis?

6 © 2004 by Carnegie Mellon University page 6 Automation Assistance Document Analysis Tools Reduce cycle time and effort while producing better results than possible with tedious manual review Early detection and correction of simple but often costly errors allows analysts to focus on more difficult problems Req. Docs Evaluation criteria Inspection or Review Process Possibility for Automation

7 © 2004 by Carnegie Mellon University page 7 QuARS Quality Analysis for Requirements and Specifications Automated tool that takes natural language input Goals: 1.Reduce the potential of product defects resulting from problems in language usage. 2.Ease the editing and inspection burden on staff. 3.Facilitate the identification and analysis of similar and conflicting requirements. Each of these goals contributes to improving the quality of requirements and specifications by facilitating an improved process for defect identification.

8 © 2004 by Carnegie Mellon University page 8 Initial Tests Two companies participated in the initial tests. Method Company A: Multiple versions of requirements documents that had been previously inspected, were put through the QuARS tool. Any additional defects were analyzed. Defects were traced to find first occurrence of defect. Method Company B: Single requirements document was analyzed and compared to inspection results.

9 © 2004 by Carnegie Mellon University page 9 Company A Results – 1 A manufacturing concern utilizes an external, independent consultant for inspection. Input 2396 Statements QuARS QuARS Results 692 identified defects 6 hours effort 3 days cycle time Human Results 279 identified defects 10 business days ~$6,000 Consultant

10 © 2004 by Carnegie Mellon University page 10 Company A Results – 2 20% of the sentences (484) were defective at the start of the inspection process. Some sentences had multiple defects. QuARS processing time was approximately 6 hours Includes learning curve Includes removing false positives Rate equals 799 statements per hour QuARS identified all non-graphical requirements defects that the human inspector identified.

11 © 2004 by Carnegie Mellon University page 11 Company B Results Requirement Document Inspection Process Updated Requirement Document QuARS Results Requirements Statements 574 Insurance company that uses a formal inspection process possible defects identified -94 confirmed defects, 16 false positives -44 separately identifiable sentences -8% of statements were defective post- inspection

12 © 2004 by Carnegie Mellon University page 12 Benefit of Tool Usage Input QuARS QuARS Results 692identified defects 6hours effort 3days cycle time Human Results 279 identified defects 10 business days ~$6,000 Human Results 279 identified defects 10 business days ~$6,000 Human Results 279 identified defects 10 business days ~$6,000 Consultant After Inspection Process Manual Only Tool Only

13 © 2004 by Carnegie Mellon University page 13 Summary of Initial Results QuARS shows significant promise for: Improving the quality of requirements documents Reducing cycle time to identify and remove requirements defects. Reduce cost, improve efficiency and improve effectiveness of inspections. We can process other text based documents such as test cases and all types of specifications.

14 © 2004 by Carnegie Mellon University page 14 Basic Operation of QuARS Lexical Analysis searches for words and phrases that are potentially defective. Syntactical Analysis uses sentence structure to search for additional problems. Readability Analysis User defined lexicon can be used to cluster and count requirements of a particular type (security)

15 © 2004 by Carnegie Mellon University page 15 Definitions Lexicon: n., dictionary-type listing of words and phrases for a selected purpose. Lexical: adj., relating to a lexicon. Syntax: n., the structure of a sentence according to the rules of grammar. Syntactical: adj., relating to the syntax or grammar of a language. Semantic: adj., relating to the meaning of text often based on context and domain. (QuARS does not help with semantic analysis).

16 © 2004 by Carnegie Mellon University page 16 QuARS Quality Model Lexical analysis identifies words and phrases that are: Vague Subjective Implying choice or option Readability (Coleman-Liau) Syntactical analysis identifies Weak phrases or verbs Multiplicity Implicit expressions And Under-specification

17 © 2004 by Carnegie Mellon University page 17 Analysis Types – 1 Ambiguity: Will the reader have a unique interpretation? Vague words and phrases -clear, easy,useful, adequate, good, bad, etc. Subjective -Similar, having in mind, taking to account, as fast as possible. -Subjective use often has some unexplored context information. Without detailed familiarity with the organization, business, and individual’s motives, such usage has a high-probability of being misinterpreted.

18 © 2004 by Carnegie Mellon University page 18 Analysis Types – 2 Ambiguity (cont.) Implying a choice or option -Possibly, eventually, if possible, if needed -How will this choice be determined? -Will requirements change? Will we need a choice function? Implicit expressions Demonstrative adjectives such as “this” or “that”. -“This report must have column totals for all dollar amounts.” Also words like “above,” “below” or “next” -If the sentence is somehow separated from the antecedent, the sentence will be impossible to understand.

19 © 2004 by Carnegie Mellon University page 19 Analysis Types – 3 Weakness May, should, can, could This is a form of under-specification. It suggests a need for a choice function or future requirement (e.g. TBA). Under-specification Many words will require a second noun to make the use specific. -For example, “report,” “flow,” “access,” “function” Are better by using -“payroll-report,” “control-flow,” “write-access,” “check-distribution-function”

20 © 2004 by Carnegie Mellon University page 20 Analysis Types – 4 Multiplicity-function identifies sentences with more than one subject, verb or object. Example: The system will generate the “Payroll-report” and “City- tax-report” monthly. Multiplicity and implicitness are examples of syntactical analysis.

21 © 2004 by Carnegie Mellon University page 21

22 © 2004 by Carnegie Mellon University page 22 QuARS Output – 1 Subjectivity Analysis The line number: The Base Rate is always set as $0.02 per $100. This factor may be updated by Actuarial depending on market conditions. is defective because it contains the wording: depending on Number of evaluated sentences: 197 Number of defective sentences: 9 Defect rate: 4%

23 © 2004 by Carnegie Mellon University page 23 QuARS Output – 2 Implicit Analysis Line number 92 At a later date this field might be calculated based on actuarial criteria or from Risk Link Accumulation output. contains an implicit sentence: implicit determiner Number of evaluated sentences: 197 Number of defective sentences: 6 Defect rate: 3% This particular statement was flagged two times, because the implicit usage is two-fold – later and this

24 © 2004 by Carnegie Mellon University page 24 Clustering Function Called “View Analysis” Construct a View Dictionary of domain-related words. Security {authorization, password, authentication, authorize, authenticate, secure-access, accessibility} The “V” function in QuARS counts total sentences per each section and counts the number of sentences in the section that include words from the requested lexicon. While QuARS does not perform semantic analysis, this type of clustering would facilitate analysis performed by humans.

25 © 2004 by Carnegie Mellon University page 25

26 © 2004 by Carnegie Mellon University page 26 Creating a Lexicon

27 © 2004 by Carnegie Mellon University page 27 “False Positives” Two types of false positives are most common. 1.QuARS identifies a possible defect, however, the actual business usage is correct. Example: “commission” is usually “vague.” “Commission” is specific in the insurance industry, since it is defined as compensation for a sale. 2.There are times when implicit usage is “ok” provided that the danger of splitting the sentences into different parts of the document is regarded with care.

28 © 2004 by Carnegie Mellon University page 28 Removing False Positives Fix the lexicon The dictionaries can be modified and words added or deleted. Hence “commission” can be removed. Or flag the offending sentence as acceptable and do not show it in the report.

29 © 2004 by Carnegie Mellon University page 29 Hiding False Positives

30 © 2004 by Carnegie Mellon University page 30 Tool Development Needs Reporting Consolidate defects by sentence instead of defect type. -Ex: All sentences having “vague words” are listed together. Use requirement identifier to format report Integration Improve user interface Process other document types Integrate with tools such as DOORS, CaliberRM Function Possible extensions such as identification of passive voice and indirect objects.

31 © 2004 by Carnegie Mellon University page 31 Planning: CNR* Discussing possibilities for near term availability. Commercialize the tool. 3 rd party developer/integrator sought Discussions with tool vendors Continued research and development of the engine. *Consiglio Nazionale delle Richerche (Italian National Research Council)

32 © 2004 by Carnegie Mellon University page 32 Further Research - 1 Test QuARS in a high-maturity organization Use orthogonal defect classification for escaped defects What % of requirements defects might be avoided? Such results are particularly useful. Some high-maturity organizations report that >50% of fielded defects can be traced to defective requirements.

33 © 2004 by Carnegie Mellon University page 33 Further Research - 2 Test use of QuARS in an Acquisition setting. How much can we improve the RFP process? Preparation effort, cycle time, rework Bidding analysis Support for the bidding process in the PMO

34 © 2004 by Carnegie Mellon University page 34 Further Research - 3 Benefits of using QuARS as part of the process Simulation of secondary effects -Does testing improve? Can human inspectors really detect semantic errors if the mechanical errors are not present? Inspectors have a couple of “bandwidth” limitations. In any one session there are two limiting factors – the size of the deliverable that can be processed in 2 hours, and the maximum number of defects that can be discussed in 2 hours.

35 © 2004 by Carnegie Mellon University page 35 Further Research – 4 Develop suitable experiments for “clustering” function. Consistency checking? -Can we more easily identify conflicting requirements? -Have we duplicated a requirement? Completeness checking? -For things like security and privacy, have we covered the bases within the different functional areas?

36 © 2004 by Carnegie Mellon University page 36 Proposed Process Model QuARS could be used as the first inspection, or Used interactively as requirements are entered. Cycle time should be shorter, and Fewer defects should escape. Does a process simulation show us results related to our first slide?

37 © 2004 by Carnegie Mellon University page 37

38 © 2004 by Carnegie Mellon University page 38 References Lami, G. “An Automatic Tool for Improving the Quality of Software Requirements” Lami, et. al. “An Automatic Quality Evaluation for Natural Language Requirements” matrix.iei.pi.cnr.it/FMT/WEBPAPER/P11RESFQ01.pdf Bucchiarone, A. “Quality Analysis of NL Requirements”

39 © 2004 by Carnegie Mellon University page 39 Contact Information Bob Ferguson Software Engineering Institute Giuseppe Lami Istituto di Scienza e Tecnologie Dell'Informazione


Download ppt "Sponsored by the U.S. Department of Defense © 2005 by Carnegie Mellon University page 1 Pittsburgh, PA 15213-3890 Automated Natural Language Analysis of."

Similar presentations


Ads by Google