Presentation is loading. Please wait.

Presentation is loading. Please wait.

Part 3 Real World Applications: SumTime-Mousam. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn SumTime-Mousam –Knowledge.

Similar presentations


Presentation on theme: "Part 3 Real World Applications: SumTime-Mousam. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn SumTime-Mousam –Knowledge."— Presentation transcript:

1 Part 3 Real World Applications: SumTime-Mousam

2 Dept. of Computing Science, University of Aberdeen2 In this lecture you learn SumTime-Mousam –Knowledge acquisition –Design Document planning Microplanning realization –Evaluation Post-edit End-user

3 Dept. of Computing Science, University of Aberdeen3 Introduction So far we studied –Data analysis techniques Time series data Spatial data –Visualization techniques –NLG techniques Now we will study –SumTime-Mousam a weather forecast text generation system –HCE 3.0 a visual knowledge discovery tool

4 Dept. of Computing Science, University of Aberdeen4 SumTime-Mousam NLG system that automates the task of writing weather forecasts –Developed in our department Input:Numerical Weather Prediction (NWP) data –Data samples for a few dozens of parameters every hour/3 hour from two NWP models Output: marine forecasts - forecasts for offshore oilrig applications Has been used by our industrial collaborator since June 2002. –Forecasts for 150 locations per day

5 Dept. of Computing Science, University of Aberdeen5 Example

6 Dept. of Computing Science, University of Aberdeen6 Example

7 Dept. of Computing Science, University of Aberdeen7 Knowledge Acquisition (KA) KA Tasks –Think aloud sessions –Direct Acquisition of knowledge –Onsite Observations –Corpus analysis –Collaborative prototype development

8 Dept. of Computing Science, University of Aberdeen8 Corpus Description SumTime-Meteo - parallel Text-Data Corpus Size - 1045 parallel Text-Data units Unit –NWP Model Data –Human Written Forecast Text Similar in concept to statistical MT (Machine Translation) Naturally Occurring –written for oilrig staff in the North Sea Distribution of the Corpus –Available in the public domain

9 Dept. of Computing Science, University of Aberdeen9 Parallel Text - Data WSW 10-15 increasing 17-22 by early morning, then gradually easing 9-14 by midnight.

10 Dept. of Computing Science, University of Aberdeen10 Corpus Analyses Meanings of Time phrases –Meanings of time phrases in terms of numerical data –required for lexical choice in summarization No standard time phrase mappings exist Numerical time values not mentioned in forecasts

11 Dept. of Computing Science, University of Aberdeen11 Alignment Step 1 –Parsing the forecast texts parser tuned for forecast text syntax break the text into phrases extract information such as wind speed and wind direction parser carried forward values for the missing fields (shown later in the example)

12 Dept. of Computing Science, University of Aberdeen12 Example SSW 12-16 BACKING ESE 16-20 IN THE MORNING, BACKING NE EARLY AFTERNOON THEN NNW 24-28 LATE EVENING

13 Dept. of Computing Science, University of Aberdeen13 Alignment (2) Step 2 –Associate each phrase with an entry in the input data set 43% of the phrases matched with a single entry (without ambiguity) heuristics used for improving the accuracy of alignment to 70% Further improvements in alignment under investigation

14 Dept. of Computing Science, University of Aberdeen14 Example (2) Example Phrase VEERING SW 10-14 BY EVENING Input Data 1800 SW By evening ---------> 1800 hours Example Phrase BACKING ESE 16-20 IN THE MORNING Input Data 0600ESE18 0900ESE16 In the morning -------------> 0600 hours

15 Dept. of Computing Science, University of Aberdeen15 Results

16 Dept. of Computing Science, University of Aberdeen16 Limitations of Corpus Analysis Quality of knowledge acquired –good in some cases –poor in many cases –required clarifications from experts Useful when used along with other KA techniques

17 Dept. of Computing Science, University of Aberdeen17 KA Methodology Directly Ask Experts for Knowledge Structured KA with Experts Corpus Analysis Expert Revision Initial Prototype Initial Version of Full System Final System

18 Dept. of Computing Science, University of Aberdeen18 SumTime-Mousam:Architecture Document planning –content selection and organisation Microplanning –selecting words and phrases –ellipsis Realisation –output text using the words and phrases by applying grammar rules Control Data –derived from end user profile Doc. Planning Micro Planning Realisation Input Data Output Text Control Data

19 Dept. of Computing Science, University of Aberdeen19 Content Selection What data items are worth picking up for the summary? –Reasoning from first principles - no detailed user model –Reusing data analysis techniques used by KDD community Attractive but not developed for communication Adapting data analysis techniques to suit needs of communication using the Gricean Maxims

20 Dept. of Computing Science, University of Aberdeen20 Data Analysis Expert’s View –Step Method –Report changes above thresholds (Significant changes) Corpus View –Segmentation Method –Report changes in Slopes/ report trends

21 Dept. of Computing Science, University of Aberdeen21 Example MAGNUS / THISTLE / NW HUTTON, EAST OF SHETLAND dayhourwind dirwind speed (Knots) 20-1-016S4 20-1-019S6 20-1-0112S7 20-1-0115S10 20-1-0118S12 20-1-0121S16 21-1-010S18 FORECAST FOR 06-24 GMT, 20- Jan 2001: S 02-06 INCREASING 16-20 BY EVENING

22 Dept. of Computing Science, University of Aberdeen22 Expert’s View-Step Model S 3-8 INCREASING 8-13 BY AFTERNOON AND 13-18 BY EVENING.

23 Dept. of Computing Science, University of Aberdeen23 Corpus View-Segmentation Model S 3-8 INCREASING 15-20 BY MIDNIGHT.

24 Dept. of Computing Science, University of Aberdeen24 Gricean Maxims (Grice 1975) Maxim of Quality: Try to make your contribution one that is true. More specifically: –Do not say what you believe to be false. –Do not say that for which you lack adequate evidence. Maxim of Quantity: –Make your contribution as informative as is required (for the current purposes of the exchange). –Do not make your contribution more informative than is required. Maxim of Relevance: Be relevant. Maxim of Manner: Be perspicuous. More specifically: –Avoid obscurity of expression. -Avoid ambiguity. –Be brief.-Be orderly.

25 Dept. of Computing Science, University of Aberdeen25 Application of Gricean Maxims - Example Maxim of Quality –Try to report true values from the input data –Use linear interpolation instead of linear segmentation –Uncertainty in the input data needs to be communicated to the user

26 Dept. of Computing Science, University of Aberdeen26 Sample Data

27 Dept. of Computing Science, University of Aberdeen27 Linear Regression Vs Linear Interpolation

28 Dept. of Computing Science, University of Aberdeen28 Linear Regression Vs Linear Interpolation (2) Linear Regression –S 03-07 INCREASING 16-20 BY MIDNIGHT Linear Interpolation –S 06-10 INCREASING 18-22 BY MIDNIGHT Human Written Forecast –S 06-10 INCREASING 18-22 BY MIDNIGHT Although visually linear regression looks better forecasters do not use it. Uncertainty –Speed values are mentioned as ranges e.g. 06-07 & 18-22

29 Dept. of Computing Science, University of Aberdeen29 Intrinsic Evaluation of content determination Metrics –Short - Size (Accessibility) –Accurate - Error (Informativeness) Size Computation –measured at the conceptual level –number of wind states Error Computation –Vertical distance from the line of approximation –combined error in wind speed and wind direction –normalized

30 Dept. of Computing Science, University of Aberdeen30 Results of Evaluation Segmentation produces shorter summaries without losing accuracy Details –16.5% of cases segmentation is better than step in both size and error –0.56% of cases the step method is better than segmentation in both size and error –2.5% of cases segmentation is better then step error wise but worse size wise –32% of cases segmentation is better then step size wise but worse error wise –31% of cases segmentation is better than step error wise but equal size wise

31 Dept. of Computing Science, University of Aberdeen31 Micro-planning & Realization Based on Parallel corpus analysis (described earlier) and Expert KA/Revision Details in Papers at –www.csd.abdn.ac.uk/research/sumtime/pap ers.htmlwww.csd.abdn.ac.uk/research/sumtime/pap ers.html

32 Dept. of Computing Science, University of Aberdeen32 SumTime-Mousam at Weathernews (UK) Ltd. SumTime-Mousam Data 1 Pre-edited Text Edited Data Text 1 Marfors Data Editor Marfors Data Editor SumTime_Mousam Marfors Text Editor NWP Data Post-edited Text

33 Dept. of Computing Science, University of Aberdeen33 Post-edit Evaluation Total number of forecasts analysed = 2728 2728 texts divided into 73041 phrases 7608 (10%) phrases could not be aligned Alignment failures imply that forecasters are not happy with our content determination –Which is dependent on a process called segmentation Forecasters seem to perform more sophisticated reasoning than simple segmentation

34 Dept. of Computing Science, University of Aberdeen34 Analysis results (1) Out of the successfully aligned phrases –43914 phrases matched perfectly –21519 phrases are mismatches Detailed analysis of the mismatches

35 Dept. of Computing Science, University of Aberdeen35 Analysis Results (2) The pie chart shows the results of phrase level comparisons The bar chart shows the detailed analysis of the mismatched phrases

36 Dept. of Computing Science, University of Aberdeen36 End-user Evaluation 73 End-users (oil company staff supporting offshore oilrigs) participated in this evaluation used forecasts produced by the following three methods –human written weather forecasts –SumTime-Mousam generated weather forecasts –SumTime-Mousam expressing Human select content Each participant completed a questionnaire that has two parts –Part 1 forecast produced by one of the above three methods (anonymous) Participant is required to answer comprehension questions based on the forecast –Part 2 showed any two forecasts from the above three methods (anonymous) Participant specified his/her preference for one of the two forecasts The main result –end-users consider the SumTime-Mousam generated output linguistically better than human written forecasts –Content of SumTime-Mousam is not as good as human selected content

37 Dept. of Computing Science, University of Aberdeen37 Conclusion SumTime-Mousam is the result of knowledge obtained from –several knowledge acquisition studies Expert based Corpus based –Several evaluation studies Intrinsic evaluation Post-edit evaluation End-user evaluation The development of SumTime-Mousam went through many cycles Building novel technology requires iterative approach with multiple KA and evaluation studies


Download ppt "Part 3 Real World Applications: SumTime-Mousam. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn SumTime-Mousam –Knowledge."

Similar presentations


Ads by Google