Text Analysis Using Automated Language Translators CDT John Stanford MAJ Ian McCulloh.

Slides:



Advertisements
Similar presentations
Making complex concepts accessible through hands-on analogs An example teaching Radioactive Decay & Dating Lily Lowery Claiborne Calvin F. Miller Vanderbilt.
Advertisements

Critical Reading Strategies: Overview of Research Process
Take a Position. Your Title Page By First and Last Name.
Assessing and Increasing the Impact of Research at the National Institute of Standards and Technology Susan Makar, Stacy Bruss, and Amanda Malanowski NIST.
John Bohannon Presenter: Mustafa Kilavuz.  Shyam Sankar proposes looking at the geospatial distribution of significant acts on the map of Baghdad. 
Application of Confidence Intervals to Text-based Social Network Construction By CDT Julie Jorgensen, 06, G4 Advisors: MAJ Ian McCulloh, D/MATH LTC John.
Topic Extraction From Turkish News Articles Anıl Armağan Fuat Basık Fatih Çalışır Arif Usta.
Networking and Information Technology Jeannette M. Wing President’s Professor of Computer Science and Department Head Carnegie Mellon University Former.
The current status of Chinese- English EBMT -where are we now Joy (Ying Zhang) Ralf Brown, Robert Frederking, Erik Peterson Aug 2001.
UMass Lowell Computer Science Advanced Algorithms Computational Geometry Prof. Karen Daniels Spring, 2004 Project.
COMS W1004 Introduction to Computer Science June 17, 2009.
19/20 May 2015, Riga “From Gaps to Caps – Risk Management Capability Based on Gaps Identification in the BSR” TASK D COMPARISON OF EVALUATIONS OF EMERGENCIES.
Information Technologies: Concepts and Management
Mantova 18/10/2002 "A Roadmap to New Product Development" Supporting Innovation Through The NPD Process and the Creation of Spin-off Companies.
Use Research & Competitive Intelligence to Grow Your Business July 28, 2010 Wanda McDavid Access/Information, Inc. Sponsored by Presented by.
1 Introduction to Modeling Languages Striving for Engineering Precision in Information Systems Jim Carpenter Bureau of Labor Statistics, and President,
Research Papers Locating Your Sources. Two Kinds of Sources Primary source: original text, document, interview, speech, or letter (it is the text itself)
242/102/49 0/51/59 181/172/166 Primary colors 248/152/29 PMS 172 PMS 137 PMS 546 PMS /206/ /227/ /129/123 Secondary colors 114/181/204.
Impact of different relation extraction methods on network analysis results Jana Diesner.
APA Journal STYLE and FORMAT Next yearPlz add a PDF of an article on antimicrobial activity for demo and also show the project report sample Lecture by.
Advanced Research Projects Agency – Energy Overview Sven C. Mumme Technology to Market Advisor, ARPA-E
Chris Luszczek Biol2050 week 3 Lecture September 23, 2013.
Psychology 9223 Neuroimaging of Cognition Last Update: September 8, 2014 Last Course: Psychology 9223, F2014, Western University.
Subject (Exam) Review WSTA 2015 Trevor Cohn. Exam Structure Worth 50 marks Parts: – A: short answer [14] – B: method questions [18] – C: algorithm questions.
INTERACTIVE ANALYSIS OF COMPUTER CRIMES PRESENTED FOR CS-689 ON 10/12/2000 BY NAGAKALYANA ESKALA.
5 Marzo 2007 Census mapping and Gis Part II: dissemination Fabio Crescenzi Istat, Central Directorate on General Censuses UNECE Training Workshop on Census.
MV-4920 by Wolfgang Baer Introduction * Course description * Battlefield Data Processing Theory * Systems Overview.
Understanding the market: Using LMI Hilary Stevens, Researcher Marchmont Observatory Marchmont Observatory.
Module 3.2.  Learn the differences between kinds of textbooks  Learn ways to help students focus their reading and manage multiple or very large reading.
Common Core State Standards A Brief Overview for Staff Members of Public Schools of Petoskey September 2011.
Project Final Presentation – Dec. 6, 2012 CS 5604 : Information Storage and Retrieval Instructor: Prof. Edward Fox GTA : Tarek Kanan ProjArabic Team Ahmed.
Click to Edit Talk Title USMA Network Science Center Specific Communication Network Measure Distribution Estimation Daniel.
Steps to Creating a Comprehensive Plan  PHASE 1: Where are we? Research & Analysis of Existing Conditions  PHASE 2: Where do we want to be? Creating.
Informe Académico With full-text articles from multidisciplinary Latin American scientific, academic, and general interest materials. It covers topics.
IARPA Overview Jason Matheny Director, IARPA October, 2015.
Sep 13, 2006 Scientific Computing 1 Managing Scientific Computing Projects Erik Deumens QTP and HPC Center.
Linda Washington, M.S. U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National Center for Health Statistics Marketing.
The Structure of Research. Begin with broad questions narrow down, focus in. Operationalize. OBSERVE Analyze data. Reach conclusions. Generalize back.
Advanced Manufacturing Laboratory Department of Industrial Engineering Sharif University of Technology Session #13.
Information Systems in Organizations 2.2 Systems Architecture: Devices, Network, Data and Apps.
Berendt: Advanced databases, winter term 2007/08, 1 Advanced databases – Inferring implicit/new.
Law Seminars International Spectrum Management Conference NTIA: SPECTRUM POLICY FOR THE 21 st CENTURY The Federal Government Spectrum Management Perspective.
Machine Learning in CSC 196K
1 MTAC May 26, 2011 Washington D.C. Kelly Sigmon Vice-President, Engineering USPS Operations Plan for Flats & Technology Update.
 The goal is scientific objectivity, the focus is on data that can be measured numerically.
Understanding European Institutional Policy Discourse on the Council of Europe's Convention on Preventing and Combating Violence Against Women through.
Introductory Lecture. What is Discrete Mathematics? Discrete mathematics is the part of mathematics devoted to the study of discrete (as opposed to continuous)
Text Similarity: an Alternative Way to Search MEDLINE James Lewis, Stephan Ossowski, Justin Hicks, Mounir Errami and Harold R. Garner Translational Research.
Demonstration: Tools for large scale bibliometric analysis André Somers | 1 June 25, 2009.
National Science Foundation Opportunities
Ricardo EIto Brun Strasbourg, 5 Nov 2015
Group Presentation and Individual Paper
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Computational Reasoning in High School Science and Math
MAKING THE LEAP FROM LEVEL 2 TO LEVEL 3.
Research Task / Overview Overview Goals & Objectives
Text Analysis Using Automated Language Translators
About Me B.A. in English and Computer Programming, Universidad Nacional de Asunción (UNA) M.A. in Education endorsement in Technology, California.
Writing for Academic Journals
Cryptography This week we are going to use OpenSSL
PSYCH 625 MENTOR Inspiring Minds / psych625mentor.com.
Office of Secretary of Defense
Ian D. Rotherham Professor of Environmental Geography,
Milena Lonati PD Quality Management DG2, European Patent Office
A Focus On Close Reading & New Literacies
Mapping Tool Joanna Ma University of British Columbia Department of Electrical and Computer Engineering Radio Science Lab II – OBJECTIVES Short-term: Create.
English Courses in Grade 10
Major developments in regional statistics in Slovakia
Research Design and Methods
Augmented Reality for Enterprise Alliance
Presentation transcript:

Text Analysis Using Automated Language Translators CDT John Stanford MAJ Ian McCulloh

Agenda Overview and Hypothesis Literature Review Motivation (Radio Address Case Study) Arabic Translation Data Conclusions and Recommendations

Overview and Hypothesis Text analysis is a useful tool for gathering intelligence. A language barrier exists that makes text analysis harder in non- English-speaking regions. Hiring human translators to translate texts into English is slow, expensive, and possibly a security issue. Hypothesis: Output from automated machine translators such as the Forward Area Lanuage Converter (FALCon) is difficult for the average person to understand, but is just as useful for text analysis as human-translated text.

Literature Review This project relates to two ARL projects: FALCon and the ARL Dynamic Network Analysis Lab. Language can be modeled mathematically as a network of concepts using an adjacency matrix (Sowa, 1984). Preprocessing steps such as stemming, deletion, and thesaurus application prepare a text for analysis (Carley and Diesner, 2004). AutoMap, being developed by Carnegie Mellon University, inputs texts and outputs adjacency matrices. ORA, also being developed by CMU, inputs the adjacency matrices and outputs the mental models (Carley and Reminga, 2004).

Text Analysis Process

Radio Address Study 94 of the President’s weekly radio addresses analyzed From after Sep 11 th to after the beginning of OIF (15 Sep 2001 to 21 June 2003) Concept of ‘violence’ plotted on timeline; high occurrence after Sep 11 th and leading up to OIF 27 JUL SEP JUN SEP George Bush speaks to UNGeneral Assembly 20 MAR United States invades Iraq

Arabic Text Analysis Arabic translated using CyberTrans, part of the FALCon package. 22 Arabic articles from the Department of State’s news site analyzed (US Dept of State, 2006).

Analysis Results Top concepts for the two methods of translation are the same in 16 of the 22 articles. Top concept in the human- translated text is in the top three machine-translated concepts for all articles When the methods differ, the human translation isn’t necessarily better. HumanMachine

Conclusions and Recommendations Automated text analysis makes it fast and economical to look at trends in local publications of strategically significant regions over either time or space. Detailed statistical analysis must be done on this data. Intelligence agencies who have access to large volumes of REDFOR data should run this kind of text analysis to verify that it works as well on REDFOR data as BLUFOR data. FALCon development should continue and possibly be expanded to other languages such as Farsi.

Works Cited Bush, George. ( ). “President Bush’s Radio Addresses by date and topic.” Washington, DC: Office of the Press Secretary. Available from. Carley, Kathleen and Diesner, Jana. (2004). Revealing Social Structure from Texts: Meta-Matrix Text Analysis as a novel method for Network Text Analysis. Causal Mapping for Information Systems and Technology Research: Approaches, Advances, and Illustrations., Harrisburg, PA: Idea Group Publishing. Sowa, J.F. (1984). Conceptual Structures: Information Processing in Mind and Machine. Reading, MA: Addison-Wesley. US Dept of State. (2006). “News from Washington.” Washington, DC: Office of the Press Secretary. Available from.

Questions? Dept of Mathematical Sciences Unites States Military Academy Dynamic Network Analysis Lab Army Research Lab