Giuseppe Attardi Dipartimento di Informatica Università di Pisa

Slides:



Advertisements
Similar presentations
Linking Entities in #Microposts ROMIL BANSAL, SANDEEP PANEM, PRIYA RADHAKRISHNAN, MANISH GUPTA, VASUDEVA VARMA INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY,
Advertisements

Deep Learning in NLP Word representation and how to use it for Parsing
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Introduction to Machine Learning Approach Lecture 5.
ELN – Natural Language Processing Giuseppe Attardi
Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning
L’età della parola Giuseppe Attardi Dipartimento di Informatica Università di Pisa ESA SoBigDataPisa, 24 febbraio 2015.
Seminar Topics and Projects Giuseppe Attardi Dipartimento di Informatica Università di Pisa.
Semantic Compositionality through Recursive Matrix-Vector Spaces
Shallow Parsing for South Asian Languages -Himanshu Agrawal.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Graph-based Dependency Parsing with Bidirectional LSTM Wenhui Wang and Baobao Chang Institute of Computational Linguistics, Peking University.
Deep Learning for Text Analysis Where do we stand?
Fill-in-The-Blank Using Sum Product Network
Language Identification and Part-of-Speech Tagging
S.Bengio, O.Vinyals, N.Jaitly, N.Shazeer
Course Outline (6 Weeks) for Professor K.H Wong
Ensembling Diverse Approaches to Question Answering
R-NET: Machine Reading Comprehension With Self-Matching Networks
Sentiment analysis using deep learning methods
Like It or Not: A Survey of Twitter Sentiment Analysis Methods
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
Sentiment analysis algorithms and applications: A survey
Deep Learning Amin Sobhani.
Natural Language and Text Processing Laboratory
Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.
Tools for Natural Language Processing Applications
Relation Extraction CSCI-GA.2591
Wei Wei, PhD, Zhanglong Ji, PhD, Lucila Ohno-Machado, MD, PhD
Factual Claim Validation Models Extraction of Evidence
Improving a Pipeline Architecture for Shallow Discourse Parsing
Giuseppe Attardi Dipartimento di Informatica Università di Pisa
MID-SEM REVIEW.
Are End-to-end Systems the Ultimate Solutions for NLP?
Vector-Space (Distributional) Lexical Semantics
AI in Cyber-security: Examples of Algorithms & Techniques
Text Analytics Giuseppe Attardi Università di Pisa
convolutional neural networkS
Distributed Representation of Words, Sentences and Paragraphs
convolutional neural networkS
Recurrent Neural Networks
Learning Emoji Embeddings Using Emoji Co-Occurrence Network Graph
Giuseppe Attardi Dipartimento di Informatica Università di Pisa
Seminar Topics and Projects
Ontology-Driven Sentiment Analysis of Product and Service Aspects
Word embeddings based mapping
Word embeddings based mapping
Text Mining & Natural Language Processing
Introduction to Natural Language Processing
Text Mining & Natural Language Processing
Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler
Word embeddings (continued)
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Preposition error correction using Graph Convolutional Networks
Natural Language Processing (NLP) Systems Joseph E. Gonzalez
CS565: Intelligent Systems and Interfaces
Attention for translation
Learn to Comment Mentor: Mahdi M. Kalayeh
Lecture 21: Machine Learning Overview AP Computer Science Principles
CSE 291G : Deep Learning for Sequences
Sequence to Sequence Video to Text
Automatic Handwriting Generation
Presented by: Anurag Paul
Factual Claim Validation Models
Sentiment Classification
Tokenizing Search/regex Statistics
The experiments based on Recurrent Neural Networks
Bidirectional LSTM-CRF Models for Sequence Tagging
Lecture 9: Machine Learning Overview AP Computer Science Principles
Presentation transcript:

Giuseppe Attardi Dipartimento di Informatica Università di Pisa Project Topics Giuseppe Attardi Dipartimento di Informatica Università di Pisa

Deep Learning Tokenizer Depling 2016 challenge requires a tokenizer for any of the Universal Dependency TreeBanks at: http://universaldependencies.org Choose one language Build a DL tokenizer using Keras based on the approach of: Basile, Valerio and Bos, Johan and Evang, Kilian A General-Purpose Machine Learning Method for Tokenization and Sentence Boundary Detection (2013), http://gmb.let.rug.nl/elephant/

Deep Learning POS for UD Depling 2016 challenge requires a POS tagger for any of the Universal Dependency TreeBanks at: http://universaldependencies.org Choose one language Build a DL POS using CNN, for example a LSTM that uses word embeddings and possible charcater embeddings.

Convolutional Networks for Sentiment Analysis SemEval Annotated Data: http://alt.qcri.org/semeval2017/task4/index.php?id=results Unannotated Data: 50 million tweets (ask Attardi) Code: DeepNL, https://github.com/attardi/deepnl Article: A. Severyn, A. Moschitti.UNITN: Training Deep Convolutional Neural Network for Twitter Sentiment Classification

POS tagging using Word Embeddings Data: https://github.com/UniversalDependencies/UD_Italian-PoSTWITA Embeddings: http://tanl.di.unipi.it/embeddings/ Article: Stratos, M. Collins. Simple Semi-Supervised POS Tagging. http://www.cs.columbia.edu/~stratos/research/naacl15semipos.pdf

Contextual Embeddings See paper: G. Attardi. Representation of Word Sentiment, Idioms and Senses. Proc. of 6th Italian Information Retrieval Workshop (IIR 2015). Cagliari, 2015. Use text from Wikipedia Extractor: https://github.com/attardi/wikiextractor

Entity Recognition and Linking See task description and data at: http://neel-it.github.io

Question Answering from FAQ See task description at: http://qa4faq.github.io

Context aware Spell Checker Words do not appear in isolation, they always have a context. You have to implement a spell checker that given a sentence with one misspelled word identifies such word and suggests the correction. Bonus: use of POS tagging; correct more than one error.

Wikipedia Related Pages Download the Wikipedia dump of a language of preference Use extractor: https://github.com/attardi/wikiextractor Implement and compare a "related pages" function based on: traditional bag-of-word representations doc2vec representations.

Negation/Speculation Extraction Determine the scope of negative or speculative statements: The lyso-platelet had no effect MnlI-AluI could suppress the basal-level activity Approach: Classifier for identifying cues Classifier to determine scope Data BioScope dataset: http://rgai.inf.u-szeged.hu/index.php?lang=en&page=bioscope

Corpus of Product Reviews Download reviews from online shops Classify as positive/negative according to number of stars Train classifier to assign score Bonus: use of a neural network for the classifier; use of pretrained embeddings.

Relation Extraction Exploit word embeddings as features + extra hand-coded features Use the Factor Based Compositional Embedding Model (FCM) http://www.cs.jhu.edu/~mrg/publications/finere-naacl-2015.pdf SemEval 2014 Relation Extraction data

Entity Linking with Embeddings Experiment with technique: R. Blanco, G. Ottaviano, E. Meiji. 2014. Fast and Space-Efficient Entity Linking in Queries. labs.yahoo.com/_c/uploads/WSDM-2015-blanco.pdf

Extraction of Semantic Hierarchies Use word embeddings as measure of semantic distance Use Wikipedia as source of text http://ir.hit.edu.cn/~jguo/papers/acl2014-hypernym.pdf Organism Plant Ranuncolacee Aconitum

Evalita 2018 Tasks from Evalita 2018: Also check previous editions Aspect-based sentiment analysis Emoji prediction Hate speech detection Irony Detection in Twitter Automatic Misogyny Identification … Also check previous editions

Deep Learning Applications Entity Hierarchy Embeddings http://www.cs.cmu.edu/~zhitingh/data/acl15entity.pdf Character RNNs for text generation http://karpathy.github.io/2015/05/21/rnn-effec8veness/ Morphology Better Word Representations with Recursive Neural Networks for Morphology – Luong et al. Polysemous words Improving Word Representa8ons Via Global Context And Multiple Word Prototypes by Huang et al. 2012 Natural language Inference (Logic) Question Answering Automatic Image captioning