Presentation is loading. Please wait.

Presentation is loading. Please wait.

L’età della parola Giuseppe Attardi Dipartimento di Informatica Università di Pisa ESA SoBigDataPisa, 24 febbraio 2015.

Similar presentations


Presentation on theme: "L’età della parola Giuseppe Attardi Dipartimento di Informatica Università di Pisa ESA SoBigDataPisa, 24 febbraio 2015."— Presentation transcript:

1 L’età della parola Giuseppe Attardi Dipartimento di Informatica Università di Pisa ESA SoBigDataPisa, 24 febbraio 2015

2 Natural Language Learning Children learn to speak naturally, by talking with others Children learn to speak naturally, by talking with others Teach computers to learn language in a similarly natural way Teach computers to learn language in a similarly natural way

3 Statistical Machine Learning Training on large document collections Training on large document collections Requires ability to process Big Data Requires ability to process Big Data  If we used same algorithms 10 years ago they would still be running The Unreasonable Effectiveness of Big Data The Unreasonable Effectiveness of Big Data

4 Example: Machine Translation Arabic to English, five-gram language models, of varying size

5 Deep Learning Breakthrough: 2006 … … … … Output layer Prediction of target Hidden layers Learn more abstract representations Input layer Raw input

6 Lots of Unlabeled Data Language Model Language Model  Corpus: 2 B words  Dictionary: 130,000 most frequent words  4 weeks of training Parallel + CUDA algorithm Parallel + CUDA algorithm  2 hours

7 Word Embeddings Word Embeddings neighboring words are semantically related

8 A Unified Deep Learning Architecture for NLP NER (Named Entity Recognition) NER (Named Entity Recognition) POS tagging POS tagging Chunking Chunking Parsing Parsing SRL (Semantic Role Labeling) SRL (Semantic Role Labeling) Sentiment Analysis Sentiment Analysis

9 Deep Text Analysis Parsing Parsing Word Sense Disambiguation Word Sense Disambiguation Anafora Resolution Anafora Resolution Information Extraction Information Extraction Sentiment Analysis Sentiment Analysis Text Entailment Text Entailment Question Answering Question Answering Biomedical Text Analysis Biomedical Text Analysis

10 QA on Alzheimer Disease the γ-secretase inhibitor Semacestat failed to slow cognitive decline disorder SnowMed: C0236848 disorder SnowMed: C0236848 protein drug substance QA on Alzheimer Competition SUBJ OBJ APPO OBJ ROOT

11 Correlation Simptoms-Diseases

12 Big data, Big Brain Google DistrBelief Google DistrBelief  Cluster capable of simulating 100 billion connections  Used to learn unsupervised image classification  Used to produce tiny ASR model Similar basic capability for processing image, audio and language Similar basic capability for processing image, audio and language European FET Brain project European FET Brain project

13


Download ppt "L’età della parola Giuseppe Attardi Dipartimento di Informatica Università di Pisa ESA SoBigDataPisa, 24 febbraio 2015."

Similar presentations


Ads by Google