Download presentation
Presentation is loading. Please wait.
1
The Web Media Verification Challenge
Olga Papadopoulou Markos Zampoglou Symeon Papadopoulos Yiannis Kompatsiaris 1st International School on “Learning from Signals, Images, and Video” Thessaloniki, July 2019
2
Social media as news source and misinformation
Verifiably false/misleading info Created, presented and disseminated for economic gain and intentionally deceive the public May cause public harm “The weaponisation of on-line fake news and disinformation poses a serious security threat to our societies” (EC) “Fake news represents a danger to democracy” (Eurobarometer study)
3
Multimedia Knowledge and Social Media Analysis Lab (MKLab)
Personnel: 3 senior researchers, 20+ post-doc researchers, 40+ research assistants, 6 PhD candidates Research projects: 20 H2020 (co-ordinating 4), 8 National Projects Industry: Infalia spin-off, contracts (e.g. Motorola UK-US) Publications: 146 Journals, 8 Patents, 437 Conferences Events: MMM 2019, IVMSP 2018, Internet Science 2017, ESSIR2015 Open source, available tools, datasets Research Areas: Computer Vision, Semantic Technologies, Social media and big data, IoT-sensors, Brain Computer Interfaces Applications: Media, Culture, Security, Smart Cities, eHealth
4
Media verification activities
Areas Image tampering Social media mining Video verification Deep fakes Projects WeVerify (ongoing) TENSOR (ongoing) InVID ( ) REVEAL ( ) SocialSensor ( ) Results Tools: Image Verification Assistant, Tweet Verification Assistant, Context Aggregation and Analysis Datasets: Fake video corpus, Tweet verification corpus, Wild Web tampered image dataset
5
The many faces of disinformation
6
Three Challenges Tampered image detection
use image forensics output to spot digitally manipulated images Contextual video verification leverage video metadata to produce a credibility score for the input video Verification-oriented comment detection build a classifier to easily select comments that are useful for verification
7
#1 Tampered image detection
8
Tampered image detection
Tampered image Ground truth Forensic results Forensic results (colorized)
9
Forensic output fusion: challenges
Different output styles depending on algorithm Inconsistent performance: Not all algorithms work in every case Distracting results on non-detections Lack of large-scale datasets
10
Datasets Columbia Uncompressed Image Splicing Detection Evaluation Dataset 180 tampered / 180 untampered 1st IFS-TC Forensics Challenge 447 tampered / 447 untampered Realistic Tampering Dataset 220 tampered / 220 untampered Split Training: 718 tampered / 718 untampered Test: 128 tampered / 128 untampered Baseline performance Precision Recall F1-score Untampered 0.68 0.86 0.76 Tampered 0.81 0.59 Average 0.74 0.73 0.72
11
Baseline solution and ideas
Threshold regions (T=190) Extract connected components (morphological clean-up) Feature extraction First three components: area, height, width, perimeter Number of components Image moments Fixed-length feature vector using statistics of 3 larger components Decision tree classifier Ideas Better features (keypoints, CNN activations) Better classifiers (end-to-end DCNNs) Others (link to object detection/semantic segmentation outputs)
12
#2 Contextual video verification
13
Contextual video verification
Main hypothesis: misleading (aka fake) videos are written/published in different ways compared to trustworthy ones Leverage different signals of quality and credibility contained in video metadata
14
Fake video corpus Unique cases: 200 fake and 180 real of YouTube, Facebook and Twitter Total number of videos in cascades: 2920 fake and 2090 real STAGED TAMPERED REUSE
15
Baseline solution and ideas
Feature extraction from video title and channel statistics RBF SVM classifier Ideas Explore better text mining pipelines: ELMo: Deep contextualized word representations LSTM and Bi-LSTM (process word sequences in both the directions) GPT-2 From video title From channel metadata Text length # words Contains question/explanation marks Contains 1st/2nd/3rd person pronoun # uppercase characters # positive/negative sentiment words # slang words Has ‘:’ symbol # question/explanation marks Channel view count Channel subscriber count Channel video count Channel comment count Precision Recall F-score 0.63 0.93 0.75
16
#3 Verification-oriented comment detection
17
Verification-oriented comment detection
Problem formulation: comments often contain valuable pieces of information that can help an analyst verify or debunk a video controversial videos attract thousands of videos: hard to separate useful comments from noise Build a classifier that can assign scores to video comments in proportion to their relevance for verification
18
Baseline solution and ideas
Filter by a predefined list of verification-oriented keywords available in 7 languages (English, German, Greek, Arabic, Spanish, Farsi) Ideas Use baseline solution as weak annotation and bootstrap a supervised classification scheme Consider additional features (e.g. comment length, contains link) lies, fake, wrong, lie, confirm, where, location, lying, false, incorrect, misleading, propaganda, liar
19
Let’s get it started https://bit.ly/2XGnmS2
leads to
20
Thank you! Get in touch!
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.