Presentation is loading. Please wait.

Presentation is loading. Please wait.

About TeMpTations & Masks

Similar presentations


Presentation on theme: "About TeMpTations & Masks"— Presentation transcript:

1 Information Security and Privacy Aspects of Using Online Machine Translation in CAT Tools
About TeMpTations & Masks Translating and the Computer Christine Bruckner Nov 15, Freelance Translation Technology Consultant

2 Machine Translation and the Translator Today
… and MT plugins like DeepL Pro are fully secure and GDPR compliant (?!) Translating and the Computer Christine Bruckner

3 MT Integration in CAT Tools
… and many more batch mode / via pre-translation => “post-editing” interactive lookup during translation at: segment level sub-segment level (AutoSuggest, MatchRepair, MT-based fuzzy match repair, TM validated MT match…) SDL Trados Studio Across memoQ STAR Transit CAT Tools These scenarios are not really new, see e. g. Heyn, Matthias (1996): Integrating machine translation into translation memory systems. pp Translating and the Computer Christine Bruckner

4 CAT+MT Integration – Example 1
Microsoft MT Plug-in and MT pre-translation in memoQ Translating and the Computer Christine Bruckner

5 CAT MT Integration - Example 2
Interactive MT proposals (via. AutoSuggest) in SDL Trados Studio 2019 via DeepL Pro plug-in Translating and the Computer Christine Bruckner

6 Free / Cheap Cloud MT Plug-ins in Common CAT Tools
MT Provider CAT Tool Integration Customization / Adaptive MT Google Cloud Translation SDL Trados Studio Across memoQ Transit NXT not in standard version TMX import via Google AutoML Translation DeepL no(t yet) MyMemory TMX import Adaptive MT (in Pro version) Microsoft MT via Category ID (=domain) in standard version for SMT TMX import via Microsoft Custom Translator Translating and the Computer Christine Bruckner

7 Information Security Aspects with Cloud MT Solutions
Confidentiality Availability Integrity Data encryption Avoid use / storing of data input by third parties Confidentiality: The “wrong” people can see information in transit. MT sites can use your data in ways you did not intend. (Don de Palma, 2014: and-localization/article/free-machine-translation-can-leak-data/ ) Translating and the Computer Christine Bruckner

8 Data Protection Aspects with Cloud MT Solutions
Server location Data Processing Agreements Anonymization / Pseudonymization Data Protection / Privacy: “data protection by design will become a legal obligation once the GDPR starts to apply in May Those who process personal data of individuals will have to take data protection into account “both at the time of the determination of the means for processing and at the time of the processing itself”, as Article 25 of the GDPR puts it.” communications-privacy_en MT user is the data controller and thus responsible and liable for what happens with third-party personal data Translating and the Computer Christine Bruckner

9 Warn the User before Using Cloud MT in CAT Tools
(too few) warnings and links to Terms of Service of cloud MT providers in CAT environments Warnings when setting up / configuring MT plug-ins Warning only when setting up Google MT plug-in SDL Trados Studio Across memoQ STAR Transit CAT Tools Warnings configurable: always never per project No warnings in UI Translating and the Computer Christine Bruckner

10 Examples from MT Terms of Services - MyMemory
Oct 2018 MyMemory “We collect any segment submitted and store it on a long term basis, whether it’s public or private. [..] The contributions to the archive, whether they are “Public Data” or “Private Data”, are collected, processed and used by Translated to create statistics, set up new services and improve existing ones.” […] MyMemory uses external partners to outsource some developments and provide some functionalities: this involves data sharing. External Machine Translation providers is the most obvious example. Translated can entitle its partners of a usage license over “Public Data” or “Private Data” in order to improve the quality of our and/or 3rd parties' services (eg. machine translation suggestions, glossaries, language models, spell checkers...). ( Oct 2018) Confidentiality Translating and the Computer Christine Bruckner

11 Examples from MT Terms of Service – DeepL Pro
Oct 2018 DeepL Pro Data Confidentiality: Your texts are deleted immediately after you've received the translation Your Data is Secure: “DeepL Pro never stores the texts you are translating, and the connection to our servers is always encrypted. This means that your texts are not used for any purposes other than your translation, nor can they be accessed by third parties. As a company based in Germany, all our operations comply with European Union Data Protection laws.” BUT: Servers are located in Iceland ( deploys-51-petaflop-supercomputer-on-verne-global-campus/ Sept 2017) “7.12 Customer is obligated to observe all legal requirements for the collection, processing and use of data which is transmitted to and processed by DeepL for the Customer in connection with the provision of its services under this Agreement. In particular, Customer shall immediately inform DeepL if Customer intends to transmit personal data to DeepL using the API. Customer guarantees not to collect, process or use any personal data in connection with the API without the express consent of the data subject or sufficient other legal authorisation. […].” ( Oct 2018) Privacy Translating and the Computer Christine Bruckner

12 Masking of Personal Data in the Translation Process
Masking = anonymization or pseudonymization Translation task assignment Review Content creation Pre-production & file preparation Project creation Project preparation Project finalization When? ? ? ? ? Process flow inspired by SDL Studio training presentation ? manually in source texts semi-automatically via search+replace strings+regex file automatically via tools How? Translating and the Computer Christine Bruckner

13 Masking during Translation Project Creation
The data is protected for the (whole) translation process – in source and target Translating and the Computer Christine Bruckner

14 Pseudonymization - Negative Side-Effects on Content
Google NMT Issues with: understandability MT quality potential adaptation needs (date format, transliteration, …) TM content re-use wrong 100% matches DeepL Translating and the Computer Christine Bruckner

15 Masking of Personal Data before Using Cloud MT
Project creation Project preparation Translation task assignment Content creation Pre-production & file preparation Review Project finalization When? Process flow inspired by SDL Studio training presentation Personal data is only protected during transmission to the cloud MT provider  Translating and the Computer Christine Bruckner

16 Cloud MT Masking Options in CAT Tools
only few available, for example: MT Enhanced plugin (Google MT, Microsoft MT) – Studio RyS plugin (Google MT) – Studio regex knowledge required for configuration source text untouched meaningful tags Config file for MT Enhanced plugin Translating and the Computer Christine Bruckner

17 Example Masking with MT Enhanced Plug-in
Google MT output in SDL Trados Studio with masking via MT Enhanced plug-in Translating and the Computer Christine Bruckner

18 Some Conclusions on Personal Data Masking in CAT Tools
Additional efforts during preparation (tools, regex knowledge) Tagging of personal data decreases understandability of source text and MT output quality “meta-information” (type, gender etc.) should be preserved for better understanding by translators 100 % matches need more revision after un-masking Translating and the Computer Christine Bruckner

19 Some Conclusions on Personal Data Masking in CAT Tools
With a little additional background knowledge, it is often possible to identify the individuals you thought had been anonymized Use secure offline MT solutions Avoid masking efforts and disadvantages via - clear information about document classification and personal data content - guidelines+agreements with translators regarding cloud MT systems Translating and the Computer Christine Bruckner

20 Thank you! Christine.Bruckner@CATTMaTTers.de


Download ppt "About TeMpTations & Masks"

Similar presentations


Ads by Google