Telecooperation Technische Universität Darmstadt Copyrighted material; for TUD student use only E-Mail Q&A Telecooperation Group TU Darmstadt.

Slides:



Advertisements
Similar presentations
JavaScript I. JavaScript is an object oriented programming language used to add interactivity to web pages. Different from Java, even though bears some.
Advertisements

Telecooperation/RBG Technische Universität Darmstadt Copyrighted material; for TUD student use only Introduction to Computer Science I Topic 14: Stepwise.
Formal Language, chapter 4, slide 1Copyright © 2007 by Adam Webber Chapter Four: DFA Applications.
Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
Telecooperation/RBG Technische Universität Darmstadt Copyrighted material; for TUD student use only Introduction to Computer Science I Topic 16: Exception.
Strings.
Copyright  Hannu Laine C++-programming Part 5 Strings.
Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 122 – Data Structures Characters and Strings.
Current Assignments Homework 5 will be available tomorrow and is due on Sunday. Arrays and Pointers Project 2 due tonight by midnight. Exam 2 on Monday.
Character and String definitions, algorithms, library functions Characters and Strings.
ISBN Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl –(on reserve.
Chapter 10.
Regular Expressions in Java. Regular Expressions A regular expression is a kind of pattern that can be applied to text ( String s, in Java) A regular.
Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl Linux editors and commands (e.g.
Overview of C++ Chapter 2 in both books programs from books keycode for lab: get Program 1 from web test files.
Characters and Strings. Characters In Java, a char is a primitive type that can hold one single character A character can be: –A letter or digit –A punctuation.
Chapter 7. 2 Objectives You should be able to describe: The string Class Character Manipulation Methods Exception Handling Input Data Validation Namespaces.
Chapter 3: Introduction to C Programming Language C development environment A simple program example Characters and tokens Structure of a C program –comment.
Shell Script Examples.
Regular Expressions. String Matching The problem of finding a string that “looks kind of like …” is common  e.g. finding useful delimiters in a file,
Applications of Regular Expressions BY— NIKHIL KUMAR KATTE 1.
JSP Standard Tag Library
Introduction to Programming Prof. Rommel Anthony Palomino Department of Computer Science and Information Technology Spring 2011.
Homework Reading Programming Assignments
Telecooperation Technische Universität Darmstadt Copyrighted material; for TUD student use only Internet Standards: Julian Schröder-Bernhardi, Dirk.
CHAPTER 8 CHARACTER AND STRINGS
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
Computer Programming for Biologists Class 5 Nov 20 st, 2014 Karsten Hokamp
CISC474 - JavaScript 03/02/2011. Some Background… Great JavaScript Guides: –
Sending Topic 4, Chapters 9, 10 Network Programming Kansas State University at Salina.
1 Regular Expressions CIS*2450 Advanced Programming Techniques Material for this lectures has been taken from the excellent book, Mastering Regular Expressions,
Characters The data type char represents a single character in Java. –Character values are written as a symbol: ‘a’, ‘)’, ‘%’, ‘A’, etc. –A char value.
ASP.NET Programming with C# and SQL Server First Edition Chapter 5 Manipulating Strings with C#
Introduction to Programming David Goldschmidt, Ph.D. Computer Science The College of Saint Rose Java Fundamentals (Comments, Variables, etc.)
UNIX Shell Script (1) Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
CPSC 388 – Compiler Design and Construction Scanners – JLex Scanner Generator.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 4. Document Search and Regular Expressions.
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 3, 09/11/2003 Prof. Roy Levow.
Chapter 2: Java Fundamentals
Week 1 Algorithmization and Programming Languages.
Variables and ConstantstMyn1 Variables and Constants PHP stands for: ”PHP: Hypertext Preprocessor”, and it is a server-side programming language. Special.
Regular Expressions CSC207 – Software Design. Motivation Handling white space –A program ought to be able to treat any number of white space characters.
CSC 352– Unix Programming, Spring 2015 April 28 A few final commands.
Regular Expressions.
CS 330 Programming Languages 10 / 07 / 2008 Instructor: Michael Eckmann.
1 Week 12 l Overview of Streams and File I/O l Text File I/O Streams and File I/O.
I/O Redirection and Regular Expressions February 9 th, 2004 Class Meeting 4.
Telecooperation Technische Universität Darmstadt Copyrighted material; for TUD student use only : Praktikum Internet – The Next Generation FG Telekooperation.
12. Regular Expressions. 2 Motto: I don't play accurately-any one can play accurately- but I play with wonderful expression. As far as the piano is concerned,
Introduction to Java Network Programming and HTTP
GREP. Whats Grep? Grep is a popular unix program that supports a special programming language for doing regular expressions The grammar in use for software.
CS 330 Programming Languages 10 / 02 / 2007 Instructor: Michael Eckmann.
Lexical Analysis S. M. Farhad. Input Buffering Speedup the reading the source program Look one or more characters beyond the next lexeme There are many.
Copyright © Curt Hill Regular Expressions Providing a Search Pattern.
CSCI 330 UNIX and Network Programming Unit IV Shell, Part 2.
Programming Fundamentals. Overview of Previous Lecture Phases of C++ Environment Program statement Vs Preprocessor directive Whitespaces Comments.
Programming Fundamentals. Summary of previous lectures Programming Language Phases of C++ Environment Variables and Data Types.
Strings and Related Classes String and character processing Class java.lang.String Class java.lang.StringBuffer Class java.lang.Character Class java.util.StringTokenizer.
Chapter 9: Completing the Basics. In this chapter, you will learn about: – Exception handling – Exceptions and file checking – The string class – Character.
Characters and Strings
Extra Recitations Wednesday 19:40-22:30 FENS L055 (tomorrow!) Friday 13:40-16:30 FENS L063 Friday 17: :30 FENS L045 Friday 19:40-22:30 FENS G032.
A FIRST BOOK OF C++ CHAPTER 14 THE STRING CLASS AND EXCEPTION HANDLING.
OOP Tirgul 11. What We’ll Be Seeing Today  Regular Expressions Basics  Doing it in Java  Advanced Regular Expressions  Summary 2.
Web Server Design Week 3 Old Dominion University Department of Computer Science CS 495/595 Spring 2006 Michael L. Nelson 1/23/06.
CS 330 Class 7 Comments on Exam Programming plan for today:
Topic Pre-processor cout To output a message.
Looking for Patterns - Finding them with Regular Expressions
Lecture 19 Strings and Regular Expressions
Regular Expressions in Java
Presentation transcript:

Telecooperation Technische Universität Darmstadt Copyrighted material; for TUD student use only Q&A Telecooperation Group TU Darmstadt

Prof. Dr. M. Mühlhäuser Telekooperation © 2 Interoperability No need to implement everything from RFCs –Way too much work –Correctly implemented, you would out-standard most common clients Your implementation should have this functionality –7Bit encoding –Quoted printable & Base64 encoding with all charsets Java can handle (i.e. every charsetName that does not throw an UnsupportedEncodingException) –Multipart messages are recognized and decoded correctly –Robustness: Do not choke on unrecognized headers Programs will be tested with public test cases + secret ones –Secret test cases only use above mentioned functionality, too

Prof. Dr. M. Mühlhäuser Telekooperation © 3 Headers Multiline-Headers –Line continuations start with a “folding whitespace” – may be space or tab (\t) Ignore every header you do not know –If you want, you can also display additional headers like BCC – but required are only those mentioned in milestone 3.1 Case-sensitivity –Header names are always case-insensitive c.f. RFC 2822, section „Characters will be specified […] by a case- insensitive literal value enclosed in quotation marks“ –Header values used in the assignment are usually case-insensitive, e.g. Content-Transfer-Encoding: Base64 and base64 are both possible Exceptions: multipart-boundary all header values displayed to the user

Prof. Dr. M. Mühlhäuser Telekooperation © 4 Date Look into the documentation of SimpleDateFormat –no need to parse each item for yourself, even recognizes “GMT” and “UTC” as timezones –Modify the parser with Locale.US in order to let it parse things like “May” Output via DateFormat.getDateTimeInstance() Timezone –Setting via SimpleDateFormat or Calender#setTimeZone is preferred to manual time manipulation –Reason: DateFormat may be configured to display the timezone

Prof. Dr. M. Mühlhäuser Telekooperation © 5 Attachments Base64 encoded lines are always 76 characters wide – only exception is the last line If numberofchars % 4 != 0, you may just throw an exception and terminate Do not use javax.mail.internet.MimeUtility or similar additional libraries for decoding Use the Content-Disposition header to suggest a name for saving Attachments that are not of type text/… don’t have and don’t need a charset –Just treat as stream of bytes/byte array

Prof. Dr. M. Mühlhäuser Telekooperation © 6 Base64-Example Take group of 4 characters S W 4 g Decode according to RFC –S = 0x12; W = 0x16; 4 = 0x38; g = 0x20 –Decoding may be done in groups: A-Z  char – ‘A’; a-z  char – ‘a’ + 26; 0-9 = char – ‘0’ + 26*2; +, /, = must be treated separately Combine to 24 bit number, shift according to index (big endian) –0x12 << 18 | 0x16 << 12 | 0x38 << 6 | 0x20 << 0  0x496e20 Shift number back in 8 bit blocks (also big endian) –Byte 0 = 0x496e20 >> 16 & 0xff = 0x49 –Byte 1 = 0x496e20 >> 8 & 0xff = 0x6e –Byte 2 = 0x496e20 >> 0 & 0xff = 0x20

Prof. Dr. M. Mühlhäuser Telekooperation © 7 Decoding Your own input stream –Elegant way of decoding Base64 and Quoted-Printable data (you can do it differently, only a suggestion) 1.Extend java.io.InputStream 2.Take character-array of undecoded data as parameter 3.Overwrite read() –Decode the character data when –Return -1 if end of data reached 4.Let the InputStreamReader deal with the nasty problem of decoding charsets Sample application has only 50 LoC for decoding quoted printable, 100 LoC for Base64

Prof. Dr. M. Mühlhäuser Telekooperation © 8 Regular Expressions Regular expressions are a nice way for filtering out substrings A bit like file name patterns (*, ?), but more powerful –Letters, Numbers remain the same –Punctuation characters usually have a special meaning, for characters escape them by a \ to use the character [, use \[ Attention: you need to escape the Backslash in Java-Strings  \[ == "\\[" –Alternatives: use [] [abc] matches a or b or c [A-Z] matches A or B or … or Z Negation: [^abc] matches everything but a or b or c –Wildcard. matches everything –Repetition * means “the previous element zero or more times” + means “the previous element one or more times”

Prof. Dr. M. Mühlhäuser Telekooperation © 9 Regular Expressions with Java Part of java.util.regex First, compile the pattern to search: –Pattern p = Pattern.compile("charset=[^ ]*") –The compile method has a variant that takes flags – use it for case-insensitivity: Pattern.CASE_INSENSITIVE Next, make a Matcher for a String out of it –Matcher m = p.match("Content-Type: text/plain; charset=\"us- ascii\"") Be sure to call the Matcher’s find method –m.find() m.group(0) now contains everything that maches –charset="us-ascii"

Prof. Dr. M. Mühlhäuser Telekooperation © 10 Grouping You need the thing after “charset=“ –Solution 1: parse for yourself –Solution 2: add groups to the expression Groups are signified by () and counted from 1 –Pattern p = Pattern.compile("charset=([^ ]*)") After matching, group(1) contains "\"us-ascii\")

Prof. Dr. M. Mühlhäuser Telekooperation © 11 Debugging Mail clients should be able to connect to the server and fetch the mail Always helpful: try to connect to the pop-server via telnet and issue POP commands manually –For closer examination, you may unzip the JAR-file and have a look at “mailbox.xml”