XMλ. Contents What is the problem? Hosoya’s approach Shields’ approach XMLambda and the UHConclusion.

Slides:



Advertisements
Similar presentations
Functional Programming Lecture 10 - type checking.
Advertisements

Reconciling OO and Haskell-Style Overloading Mark Shields Simon Peyton Jones Microsoft Research Cambridge
Introduction to Compilation of Functional Languages Wanhe Zhang Computing and Software Department McMaster University 16 th, March, 2004.
XML: Extensible Markup Language
Type Checking, Inference, & Elaboration CS153: Compilers Greg Morrisett.
Semantics Static semantics Dynamic semantics attribute grammars
XDuce Tabuchi Naoshi, M1, Yonelab.
Exercise 1 Generics and Assignments. Language with Generics and Lots of Type Annotations Simple language with this syntax types:T ::= Int | Bool | T =>
Letrec fact(n) = if zero?(n) then 1 else *(n, (fact sub1(n))) 4.4 Type Inference Type declarations aren't always necessary. In our toy typed language,
Cs776 (Prasad)L4Poly1 Polymorphic Type System. cs776 (Prasad)L4Poly2 Goals Allow expression of “for all types T” fun I x = x I : ’a -> ’a Allow expression.
ML Datatypes.1 Standard ML Data types. ML Datatypes.2 Concrete Datatypes  The datatype declaration creates new types  These are concrete data types,
Getting started with ML ML is a functional programming language. ML is statically typed: The types of literals, values, expressions and functions in a.
ML: a quasi-functional language with strong typing Conventional syntax: - val x = 5; (*user input *) val x = 5: int (*system response*) - fun len lis =
Defining new types of data. Defining New Datatypes Ability to add new datatypes in a programming language is important. Kinds of datatypes – enumerated.
Functional Design and Programming Lecture 1: Functional modeling, design and programming.
1 Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications.
ML: a quasi-functional language with strong typing Conventional syntax: - val x = 5; (*user input *) val x = 5: int (*system response*) - fun len lis =
Catriel Beeri Pls/Winter 2004/5 type reconstruction 1 Type Reconstruction & Parametric Polymorphism  Introduction  Unification and type reconstruction.
Introduction to ML - Part 2 Kenny Zhu. What is next? ML has a rich set of structured values Tuples: (17, true, “stuff”) Records: {name = “george”, age.
Introduction to ML Last time: Basics: integers, Booleans, tuples,... simple functions introduction to data types This time, we continue writing an evaluator.
Static Validation of Dynamically Generated XML Documents A survey on a series of papers by the BRICS research group at the University of Aarhus, Denmark.
Type Inference David Walker COS 441. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful.
Type Inference David Walker CS 510, Fall Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs.
1 Type Type system for a programming language = –set of types AND – rules that specify how a typed program is allowed to behave Why? –to generate better.
0 PROGRAMMING IN HASKELL Chapter 3 - Types and Classes.
SchemeCOP Introduction to Scheme. SchemeCOP Scheme Meta-language for coding interpreters –“ clean ” semantics Scheme = LISP + ALGOL –simple.
Type Inference: CIS Seminar, 11/3/2009 Type inference: Inside the Type Checker. A presentation by: Daniel Tuck.
Generic Programming with Dependent Types Stephanie Weirich University of Pennsylvania.
CS 2104 : Prog. Lang. Concepts. Functional Programming I Lecturer : Dr. Abhik Roychoudhury School of Computing From Dr. Khoo Siau Cheng’s lecture notes.
Neminath Simmachandran
PrasadCS7761 Haskell Data Types/ADT/Modules Type/Class Hierarchy Lazy Functional Language.
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Relational Algebra.
Overview of the Haskell 98 Programming Language
Management of XML and Semistructured Data Lecture 11: Schemas Wednesday, May 2nd, 2001.
Language: Set of Strings
Chapter 3 Part II Describing Syntax and Semantics.
CS 2104 – Prog. Lang. Concepts Functional Programming II Lecturer : Dr. Abhik Roychoudhury School of Computing From Dr. Khoo Siau Cheng’s lecture notes.
Data Structures & Algorithms
1 Typing XQuery WANG Zhen (Selina) Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,
Implementing a Dependently Typed λ -Calculus Ali Assaf Abbie Desrosiers Alexandre Tomberg.
12/9/20151 Programming Languages and Compilers (CS 421) Elsa L Gunter 2112 SC, UIUC Based in part on slides by Mattox.
1 CS 457/557: Functional Languages Lists and Algebraic Datatypes Mark P Jones Portland State University.
1 XDuce XDuce: A statically Type XML Processing: Hosoya and Pierce Presented by: Guy Korland Based on presentation by: Tabuchi
CS412/413 Introduction to Compilers Radu Rugina Lecture 13 : Static Semantics 18 Feb 02.
November 2003Computational Morphology III1 CSA405: Advanced Topics in NLP Xerox Notation.
Operational Semantics Mooly Sagiv Tel Aviv University Textbook: Semantics with Applications Chapter.
Types and Programming Languages Lecture 14 Simon Gay Department of Computing Science University of Glasgow 2006/07.
1 FP Foundations, Scheme In Text: Chapter Chapter 14: FP Foundations, Scheme Mathematical Functions Def: A mathematical function is a mapping of.
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Cs776(Prasad)L6sml971 SML-97 Specifics SML/NJ 110.
Operational Semantics Mooly Sagiv Reference: Semantics with Applications Chapter 2 H. Nielson and F. Nielson
Arvind Computer Science and Artificial Intelligence Laboratory M.I.T. L05-1 September 21, 2006http:// Types and Simple Type.
Principles of programming languages 12: Functional programming
PROGRAMMING IN HASKELL
Types CSCE 314 Spring 2016.
Type Checking Generalizes the concept of operands and operators to include subprograms and assignments Type checking is the activity of ensuring that the.
ML: a quasi-functional language with strong typing
Haskell Chapter 2.
Mark Shields Oregon Graduate Institute Erik Meijer Utrecht University
CSE 3302 Programming Languages
Data Modeling II XML Schema & JAXB Marc Dumontier May 4, 2004
Introduction to C++ Programming
Objective caml Daniel Jackson MIT Lab for Computer Science 6898: Advanced Topics in Software Design March 18, 2002.
Haskell Types, Classes, and Functions, Currying, and Polymorphism
PROGRAMMING IN HASKELL
PROGRAMMING IN HASKELL
COMPILER CONSTRUCTION
Presentation transcript:

XMλ

Contents What is the problem? Hosoya’s approach Shields’ approach XMLambda and the UHConclusion

What is the problem? XML, a standard language of first-order, tree-like datatypes XML works well for describing static documents, but documents are typically dynamic, generated by a server Implementing a server for dynamic documents in conventional languages is hard:  no direct support for XML or scripting language syntax  no compile-time checks to ensure valid documents Can custom languages developed for XML be embedded as combinatory libraries within a Haskell-like language?

element Msg= ( ( (To|Bcc)* & From), Body) element To= String element Bcc= String element From= String element Body= P* element P= String Our presentation is finished! XML

element Msg= ( ( (To|Bcc)* & From), Body) element To= String element Bcc= String element From= String element Body= P* element P= String |:union *:sequence &:unordered tuple,:ordered tuple XML

What we are looking for: XML→ Functional Program. document-type definition→ type definitions Regular expression→type element→ term Document validation→type checking

Possible solutions 1. Using a universal datatype Data Element= Atom String | Node String (List Element)

Data Element= Atom String | Node String (List Element) Node “Msg” [ Node “To” [Atom Node “Bcc [Atom Node “From” [Atom Node “Body” [ Node “P” [Atom “Our...”] ] No validation possible

Possible solutions 1. Using a universal datatype 2. Using a newtype declarations Newtype Msg= Msg (List (Either To Bcc), From, Body ) Newtype From= From String Newtype To= To String Newtype Bcc= Bcc String Newtype Body= List P Newtype P= P String

Newtype Msg = Msg (List (Either To Bcc), From, Body Newtype From = From String Newtype To = To String Newtype Bcc = Bcc String Newtype Body = List P Newtype P = P String Msg ( [ Left ( To Right ( Bcc From Body [ P “Our...” ] ) Sound, but not complete.

Possible solutions 1. Using a universal datatype 2. Using a newtype declarations 3. Using regular expression types as primitive Hosoya

Possible solutions 1. Using a universal datatype 2. Using a newtype declarations 3. Using regular expression types as primitive 4. Using Type-Indexed rows Shields

Hosoya’s approach

Why Regular Expression Types? Static typechecking: generated XML documents conform to DTD Or: invalid documents can never arise For example: A must have at least one

Why Regular Expression Patterns? Convenient programming constructs for manipulating documents For instance, jump over arbitrary length data and extract specific data: type Person = person[Name, *,Tel?] match p with person[Name, +,Tel ] -> … …

XDuce: Values Primitives represent XML documents (trees) For example: I.e. a value is a sequence of nodes

XDuce: Regular Expression Types Types correspond to document schemas Familiar XML regular expressions: type Tel = tel[String] type Tels = Tel* type Recip = Bcc|Cc (Name, Tel*), Addr T? = T|() T+ = T,T*

Subtyping Many algebraic laws:  Associativity of concatenation and union: A|(B|C)  (A|B)|C  Commutativity of union: A|B  B|A These laws are crucial for XML processing, but lead to complicated specification

Subtyping Subtyping as set inclusion First define which values belong to type One type is a subtype of another if the former denotes a subset of the latter For example: (Name*, Tel*) <: (Name|Tel)*

Pattern Matching: Exhaustiveness type Person = person[Name, *,Tel?] match p with person[Name, +,Tel?] -> … person[Name, *,Tel] -> … Not exhaustive Use subtyping to check: the input type must be a subtype of the union of the pattern types

Pattern Matching: Irredundancy match p with person[Name, *,Tel?] -> … person[Name, +,Tel] -> … Second clause redundant A clause is redundant iff all the input values that can be matched by the pattern can also be matched by preceding patterns

Pattern Matching: Type Inference type Name = name[String] match (ps as Person*) with person[name[val n as String], *,Tel?],rest -> … Avoid excessive type annotations Use input type and pattern to infer types of  bare variables ( rest )  bound variables ( n )

Functions First-order functions (explicitly typed): fun f(P):T = e For example: fun tels(val ps as Person*):Tel* = match ps with person[Name, *,tel[val t]],rest -> tel[t],tels(rest) person[Name, *],rest -> tels(rest)

Higher-order Functions Functions as first-class citizen Why desireable?  Abstraction Not supported by XDuce What is needed?  Subtyping for arrow types So why not support higher-order functions?

Higher-order Functions Function definitions given by fixed set G G is used in T-APP (instead of standard rule) Consequence: T-ABS fails Fix: redefine T-APP Type annotations needed for check of pattern match

Parametric Polymorphism Generic typing using vars instead of actual types Why desireable?  Abstraction from structure of problem What is needed?  Type abstraction  Type application So why no parametric polymorphism?

Parametric Polymorphism Problems: forall X. (U|X) -> (T|X) Pattern matching problems:  Exhaustiveness / irredundancy checks  Type inference Typing constraints cannot be represented forall X  {U,T}.(U|X) -> (T|X)

Conclusions Typed language with XML docs as primitive values Regular expression types are fundamental Regular expression pattern matching No higher-order functions No parametric polymorphism

Shields’ approach “It is required that content models in element type declarations be deterministic” Consequence 1: regular expressions must be 1-unambiguous Unions and unordered tuples are formed from distinct members. ( ( To, Bcc ) & (Bcc, To) )is 1-unambiguous ( (Bcc, To) & Bcc )is not ( (To | Bcc) & Bcc )is not

Shields’ approach “It is required that content models in element type declarations be deterministic” Consequence 2: possible to transform any XML element into a term: *sequencelist,tupletuple |union → type-indexed sum &unordered tuple → type-indexed product | and & are both formed from Type-Index Rows

Type-Indexed Rows A type-indexed row is a list of types Type constructors  Empty:Row  (_#_):Type → Row → Row For example: (Int # Bool # Empty)

Type-indexed product TIP:  (All _):Row → Type Type-indexed coproduct TIC:  (One _): Row → Type

Insertion Constraints Insertion constraints used to guarantee distinctness of elements: a ins (Int # Bool # Empty) constrains a to be any other than Int or Bool (List b) ins (Int # Bool # Empty) Is True

Type-indexed product TIP:  Triv:All Empty  (_ && _):extension forall (a: Type) (b: Row). a ins b => a → All b → All (a#b) Type-indexed coproduct TIC:  (Inj _):injection forall (a: Type) (b: Row). a ins b => a → One (a#b)

Let tuple = \(x && y && Triv). (x, y) In tuple (True && 1 && Triv) Type checking: Unify All(x#y#Empty) and All(Int#Bool#Empty) Under constraint: x ins (y#Empty) Overall term has type (Int, Bool) or (Bool, Int) !

Equality constraints ( c # d # Empty ) eq ( Int # Bool # Empty ) Propagates until sufficient information is found to be simplified

Simplifying constraints Simple unification:(a → Int) eq (Bool → b) a eq Bool, Int eq b Row unification:(Int # a # Empty) eq (Bool # b # Empty) (Int eq b), (a # Empty) eq (Bool # Empty) insertion:(a,b) ins (Bool # c # Empty) (a,b) ins (c # Empty)

Introducing fresh typenames Monomorphic: newtype xCoord = Int All (xCoord # Int # Empty) Polymorphic: newtype xCoord = \ (a:Type).a Allows same newtypes within a record!! Introduction opaque newtypes Type arguments are ignored in insertion constraints : newtype opaque xCoord = \(a:Type).a

XMLambda and UHConclusion Why regular expression types (Hosoya)?  Fundamental regular expression types  Powerful pattern matching  No higher order functions and polymorphism  Subtyping and parametric polymorphism? Why type indexed rows (Shields)?  Flexibility: more general than regular expression types  All nice characteristics of FP  Constraint system?