Presentation is loading. Please wait.

Presentation is loading. Please wait.

Web data exchange formats Introduction and Overview.

Similar presentations


Presentation on theme: "Web data exchange formats Introduction and Overview."— Presentation transcript:

1 Web data exchange formats Introduction and Overview

2 Web data exchange formats XML JSON YAML

3 XML outline What is XML & Why XML The rules of XML documents XML schema and validation XML processing DOM SAX JAXP JAXB Digester

4 Before XML HTML, Hyper-Text Markup Language, the most successful markup language of all the times First definition, HTML 1.0 – 1992 Latest version, HTML 4.01 – 1999 Fixed collection of markup tags,,,, etc …

5 What is XML? XML, Extensible Markup Language, is a framework for defining markup languages Created by the World Wide Web Consortium (W3C) to overcome the limitations of HTML Like HTML, XML is based on SGML - Standard Generalized Markup Language XML was designed with the Web in mind!

6 XML design goals 1.XML shall be straightforwardly usable over the Internet 2.XML shall support a wide variety of applications 3.XML shall be compatible with SGML 4.It shall be easy to write programs which process XML documents 5.The number of optional features in XML is to be kept to the absolute minimum, ideally zero

7 XML design goals 6.XML documents should be human-legible and reasonably clear 7.The XML design should be prepared quickly 8.The design of XML shall be formal and concise 9.XML documents shall be easy to create 10.Terseness in XML markup is of minimal importance

8 Typical XML usages Web development and content management Data exchange Data storage Configuration files Web services

9 Historical outline The development of XML began in the mid-90s Initial XML draft – November 1996 XML 1.0, W3C recommendation – February 1998 XML 1.1 – February 2004

10 More about XML XML lets us define our own tags Each XML language is targeted to a particular application domain XML specification says nothing about the semantics of the markup tags XML is internationalized and platform independent

11 XML specification Is located at XML 1.0: http://www.w3.org/TR/REC-xml/ http://www.w3.org/TR/REC-xml/ XML 1.1: http://www.w3.org/TR/xml11/http://www.w3.org/TR/xml11/ Defines the basic rules for XML documents

12 Sample XML document David Gilmour Richard Wright Nick Mason

13 Examples of XML markups XHTML WML - Wireless Markup Language MathML – Mathematical Markup Language ebXML - Electronic Business XML CML - Chemical Markup Language MusicXML – Musical Scores Markup Language ThML - Theological Markup Language See more at http://en.wikipedia.org/wiki/List_of_XML_markup_languages

14 XHTML versus HTML XHTML 1.0 is W3C’s XMLification of HTML 4.01 The most notable differences: HTML allows certain elements to omit the end tag (forbidden in XML) Element and attribute names must be lowercase Attribute values in XHTML must be present and they must be surrounded by quotes

15 XML document rules The creators of XML decided to enforce document structure from the beginning The XML specification requires a parser to reject any XML document that doesn't follow the basic rules A parser is a piece of code that attempts to read a document and interpret its contents

16 Three kinds of XML documents Invalid documents Don't follow the syntax rules defined by XML specification or DTD/schema Valid documents Follow both the XML syntax rules and the rules defined in their DTD/schema Well-formed documents Follow the XML syntax rules but don't have a DTD/schema

17 How to check XML document? Easy way to check if XML document is well- formed: Simply open it in a browser

18 XML main notions There are three common terms used to describe parts of an XML document: tags elements attributes David Gilmour David Gilmour David Gilmour

19 Rule: The root element An XML document must be contained in a single element Hello, World! Hello, World! Hola, el Mundo!

20 Rule: Elements can't overlap Invalid XML documents: Jonh Brown My name is John Brown.

21 Rule: End tags are required You can't leave out any end tags If an element contains no markup at all it is called an empty element In empty elements in XML documents, you can put the closing slash in the start tag My name is John Brown I am 25 years old...

22 Rule: Elements are case sensitive In HTML, and are the same; in XML, they're not Elements are case sensitive Elements are case sensitive

23 Rule: Quoted attribute values There are two rules for attributes in XML documents: Attributes must have values Those values must be enclosed within quotation marks (single or double)

24 XML declarations Most XML documents start with an XML declaration that provides basic information about the document to the parser An XML declaration is recommended, but not required <?xml version="1.0" encoding="UTF-8" standalone="no"?>

25 XML document as a tree Conceptually, an XML document is a hierarchical structure called an XML tree Although there is no consensus on the terminology used on XML trees, at least two standard terminologies exist: XPath Data Model XML Information Set http://www.ibm.com/developerworks/xml/library/x-hands-on-xsl/

26 Namespaces Different XML languages may use the same tags Namespaces a solution for a name clashing problem <customer_summary xmlns:addr="http://www.xyz.com/addresses/" xmlns:books="http://www.zyx.com/books/" xmlns:mortgage="http://www.yyz.com/mortage/">... Mrs....... Lord of the Rings...... NC2948-388-1983...

27 Namespaces XML namespaces are similar to Java packages The string in a namespace definition looks like a URL, but it’s just a string! For simplicity, unprefixed element names are assigned a default namespace ( xmlns=“ ” ) Can be overridden using a declaration in a form xmlns=“URI”

28 Defining document content The elements of particular XML language have to be defined in some way A schema is a formal definition of the syntax of an XML-based language Two main schema languages: DTD XML Schema

29 DTD - Document Type Definition Built-in schema language since the first XML working draft DTD is not itself written in XML notations

30 Document Type Declaration An XML document may contain a reference to a DTD schema XHTML documents often contain: <!DOCTYPE people SYSTEM "http://www.music.com/people.dtd"> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

31 DTD – Element declaration An element declaration looks as follows: Content model defines the validity requirements of the contents (the sequence of its immediate child nodes) of all elements of the given name

32 DTD – Content model Constructs used in content model description: EMPTY Empty contents ANY Any contents #PCDATA Character data element nameAn element, Concatenation | Union ? Optional * Zero or more repetitions + One or more repetitions

33 DTD: example

34 DTD: Attribute-List declarations An attribute-list declarations looks as follows: attribute-definitions is a list, each element in a form: attribute-name attribute-type default-declaration Default declarations: #REQUIRED Required #IMPLIED Optional, no default “value” Optional, value is default #FIXED “value” As the previous, but only this value is permitted

35 DTD: examples <!ATTLIST img alt CDATA #REQUIRED src CDATA #REQUIRED width CDATA #IMPLIED height CDATA #IMPLIED> 123 15th St. Troy NY 12180

36 XML Schema Shortly after XML 1.0, the W3C initiated the development of the next generation schema language to attack the problems with DTD Some judicious guiding design principles, that the new schema language should be: More expressive that XML DTD Expressed in XML Self-describing Simple enough

37 XML Schema Specification Published in 2001 Specification consist of the following parts: Part 0 - Primer: http://w3.org/TR/xmlschema-0 Part 1 - Document structures: http://w3.org/TR/xmlschema-1 http://w3.org/TR/xmlschema-1 Part 2 - Datatypes: http://w3.org/TR/xmlschema-2

38 XML Schema Unfortunately, the resulting language does not fulfill the original requirement Although it provides good support for namespaces, modularization and datatypes, but It is not simple – Part 1 alone is more than 160 pages, and even XML experts do not find it human- readable It is not fully self-describing – there is a schema for XML Schema, but it doesn’t capture all syntactical aspects of the language

39 XML Schema advantages Several advantages over DTDs XML schemas use XML syntax You can process a schema just like any other document XML schemas support datatypes Integers, floating point numbers, dates, times, strings, URLs XML schemas are extensible User-defined datatypes, derived datatypes XML schemas have more expressive power XML schemas support namespaces

40 XSD An XML Schema instance is an XML Schema Definition (XSD) and typically has the filename extension ".xsd" <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

41 Example: people.xsd

42 Declaring XML Schema To declare that people.xml uses people.xsd schema, need to add the following: <people xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="people.xsd">... <people xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation= "http://www.ante.lv/lab01-music-serverside/data/people.xsd">...

43 XML Schema: Defining elements To define an element is to define its name and content model (type) A type can be simple or complex A simple type cannot contain elements or attributes in its value A complex type can create the effect of embedding elements in other elements or it can associate attributes with an element

44 Simple, non-nested elements An element that does not contain attributes or other elements can be defined to be of a simple type predefined user-defined http://www.ibm.com/developerworks/xml/library/xml-schema/sidetable2.html

45 Complex types Elements with attributes must have a complex type Elements that embed other elements must have a complex type

46 Expressing constraints on elements XML Schema offers greater flexibility than DTD for expressing constraints on the content model of elements For example, element occurrence definition: DTD: * + ? XML Schema : maxOccurs minOccurs

47 XML validation Online XML validator against XML Schema: http://tools.decisionsoft.com/schemaValidate/ Java API also provides a way to make a XML parser validate a document

48 XML processing APIs The three basic XML parsing interfaces are: Document Object Model (DOM) Simple API for XML (SAX) Streaming API for XML (StAX) Java API for XML Processing (JAXP) Provides common interfaces for processing XML documents (using DOM, SAX or StAX) XML to Java classes binding Java Architecture for XML Binding (JAXB) Digester

49 DOM The Document Object Model defines a set of interfaces to the parsed version of an XML document The parser reads in the entire document and builds an in-memory tree Your code can then use the DOM interfaces to manipulate the tree

50 DOM Using DOM API you can move through the tree to see what the original document contained delete sections of the tree rearrange the tree add new branches and so on...

51 DOM issues The DOM builds an in-memory tree of an entire document. If the document is very large, this requires a significant amount of memory. This could cause also a significant delay. The DOM creates objects that represent everything in the original document, including elements, text, attributes, and whitespace. It may be extremely wasteful to create all those objects that will never be used.

52 SAX To get around the DOM issues, the XML-DEV participants created the SAX interface A SAX parser sends events to your code The parser tells you when it finds the start of an element the end of an element text the start or end of the document and so on...

53 SAX You decide which events are important to you A SAX parser doesn't create any objects at all You decide what kind of data structures you want to create to hold the data from SAX events

54 SAX issues SAX events are stateless SAX event simply gives you the text that was found; it does not tell you what element contains that text. You have to write the state management code yourself. SAX events are not permanent If your application needs a data structure that models the XML document, you have to write that code yourself SAX is not controlled by a centrally managed organization (such as the W3C)

55 Proprietary XML parsers in Java jDOM http://www.jdom.org Xerces http://xerces.apache.org/xerces-j/ Woodstox http://woodstox.codehaus.org/ It is recommended to use a standard: Java API for XML Processing

56 Problem: the process of creating, for example, a DOMParser object in a Java program differs from one DOM parser to the next JAXP provides common interfaces for processing XML documents (using DOM, SAX, StAX or XSLT) JAXP provides interfaces such as the DocumentBuilderFactory and the DocumentBuilder that provide a standard interface to different parsers

57 JAXP DOM API diagram http://docs.oracle.com/javase/tutorial/jaxp/dom/index.htm l

58 JAXP SAX API diagram http://docs.oracle.com/javase/tutorial/jaxp/sax/index.html

59 JAXP + DOM: parsing XML DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); factory.setNamespaceAware(true); factory.setValidating(true); factory.setAttribute( "http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema" ); DocumentBuilder db = factory.newDocumentBuilder(); org.w3c.dom.Document doc = db.parse(input); // process the DOM document Element root = doc.getDocumentElement(); for (Node node = root.getFirstChild(); node != null; node = node.getNextSibling()){... }

60 JAXP + DOM: creating XML [1] DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder docBuilder = factory.newDocumentBuilder(); Document doc = docBuilder.newDocument(); Element root = doc.createElement("music-summary"); doc.appendChild(root); Element reportId = doc.createElement("report-id"); String reportIdString = generateUniqueId(); Text reportIdText = doc.createTextNode(reportIdString); reportId.appendChild(reportIdText); root.appendChild(reportId);...

61 JAXP + DOM: creating XML [2]... TransformerFactory transfactory = TransformerFactory.newInstance(); Transformer transformer = transfactory.newTransformer(); transformer.setOutputProperty(OutputKeys.INDENT, "yes"); FileWriter fw = new FileWriter(outputFile); StreamResult result = new StreamResult(fw); DOMSource source = new DOMSource(doc); transformer.transform(source, result);

62 Java Architecture for XML Binding JAXB allows to map Java classes to XML representations Steps using JAXB: Bind the schema for the XML document Unmarshal the document into Java content objects http://docs.oracle.com/javase/tutorial/jaxb/

63 Apache Jakarta Commons Digester Digester is a layer on top of the SAX API to make it easier to process XML input Digester makes it easy to create and initialise a tree of objects based on an XML input file The developer needs to write rules that tell Digester how to map input XML into Java objects Digester supports only one-way mapping: XML  Java objects

64 Code sample: Digester Digester digester = new Digester(); digester.setValidating(false); digester.addObjectCreate( "people", PeopleHolder.class); digester.addObjectCreate("people/person", Person.class); digester.addSetProperties( "people/person", "id", "idString"); digester.addBeanPropertySetter( "people/person/name", "name"); digester.addBeanPropertySetter( "people/person/surname", "surname"); digester.addSetNext("people/person", "addPerson"); PeopleHolder peopleHolder = (PeopleHolder)digester.parse(input); Vector people = peopleHolder.getPeople();

65 Other XML standards XSL (Extensible Stylesheet Language) XSLT (XSL Transformations) XPath (XML Path Language) XLink, XPointer XML security Web Services SOAP, WSDL, UDDI SVG, SMIL Many more...

66 JSON

67 JSON (JavaScript Object Notation) is a lightweight computer data interchange format Text-based, human-readable format for representing simple data structures and associative arrays easy for humans to read and write easy for machines to parse and generate Is based on a subset of the JavaScript programming language MIME type: application/json

68 JSON example { "firstName": "John", "lastName": "Smith", "address": { "streetAddress": "21 2nd Street", "city": "New York", "state": "NY", "postalCode": 10021 }, "phoneNumbers": [ { "type": "home", "number": "212 555-1234" }, { "type": "fax", "number": "646 555-4567" } ], "newSubscription": false, "companyName": null } JSON Web page: http://json.org/ http://json.org/

69 JSON structure http://json.org/

70 JSON from Facebook { "data": [ { "name": "Ann Blue", "id": "100002771239557" }, { "name": "David Green", "id": "100002808391341" } ] } Friends { "data": [ { "name": "Second event", "start_time": "2011-10-04T16:00:00", "end_time": "2011-10-04T18:00:00", "location": "14. auditorija", "id": "196365027094566", "rsvp_status": "attending" } ], "paging": { "previous": "https://graph.facebook.com/100002774971272/ev ents?format=json&limit=25&since=1317744000", "next": "https://graph.facebook.com/100002774971272/ev ents?format=json&limit=25&until=1317744000" } FriendsEvents

71 User Profile JSON from Google+ { "kind": "plus#person", "etag": "\"GZR2X3-UK6zXRwPjCsTmgE7l6CI/feNs8dXzP9_SaZJBtANkXqtESTI\"", "urls": [ { "value": "https://plus.google.com/107192656717644038166" }, { "value": "https://www.googleapis.com/plus/v1/people/107192656717644038166" } ], "id": "107192656717644038166", "displayName": "Brian Red", "name": { "familyName": "Red", "givenName": "Brian" }, "url": "https://plus.google.com/107192656717644038166", "image": { "url": "https://lh4.googleusercontent.com/- S_Y0PMoBgT0/AAAAAAAAAAI/AAAAAAAAAAA/xzau2wEOUo8/photo.jpg?sz=50" }

72 Venues JSON from Foursquare {... response: { groups: [ {... items: [ {... venue: { id: "4cf4102d899c6ea84fd0fec1" name: "Innocent Cafe"... } venue: { id: "4c93a04a58d4b60c2b012129" name: "MiiT"... }... }]........................................................ }

73 JSON parsers JSON-lib http://json-lib.sourceforge.net/ Google’s GSON http://code.google.com/p/google-gson/ FlexJSON http://flexjson.sourceforge.net/ Java API for JSON Processing: a part of Java EE 7 (standard)

74 JSON-lib net.sf.json-lib json-lib 2.4 jdk15 JSON-lib is a Java library for transforming beans, maps, collections, arrays and XML to JSON and back again to beans Very simple for core JSON tasks Maven dependency:

75 JSON-lib JSON object == net.sf.json.JSONObject JSON array == net.sf.json.JSONArray net.sf.json.JSONSerializer can transform any Java object to JSON notation and back with a simple and clean interface, leveraging all the builders in JSONObject and JSONArray

76 JSON-lib example import net.sf.json.*; InputStream response = httpClient.download(url); String content = IOUtils.toString(response); Object json = JSONSerializer.toJSON(content); if (json instanceof JSONObject){ JSONObject jsonObject = (JSONObject)json; Object data = jsonObject.get("data"); if (data instanceof JSONArray){ JSONArray jsonArray = (JSONArray)data; for (Object jsonElement: jsonArray){ if (jsonElement instanceof JSONObject){ JSONObject jsonFriend = (JSONObject)jsonElement; String friendId = (String)jsonFriend.get("id"); String friendName = (String)jsonFriend.get("name");... } Apache Commons IO http://commons.apache.org/proper/commons- io/javadocs/api- 2.4/org/apache/commons/io/IOUtils.html

77 GSON com.google.code.gson gson 2.2.4 Library that can be used to convert: Java Objects into their JSON representation JSON string to an equivalent Java object Maven dependency:

78 GSON example public class GooglePlusUser { private String id; private Name name; class Name { public String givenName; public String familyName; public Name() {} }... } Gson gson = new Gson(); GooglePlusUser googleUser = gson.fromJson(json, GooglePlusUser.class); 1.Define a class: 2.Parse JSON:

79 Java API for JSON Processing Provides portable APIs to parse, generate, transform, and query JSON using: Object model API (similar to DOM) creates a random-access, tree-like structure that represents the JSON data in memory Streaming API provides a way to parse and generate JSON in a streaming fashion http://www.oracle.com/technetwork/articles/java/json-1973242.html

80 YAML YAML is yet another human-readable data serialization format (first proposed in 2001) Takes concepts from programming languages such as C, Perl, and Python, and ideas from XML YAML is a recursive acronym for "YAML Ain't Markup Language“ (data-oriented, rather than document markup) Early in its development: "Yet Another Markup Language"

81 YAML example --- receipt: Oz-Ware Purchase Invoice date: 2007-08-06 customer: given: Dorothy family: Gale items: - part_no: A4786 descrip: Water Bucket (Filled) price: 1.47 quantity: 4 - part_no: E1628 descrip: High Heeled "Ruby" Slippers price: 100.27 quantity: 1...

82 JSON versus YAML JSON syntax is a subset of YAML 1.2 Most JSON documents can be parsed by a YAML parser JSON's semantic structure is equivalent to the optional "inline-style" of writing YAML The Official YAML Web Site: http://www.yaml.org/

83 XML Resources Book “An Introduction to XML and Web Technologies”, A. Moller and M. Schwartzbach, 2006 Articles, online tutorials, and other technical resources on XML standards and technologies http://www.ibm.com/developerworks/xml IBM developerWorks: Introduction to XML http://www.ibm.com/developerworks/edu/x-dw-xmlintro-i.html

84 XML Resources Java Tutorial: Java API for XML Processing (JAXP) http://docs.oracle.com/javase/tutorial/jaxp/ Java Tutorial: Introduction to JAXB http://docs.oracle.com/javase/tutorial/jaxb/intro/

85 JSON Resources Java API for JSON Processing: An Introduction to JSON http://www.oracle.com/technetwork/articles/java/json- 1973242.html Java EE 7 Tutorial: JSON Processing http://docs.oracle.com/javaee/7/tutorial/doc/jsonp.htm http://docs.oracle.com/javaee/7/tutorial/doc/jsonp.htm


Download ppt "Web data exchange formats Introduction and Overview."

Similar presentations


Ads by Google