Presentation is loading. Please wait.

Presentation is loading. Please wait.

How the World Wide Web works UC Santa Cruz CMPS 10 – Introduction to Computer Science 9 May 2011.

Similar presentations


Presentation on theme: "How the World Wide Web works UC Santa Cruz CMPS 10 – Introduction to Computer Science 9 May 2011."— Presentation transcript:

1 How the World Wide Web works UC Santa Cruz CMPS 10 – Introduction to Computer Science www.soe.ucsc.edu/classes/cmps010/Spring11 ejw@cs.ucsc.edu 9 May 2011

2 UC SANTA CRUZ Homework #3  Will be an assignment working with Context Free Art  Described in class today  Context Free Art is a domain specific programming language for creating computer-generated artwork  www.contextfreeart.org www.contextfreeart.org  Program is freely available, PC/Mac/Linux  Standalone version works in PC labs (BE 109)  In the assignment, you will 1.Take an existing context free art program and modify some of the numbers in it, and then describe the impact on the final artworks generated 2.In the same program, you will add a few new lines to an existing rule, and then describe the impact on the final artworks 3.In the same program, you will add a new duplicate rule, then describe the impact 4.For extra credit, you can write a context free art program that is completely new (from scratch), describe how it works, and give an example of some artwork

3 UC SANTA CRUZ Homework #3  Assignment due Friday, May 13  Help sessions:  Tuesday, May 10, 3-5pm, E2 307  Thursday, May 12, 4-6pm, location TBD  Assignment details now on web

4 UC SANTA CRUZ The three key advances of the Web  When Tim Berners-Lee invented the Web, he had to solve three key problems:  Addressing (URL)  How to uniquely identify each web page in a human-friendly way  Presentation (HTML)  How to present the information in a web page to a human reader in a standard way  Transport (HTTP)  How to quickly transmit information from a remote computer to the user’s computer  Especially, how to do this faster than FTP Tim Berners-Lee (1993) These three technologies form the technical pillars of the World Wide Web

5 UC SANTA CRUZ Development context of the web  When the web was under development, 1989-1990, there was no one dominant internet information system  A collection of systems, including Usenet News, FTP, Gopher  Many different types of machines  PCs and Macs were not typically connected to the Internet  Most users of the Internet were academics on Unix workstations  But, many other kinds of computers existed as well: mainframes, DEC workstations, NeXT machines, etc.  In general, was a challenge to access information uniformly from all computers  A given information system might be accessible from some, but not all types of computer

6 UC SANTA CRUZ Addressing the internet, circa 1989/1990  Before the web, there was no uniform way to describe a specific information resource  One might say, “Go to the FTP server at ftp.ucsc.edu, login anonymously, then get file /games/rogue.tar”  Worked OK for humans, but,  Hard for computers to understand  Not very compact – what if you wanted to make a list of these?  Or, what if you wanted to embed them in a link in a document?  Ideally, one might want a way of identifying internet resources that includes:  The way to access it (i.e., which protocol? FTP? Gopher? Etc.)  The name of the machine that has the resource (i.e., a domain name like ftp.ucsc.edu)  The local name of the resource – i.e., once you get to a given machine with a given protocol, what identifier (name) should be used to retrieve the resource?

7 UC SANTA CRUZ URL  Uniform Resource Locator  Uniform: the same format is used in a standard way for identifying all resources on the Internet  Resource: a specific chunk of information available on the internet, or, a specific computational process (e.g., “current temperature”) on the internet  Locator: contains all of the information necessary to retrieve a web resource  A URL should contain, within itself, all of the information necessary to identify and retrieve a specific resource on the internet

8 UC SANTA CRUZ Significance of the URL  Any web resource can be retrieved using a URL  That is, URLs work at Internet scale  Most hypertext systems prior to the Web did not work at Internet scale: required a significant shift in thinking  Provides a standard way of writing down internet identifiers  Extensible syntax: not limited to just protocols and information systems available when URLs were invented  Has room to grow well into the future  Compact representation  Easy to write down and share with other humans  Easy to embed within computer documents (such as links in HTML)  Could easily map to file system paths, but was not required to exactly map to file system paths.  The web was not intended to be a wide area network file system  In retrospect, URLs have a kind of obvious, inevitable quality. This was not obvious at the time, and several other schemes existed prior to URLs that lacked some of the qualities above.

9 UC SANTA CRUZ Anatomy of a URL  {scheme}://{scheme-specific part}  Usually:  {scheme}://{domain name}/{local identifier}  For Web URLs:  Scheme = http (i.e., the protocol used to access it, HTTP)  Domain name = the name of the machine holding the web resource  Local identifier  A name that the machine identified by “domain name” understands, and can use to access a specific resource  Often a file name  Not required to be a file name

10 UC SANTA CRUZ Document formats, circa 1989/90  In 1989/90, there were two main document formats n the PC, Word and WordPerfect  But, these only worked on PCs  Complex document formats. Mostly publically undocumented.  Not really feasible for a single programmer to write a universal viewer for Word or Word Perfect documents.  No Unix support  On Unix, Framemaker, Latex, and text were the main file formats  Framemaker was also complex, like Word, and tied to a specific application  Latex was freely available, but required a compilation step to produce output (too slow for a web browser)  Text was freely available, and rendered quickly, but had limited typographic potential  Fixed width fonts  No choice of fonts, different width fonts, lists, tables, etc.  None of these document formats had built-in hypertext links

11 UC SANTA CRUZ HTML  In order to create a universal viewer for web resources, a new, simple document language was needed  Through the 1980s, the Standard Generalized Markup Language was in limited use  But, the ideas behind SGML were known to most hypertext system researchers  HTML took SGML, and simplified it  Followed the same syntax, more or less, and created a set of standard tags (elements) for describing simple documents  HTML: Hypertext Markup Language  Hypertext: the language needed to support hypertext linking  Markup: the style of language was a markup language  Textual content has additional annotations added that specify structure and formatting  Language: not a programming language, but a domain-specific document description language

12 UC SANTA CRUZ Significance of HTML  A simple document format that is  (relatively) easy render to a screen  Can be edited using simple text editors  Is more-or-less human-readable  Relatively space efficient  An open standard  The document format is not owned and controlled by a single for-profit company  Prevents lock-in to a standard where a company might start charging high licensing fees for its use  Implementation of HTML has grown to the point where it is supported by most general purpose computers  Writing a document in HTML means that you don’t have to worry whether people will be able to view it on their computer

13 UC SANTA CRUZ HTML document structure  Learning HTML is complex, and beyond the scope of a single lecture  A document consists of a metadata portion (the head) and the main content (the body) … metadata goes here, such as the title of the page … … main page content goes here …

14 UC SANTA CRUZ HTML markup  The general idea of markup is that you add annotations to the core text of a document  The annotations indicate the type of text (i.e., structural markup, such as heading, list item, etc.) or the presentation of the text (italics, bold, etc.) Sample page Chapter 1 heading This is a paragraph. It has sentences. It also has an italicized and a bolded word. HTML elements “mark up” the text in the document Chapter 1 heading This is a paragraph. It has sentences. It also has an italicized and a bolded word.

15 UC SANTA CRUZ HTTP  HTTP – Hypertext Transfer Protocol  An Internet protocol that is used by Web browsers to retrieve information (web pages, images, etc.) from a remote web server  Key qualities  Much faster than FTP for retrieving Internet resources  Extensible design, permitted web to grow to large scale  See example  http://upload.wikimedia.org/wikipedia/commons/c/c6/Http_request_tel net_ubuntu.png


Download ppt "How the World Wide Web works UC Santa Cruz CMPS 10 – Introduction to Computer Science 9 May 2011."

Similar presentations


Ads by Google