Presentation on theme: "Www.monash.edu.au IMS5401 Web-based Systems Development Topic 2: Elements of the Web (a)Connecting computers (the Internet) (b)Connecting documents (hypertext)"— Presentation transcript:
IMS5401 Web-based Systems Development Topic 2: Elements of the Web (a)Connecting computers (the Internet) (b)Connecting documents (hypertext)
2 Agenda 1.Network elements of the web 2.Topic 2(a): The Internet 3.Topic 2(b): Hypertext and the web 4.Some implications of the web’s networks for its applications 5.Summary
3 Elements of the Web THE WEB Connecting computers Digital representation of documents Display and organisation of documents Linking documents
Network elements of the web The web is based on two communication networks: A network of computers and other digital devices (called the Internet) A network of documents which are stored on Internet- based computers The characteristics of these networks determine the usefulness of the web for any application Different networks have unique features (eg phone, TV, radio, mail, road, etc), but are based on common general concepts
5 Basic network concepts Nodes Identifying and naming nodes Links ‘Rules’ for connecting via links (protocols) Centralised vs de-centralised networks Localised vs distributed networks
6 Some key requirements for an effective network Commonality of need and requirements across user population Accessibility at affordable cost Compatibility of technology Standards for common usage Simplicity of usage Control and management structures Note that different purposes/applications may also have their own specialist network needs
The Internet as a network of computers: Brief history 1940s Theory of digital machine communication (Shannon and Weaver) 1962 RAND Corporation: redundant networks, packet switching and message routing ARPANET and other networks 1973 Work on TCP/IP begins 1974 First use of term “Internet” 1983 establishment of the Internet with TCP/IP and DNS Web is developed on the Internet
8 Internet elements: Nodes and addresses An Internet address is the address of a computer or digital device Every device on the Internet has to have a unique address - called an IP address (IP = Internet Protocol) (like house addresses) IP addresses are a string of numbers - eg For simplicity, use ‘domain names’ instead of numbers - eg sims.monash.edu.au The Domain Name System keeps track of domain names and IP addresses
9 Internet elements: Maintaining Nodes - The Domain Names System (DNS) The DNS system spreads the task of maintaining address information for each node DNS servers maintain and share a distributed database of domain names and IP addresses Each DNS server maintains addresses in its domain (eg Monash DNS looks after all addresses in the.monash.edu.au domain) DNS servers pass requests for access to a particular address through the DNS system until a server recognises the address (or sends an error message if the address can’t be found)
10 Internet elements: Links Many different types of physical network link connections between Internet nodes are possible: Medium for connection Cables/microwaves/radio waves Permanence of connection Dedicated lines/shared lines Capabilities of connection Fast transmission/slow transmission More detail in lecture on data communications
11 Internet elements: Message transmission along links A protocol is a set of rules for transmitting and receiving data. TCP/IP is the standard protocol for data transmission on the Internet (Transmission Control Protocol = TCP; Internet Protocol = IP) Messages are sent via packet switching: message split into small pieces (packets); each packet transmitted separately; packets re-assembled at receiver end Protocol checks to ensure message is transmitted correctly; re-transmits until it is right
Topic 2(b) Hypertext and the web as a network of documents: Brief history The need for cross-referencing of documents 1945 Vannevar Bush and Memex 1967 Ted Nelson and hypertext - the Xanadu project Tim Berners-Lee and CERN - the ‘web’ (mesh) of CERN documents The World-wide Web of documents
13 Birth of the web as a document network Developed by Tim Berners-Lee to help CERN staff find and develop connections between their work Berners-Lee thought the working structure of CERN was like a “web” - complex networks of inter-relationships; information access should match that (originally called his idea a “mesh”) Aim: to enable “ …… a pool of information to develop which could grow and evolve with the organisation and the projects it describes”
14 Features of Berners-Lee’s web proposal (not all were implemented) Based around hypertext connections between a wide range of documents and document types Different types of links to represent different relationships Internal to CERN (no mention of Internet) Device independence “Live” links See reference to Berners-Lee’s original proposal in the tute and unit resources
15 The CERN web becomes the world-wide web Berners-Lee posted his idea on a news group on the Internet; his idea and his software were distributed for free, and quickly spread world- wide The Internet was the obvious machine network on which to store the document network Others added and improved “the web” As uncontrolled as the Internet: entirely open and non-proprietary basic standards World Wide Web Consortium (W3C) set up to try to co-ordinate and direct its development
16 Nodes on the document network Every document (or page) has its own unique address The address of a web page is called its URL (Uniform Resource Locator). For example, A web site is a collection of linked pages on a web server To get a document from a web site all I have to do is send the address of the document to the web server on which it is stored
17 Managing the document nodes with web servers At a basic level, web servers do two things: –store web pages –receive and respond to requests for their web pages Web servers have specialist software to manage web pages and page requests Large organisations maintain dedicated web servers for managing web traffic Large web sites use computers specifically configured for web tasks (eg Apache, Microsoft Server 2000, etc)
18 Links Berners-Lee developed: –a simple document mark-up language (HTML) which allows a document to identify a link to another document –a simple program (called a browser) which can display HTML documents and respond to document links Clicking on a link reference causes the browser to sent a request for that linked document to the address where it is stored The server where it is stored then sends a copy of the requested document to the browser which then displays it on your screen
19 Document transmission through the network The standard protocol which is used to transmit web pages is called HTTP (Hypertext transfer protocol) (hence “http://… etc in a URL) HTTP converts a request into TCP/IP for transmission on the Internet Note: under HTTP, there is no on-going connection established between the two computers (compare the contrast between a telephone call and SMS messages)
Implications of the web’s networks for its applications Any network has its own unique characteristics and features; its ability to support a particular use depends on the ‘fit’ between them (eg consider road vs train network, phone vs mail network, etc) Changing a network’s features to suit one purpose may hamper its ability to meet another As the developer of a web application, you must understand how your application fits the features of the web’s networks
(a) Key characteristics of the Internet as a communications network The level of automation of the network The lack of physical focus for the network (there is no ‘centre’; geography is irrelevant) The extent of the network (number of machines and users); its future extendability The lack of ‘ownership’ of the network The underlying philosophy of access/use and sharing (“information wants to be free”) (Compare with other communications networks – phone, mail, TV, radio, etc)
22 Features of the Internet as a network (1) Accessibility Initially accessible only via computer; now also from other devices (phone, fridge?!, etc) Initial set-up requires technical support; subsequent access is simple Stability Ever-changing and expanding (but not obvious to users) Many standards are still being established; driven by market and standards organisations Reliability Very robust and rarely breaks down DNS system is main point of vulnerability
23 Features of the Internet as a network (2) Authorisation of access Easy to enable access Virtually impossible to prevent access Knowledge requirements for users Minimal given appropriate support software ( package, web browser, etc) Finding things/people/etc can be very hard unless you know how Management and control Difficult to monitor because of its size Virtually impossible to regulate because of its de- centralisation, scope and extent
24 The Internet without other web features The Internet by itself can adequately support any activity which is about communication of messages - , chat rooms, bulletin boards, etc No in-built capabilities for formatting, document management, hypertext, multimedia, etc Therefore, for document viewing and exchange the Internet needs more than just connectivity
25 The Internet before other web features Major uses were for , file transfer, bulletin boards and news groups Usage characteristics: technically complex required a range of specialist tools (gopher, veronica, etc) non-standard interfaces/representation methods command-driven interface to ‘publish’ on the Internet you needed technical skills For anything more complex than , the Internet was useful only for technically competent people
26 The Internet trade-off Potential positives: Ease of access Reliability of communication Expandability Open-ness to all (no ownership) Potential negatives: The expanding traffic stream Open-ness to mis-use and abuse Impossibility of regulation or control How will the balance between these affect your proposed application?
27 The future of the Internet? Some differing views: – The Tragedy of the Commons - will the Internet (and with it the web) go the same way? – The expanded Internet - the Internet fridge (and other devices)? – The regulated Internet? – The interplanetary Internet?! See resources for some Internet standards organisations and some analyses of the Internet Implications for applications??
28 4(b) Hypertext: The good and bad of hypertext document networking? Good? –Linking to related information –Linking to source information –Linking to more detailed information –Providing multiple paths to suit different needs Bad? –Linking to dead-ends or incorrect information –Confusing the reader with information overload –Confusing the reader over which path to take - lost in hyper-space –Enabling theft of document content
29 Key features of the web as a document network: documents/pages vs data The web was built to connect documents and pages, not data. Contrast with traditional IS which are based around data, not pages Sending pages creates issues for: –applications which only want data (discuss further in relation to interactivity) –the needs for document formatting for searching (discuss further in relation to mark-up languages) These problems have led to the need for XML to enable documents to be more data-oriented
30 Key features of the web as a document network: page-serving vs interactivity The web was built to send a page or document, but not to enable an exchange of information. Contrast with traditional IS which are based around information exchanges between user and system This creates problems for web applications which require information exchanges (discuss further in lecture on interactivity) These problems have led to complex work- arounds requiring scripting languages
31 Key features of the web as a document network: sending a copy vs allowing to read Document ownership did not matter for Berners-Lee’s web (“…these are of secondary importance at CERN, where information exchange is more important than secrecy”) This creates problems with: –copyright and control of ownership of material published on the web –controlling document access; how do I stop people accessing documents they should not have? Complex legal issues over copyright (Napster, etc) and censorship (pornography, etc)
32 Designing to fit within the limits of effective document connectivity Too many links can confuse the reader Maintaining links becomes an on-going site maintenance problem (dead links = frustrated reader; out-of-date links = mis-led reader) Getting ‘lost in hyperspace’ becomes a major risk for users, making good site navigation and site architecture vital What is a link for? Different types of links?
33 Document networks: the problem of scale As it gets bigger, a document network can hold more useful information and links … … but as it gets bigger, it gets harder to manage it and find things in it Tim Berners-Lee designed his web for one organisation (small scale); now it works globally (massive scale) How well is the structure of the web suited to this different scale? What doe this mean for applications?
34 The trade-off for the web as a document network Document connectivity as implemented in the web meets some information needs …. …. but creates some new problems of its own Possible solutions: –Design and build systems in ways which compensate for the problems (see future topics on web design) –Modify the mark-up languages and software used for formatting and finding web pages (see future discussion of XML and search engines) –Re-design the entire web (see future discussion on the semantic web) Web supporters are trying to implement all of these
Summary: Implications for applications The web enables very powerful network connectivity of computers/digital devices and documents Each form of connectivity was designed with a particular application/purpose in mind; its features reflect the need This connectivity can be used to support other applications, but how well do they suit? If not, is it possible to work around the problems?
36 What does this mean for information professionals? Understanding what forms of connectivity people need/want Understanding what connectivity the web technology can offer Understanding how to apply that technology to what people need/want Accepting the limitations of what can be done (with technology OR with people !)