Presentation on theme: "LIS650lecture 0 Introductory lecture Thomas Krichel 2004-01-23."— Presentation transcript:
LIS650lecture 0 Introductory lecture Thomas Krichel 2004-01-23
administrative matters Course home page is at http://wotan.liu.edu/home/krichel/lis650p04s First quiz next lecture! Deadline to finish web site: one week after the end of the last lecture. You will not be able to change your web site between the deadline and the time that the grade is issued! Subscribe to class mailing list https://lists.liu.edu/mailman/listinfo/cwp-lis650-krichel
today introduction to the course talk about you the basic ingredients of the web, without html introduction to our basic technical set up introduction to html
Course history Course was first run as an institute 2002-05-13 to 2002-05-17 Title was Webmastering I: the static web site. To the curriculum committee, this title did not sound academic enough. Since Web Site Architecture and Design is now the full title, WeSaD (pronounced like wizard) is the official abbreviation. Webmastering is still what we want to learn.
teaching WeSaD WeSaD combines many aspects: –Authoring pages –Work on the organization of data to fit onto pages –Set display style of different pages –Organize the contribution of data –Maintain a technical web installation Some of them can be learned in a course, but others can not. Emphasis has to be on learnable elements.
teaching philosophy Point and click on a computer software is not enough Explain underlying principles Promote standards –HTML 4.01 –CSS level 2.1 Avoid proprietary software
WeSaD contents Deals with the maintenance of a static web site. Such a web site remains the same whatever the user does with it. Topics include –html –css –site usability and information architecture, as far as relevant for static web sites –http, uri, web server
things this course does not do Forms: allow you to design forms that users fill in. But you do not have the programming skills to do something with the form. Any HTML elements that require executable contents are not covered. Frames: allow you to put several documents into one physical document. Most experts advise against them. We do not cover image maps. We dont do some advanced CSS properties.
Other courses: webmastering II Deals with building dynamic web sites. –Users fill in a form –Users submit the form –Web server return a page that is specific to the request of the user. Teaches a language called PHP, that is widely used to generate such web sites. –Gets you introduced to computer programming –Gets you to train analytical thinking.
other courses: webmastering III Deals with XML –XML is a syntax to encode any kind of data. –XML can be constrained to only allow certain types of data (XML Schema) –XML can be transformed to render the data in various ways (XSLT) Achieve a separation of contents and presentation of a web page. advanced course, has both Schema and Transformation
The world wide web The World Wide Web (Web) is a network of information resources. The Web relies on three mechanisms to make these resources readily available to the widest possible audience: –A uniform naming scheme for locating resources on the Web (I.e. URIs). –Protocols, for access to named resources over the Web (e.g., HTTP). –Hypertext, for easy navigation among resources (e.g., HTML).
URI introduction Every resource available on the Web -- HTML document, image, video clip, program, etc. -- has an address that may be encoded by a Universal Resource Identifier, or "URI". URIs typically consist of three pieces: –The naming scheme of the mechanism used to access the resource. –The name of the machine hosting the resource. –The name of the resource itself, given as a path.
example URI http://openlib.org/home/krichel This URI may be read as follows: There is a document available via the HTTP protocol, residing on the site openlib.org, accessible via the path "/home/krichel". mailto:email@example.com This URI may be read as follows: There is email user krichel in a domain openlib.org to whom email may be sent.
client / server protocol The web operates mostly on http. This is a client-server protocol. The client software is run on the local PC that you are using. –It is called a web browser or user agent. Our server is a piece of hardware called wotan.liu.edu –It runs the Debian GNU/Linux operating system on a Intel architecture. –It provides http daemon software that serves http requests. The particular software is called Apache.
communication with the server The protocol for communicating with the server is the secure shell, short ssh. It is based public- key cryptography. We two two ssh clients –For file editing and manipulation, we use putty. –For file transfer, we use winscp. –Both are available on the web. Telnet and ftp servers are not available on wotan.liu.edu. Telnet and ftp do not encrypt the communication stream; therefore they are not secure.
registration time As part of the course, you are being provided with web space on the server wotan.liu.edu, at the URL http://wotan.liu.edu/~username where username is a user name that you will chose now. It is my intention to maintain this web space for you into the foreseeable future. You should also choose a password, now. I will now register you.
login time Use putty, port 22 to wotan.liu.edu set other attributes of the session as you like, using the menu on the left, for example –colors –font shapes and sizes –bell Save the session as wotan (in the first screen) to save all the customization. You do not normally need to login to the machine, unless you want to work with it.
free software I maintain wotan.liu.edu server but you can build your own server if –you have Internet access –you have an old PC to spare All the server software, as well as putty and winscp are free, open-source. It is one of my fundamental beliefs that free information should run on free software. The library community can learn a hell of a lot from the free software community. See my talk at http://openlib.org/home/krichel/ presentations/new_york_2003-11-07.ppt
installing software at home Go to your favorite search engine to search for –putty –winscp Download and run windows-style installer software to install both pieces of software. Download and install a recent version of at least two browsers. I suggest –Netscape Navigator at http://channels.netscape.com/ns/browsers/download.jsp –Opera at http://www.opera.com
putty and winscp You can either maintain files on wotan.liu.edu –by logging into wotan.liu.edu –using a file editor there, for example nano –past experience has shown that this is hard for students with no UNIX experience. You can also maintain text files locally –each time you make a change, you save the file and upload to wotan.liu.edu using winscp. –you can use Notepad locally to maintain text files –I do not recommend using WordPad and Word.
create a web page in MS notepad Open Microsoft notepad. Type the text
Saving the web page save as empty.html. If you want to open it again in notepad –open notepad –select file/open –list all files –empty.html Don't click on the file. Don't choose edit in the context menu.
upload and view file Once you have your file empty.html, use the menus of winscp to upload it to your file in the public_html directory of your home directory on wotan.liu.edu. It has to be in public_html ! Once it is there, use a web browser to view it at http://wotan.liu.edu/~user/empty.html, where user is your user id. Then validate it at http://validator.w3.org. –enter the URL of the page that you want to validate –hit the validate button It has to be in public_html !
public_html Is your web directory. It is automagically created for you when Thomas registers you. The web server will map requests to http://wotan.liu.edu/~user/file to show the file /home/user/public_html/file. Here user stands for your user id, and file is the file name. If file ends with.html or.htm the web browser will be told that the file is a html file. It will be rendered accordingly by the browser.
index.html The web server on wotan will map requests to http://wotan.liu.edu/~user to show the file ~user/public_html/index.html If this file is not there, the server will prepare a html document from the list of files that it finds in the directory and send it to the user agent. Once you have a file index.html, the web user can no longer see the individual files in your directory.
HTML and XHTML HTML is the hypertext markup language HTML is a markup language that is widely used on the Word Wide Web (WWW) The latest, and probably last version of HTML is at http://www.w3.org/TR/html4/ The WC3, the standard making body for the WWW, have issued XHTML, a replacement of HTML that is compatible with XML. We will ignore XHTML for the rest of the course.
what is markup? Everything in a document that is not content. It can be give in two ways 1: Procedural –Codes identify point size, style, font, etc. –Usually only understood by defining tool –Example: Microsoft Word 2: Descriptive –Describes purpose of text within the document –Chapter head, Paragraph, Section Head, TOC –Structure and Style are kept separate –Example: LaTeX, SGML
SGML Standard Generalized Markup Language Descriptive approach with three separate layers –structure: types of information in document –content: the information itself –style: matches typesetting with structure Developed for the publishing industry by a group around Goldfarb. So complicated that no software implements it fully Document Type Definition (DTD) –Defines the structure
Document Type Definition (DTD) Describes information the document handles –e.g Title,TOC, Chapter, Section Relationships between fields –e.g. A Chapter contains Sections Consistency Logical structure Information defined by tags
HTML HyperText Markup Language Defines an SGML DTD –Head, Title, Body, Paragraph, etc. –Headings, Bold, Italic, etc. –Table, List, Image, etc. –Links to other documents –Forms Style applied by Web Browser –User has some control
HTML history HTML was a very bare-bones language when first invented by Tim Berners-Lee. It did not describe pages with much of a visual appeal. In the 90s, successful browsers invented extensions that aimed to stretch the visual boundaries of HTML. Some of these extensions found their way in the official HTML spec issued by the W3C.
my HTML I will teach HTML 4.01. This version has two different DTDs: –the loose DTD –the strict DTD I will only do the tags of the strict DTD The loose DTD has more tags, but all the functionality of these tags is best done with style sheets. Thus, the pages created with HTML only will look rather boring. But we do cover style sheets later.
HTML tags HTML markup is written as tags. Tags are written as pairs (typically) –begin with "tag start" –end with "tag end" –tag is the tag name Can be nested Can contain non-markup data Tag names are case-insensitive, but it is best to use the same case, consistently, for human readability.
attributes to tags Here attribute_name_one and attribute_name_two are attribute names and value_one and value_two are attribute values. I will say: tag requires attribute "attribute". I will say tag takes attribute "attribute" if the attribute is optional.
Example Thomas Krichel –the whole thing is an tag. (I surround tag names with <>) –href is an attribute name –http://openlib.org/home/krichel is the value of the "href" attribute (I surround attribute names with straight quotes) –Thomas Krichel is character data.
Characters: concept A character set combine two things –Character repertoire: a set of characters e.g. "A", "" "", "" –Character code positions: defines a number for each character in the repertoire. Character encoding is a way to encode the code positions in bytes To correctly display a document, the user agent needs to know both!
playing safe with characters Only use the characters on the US keyboard, don't insert symbols. Save as ascii or utf-8. Never save as "Unicode" within MS Notepad. If you encounter a character that is not on your keyboard, use an SGML entity.
Special Characters Inserted as an entity reference –Format can be &code; Ex. & –Insert an ampersand –Codes are often abbreviation of the character names –Codes can be in hex form Ex. & to insert an ampersand http://www.w3.org/TR/REC-html40/sgml/entities.html has the list
classifying tags There is a whole bunch of different tags. We can group tags together in different ways. In the following, I will explain some of the ways. –block-level vs text-level tags –tags that require closing vs those that do not.
block-level vs text-level tags Block-level tags contain data that is aligned vertical by visual user agent. Text-level tags are aligned horizontally by visual user agents. There are a number of reasons behind this distinction –Block level can contain other block level tags and text-level tags. –Text-level tags can not contain block-level tags. –Visual user agents start a new line at the beginning of block-level tags. –Multidirectional text would be impossible without it.
common frame for pages We look at empty.html again. Here is the start again This is an SGML document type declaration. It says which kind of HTML it is. Use empty.html as a start to compose all your pages.
special topic: images The appeal of the web to the masses has a lot to do with its capability to transport image. Image format are independent of the web, but there are two classic format that are widely supported by user agents. –GIF –JPEG
GIF stands for graphics interchange format. developed by CompuServe. unresolved copyright issues make the format abhorred by the free software community. 250 colors maximum uses a loss-less compression technique
GIF has three tricks interlacing: –when downloading the file, the browser can show every forth row first –user gets in an idea of the picture before it is sharp transparency –some GIFs are transparent, so you can see them on top of already exist –technically, the GIF has one color as the background color, and pixels of that color are ignored by the user agent animation –some GIFs are in fact sequences of GIFs that can be rendered one after the other.
JPEG The Joint Photographic Experts Group is a standard-making body for images They can support thousands of colors. The compression is lossy, i.e. the JPEG file will look like the original image, but not be the same. The compression does not work well with drawings. There are no copyright and patent problems with JPEG
working with wotan You can work with wotan directly if you like. Use putty to connect to wotan.liu.edu, then type cd public_html You can start from empty.html, the file that validates, and copy it to test.html cp empty.html test.html nano test.html Then you can change test.html to try out the tags as I discuss them here.
working on the local machine Open empty.html on your web site and save as test.html edit it with notepad to be safe open with Internet Explorer to see the rendered html to validate –you have to upload the file first to your public_html directory on wotan.liu.edu –Then use the W3C validator at http://validator.w3c.org
literature I work from the text of the official standard at http://www.w3.org/TR/html4/ To work with it faster, I made a copy at http://wotan.liu.edu/~krichel/html4/ You can work from any HTML book.
Homework Look at course home page http://wotan.liu.edu/home/krichel/lis650p04s Send firstname.lastname@example.org your secret word for course result delivery. Prepare a one-page max summary of the type of website that you want to build, bring printed copy with you next week. Prepare for quiz at the beginning of next lecture.
http://openlib.org/home/krichel Thank you for your attention!