LIS900C: webmastering I: the static web site Thomas Krichel 2003-05-12.

1 LIS900C: webmastering I: the static web site Thomas Krichel

2 structure of talk First talk about me My concepts of webmastering Then talk about you Introduction to wotan –files– basic command –editing– creating your first page images links introduction to HTML

3 about me Born 1965, in Völklingen (Germany) Studied economics and social sciences at the Universities of Toulouse, Paris, Exeter and Leiceister. PhD in theoretical macroeconomics Lecturer in Economics at the University of Surrey 1993 and 2001 Since 2001 assistant professor at the Palmer School

4 why? During research assistantship period, (1990 to 1993) I was constantly frustrated with difficult access to scientific literature. At the same time, I discovered easy access to freely downloadable software over the Internet. I decided to work towards downloadable scientific documents. This lead to my library career (eventually).

5 steps taken I 1993 founded the NetEc project at later available at as well as at These are networking projects targeted to the economics community. The bulk is –Information about working papers –Downloadable working papers –Journal articles were added later

6 steps taken II Set up RePEc, a digital library for economics research. Catalogs –Research documents –Collections of research documents –Researchers themselves –Organizations that are important to the research process Decentralized collection, model for the open archives initiative

7 steps taken III Co-founder of Open Archives Initiative Work on the Academic Metadata Format Co-founded rclis, a RePEc clone for Research in Computing, Library and Information Science continuing work on freely available abstracting and indexing services on the Internet.

8 webmaster There are two available definitions that come to mind –A webmaster is a person who has write access to a set of files that are available for display on the World Wide Web. –A webmaster is a person who has control over a software installation that can deliver web pages. The second is more stricter. We mostly use the first one.

9 webmastering Webmastering combines many aspects: –Authoring pages –Work on the organization of data to fit onto pages –Set display style of different pages –Organize the contribution of data –Maintain a technical web installation Some of them can be learned in a course, but others can not. Emphasis has to be on learnable elements.

10 teaching philosophy Point and click on a computer software is not enough. Explain underlying principles. Promote standards. –HTML 4.01 –CSS level 2 Avoid proprietary software.

11 webmastering I Deals with the maintenance of a static web site. Such a web site remains the same whatever the user does with it. Topics include –html –http –information architecture –web server

12 things this course does not do Forms: allow you to design forms that users fill in. But you do not have the programming skills to do something with the form. That is part of webmastering II. Frames: allow you to put several documents into one physical document. This is seldomly well done. Any html elements that require executable contents are not covered. We do do not cover image maps.

13 webmastering II Deals with building dynamic web sites. –Users fill in a form –Users submit the form –Web server return a page that is specific to the request of the user. Teaches a language called PHP, that is widely used to generate such web sites. –Gets you introduced to computer programming –Gets you to train analytical thinking.

14 webmastering III Deals with XML –XML is a syntax to encode any kind of data. –XML can be constrained to only allow certain types of data (XML Schema) –XML can be transformed to render the data in various ways (XSLT) Achieve a separation of contents and presentation of a web page. advanced course, has both Schema and Transformation


16 The world wide web The World Wide Web (Web) is a network of information resources. The Web relies on three mechanisms to make these resources readily available to the widest possible audience: –A uniform naming scheme for locating resources on the Web (e.g., URIs). –Protocols, for access to named resources over the Web (e.g., HTTP). –Hypertext, for easy navigation among resources (e.g., HTML).

17 URI introduction Every resource available on the Web -- HTML document, image, video clip, program, etc. -- has an address that may be encoded by a Universal Resource Identifier, or "URI". URIs typically consist of three pieces: –The naming scheme of the mechanism used to access the resource. –The name of the machine hosting the resource. –The name of the resource itself, given as a path.

18 example URI This URI may be read as follows: There is a document available via the HTTP protocol, residing on the machine, accessible via the path "/home/krichel". This URI may be read as follows: There is user krichel in a domain to whom may be sent.

19 client / server protocol The web operates mostly on http. This is a client-server protocol. The client software is run on the local PC that you are using. It is called a web browser or user agent. Our server is a piece of hardware called It runs the Debian GNU/Linux operating system on a Intel architecture. It provides http daemon software that serves http requests. The particular software is called Apache.

20 communication with the server The protocol is ssh communicating with the server is the secure shell. It is based public-key cryptography. We two two ssh clients –For file editing and manipulation, we use putty. –For file transfer, we use winscp. Both are available on the web. Telnet and ftp servers are not available on wotan. Telnet and ftp do not encrypt the communication stream; herefore they are not secure.

21 free software I maintain server You can build your own server –you have Internet access –you have an old PC All the server software, as well as putty and winscp are free, open-source. It is one of my fundamental beliefs that free information should run on free software. The library community can learn a hell of a lot from the free software community.

22 installing putty and winscp Go to your favorite search engine to search for putty. Download and install putty Do the same thing with winscp. Here you can use the formal installer. Download and install a recent version of at least two browsers. I recommend Netscape Navigator and Microsoft Internet Explorer. Try to do this on the lab machines.

23 registration time As part of the course, you are being provided with web space on the server, at the URL where username is a user name that you will chose now. It is my intention to maintain this web space for you into the foreseeable future. You should also choose a password, now. I will now register you.

24 login time Use putty, port 22 to set other attributes of the session as you like, using the menu on the left, for example –colors –font shapes and sizes –bell Save the session as wotan (in the first screen) to save all the customization. Do the same thing at home!

25 issuing commands While you are logged in, you talk to the computer by issuing commands. Your commands are read by command line interpreter. The command line interpreter is called a shell. You are using the Bourne Again Shell, bash. bash allows to browse the command history with the arrow keys bash allows to edit commands with the arrow keys exit is the command to leave the shell.

26 files, directories and links Files are continuous chunks data on disks that are required for software applications. A link is a file that contain the address of another file. Microsoft call it a shortcut. Directories are files that contain other files. Microsoft calls them folders. In UNIX, the directory separator is / The top directory is / on ist own.

27 home directory When you first log in to wotan you are placed in your home directory /home/username cd is the command that gets you back to the home directory. The home directory is also abbreviated as ~ cd ~user gets you to the home of user user. cd ~ does what?

28 ~/public_html Is your web directory. It is automagically created for you when Thomas registers you. The web server on wotan will map requests to to show the file ~user/public_html/index.html The web server will map requests to to show the file ~user/public_html/file The server does this with a configuration option set by Thomas.

29 changing directory, listing files cd directory changes into the directory directory the current directory is. its parent directory is.. ls lists files As an exercise, move around the directory structure and discover the files that they hold with ls. IMPORTANT NOTE: bash allows completion of file and directory names with the TAB character

30 more on listing files ls lists files ls –l make a long listing. It contains –elementary type and permissions –owner –group –size –date –name Date, name and size are what interests us, the rest is for the computer guru.

31 general structure of commands commandname –flag --option Where commandname is a name of a command flag can be a letter Several letters set several flags at the same time An option can also be expressed with - - and a word, this is more user-friendly than flags. Let us look at an example with the ls command.

32 example ls lists files ls -l makes a long listing ls -a lists all files, not only regular files but some hidden files as well –all files that start with a dot are hidden ls -la lists all files is long listing ls --all is the same as ls -a. --all is known as a long listing.

33 copying and removing cp file copyfile copies file file to file copyfile. If copyfile is a directory, it copies into the directory. mv file movedfile moves file file to file movedfile If movedfile is a directory, it moves file into the directory. rm file removes file There is no recycling bin!!

34 directories and files mkdir directory makes a directory rmdir directory removes an empty directory rm -r directory removes a directory and all its files more file –Pages contents of file, no way back less file –Pages contents of file u to go back q to quit

35 file transfer you can use winscp to upload and download files to wotan. If uploaded files in the web directory remain invisible, that is most likely a problem with permission. –chmod 644 * will put it right for the files –chmod 755. (yes with a dot) will put it right for the current directory * is a wildcard for all files. rm -r * is a command to avoid.

36 editing There are a plethora of editors available. For the neophyte, nano works best. nano file edits the file file. nano -w switches off line wrapping. nano shows the commands available at the bottom of the screen. Note that ^letter, where letter is a letter, means pressing CONTROL and the letter letter at the same time.

37 copy and paste Putty allows to copy and paste text between windows and UN*X. On the windows machine, it uses the windows approach to copy and paste On the UN*X machine, –you copy by highlighting with the mouse left button –you paste using the middle button. if your mouse does not have a middle button, you can emulate it through pressing the left and right buttons and the same time. –try this out!

38 your first page type: cd type: cd public_html you can do cd pu type: nano page.html where page is the name of the page. edit your file find your file on the web with a web browser. You have written your first web page!

39 special topic: images The appeal of the web to the masses has a lot to do with its capability to transport image. Image format are independent of the web, but there are two classic format that are widely supported by user agents. –GIF –JPEG

40 GIF stands for graphics interchange format. developed by CompuServe. unresolved copyright issues make the format abhorred by the free software community. 250 colors maximum uses a loss-less compression technique

41 GIF has three tricks interlacing: –when downloading the file, the browser can show every forth row first –user gets in an idea of the picture before it is sharp transparency –some GIFs are transparent, so you can see them on top of already exist –technically, the GIF has one color as the background color, and pixels of that color are ignored by the user agent animation –some GIFs are in fact sequences of GIFs that can be rendered one after the other.

42 JPEG The Joint Photographic Experts Group is a standard-making body for images They can support thousands of colors. The compression is lossy, i.e. the JPEG file will look like the original image, but not be the same. The compression does not work well with drawings. There are no copyright and patent problems with JPEG

43 HTML and XHTML HTML is the hypertext markup language HTML is a markup language that is widely used on the Word Wide Web (WWW) The latest, and probably last version of HTML is at The WC3, the standard making body for the WWW, have issued XHTML, a replacement of HTML that is compatible with XML. We will ignore XHTML for the rest of the course.

44 what is markup? Everything in a document that is not content. It can be give in two ways 1: Procedural –Codes identify point size, style, font, etc. –Usually understood by defining tool –Example: M$ Word 2: Descriptive –Describes purpose of text within the document –Chapter head, Paragraph, Section Head, TOC –Structure and Style are kept separate –Example: LaTeX, SGML

45 Procedural vs Descriptive

46 SGML Standard Generalized Markup Language Descriptive approach with three separate layers –structure: types of information in document –content: the information itself –style: matches typesetting with structure Document Type Definition (DTD) –Defines the structure Developed for the publishing industry by a group around Goldfarb. So complicated that no software implements it fully

47 SGML Document Type Definition Describes information the document handles –e.g Title,TOC, Chapter, Section Relationships between fields –e.g. A Chapter contains Sections Consistency Logical structure Information defined by tags

48 HTML HyperText Markup Language Defines an SGML DTD –Head, Title, Body, Paragraph, etc. –Headings, Bold, Italic, etc. –Table, List, Image, etc. –Links to other documents –Forms Style applied by Web Browser –User has some control

49 HTML tags HTML markup is written as tags. Tags are written as pairs (typically) –begin with "tag start" –end with "tag end" –atag is the tag name Can be nested Can contain non-markup data Tag names are case-insensitive, but it is best to use the same case, consistently, for human readability.

50 attributes to tags Here attribute_name_one and attribute_name_two are attribute names and value_one and value_two are attribute values.

51 common frame for pages Put the following in your pages: The first three lines are the SGML document type declaration that says which kind of HTML it is. use validator service

52 Thank you for your attention!

