World Wide Web Tutorial

Northwestern University EECS UNIX Lecture Series
May 17, 1994
Jennifer Myers

[SND] Welcome to the World Wide Web tutorial created for the May 17th UNIX lecture seminar series sponsored by the Electrical Engineering and Computer Science Department. This document illustrates the nature of the talk, provides the tutorial, and provides background and supporting material.

It is assumed that the reader has a familarity with the Internet and the basic Internet services (for example, telnet, Usenet news, FTP, gopher). It is also assumed, for the moment, that the reader is accessing the tutorial via the WWW client NCSA Mosaic. When you encounter a highlighted image or word, such as this, you have found a hypertext link. Single click the left mouse button on the highlighted item to traverse the link; to return, select Back from the bottom control panel or the Navigate menu, or simply press `b'.

Goals

By the end of the tutorial, you will:

  1. Be able to differentiate between the World Wide Web and Mosaic.
  2. Become familiar with the two WWW browsers installed on the EECS departmental machines: lynx (a curses-based client), and Mosaic (a Motif-based client).
  3. Learn how to save pointers to resource pages to which you would like to return.
  4. Learn the basics of HyperText Markup Language (HTML).
  5. Be on your way to composing your own ``hyplan,'' with hypertext links to your favorite resources.

Outline

This talk is divided into three distinct parts. In the first part, I will present an overview of the World Wide Web project. In the second part, I will demonstrate how to use two popular WWW browsers: lynx, a curses-based browser which can run on VT100 terminals, and Mosaic, a graphical browser which runs under the X Windows System. In the third part, I will present an introduction to HTML and provide instructions for publishing on the local HTTP server.


Part 1: What is the World Wide Web?

In March of 1989, and revised in October 1990, two researchers at CERN, Tim Berners-Lee and Robert Cailliau, developed a proposal for a single hypertext-based user interface to a wide variety of networked information sources. A single user interface would greatly simplify the effort in retrieving data from multiple sources, a task which previously required an array of software packages and machine platforms. The usability of existing information would presumably be increased.

Hypertext, ``non-sequential writing,'' is a concept coined by Ted Nelson in the 1960s. In a sense, reference books which refer to other sections of the book within the text (such as with the notation cf.) are hypertext.

The term ``World Wide Web'' comes from the notion that individual nodes of information are linked with hypertext, and this creates a ``web'' of information. It is ``world-wide'' in that WWW browers speak not one, but multiple protocols and retrieve information from distant machines around the globe over the Internet. In the words of Tim Berners-Lee, the World Wide Web is ``the universe of network-accessible information, an embodiment of human knowledge.''

By November 1990, Tim Berners-Lee had developed a prototype WWW browser for the NeXT. In January 1992, the line-mode broswer, www, was in public release. One year after that - in February of 1993 - came the initial alpha release of X Mosaic, as well as several other independently developed WWW broswers for X Windows. By August 1993, NCSA Mosaic had been released for both MS Windows and Macintosh platforms.

While the World Wide Web encompasses previously existing protocols such as gopher and FTP, also critical to the WWW proposal was the development of a new protocol, specially designed for the needs of a distributed hypertext system. It is a fast, stateless, object-oriented protocol called HyperText Transfer Protocol (HTTP).

World Wide Web browsers access servers by way of a new common addressing scheme known as Uniform Resource Locators, or URLs. For example, here is a URL for a file accessible via a gopher server:

gopher://epsilon.eecs.nwu.edu/00/.txts/eecsnet_bulletins
In a similar fashion, URLs can be constructed which point to files available via HTTP, anonymous FTP, WAIS databases ... anything accessible through the World Wide Web.

The World Wide Web project also introduced a markup language, HTML, which is used to create hypertext documents. Hypertext documents are but a fraction of the document types routinely accessed via WWW clients, however. WWW clients also access, for example, plain text files, GIF files , ULAW files, MPEG files, and PostScript files.

The World Wide Web Frequently Asked Questions list is a good source for futher information.


Part 2: Two WWW Browsers: Lynx and NCSA Mosaic

Lynx

Lynx is a World Wide Web browser for curses-oriented displays, such as VT100 terminals. It was developed by Lou Montulli of the University of Kansas. The Lynx Users Guide is available on-line. To access the Lynx Help Files from within lynx, hit the h key.

For a summary of valid keystroke commands within lynx, refer to the guide on page 2 of the tutorial handouts.

On the EECS departmental machines, lynx (version 2.1) is installed in /usr/local/bin/lynx.
Type lynx at the UNIX shell command prompt to start lynx.

NCSA Mosaic

NCSA Mosaic is a World Wide Web browser for the X Windows System (it is also now available on MS Windows and Apple Macintosh platforms). It was developed by Marc Andreessen and Eric Bina of the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign.

(Incidentally, Eric, Marc, Lou, and several others have recently formed a start-up company, Mosaic Communications Corporation, in the bay area. Lynx and NCSA Mosaic, despite fears to the contrary, will continue to be supported by continuing and new members of the development teams at University of Kansas and NCSA).

Online documentation is available via the Help menu bar item within Mosaic. Mosaic's man page (type man Mosaic at the UNIX shell command prompt) contains essentially the same information. Part of the users guide, Navigating the WWW with Mosaic, may be found in the tutorial handout.

On the EECS departmental machines, Mosaic (version 2.4) is installed in /usr/bin/X11/Mosaic.
Type Mosaic at the UNIX shell command prompt to start Mosaic.


Part 3: An Introduction to HTML

HTML, or HyperText Markup Language, is the markup language developed for representing hypertext in the World Wide Web. HTML, like LaTeX, is used to represent the semantic structure of a document, and not its precise physical layout. This gives WWW clients considerable latitude in how the interpreted document will be displayed. This is important in that WWW clients run on a wide variety of platforms, with widely varying display capabilities. What this means for you, as an HTML coder, is that you should design your use of HTML to reflect the semantic structure of your document and avoid the use of clever ``hacks'' (using tags for their display properties in your browser-of-choice when that use does not reflect the intended meaning of the tag).

HTML is an evolving language, and several browsers (most notably Mosaic) are very forgiving with incorrect HTML syntax, it is very easy to write ``bad HTML'' that ``works'' under Mosaic. It may fail, though, when viewed with another WWW browser. Peter Flynn is one of the key players in the design of HTML. His guide is the best that I have seen, and it is especially good at defining proper HTML syntax. We will work from his guide. But first, we must set up an area in your home directory in which you can experiment with HTML and Web publishing.

Publishing on the EECS HTTP server

The EECS HTTP server has been configured so that files stored in users' home directories may be retrieved with a URL of the construct:

http://www.eecs.nwu.edu/~username/path/to/file The HTTP server fetches the file: /homes/username/public_html/path/to/file So, to create your own publishing area, type in the following commands at the UNIX prompt:
  1. mkdir ~/public_html
  2. chmod a+rx ~/public_html

Once you have created your public_html directory, click the Open... button on Mosaic's bottom control panel and type in the URL:

http://www.eecs.nwu.edu/~username/ ...replacing ``username'' with your username.

What happens when you select the link Parent Directory?

Now copy a file into ~/public_html. For example, ~/.xinitrc:

cp ~/.xinitrc ~/public_html/xinitrc (Note: no preceding ``.'' It could have been named ~/public_html/.xinitrc but the server's automatic directory indexing has been configured to not display dot files).

Re-open the URL

http://www.eecs.nwu.edu/~username/ Do you see your file? Click on the file name to open it.
If you get the error message...


403 Forbidden

Your client does not have permission to get URL /~username/xinitrc from this server.


...then your file does not have the world read permission bit set. (In other circumstances, you might receive this message because the server which you are accessing has been instructed not to serve that document to you, based on the address of the machine your client is running on). A quick way to ensure that all files in the ~/public_html directory have the necessary permissions is to execute the command:

chmod -R a+rX ~/public_html This command sets the correct permissions for all files and directories contained within ~/public_html.

HyperText Markup Language

If you followed the exercise above, you have already served a plain-text file over HTTP (an HTML file, too, but that was the directory listing and the HTTP server automatically generated it).

In the remainder of this tutorial, we will discuss the magical language of HTML. And to proceed, open: example.html


jmyers@eecs.nwu.edu (17-May-94)