Sie sind auf Seite 1von 44

A history of HTML

Included in this chapter is information on:

• How the World Wide Web began


• The events and circumstances that led to the World Wide Web's current popularity
• How HTML has grown from its conception in the early 1990s

Summary

HTML has had a life-span of roughly seven years. During that time, it has evolved from a simple
language with a small number of tags to a complex system of mark-up, enabling authors to create
all-singing-and-dancing Web pages complete with animated images, sound and all manner of
gimmicks. This chapter tells you something about the Web's early days, HTML, and about the
people, companies and organizations who contributed to HTML+, HTML 2, HTML 3.2 and
finally, HTML 4.

This chapter is a short history of HTML. Its aim is to give readers some idea of how the HTML
we use today was developed from the prototype written by Tim Berners-Lee in 1992. The story is
interesting - not least because HTML has been through an extremely bumpy ride on the road to
standardization, with software engineers, academics and browser companies haggling about the
language like so many Ministers of Parliament debating in the House of Commons.

1989: Tim Berners-Lee invents the Web with HTML as its publishing language

The World Wide Web began life in the place where you would least expect it: at CERN, the
European Laboratory for Particle Physics in Geneva, Switzerland. CERN is a meeting place for
physicists from all over the world, where highly abstract and conceptual thinkers engage in the
contemplation of complex atomic phenomena that occur on a minuscule scale in time and space.
This is a surprising place indeed for the beginnings of a technology which would, eventually,
deliver everything from tourist information, online shopping and advertisements, financial data,
weather forecasts and much more to your personal computer.
Tim Berners-Lee is the inventor of the Web. In 1989, Tim was working in a computing services
section of CERN when he came up with the concept; at the time he had no idea that it would be
implemented on such an enormous scale. Particle physics research often involves collaboration
among institutes from all over the world. Tim had the idea of enabling researchers from remote
sites in the world to organize and pool together information. But far from simply making
available a large number of research documents as files that could be downloaded to individual
computers, he suggested that you could actually link the text in the files themselves.

In other words, there could be cross-references from one research paper to another. This would
mean that while reading one research paper, you could quickly display part of another paper that
holds directly relevant text or diagrams. Documentation of a scientific and mathematical nature
would thus be represented as a `web' of information held in electronic form on computers across
the world. This, Tim thought, could be done by using some form of hypertext, some way of
linking documents together by using buttons on the screen, which you simply clicked on to jump
from one paper to another. Before coming to CERN, Tim had already worked on document
production and text processing, and had developed his first hypertext system, `Enquire', in 1980
for his own personal use.

Tim's prototype Web browser on the NeXT computer came out in 1990.

Through 1990: The time was ripe for Tim's invention

The fact that the Web was invented in the early 1990s was no coincidence. Developments in
communications technology during that time meant that, sooner or later, something like the Web
was bound to happen. For a start, hypertext was coming into vogue and being used on computers.
Also, Internet users were gaining in the number of users on the system: there was an increasing
audience for distributed information. Last, but not least, the new domain name system had made
it much easier to address a machine on the Internet.

Hypertext

lthough already established as a concept by academics as early as the 1940s, it was with the
advent of the personal computer that hypertext came out of the cupboard. In the late 1980s, Bill
Atkinson, an exceptionally gifted programmer working for Apple Computer Inc., came up with
an application called Hypercard for the Macintosh. Hypercard enabled you to construct a series
of on-screen `filing cards' that contained textual and graphical information. Users could navigate
these by pressing on-screen buttons, taking themselves on a tour of the information in the process.

Hypercard set the scene for more applications based on the filing card idea. Toolbook for the PC
was used in the early 1990s for constructing hypertext training courses that had `pages' with
buttons which could go forward or backward or jump to a new topic. Behind the scenes, buttons
would initiate little programs called scripts. These scripts would control which page would be
presented next; they could even run a small piece of animation on the screen. The application
entitled Guide was a similar application for UNIX and the PC.

Hypercard and its imitators caught the popular imagination. However, these packages still had
one major limitation: hypertext jumps could only be made to files on the same computer. Jumps
made to computers on the other side of the world were still out of the question. Nobody yet had
implemented a system involving hypertext links on a global scale.
The domain name system

By the middle 1980s, the Internet had a new, easy-to-use system for naming computers. This
involved using the idea of the domain name. A domain name comprises a series of letters
separated by dots, for example: `www.bo.com' or `www.erb.org.uk'. These names are the easy-to-
use alternative to the much less manageable and cumbersome IP address numbers.

A program called Distributed Name Service (DNS) maps domain names onto IP addresses,
keeping the IP addresses `hidden'. DNS was an absolute breakthrough in making the Internet
accessible to those who were not computer nerds. As a result of its introduction, email addresses
became simpler. Previous to DNS, email addresses had all sorts of hideous codes such as
exclamation marks, percent signs and other extraneous information to specify the route to the
other machine.

Choosing the right approach to create a global hypertext system

To Tim Berners-Lee, global hypertext links seemed feasible, but it was a matter of finding the
correct approach to implementing them. Using an existing hypertext package might seem an
attractive proposition, but this was impractical for a number of reasons. To start with, any
hypertext tool to be used worldwide would have to take into account that many types of
computers existed that were linked to the Internet: Personal Computers, Macintoshes, UNIX
machines and simple terminals. Also, many desktop publishing methods were in vogue: SGML,
Interleaf, LaTex, Microsoft Word, and Troff among many others. Commercial hypertext
packages were computer-specific and could not easily take text from other sources; besides, they
were far too complicated and involved tedious compiling of text into internal formats to create the
final hypertext system.

What was needed was something very simple, at least in the beginning. Tim demonstrated a
basic, but attractive way of publishing text by developing some software himself, and also his
own simple protocol - HTTP - for retrieving other documents' text via hypertext links. Tim's own
protocol, HTTP, stands for HyperText Transfer Protocol. The text format for HTTP was named
HTML, for HyperText Mark-up Language; Tim's hypertext implementation was demonstrated on
a NeXT workstation, which provided many of the tools he needed to develop his first prototype.
By keeping things very simple, Tim encouraged others to build upon his ideas and to design
further software for displaying HTML, and for setting up their own HTML documents ready for
access.

Tim bases his HTML on an existing internationally agreed upon method of text mark-up

The HTML that Tim invented was strongly based on SGML (Standard Generalized Mark-up
Language), an internationally agreed upon method for marking up text into structural units such
as paragraphs, headings, list items and so on. SGML could be implemented on any machine. The
idea was that the language was independent of the formatter (the browser or other viewing
software) which actually displayed the text on the screen. The use of pairs of tags such as
<TITLE> and </TITLE> is taken directly from SGML, which does exactly the same. The SGML
elements used in Tim's HTML included P (paragraph); H1 through H6 (heading level 1 through
heading level 6); OL (ordered lists); UL (unordered lists); LI (list items) and various others. What
SGML does not include, of course, are hypertext links: the idea of using the anchor element with
the HREF attribute was purely Tim's invention, as was the now-famous `www.name.name'
format for addressing machines on the Web.
Basing HTML on SGML was a brilliant idea: other people would have invented their own
language from scratch but this might have been much less reliable, as well as less acceptable to
the rest of the Internet community. Certainly the simplicity of HTML, and the use of the anchor
element A for creating hypertext links, was what made Tim's invention so useful.

September 1991: Open discussion about HTML across the Internet begins

Far from keeping his ideas private, Tim made every attempt to discuss them openly online across
the Internet. Coming from a research background, this was quite a natural thing to do. In
September 1991, the WWW-talk mailing list was started, a kind of electronic discussion group in
which enthusiasts could exchange ideas and gossip. By 1992, a handful of other academics and
computer researchers were showing interest. Dave Raggett from Hewlett-Packard's Labs in
Bristol, England, was one of these early enthusiasts, and, following electronic discussion, Dave
visited Tim in 1992.

Here, in Tim's tiny room in the bowels of the sprawling buildings of CERN, the two engineers
further considered how HTML might be taken from its current beginnings and shaped into
something more appropriate for mass consumption. Trying to anticipate the kind of features that
users really would like, Dave looked through magazines, newspapers and other printed media to
get an idea of what sort of HTML features would be important when that same information was
published online. Upon return to England, Dave sat down at his keyboard and resolutely
composed HTML+, a richer version of the original HTML.

Late 1992: NCSA is intrigued by the idea of the Web

Meanwhile on the other side of the world, Tim's ideas had caught the eye of Joseph Hardin and
Dave Thompson, both of the National Center for Supercomputer Applications, a research institute
at the University of Illinois at Champaign-Urbana. They managed to connect to the computer at
CERN and download copies of two free Web browsers. Realizing the importance of what they
saw, NCSA decided to develop a browser of their own to be called Mosaic. Among the
programmers in the NCSA team were Marc Andreessen - who later made his millions by selling
Web products - and the brilliant programmer Eric Bina - who also became rich, courtesy of the
Web. Eric Bina was a kind of software genius who reputedly could stay up three nights in
succession, typing in a reverie of hacking at his computer.

December 1992: Marc Andreessen makes a brief appearance on WWW- talk

Early Web enthusiasts exchanged ideas and gossip over an electronic discussion group called
WWW-talk. This was where Dave Raggett, Tim Berners-Lee, Dan Connolly and others debated
how images (photographs, diagrams, illustrations and so on) should be inserted into HTML
documents. Not everyone agreed upon the way that the relevant tag should be implemented, or
even what that tag should be called. Suddenly, Marc Andreessen appeared on WWW-talk and,
without further to-do, introduced an idea for the IMG tag by the Mosaic team.

It was quite plain that the others were not altogether keen on the design of IMG, but Andreessen
was not easily redirected. The IMG tag was implemented in the form suggested by the Mosaic
team on its browser and remains to this day firmly implanted in HTML. This was much to the
chagrin of supporters back in academia who invented several alternatives to IMG in the years to
come. Now, with the coming of HTML 4, the OBJECT tag potentially replaces IMG, but this is,
of course, some years later.
March 1993: Lou Montulli releases the Lynx browser version 2.0a

Lou Montulli was one of the first people to write a text-based browser, Lynx. The Lynx browser
was a text-based browser for terminals and for computers that used DOS without Windows. Lou
Montulli was later recruited to work with Netscape Communications Corp., but nonetheless
remained partially loyal to the idea of developing HTML as an open standard, proving a real asset
to the HTML working group and the HTML Editorial Board in years to come. Lou's enthusiasm
for good, expensive wine, and his knowledge of excellent restaurants in the Silicon Valley area
were to make the standardization of HTML a much more pleasurable process.

Early 1993: Dave Raggett begins to write his own browser

While Eric Bina and the NCSA Mosaic gang were hard at it hacking through the night, Dave
Raggett of Hewlett-Packard Labs in Bristol was working part-time on his Arena browser, on
which he hoped to demonstrate all sorts of newly invented features for HTML.

April 1993: The Mosaic browser is released

In April 1993, version 1 of the Mosaic browser was released for Sun Microsystems Inc.'s
workstation, a computer used in software development running the UNIX operating system.
Mosaic extended the features specified by Tim Berners-Lee; for example, it added images, nested
lists and fill-out forms. Academics and software engineers later would argue that many of these
extensions were very much ad hoc and not properly designed.

Late 1993: Large companies underestimate the importance of the Web

Dave Raggett's work on the Arena browser was slow because he had to develop much of it single-
handedly: no money was available to pay for a team of developers. This was because Hewlett-
Packard, in common with many other large computer companies, was quite unconvinced that the
Internet would be a success; indeed, the need for a global hypertext system simply passed them
by. For many large corporations, the question of whether or not any money could be made from
the Web was unclear from the outset.

There was also a misconception that the Internet was mostly for academics. In some companies,
senior management was assured that the telephone companies would provide the technology for
global communications of this sort, anyway. The result was that individuals working in research
labs in the commercial sector were unable to devote much time to Web development. This was a
bitter disappointment to some researchers, who gratefully would have committed nearly every
waking moment toward shaping what they envisioned would be the communications system of
the future.

Dave Raggett, realizing that there were not enough working hours left for him to succeed at what
he felt was an immensely important task, continued writing his browser at home. There he would
sit at a large computer that occupied a fair portion of the dining room table, sharing its slightly
sticky surface with paper, crayons, Lego bricks and bits of half-eaten cookies left by the children.
Dave also used the browser to show text flow around images, forms and other aspects of HTML
at the First WWW Conference in Geneva in 1994. The Arena browser was later used for
development work at CERN.

May 1994: NCSA assigns commercial rights for Mosaic browser to Spyglass, Inc.
In May 1994, Spyglass, Inc. signed a multi-million dollar licensing agreement with NCSA to
distribute a commercially enhanced version of Mosaic. In August of that same year, the
University of Illinois at Champaign-Urbana, the home of NCSA, assigned all future commercial
rights for NCSA Mosaic to Spyglass.

May 1994: The first World Wide Web conference is held in Geneva, with HTML+ on show

Although Marc Andreessen and Jim Clark had commercial interests in mind, the rest of the
World Wide Web community had quite a different attitude: they saw themselves as joint creators
of a wonderful new technology, which certainly would benefit the world. They were jiggling with
excitement. Even quiet and retiring academics became animated in discussion, and many seemed
evangelical about their new-found god of the Web.

At the first World Wide Web conference organized by CERN in May 1994, all was merry with
380 attendees - who mostly were from Europe but also included many from the United States.
You might have thought that Marc Andreessen, Jim Clark and Eric Bina surely would be there,
but they were not. For the most part, participants were from the academic community, from
institutions such as the World Meteorological Organization, the International Center for
Theoretical Physics, the University of Iceland and so on. Later conferences had much more of a
commercial feel, but this one was for technical enthusiasts who instinctively knew that this was
the start of something big.

At the World Wide Web conference in Geneva. Left to right: Joseph Hardin from NCSA, Robert
Cailliau from CERN, Tim Berners-Lee from CERN and Dan Connolly (of HTML 2 fame) then
working for Hal software.

During the course of that week, awards were presented for notable achievements on the Web;
these awards were given to Marc Andreessen, Lou Montulli, Eric Bina, Rob Hartill and Kevin
Hughes. Dan Connolly, who proceeded to define HTML 2, gave a slide presentation entitled
Interoperability: Why Everyone Wins, which explained why it was important that the Web
operated with a proper HTML specification. Strange to think that at least three of the people who
received awards at the conference were later to fly in the face of Dan's idea that adopting a cross-
company uniform standard for HTML was essential.

Dave Raggett had been working on some new HTML ideas, which he called HTML+. At the
conference it was agreed that the work on HTML+ should be carried forward to lead to the
development of an HTML 3 standard. Dave Raggett, together with CERN, developed Arena
further as a proof-of-concept browser for this work. Using Arena, Dave Raggett, Henrik Frystyk
Nielsen, Håkon Lie and others demonstrated text flow around a figure with captions, resizable
tables, image backgrounds, math and other features.

A panel discussion at the Geneva conference. Kevin Altis from Intel, Dave Raggett from HP
Labs, Rick `Channing' Rodgers from the National Library of Medicine.

The conference ended with a glorious evening cruise on board a paddle steamer around Lake
Geneva with Wolfgang and the Werewolves providing Jazz accompaniment.

September 1994: The Internet Engineering Task Force (IETF) sets up an HTML working
group

In early 1994, an Internet Engineering Task Force working group was set up to deal with HTML.

he Internet Engineering Task Force is the international standards and development body of the
Internet and is a large, open community of network designers, operators, vendors and researchers
concerned with the evolution and smooth operation of the Internet architecture. The technical
work of the IETF is done in working groups, which are organized by topic into several areas; for
example, security, network routing, and applications. The IETF is, in general, part of a culture
that sees the Internet as belonging to The People. This was even more so in the early days of the
Web.

he feelings of the good `ole days of early Web development are captured in the song, The Net
Flag, which can be found `somewhere on the Internet'. The first verse runs as follows:

The people's web is deepest red,


And oft it's killed our routers dead.
But ere the bugs grew ten days old,
The patches fixed the broken code.

Chorus:

So raise the open standard high


Within its codes we'll live or die
Though cowards flinch and Bill Gates sneers
We'll keep the net flag flying here.

In keeping with normal IETF practices, the HTML working group was open to anyone in the
engineering community: any interested computer scientist could potentially become a member
and, once on its mailing list, could take part in email debate. The HTML working group met
approximately three times a year, during which time they would enjoy a good haggle about
HTML features present and future, be pleasantly suffused with coffee and beer, striding about
plush hotel lobbies sporting pony tails, T-shirts and jeans without the slightest care.

July 1994: HTML specification for HTML 2 is released

During 1993 and early 1994, lots of browsers had added their own bits to HTML; the language
was becoming ill-defined. In an effort to make sense of the chaos, Dan Connolly and colleagues
collected all the HTML tags that were widely used and collated them into a draft document that
defined the breadth of what Tim Berners-Lee called HTML 2. The draft was then circulated
through the Internet community for comment. With the patience of a saint, Dan took into account
numerous suggestions from HTML enthusiasts far and wide, ensuring that all would be happy
with the eventual HTML 2 definition. He also wrote a Document Type Definition for HTML 2, a
kind of mathematically precise description of the language.

November 1994: Netscape is formed

During 1993, Marc Andreessen apparently felt increasingly irritated at simply being on the
Mosaic project rather than in charge of it. Upon graduating, he decided to leave NCSA and head
for California where he met Jim Clark, who was already well known in Silicon Valley and who
had money to invest. Together they formed Mosaic Communications, which then became
Netscape Communications Corp. in November, 1994. What they planned to do was create and
market their very own browser.

The browser they designed was immensely successful - so much so in fact, that for some time to
come, many users would mistakenly think that Netscape invented the Web. Netscape did its best
to make sure that even those who were relying on a low-bandwidth connection - that is, even
those who only had a modem-link from a home personal computer - were able to access the Web
effectively. This was greatly to the company's credit.

Following a predictable path, Netscape began inventing its own HTML tags as it pleased without
first openly discussing them with the Web community. Netscape rarely made an appearance at the
big International WWW conferences, but it seemed to be driving the HTML standard. It was a
curious situation, and one that the inner core of the HTML community felt they must redress.

Late 1994: The World Wide Web Consortium forms

The World Wide Web Consortium was formed in late 1994 to fulfill the potential of the Web
through the development of open standards. They had a strong interest in HTML. Just as an
orchestra insists on the best musicians, so the consortium recruited many of the best-known
names in the Web community. Headed up by Tim Berners-Lee, here are just some of the players
in the band today (1997):

Members of the World Wide Web Consortium at the MIT site. From left to right are Henrick
Frystyk Neilsen, Anselm Baird-Smith, Jay Sekora, Rohit Khare, Dan Connolly, Jim Gettys, Tim
Berners-Lee, Susan Hardy, Jim Miller, Dave Raggett, Tom Greene, Arthur Secret, Karen
MacArthur.

• Dave Raggett on HTML; from the United Kingdom.


• Arnaud le Hors on HTML; from France.
• Dan Connolly on HTML; from the United States.
• Henrik Frystyk Nielsen on HTTP and on enabling the Web to go faster; from Denmark.
• Håkon Lie on style sheets; from Norway. He is located in France, working at INRIA.
• Bert Bos on style sheets and layout; from the Netherlands.
• Jim Miller on investigating technologies that could be used in rating the content of Web
pages; from the United States.
• Chris Lilley on style sheets and font support; from the United Kingdom.

The W3 Consortium is based in part at the Laboratory of Computer Science at Massachusetts'


Institute of Technology in Cambridge, Massachusetts, in the United States; and in part at INRIA,
the Institut National de Recherche en Informatique et en Automatique, a French governmental
research institute. The W3 Consortium is also located in part at Keio University in Japan. You
can look at the Consortium's Web pages on `www.w3.org'.

The consortium is sponsored by a number of companies that directly benefit from its work on
standards and other technology for the Web. The member companies include Digital Equipment
Corp.; Hewlett-Packard Co.; IBM Corp.; Microsoft Corp.; Netscape Communications Corp.; and
Sun Microsystems Inc., among many others.

Through 1995: HTML is extended with many new tags


During 1995, all kinds of new HTML tags emerged. Some, like the BGCOLOR attribute of the
BODY element and FONT FACE, which control stylistic aspects of a document, found
themselves in the black books of the academic engineering community. `You're not supposed to
be able to do things like that in HTML,' they would protest. It was their belief that such things as
text color, background texture, font size and font face were definitely outside the scope of a
language when their only intent was to specify how a document would be organized.

March 1995: HTML 3 is published as an Internet Draft

Dave Raggett had been working for some time on his new ideas for HTML, and at last he
formalized them in a document published as an Internet Draft in March, 1995. All manner of
HTML features were covered. A new tag for inserting images called FIG was introduced, which
Dave hoped would supersede IMG, as well as a whole gambit of features for marking up math
and scientific documents. Dave dealt with HTML tables and tabs, footnotes and forms. He also
added support for style sheets by including a STYLE tag and a CLASS attribute. The latter was to
be available on every element to encourage authors to give HTML elements styles, much as you
do in desktop publishing.

Although the HTML 3 draft was very well received, it was somewhat difficult to get it ratified by
the IETF. The belief was that the draft was too large and too full of new proposals. To get
consensus on a draft 150 pages long and about which everyone wanted to voice an opinion was
optimistic - to say the least. In the end, Dave and the inner circle of the HTML community
decided to call it a day.

Of course, browser writers were very keen on supporting HTML 3 - in theory. Inevitably, each
browser writer chose to implement a different subset of HTML 3's features as they were so
inclined, and then proudly proclaimed to support the standard. The confusion was mind-boggling,
especially as browsers even came out with extensions to HTML 3, implying to the ordinary gent
that normal HTML 3 was, of course, already supported. Was there an official HTML 3 standard
or not? The truth was that there was not, but reading the computer press you might never have
known the difference.

March 1995: A furor over the HTML Tables specification

Dave Raggett's HTML 3 draft had tackled the tabular organization of information in HTML.
Arguments over this aspect of the language had continued for some time, but now it was time to
really get going. At the 32nd meeting of the IETF in Danvers, Massachusetts, Dave found a group
from the SGML brethren who were up in arms over part of the tables specification because it
contradicted the CALS table model. Groups such as the US Navy use the CALS table model in
complex documentation. After long negotiation, Dave managed to placate the CALS table
delegates and altered the draft to suit their needs. HTML tables, which were not in HTML
originally, finally surfaced from the HTML 3 draft to appear in HTML 3.2. They continue to be
used extensively for the purpose of providing a layout grid for organizing pictures and text on the
screen.

August 1995: Microsoft's Internet Explorer browser comes out

Version 1.0 of Microsoft Corp.'s Internet Explorer browser was announced. This browser was
eventually to compete with Netscape's browser, and to evolve its own HTML features. To a
certain extent, Microsoft built its business on the Web by extending HTML features. The
ActiveX feature made Microsoft's browser unique, and Netscape developed a plug-in called
Ncompass to handle ActiveX. This whole idea whereby one browser experiments with an
extension to HTML only to find others adding support to keep even, continues to the present.

In November 1995, Microsoft's Internet Explorer version 2.0 arrived for its Windows NT and
Windows 95 operating systems.

September 1995: Netscape submits a proposal for frames

By this time, Netscape submitted a proposal for frames, which involved the screen being divided
into independent, scrollable areas. The proposal was implemented on Netscape's Navigator
browser before anyone really had time to comment on it, but nobody was surprised.

November 1995: The HTML working group runs into problems

The HTML working group was an excellent idea in theory, but in practice things did not go quite
as expected. With the immense popularity of the Web, the HTML working group grew larger and
larger, and the volume of associated email soared exponentially. Imagine one hundred people
trying to design a house. `I want the windows to be double-glazed,' says one. `Yes, but shouldn't
we make them smaller, while we're at it,' questions another. Still others chime in: `What material
do you propose for the frames - I'm not having them in plastic, that's for sure'; `I suggest that we
don't have windows, as such, but include small, circular port-holes on the Southern elevation...'
and so on.

You get the idea. The HTML working group emailed each other in a frenzy of electronic activity.
In the end, its members became so snowed under with email that no time was left for
programming. For software engineers, this was a sorry state of affairs, indeed: `I came back after
just three days away to find over 2000 messages waiting,' was the unhappy lament of the HTML
enthusiast.

Anyway, the HTML working group still was losing ground to the browser vendors. The group
was notably slow in coming to a consensus on a given HTML feature, and commercial
organizations were hardly going to sit around having tea, pleasantly conversing on the weather
whilst waiting for the results of debates. And they did not.

November 1995: Vendors unite to form a new group dedicated to developing an HTML
standard

In November, 1995 Dave Raggett called together representatives of the browser companies and
suggested they meet as a small group dedicated to standardizing HTML. Imagine his surprise
when it worked! Lou Montulli from Netscape, Charlie Kindel from Microsoft, Eric Sink from
Spyglass, Wayne Gramlich from Sun Microsystems, Dave Raggett, Tim Berners-Lee and Dan
Connolly from the W3 Consortium, and Jonathan Hirschman from Pathfinder convened near
Chicago and made quick and effective decisions about HTML.

November 1995: Style sheets for HTML documents begin to take shape

Bert Bos, Håkon Lie, Dave Raggett, Chris Lilley and others from the World Wide Web
Consortium and others met in Versailles near Paris to discuss the deployment of Cascading Style
Sheets. The name Cascading Style Sheets implies that more than one style sheet can interact to
produce the final look of the document. Using a special language, the CSS group advocated that
everyone would soon be able to write simple styles for HTML, as one would do in Microsoft
Word and other desktop publishing software packages. The SGML contingent, who preferred a
LISP-like language called DSSSL - it rhymes with whistle - seemed out of the race when
Microsoft promised to implement CSS on its Internet Explorer browser.

November 1995: Internationalization of HTML Internet Draft

Gavin Nicol, Gavin Adams and others presented a long paper on the internationalization of the
Web. Their idea was to extend the capabilities of HTML 2, primarily by removing the restriction
on the character set used. This would mean that HTML could be used to mark up languages other
than those that use the Latin-1 character set to include a wider variety of alphabets and character
sets, such as those that read from right to left.

December 1995: The HTML working group is dismantled

Since the IETF HTML working group was having difficulties coming to consensus swiftly
enough to cope with such a fast-evolving standard, it was eventually dismantled.

February 1996: The HTML ERB is formed

Following the success of the November, 1995 meeting, the World Wide Web Consortium formed
the HTML Editorial Review Board to help with the standardization process. This board consisted
of representatives from IBM, Microsoft, Netscape, Novell, Softquad and the W3 Consortium, and
did its business via telephone conference and email exchanges, meeting approximately once
every three months. Its aim was to collaborate and agree upon a common standard for HTML,
thus putting an end to the era when browsers each implemented a different subset of the language.
The bad fairy of incompatibility was to be banished from the HTML kingdom forever, or one
could hope so, perhaps.

Dan Connolly of the W3 Consortium, also author of HTML 2, deftly accomplished the feat of
chairing what could be quite a raucous meeting of the clans. Dan managed to make sure that all
representatives had their say and listened to each other's point of view in an orderly manner. A
strong chair was absolutely essential in these meetings.

In preparation for an ERB meeting, specifications describing new aspects of HTML were made
electronically available for ERB members to read. Then, at the meeting itself, the proponent
explained some of the rationale behind the specification, and then dearly hoped that all who were
present also concurred that the encapsulated ideas were sound. Questions such as, `should a
particular feature be included, or should we kick it out,' would be considered. Each representative
would air his point of view. If all went well, the specification might eventually see daylight and
become a standard. At the time of writing, the next HTML standard, code-named Cougar, has
begun its long journey in this direction.

The BLINK tag was ousted in an HTML ERB meeting. Netscape would only abolish it if
Microsoft agreed to get rid of MARQUEE; the deal was struck and both tags disappeared. Both
of these extensions have always been considered slightly goofy by all parties. Many tough
decisions were to be made about the OBJECT specification. Out of a chaos of several different
tags - EMBED, APP, APPLET, DYNSRC and so on - all associated with embedding different
types of information in HTML documents, a single OBJECT tag was chosen in April, 1996. This
OBJECT tag becomes part of the HTML standard, but not until 1997.

April 1996: The W3 Consortium working draft on Scripting comes out

Based on an initial draft by Charlie Kindel, and, in turn, derived from Netscape's extensions for
JavaScript, a W3C working draft on the subject of Scripting was written by Dave Raggett. In one
form or another, this draft should eventually become part of standard HTML.

July 1996: Microsoft seems more interested than first imagined in open standards

In April 1996, Microsoft's Internet Explorer became available for Macintosh and Windows 3.1
systems.

Thomas Reardon had been excited by the Web even at the second WWW conference held in
Darmstadt, Germany in 1995. One year later, he seemed very interested in the standardization
process and apparently wanted Microsoft to do things the right way with the W3C and with the
IETF. Traditionally, developers are somewhat disparaging about Microsoft, so this was an
interesting turn of events. It should be said that Microsoft did, of course, invent tags of their own,
just as did Netscape. These included the remarkable MARQUEE tag that caused great mirth
among the more academic HTML community. The MARQUEE tag made text dance about all
over the screen - not exactly a feature you would expect from a serious language concerned with
structural mark-up such as paragraphs, headings and lists.

The worry that a massive introduction of proprietary products would kill the Web continued.
Netscape acknowledged that vendors needed to push ahead of the standards process and innovate.
They pointed out that, if users like a particular Netscape innovation, then the market would drive
it to become a de facto standard. This seemed quite true at the time and, indeed, Netscape has
innovated on top of that standard again. It's precisely this sequence of events that Dave Raggett
and the World Wide Web Consortium were trying to avoid.

December 1996: Work on `Cougar' is begun

The HTML ERB became the HTML Working Group and began to work on `Cougar', the next
version of HTML with completion late Spring, 1997, eventually to become HTML 4. With all
sorts of innovations for the disabled and support for international languages, as well as providing
style sheet support, extensions to forms, scripting and much more, HTML 4 breaks away from the
simplicity and charm of HTML of earlier years!
Dave Raggett, co-editor of the HTML 4 specification, at work composing at the keyboard at his
home in Boston.

January 1997: HTML 3.2 is ready

Success! In January 1997, the W3 Consortium formally endorsed HTML 3.2 as an HTML cross-
industry specification. HTML 3.2 had been reviewed by all member organizations, including
major browser vendors such as Netscape and Microsoft. This meant that the specification was
now stable and approved of by most Web players. By providing a neutral forum, the W3
Consortium had successfully obtained agreement upon a standard version of HTML. There was
great rejoicing, indeed. HTML 3.2 took the existing IETF HTML 2 standard and incorporated
features from HTML+ and HTML 3. HTML 3.2 included tables, applets, text flow around
images, subscripts and superscripts.

One might well ask why HTML 3.2 was called HTML 3.2 and not, let's say, HTML 3.1 or HTML
3.5. The version number is open to discussion just as much as is any other aspect of HTML. The
version number is often one of the last details to be decided.

Update
Spring 1998: Cougar has now fully materialized as HTML 4.0 and is a W3C Proposed
Recommendation. But do the major browsers implement HTML 4.0, you wonder? As usual in the
computer industry, there is no simple answer. Certainly things are heading in that direction.
Neither Netscape's or Microsofts browser completely implements style sheets in the way
specified, which is a pity, but no doubt they will make amends. There are a number of
pecularities in the way that OBJECT works but we very much hope that this will also eventually
be implemented in a more consistent manner.

What is HTML?

An Opinion About the Purpose and Use of the Worldwide Web's Document Markup
Language

Michael H. Kelsey

Adapted from a pair of articles posted to comp.infosystems.www.authoring.html on 23 June 1995


and 6 July 1995. Part 1 outlines my idea of the philosophy behind HTML, as a platform-
independent way to identify content, rather than presentation. Part 2 describes my opinion of
HTML as a simple language, accessible to non-professionals, and the importance of that for
allowing everyone to make their ideas and opinions available to the world community.

Please send any comments, criticisms, or corrections to me, or post them to the
comp.infosystems.www.authoring.html Usenet Newsgroup.

Last revised: 12 August 1995 (forgot NAME anchors)

In article <3sf10f$9oc@folda.ismennt.is>, Laura V. wrote:

OK, I've read and tried out the basics in HTML...would anyone care to recommend a "next step"
manual? I'm an artist, so I'll be using lots of graphics. I'm thinking of doing a page designed for
Netscape so I can get really specific, then have an alternate page in "proper html" for other
browsers.

Thanks alot! Laura V.

Before looking for a particular manual, it might be useful for you to get comfortable with the
"philosophy" of HTML. It sounds like you believe you should approach HTML markup from a
"visual" perspective. If you do this, I believe you will find the process extremely difficult, and
you will likely end up with results that are unsatisfying. Not because of anything you do "wrong,"
but because you choose the wrong tool for the task. It would be like trying to use a chisel and
mallet to paint a watercolor. It can be done, and might even make a powerful artistic statement,
but it would not make a good watercolor portrait.
HTML was and is designed as a structural description of a document -- think of it as describing
the "composition" of a piece, rather than specific things like the canvas material, the color of the
frame, and so on. Those things are important, but there are other tools (the local users' browser
configuration) to define them. HTML is a language which allows you to identify each component
of a document as a particular piece of information. Those components can then be extracted,
searched, or presented in a variety of ways, whichever is appropriate for the particular user.

The elements ("tags") of HTML are used to identify structural components:

<P>
"This is a paragraph"
<H1>
"What follows is the main section of this document"
<H2>
"What follows is a subsection of the main section"
<UL>
"This is a list of items which don't need to be ordered"
<BLOCKQUOTE>
"This is a long quotation from another author"
<DT>
"This is a term which is going to be defined"
<TABLE>
"This is a collection of two-dimensional data"
and so on. If you are interpreting the document visually, then there are certain conventions by
which each of these components is presented, to make it easy for a person to recognize and
associate the information. But you might not be interpreting the document visually -- you might
have a program which takes a document and generates a table of contents, or an executive
summary, or a list of citations. Such programs are designed to pick out particular structural
components, extract them, maybe even convert one set of markup tags to another, and produce a
new document which contains some or all of the original information, in a different format.

Now, certainly visual presentation is important, and there are many situations in which a
document author will want to encourage a particular presentation of the information. Centering,
alignment, wrap-around text flow; none of these things are structural in themselves, but they add
important visual cues to assist the reader in understanding the relationships between the
information components of the document. attributes are used to qualify, or modify, the
presentation of specific structural elements of an HTML document.

<P ALIGN=CENTER>
"This paragraph should be centered on the page"
<H1 ALIGN=RIGHT>
"This heading should be put at the right margin"
<A HREF="Some-Url">
"This anchor point is a hyperlink to a document"
<UL PLAIN>
"This unordered list should be presented without any bullets"

There are two important things to realize about attributes. First, they are "suggestions:" if the end
user is accessing your document in an unconventional manner, these suggestions may well be
ignored. For example, if I'm using a program to generate a table of contents by extracting all of
the section headings from my document, obviously it doesn't matter whether I wanted the heading
centered, left- or right-justified -- in the table of contents the headings should be presented in the
form of an outline, with different indention levels indicating what sort of section each is, so the
ALIGN attributes would be ignored.

Second, some browsers may not be programmed to understand certain attributes. If you make use
of standard HTML elements, _every_ browser will be able to identify the structure, and the
information content, of your document. They will do the best job they can to present that
structure in a manner that the user can easily identify and understand, even if they cannot follow
all of the special suggestions you have made for that presentation.

On the other hand, if you "misuse" the elements of HTML, believing that a particular structural
component _must_ be presented in a particular way, or using elements in a context where they are
meaningless, then it will be difficult if not impossible for browsers to present your information
sensibly. Different browser programmers have made different choices about the "best" way to
render the structure of a document. Sometimes these choices are forced by limitations of the
device for which the browser is designed (a voice-synthesizer cannot do hanging indents, any
more than a landscape mural can be painted with a G-major triad). Other times, the choices are
aesthetic, and different people have different ideas. Each browser is (or better be!) internally
consistent, but there isn't really any reason to expect two different browsers to do things
identically.

To return to my original point about choosing the right tool for the job. If you are presenting
textual information, for example a retrospective of your own work, where you show some of your
paintings and drawings, with a description of your thoughts and motivations for each one, then
HTML is a terrific way to make that available on the Web. Anyone, with any sort of browser
(even a non-graphical one!) can access and understand your information, if you use HTML
elements to mark the logical, structural components of your document, and attributes to
recommend a particularly pleasing arrangement of those components.

On the other hand, if you are seeking to compose "multimedia art" as an end in itself, HTML may
not be the appropriate tool. If you want to combine letters and words of different sizes and
shapes, mingle fragments of text with graphic images, ensure that there is exactly 17 mm of space
between these two images, because "17" has mystical significance, then HTML would lead you
down a path of despair. Every browser would try to interpret (or worse, completely ignore!) your
markup differently, and the special configurations that each user has for his or her browser (font
sizes, window size, colors, etc.) would make your art appear radically different to different
people. For this purpose, a true, well-designed page-layout language like PostScript or PDF
(Adobe Acrobat), would give you the essential control needed for such a visual project, and
would guarantee that anyone who looked at your artwork online would see either the same thing,
or would see nothing at all (if they don't have the appropriate viewer).

HTML is not a page description language, and PDF is not a device-independent document
markup language. Using either one for the other purpose is a mistake.

In article <3ti871$btc_003@dialup.inch.com>, Justin Greene <jgreene@greene.com> wrote:

Much of the fluff is people experimenting with a technology designed for delivery of content who
do not understand what delivery of content means, nor do they have the training to do it properly
(design training, not HTML). Many of these sites have design for ego sake, not content sake. Just
because Joy Schmo has some DTP program on her system does not make her qualified to produce
presentations. There is a need to understand the tools at hand and the proper use of them.

HTML is an incredibly simple language. Anyone, with little or no training -- even little or no
experience with computers! -- can see pages on the Web, look at the HTML source, and start
making their own Web pages within an hour or two. I'm not exaggerating, I have watched
someone (one of our secretaries) do exactly this.

The reason Justin and I seem to disagree so much, I think, is because he is assuming that the
people who are producing material for the Web are, or want to be, graphic designers. They aren't.
They are business people, college students, housewives, teenagers, even kindergarteners! Justin
wants a powerful language that is suitable for both proper, platform-independent markup of
content, and for powerful graphic design to present that content on sufficiently advanced
platforms. For him (and even for the scientific geeks like me for whom the Web was invented)
that would be terrific, and we'd see some really great sites develop as a result.

But for the millions (yes, millions) of potential information providers out there, the housewives
and businessmen and teenagers, that's not reasonable. It would be like giving a kid a D-9
Caterpillar Tractor so he can build a snow fort. There are some really powerful graphic design
systems out there -- Acrobat and PostScript both have plans to incorporate hypermedia capability,
for example, though I have yet to see a mass market browser for either one.

HTML is meant to be a very simple, pure content language, where most (originally all) of the
minutiae of presentation were left to the platform-specific software. The idea was that you'd have
graphic designers contributing to the development of browsers -- making sensible choices of
fonts, paragraph spacing, list and table rendering, and so on -- so that the user would get an
intelligible presentation of whatever information was made available. If someone had a need for
exact graphical control within an otherwise content-driven HTML document, they would make
use of the limited inline-graphic capabilities to achieve that control.

I believe this is an appropriate goal for HTML, because it puts the power of expression into the
hands of the maximum number of people. Everyone in the world (eventually), can make his or
her voice heard. The niece of the desaparecido who announced her home page a month or so ago;
the fourth grader in Cedar Rapids who wants to find a pen pal in New Zealand; the physicist who
wants to make his current research projects available to his colleagues. The graphic designers
have far more powerful tools available at their disposal, and they have the years of training
necessary to use them effectively, that I don't see why they need to regress to something as
primitive as HTML to make their message

Getting Started

Terms to Know
WWW
World Wide Web
Web
World Wide Web
SGML
Standard Generalized Markup Language--a standard for describing markup languages
DTD
Document Type Definition--this is the formal specification of a markup language, written
using SGML
HTML
HyperText Markup Language--HTML is an SGML DTD
In practical terms, HTML is a collection of platform-independent styles (indicated by
markup tags) that define the various components of a World Wide Web document.
HTML was invented by Tim Berners-Lee while at CERN, the European Laboratory for
Particle Physics in Geneva.

What Isn't Covered

This primer assumes that you:

• know how to use NCSA Mosaic or some other Web browser


• have a general understanding of how Web servers and client browsers work
• have access to a Web server (or that you want to produce HTML documents for personal
use in local-viewing mode)

HTML Version

This guide reflects the most current specification--HTML Version 4.0-- plus some additional
features that have been widely and consistently implemented in browsers. Future versions and
new features for HTML are under development.

HTML Documents

What an HTML Document Is

HTML documents are plain-text (also known as ASCII) files that can be created using any text
editor (e.g., Emacs or vi on UNIX machines; SimpleText on a Macintosh; Notepad on a Windows
machine). You can also use word-processing software if you remember to save your document as
"text only with line breaks".

HTML Editors

Some WYSIWYG editors are also available (e.g., Claris Home Page or Adobe PageMill, both for
Windows and Macintosh). You may wish to try one of them after you learn some of the basics of
HTML tagging. WYSIWYG is an acronym for "what you see is what you get"; it means that you
design your HTML document visually, as if you were using a word processor, instead of writing
the markup tags in a plain-text file and imagining what the resulting page will look like. It is
useful to know enough HTML to code a document before you determine the usefulness of a
WYSIWYG editor, in case you want to add HTML features that your editor doesn't support.
Getting Your Files on a Server

If you have access to a Web server at school or work, contact your webmaster (the individual who
maintains the server) to see how you can get your files on the Web. If you do not have access to a
server at work or school, check to see if your community operates a FreeNet, a community-based
network that provides free access to the Internet. Lacking a FreeNet, you may need to contact a
local Internet provider that will post your files on a server for a fee. (Check your local newspaper
for advertisements or with your Chamber of Commerce for the names of companies.)

Tags Explained

An element is a fundamental component of the structure of a text document. Some examples of


elements are heads, tables, paragraphs, and lists. Think of it this way: you use HTML tags to
mark the elements of a file for your browser. Elements can contain plain text, other elements, or
both.

To denote the various elements in an HTML document, you use tags. HTML tags consist of a left
angle bracket (<), a tag name, and a right angle bracket (>). Tags are usually paired (e.g., <H1>
and </H1>) to start and end the tag instruction. The end tag looks just like the start tag except a
slash (/) precedes the text within the brackets. HTML tags are listed below.

Some elements may include an attribute, which is additional information that is included inside
the start tag. For example, you can specify the alignment of images (top, middle, or bottom) by
including the appropriate attribute with the image source HTML code. Tags that have optional
attributes are noted below.

NOTE: HTML is not case sensitive. <title> is equivalent to <TITLE> or <TiTlE>. There are a
few exceptions noted in Escape Sequences below.

Not all tags are supported by all World Wide Web browsers. If a browser does not support a tag,
it will simply ignore it. Any text placed between a pair of unknown tags will still be displayed,
however.

The Minimal HTML Document

Every HTML document should contain certain standard HTML tags. Each document consists of
head and body text. The head contains the title, and the body contains the actual text that is made
up of paragraphs, lists, and other elements. Browsers expect specific information because they are
programmed according to HTML and SGML specifications.

Required elements are shown in this sample bare-bones document:

<html>
<head>
<TITLE>A Simple HTML Example</TITLE>
</head>
<body>
<H1>HTML is Easy To Learn</H1>
<P>Welcome to the world of HTML.
This is the first paragraph. While short it is
still a paragraph!</P>
<P>And this is the second paragraph.</P>
</body>
</html>
The required elements are the <html>, <head>, <title>, and <body> tags (and their corresponding
end tags). Because you should include these tags in each file, you might want to create a template
file with them. (Some browsers will format your HTML file correctly even if these tags are not
included. But some browsers won't! So make sure to include them.)

Click to see the formatted version of the example. A longer example is also available but you
should read through the rest of the guide before you take a look. This longer-example file
contains tags explained in the next section.

A Teaching Tool

To see a copy of the file that your browser reads to generate the information in your current
window, select View Source (or the equivalent) from the browser menu. (Most browsers have a
"View" menu under which this command is listed.) The file contents, with all the HTML tags, are
displayed in a new window.

This is an excellent way to see how HTML is used and to learn tips and constructs. Of course, the
HTML might not be technically correct. Once you become familiar with HTML and check the
many online and hard-copy references on the subject, you will learn to distinguish between
"good" and "bad" HTML.

Remember that you can save a source file with the HTML codes and use it as a template for one
of your Web pages or modify the format to suit your purposes.

Markup Tags

HTML

This element tells your browser that the file contains HTML-coded information. The file
extension .html also indicates this an HTML document and must be used. (If you are restricted to
8.3 filenames (e.g., LeeHome.htm, use only .htm for your extension.)

HEAD

The head element identifies the first part of your HTML-coded document that contains the title.
The title is shown as part of your browser's window (see below).

TITLE

The title element contains your document title and identifies its content in a global context. The
title is typically displayed in the title bar at the top of the browser window, but not inside the
window itself. The title is also what is displayed on someone's hotlist or bookmark list, so choose
something descriptive, unique, and relatively short. A title is also used to identify your page for
search engines (such as HotBot or Infoseek).
For example, you might include a shortened title of a book along with the chapter contents:
NCSA Mosaic Guide (Windows): Installation. This tells the software name, the platform, and the
chapter contents, which is more useful than simply calling the document Installation. Generally
you should keep your titles to 64 characters or fewer.

BODY

The second--and largest--part of your HTML document is the body, which contains the content of
your document (displayed within the text area of your browser window). The tags explained
below are used within the body of your HTML document.

Headings

HTML has six levels of headings, numbered 1 through 6, with 1 being the largest. Headings are
typically displayed in larger and/or bolder fonts than normal body text. The first heading in each
document should be tagged <H1>.

The syntax of the heading element is:


<Hy>Text of heading </Hy>
where y is a number between 1 and 6 specifying the level of the heading.

Do not skip levels of headings in your document. For example, don't start with a level-one
heading (<H1>) and then next use a level-three (<H3>) heading.

Paragraphs

Unlike documents in most word processors, carriage returns in HTML files aren't significant. In
fact, any amount of whitespace -- including spaces, linefeeds, and carriage returns -- are
automatically compressed into a single space when your HTML document is displayed in a
browser. So you don't have to worry about how long your lines of text are. Word wrapping can
occur at any point in your source file without affecting how the page will be displayed.

In the bare-bones example shown in the Minimal HTML Document section, the first paragraph is
coded as

<P>Welcome to the world of HTML.


This is the first paragraph.
While short it is
still a paragraph!</P>

In the source file there is a line break between the sentences. A Web browser ignores this line
break and starts a new paragraph only when it encounters another <P> tag.

Important: You must indicate paragraphs with <P> elements. A browser ignores any
indentations or blank lines in the source text. Without <P> elements, the document becomes one
large paragraph. (One exception is text tagged as "preformatted," which is explained below.) For
example, the following would produce identical output as the first bare-bones HTML example:

<H1>Level-one heading</H1>
<P>Welcome to the world of HTML. This is the
first paragraph. While short it is still a
paragraph! </P> <P>And this is the second paragraph.</P>
To preserve readability in HTML files, put headings on separate lines, use a blank line or two
where it helps identify the start of a new section, and separate paragraphs with blank lines (in
addition to the <P> tags). These extra spaces will help you when you edit your files (but your
browser will ignore the extra spaces because it has its own set of rules on spacing that do not
depend on the spaces you put in your source file).

NOTE: The </P> closing tag may be omitted. This is because browsers understand that when
they encounter a <P> tag, it means that the previous paragraph has ended. However, since HTML
now allows certain attributes to be assigned to the <P> tag, it's generally a good idea to include it.

Using the <P> and </P> as a paragraph container means that you can center a paragraph by
including the ALIGN=alignment attribute in your source file.

<P ALIGN=CENTER>
This is a centered paragraph.
[See the formatted version below.]
</P>

This is a centered paragraph.

It is also possible to align a paragraph to the right instead, by including the ALIGN=RIGHT
attribute. ALIGN=LEFT is the default alignment; if no ALIGN attribute is included, the
paragraph will be left-aligned.

Lists

HTML supports unnumbered, numbered, and definition lists. You can nest lists too, but use this
feature sparingly because too many nested items can get difficult to follow.

Unnumbered Lists

To make an unnumbered, bulleted list,

1. start with an opening list <UL> (for unnumbered list) tag


2. enter the <LI> (list item) tag followed by the individual item; no closing </LI> tag is
needed
3. end the entire list with a closing list </UL> tag

Below is a sample three-item list:

<UL>
<LI> apples
<LI> bananas
<LI> grapefruit
</UL>
The output is:

• apples
• bananas
• grapefruit

The <LI> items can contain multiple paragraphs. Indicate the paragraphs with the <P> paragraph
tags.

Numbered Lists

A numbered list (also called an ordered list, from which the tag name derives) is identical to an
unnumbered list, except it uses <OL> instead of <UL>. The items are tagged using the same
<LI> tag. The following HTML code:

<OL>
<LI> oranges
<LI> peaches
<LI> grapes
</OL>

produces this formatted output:

1. oranges
2. peaches
3. grapes

Definition Lists

A definition list (coded as <DL>) usually consists of alternating a definition term (coded as
<DT>) and a definition description (coded as <DD>). Web browsers generally format the
definition on a new line and indent it.

The following is an example of a definition list:

<DL>
<DT> NCSA
<DD> NCSA, the National Center for Supercomputing
Applications, is located on the campus of the
University of Illinois at Urbana-Champaign.
<DT> Cornell Theory Center
<DD> CTC is located on the campus of Cornell
University in Ithaca, New York.
</DL>

The output looks like:

NCSA
NCSA, the National Center for Supercomputing Applications, is located on the campus
of the University of Illinois at Urbana-Champaign.
Cornell Theory Center
CTC is located on the campus of Cornell University in Ithaca, New York.

The <DT> and <DD> entries can contain multiple paragraphs (indicated by <P> paragraph tags),
lists, or other definition information.

The COMPACT attribute can be used routinely in case your definition terms are very short. If,
for example, you are showing some computer options, the options may fit on the same line as the
start of the definition.

<DL COMPACT>
<DT> -i
<DD>invokes NCSA Mosaic for Microsoft Windows
using the initialization file defined in the path
<DT> -k
<DD>invokes NCSA Mosaic for Microsoft Windows in
kiosk mode
</DL>
The output looks like:
-i
invokes NCSA Mosaic for Microsoft Windows using the initialization file defined in the
path.
-k
invokes NCSA Mosaic for Microsoft Windows in kiosk mode.

Nested Lists

Lists can be nested. You can also have a number of paragraphs, each containing a nested list, in a
single list item.

Here is a sample nested list:

<UL>
<LI> A few New England states:
<UL>
<LI> Vermont
<LI> New Hampshire
<LI> Maine
</UL>
<LI> Two Midwestern states:
<UL>
<LI> Michigan
<LI> Indiana
</UL>
</UL>

The nested list is displayed as


• A few New England states:
o Vermont
o New Hampshire
o Maine
• Two Midwestern states:
o Michigan
o Indiana

Preformatted Text

Use the<PRE> tag (which stands for "preformatted") to generate text in a fixed-width font. This
tag also makes spaces, new lines, and tabs significant -- multiple spaces are displayed as multiple
spaces, and lines break in the same locations as in the source HTML file. This is useful for
program listings, among other things. For example, the following lines:

<PRE>
#!/bin/csh
cd $SCR
cfs get mysrc.f:mycfsdir/mysrc.f
cfs get myinfile:mycfsdir/myinfile
fc -02 -o mya.out mysrc.f
mya.out
cfs save myoutfile:mycfsdir/myoutfile
rm *
</PRE>

display as:

#!/bin/csh
cd $SCR
cfs get mysrc.f:mycfsdir/mysrc.f
cfs get myinfile:mycfsdir/myinfile
fc -02 -o mya.out mysrc.f
mya.out
cfs save myoutfile:mycfsdir/myoutfile
rm *

The <PRE> tag can be used with an optional WIDTH attribute that specifies the maximum
number of characters for a line. WIDTH also signals your browser to choose an appropriate font
and indentation for the text.

Hyperlinks can be used within <PRE> sections. You should avoid using other HTML tags within
<PRE> sections, however.

Note that because <, >, and & have special meanings in HTML, you must use their escape
sequences (&lt;, &gt;, and &amp;, respectively) to enter these characters. See the section Escape
Sequences for more information.
Extended Quotations

Use the <BLOCKQUOTE> tag to include lengthy quotations in a separate block on the screen.
Most browsers generally change the margins for the quotation to separate it from surrounding
text.

In the example:

<P>Omit needless words.</P>


<BLOCKQUOTE>
<P>Vigorous writing is concise. A sentence should
contain no unnecessary words, a paragraph no unnecessary
sentences, for the same reason that a drawing should have
no unnecessary lines and a machine no unnecessary parts.
</P>
<P>--William Strunk, Jr., 1918 </P>
</BLOCKQUOTE>

the result is:

Omit needless words.

Vigorous writing is concise. A sentence should contain no unnecessary words, a paragraph no


unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a
machine no unnecessary parts.

--William Strunk, Jr., 1918

Forced Line Breaks/Postal Addresses

The <BR> tag forces a line break with no extra (white) space between lines. Using <P> elements
for short lines of text such as postal addresses results in unwanted additional white space. For
example, with
:

National Center for Supercomputing Applications<BR>


605 East Springfield Avenue<BR>
Champaign, Illinois 61820-5518<BR>
The output is:

National Center for Supercomputing Applications


605 East Springfield Avenue
Champaign, Illinois 61820-5518

Horizontal Rules

The <HR> tag produces a horizontal line the width of the browser window. A horizontal rule is
useful to separate major sections of your document.
You can vary a rule's size (thickness) and width (the percentage of the window covered by the
rule). Experiment with the settings until you are satisfied with the presentation. For example:

<HR SIZE=4 WIDTH="50%">


displays as:

Character Formatting

HTML has two types of styles for individual words or sentences: logical and physical. Logical
styles tag text according to its meaning, while physical styles indicate the specific appearance of a
section. For example, in the preceding sentence, the words "logical styles" was tagged as
"emphasis." The same effect (formatting those words in italics) could have been achieved via a
different tag that tells your browser to "put these words in italics."

Logical Versus Physical Styles

If physical and logical styles produce the same result on the screen, why are there both?

In the ideal SGML universe, content is divorced from presentation. Thus SGML tags a level-one
heading as a level-one heading, but does not specify that the level-one heading should be, for
instance, 24-point bold Times centered. The advantage of this approach (it's similar in concept to
style sheets in many word processors) is that if you decide to change level-one headings to be 20-
point left-justified Helvetica, all you have to do is change the definition of the level-one heading
in your Web browser. Indeed, many browsers today let you define how you want the various
HTML tags rendered on-screen using what are called cascading style sheets, or CSS. CSS is more
advanced than HTML, though, and will not be covered in this Primer. (You can learn more about
CSS at the World Wide Web Consortium CSS site.)

Another advantage of logical tags is that they help enforce consistency in your documents. It's
easier to tag something as <H1> than to remember that level-one headings are 24-point bold
Times centered or whatever. For example, consider the <STRONG> tag. Most browsers render it
in bold text. However, it is possible that a reader would prefer that these sections be displayed in
red instead. (This is possible using a local cascading style sheet on the reader's own computer.)
Logical styles offer this flexibility.

Of course, if you want something to be displayed in italics (for example) and do not want a
browser's setting to display it differently, you should use physical styles. Physical styles,
therefore, offer consistency in that something you tag a certain way will always be displayed that
way for readers of your document.

Try to be consistent about which type of style you use. If you tag with physical styles, do so
throughout a document. If you use logical styles, stick with them within a document. Keep in
mind that future releases of HTML might not support certain logical styles, which could mean
that browsers will not display your logical-style coding. (For example, the <DFN> tag -- short for
"definition", and typically displayed in italics -- is not widely supported and will be ignored if the
reader's browser does not understand it.)

Logical Styles
<DFN>
for a word being defined. Typically displayed in italics. (NCSA Mosaic is a World Wide
Web browser.)
<EM>
for emphasis. Typically displayed in italics. (Consultants cannot reset your password
unless you call the help line.)
<CITE>
for titles of books, films, etc. Typically displayed in italics. (A Beginner's Guide to
HTML)
<CODE>
for computer code. Displayed in a fixed-width font. (The <stdio.h> header file)
<KBD>
for user keyboard entry. Typically displayed in plain fixed-width font. (Enter passwd to
change your password.)
<SAMP>
for a sequence of literal characters. Displayed in a fixed-width font. (Segmentation fault:
Core dumped.)
<STRONG>
for strong emphasis. Typically displayed in bold. (NOTE: Always check your links.)
<VAR>
for a variable, where you will replace the variable with specific information. Typically
displayed in italics. (rm filename deletes the file.)

Physical Styles
<B>
bold text
<I>
italic text
<TT>
typewriter text, e.g. fixed-width font.

Escape Sequences (a.k.a. Character Entities)

Character entities have two functions:

• escaping special characters


• displaying other characters not available in the plain ASCII character set (primarily
characters with diacritical marks)

Three ASCII characters--the left angle bracket (<), the right angle bracket (>), and the ampersand
(&)--have special meanings in HTML and therefore cannot be used "as is" in text. (The angle
brackets are used to indicate the beginning and end of HTML tags, and the ampersand is used to
indicate the beginning of an escape sequence.) Double quote marks may be used as-is but a
character entity may also be used (&quot;).

To use one of the three characters in an HTML document, you must enter its escape sequence
instead:

&lt;
the escape sequence for <
&gt;
the escape sequence for >
&amp;
the escape sequence for &

Additional escape sequences support accented characters, such as:

&ouml;
a lowercase o with an umlaut: ö
&ntilde;
a lowercase n with a tilde: ñ
&Egrave;
an uppercase E with a grave accent: È
You can substitute other letters for the o, n, and E shown above. Visit the World Wide Web
Consortium for a complete list of special characters.

NOTE: Unlike the rest of HTML, the escape sequences are case sensitive. You cannot, for
instance, use &LT; instead of &lt;.

Linking

The chief power of HTML comes from its ability to link text and/or an image to another
document or section of a document. A browser highlights the identified text or image with color
and/or underlines to indicate that it is a hypertext link (often shortened to hyperlink or just link).

HTML's single hypertext-related tag is <A>, which stands for anchor. To include an anchor in
your document:

1. start the anchor with <A (include a space after the A)


2. specify the document you're linking to by entering the parameter HREF="filename"
followed by a closing right angle bracket (>)
3. enter the text that will serve as the hypertext link in the current document
4. enter the ending anchor tag: </A> (no space is needed before the end anchor tag)

Here is a sample hypertext reference in a file called US.html:

<A HREF="MaineStats.html">Maine</A>

This entry makes the word Maine the hyperlink to the document MaineStats.html, which is in the
same directory as the first document.

Relative Pathnames Versus Absolute Pathnames

You can link to documents in other directories by specifying the relative path from the current
document to the linked document. For example, a link to a file NYStats.html located in the
subdirectory AtlanticStates would be:

<A HREF="AtlanticStates/NYStats.html">New York</A>


These are called relative links because you are specifying the path to the linked file relative to the
location of the current file. You can also use the absolute pathname (the complete URL) of the
file, but relative links are more efficient in accessing a server. They also have the advantage of
making your documents more "portable" -- for instance, you can create several web pages in a
single folder on your local computer, using relative links to hyperlink one page to another, and
then upload the entire folder of web pages to your web server. The pages on the server will then
link to other pages on the server, and the copies on your hard drive will still point to the other
pages stored there.

It is important to point out that UNIX is a case-sensitive operating system where filenames are
concerned, while DOS and the MacOS are not. For instance, on a Macintosh,
"DOCUMENT.HTML", "Document.HTML", and "document.html" are all the same name. If you
make a relative hyperlink to "DOCUMENT.HTML", and the file is actually named
"document.html", the link will still be valid. But if you upload all your pages to a UNIX web
server, the link will no longer work. Be sure to check your filenames before uploading.

Pathnames use the standard UNIX syntax. The UNIX syntax for the parent directory (the
directory that contains the current directory) is "..". (For more information consult a beginning
UNIX reference text such as Learning the UNIX Operating System from O'Reilly and Associates,
Inc.)

If you were in the NYStats.html file and were referring to the original document US.html, your
link would look like this:

<A HREF="../US.html">United States</A>


In general, you should use relative links whenever possible because:

1. it's easier to move a group of documents to another location (because the relative path
names will still be valid)
2. it's more efficient connecting to the server
3. there is less to type

However, use absolute pathnames when linking to documents that are not directly related. For
example, consider a group of documents that comprise a user manual. Links within this group
should be relative links. Links to other documents (perhaps a reference to related software)
should use absolute pathnames instead. This way if you move the user manual to a different
directory, none of the links would have to be updated.

URLs

The World Wide Web uses Uniform Resource Locators (URLs) to specify the location of files on
other servers. A URL includes the type of resource being accessed (e.g., Web, gopher, FTP), the
address of the server, and the location of the file. The syntax is:

scheme://host.domain [:port]/path/ filename

where scheme is one of

file
a file on your local system
ftp
a file on an anonymous FTP server
http
a file on a World Wide Web server
gopher
a file on a Gopher server
WAIS
a file on a WAIS server
news
a Usenet newsgroup
telnet
a connection to a Telnet-based service

The port number can generally be omitted. (That means unless someone tells you otherwise,
leave it out.)

For example, to include a link to this primer in your document, enter:

<A HREF="http://www.ncsa.uiuc.edu/General/Internet/WWW/HTMLPrimer.html">
NCSA's Beginner's Guide to HTML</A>

This entry makes the text NCSA's Beginner's Guide to HTML a hyperlink to this document.

There is also a mailto scheme, used to hyperlink email addresses, but this scheme is unique in that
it uses only a colon (:) instead of :// between the scheme and the address. You can read more
about mailto below.

For more information on URLs, refer to:

• WWW Names and Addresses, URIs, URLs, URNs


• A Beginner's Guide to URLs

Links to Specific Sections

Anchors can also be used to move a reader to a particular section in a document (either the same
or a different document) rather than to the top, which is the default. This type of an anchor is
commonly called a named anchor because to create the links, you insert HTML names within the
document.

This guide is a good example of using named anchors in one document. The guide is constructed
as one document to make printing easier. But as one (long) document, it can be time-consuming
to move through when all you really want to know about is one bit of information about HTML.
Internal hyperlinks are used to create a "table of contents" at the top of this document. These
hyperlinks move you from one location in the document to another location in the same
document. (Go to the top of this document and then click on the Links to Specific Sections
hyperlink in the table of contents. You will wind up back here.)

You can also link to a specific section in another document. That information is presented first
because understanding that helps you understand linking within one document.
Links Between Sections of Different Documents

Suppose you want to set a link from document A (documentA.html) to a specific section in
another document (MaineStats.html).

Enter the HTML coding for a link to a named anchor:

documentA.html:

In addition to the many state parks, Maine is also home to


<a href="MaineStats.html#ANP">Acadia National Park</a>.
Think of the characters after the hash (#) mark as a tab within the MaineStats.html file. This tab
tells your browser what should be displayed at the top of the window when the link is activated.
In other words, the first line in your browser window should be the Acadia National Park
heading.

Next, create the named anchor (in this example "ANP") in MaineStats.html:

<H2><A NAME="ANP">Acadia National Park</a></H2>

With both of these elements in place, you can bring a reader directly to the Acadia reference in
MaineStats.html.

NOTE: You cannot make links to specific sections within a different document unless either you
have write permission to the coded source of that document or that document already contains in-
document named anchors. For example, you could include named anchors to this primer in a
document you are writing because there are named anchors in this guide (use View Source in
your browser to see the coding). But if this document did not have named anchors, you could not
make a link to a specific section because you cannot edit the original file on NCSA's server.

Links to Specific Sections within the Current Document

The technique is the same except the filename is omitted.

For example, to link to the ANP anchor from within MaineStats, enter:

...More information about


<A HREF="#ANP">Acadia National Park</a>
is available elsewhere in this document.

Be sure to include the <A NAME=> tag at the place in your document where you want the link to
jump to (<A NAME="ANP">Acadia National Park</a>).

Named anchors are particularly useful when you think readers will print a document in its entirety
or when you have a lot of short information you want to place online in one file.
Mailto

You can make it easy for a reader to send electronic mail to a specific person or mail alias by
including the mailto attribute in a hyperlink. The format is:

<A HREF="mailto:emailinfo@host">Name</a>
For example, enter:
<A HREF="mailto:pubs@ncsa.uiuc.edu">
NCSA Publications Group</a>
to create a mail window that is already configured to open a mail window for the NCSA
Publications Group alias. (You, of course, will enter another mail address!)

Inline Images

Most Web browsers can display inline images (that is, images next to text) that are in X Bitmap
(XBM), GIF, or JPEG format. Other image formats are also being incorporated into Web
browsers [e.g., the Portable Network Graphic (PNG) format]. Each image takes additional time to
download and slows down the initial display of a document. Carefully select your images and the
number of images in a document.

To include an inline image, enter:

<IMG SRC=ImageName>

where ImageName is the URL of the image file.

The syntax for <IMG SRC> URLs is identical to that used in an anchor HREF. If the image file is
a GIF file, then the filename part of ImageName must end with .gif. Filenames of X Bitmap
images must end with .xbm; JPEG image files must end with .jpg or .jpeg; and Portable Network
Graphic files must end with .png.

Image Size Attributes

You should include two other attributes on <IMG> tags to tell your browser the size of the
images it is downloading with the text. The HEIGHT and WIDTH attributes let your browser set
aside the appropriate space (in pixels) for the images as it downloads the rest of the file. (You can
get the pixel size from your image-processing software, such as Adobe Photoshop. Some
browsers will also display the dimensions of an image file in the title bar if the image is viewed
by itself without an enclosing HTML document.)

For example, to include a self portrait image in a file along with the portrait's dimensions, enter:

<IMG SRC=SelfPortrait.gif HEIGHT=100 WIDTH=65>

NOTE: Some browsers use the HEIGHT and WIDTH attributes to stretch or shrink an image to
fit into the allotted space when the image does not exactly match the attribute numbers. Not all
browser developers think stretching/shrinking is a good idea, so don't plan on your readers having
access to this feature. Check your dimensions and use the correct ones.
Aligning Images

You have some flexibility when displaying images. You can have images separated from text and
aligned to the left or right or centered. Or you can have an image aligned with text. Try several
possibilities to see how your information looks best.

Aligning Text with an Image

By default the bottom of an image is aligned with the following text, as shown in this
paragraph. You can align images to the top or center of a paragraph using the ALIGN= attributes
TOP and CENTER.

This text is aligned with the top of the image (<IMG SRC = "BarHotlist.gif"
ALT="[HOTLIST]" ALIGN=TOP>). Notice how the browser aligns only one line and then
jumps to the bottom of the image for the rest of the text.

And this text is centered on the image (<IMG SRC = "BarHotlist.gif"


ALT="[HOTLIST]" ALIGN=CENTER>). Again, only one line of text is centered; the rest is
below the image.

Images without Text


To display an image without any associated text (e.g., your organization's logo), make it a
separate paragraph. Use the paragraph ALIGN= attribute to center the image or adjust it to the
right side of the window as shown below:

<p ALIGN=CENTER>
<IMG SRC = "BarHotlist.gif" ALT="[HOTLIST]">
</p>
which results in:

The image is centered; this paragraph starts below it and left justified.

Alternate Text for Images

Some World Wide Web browsers -- primarily the text-only browsers such as Lynx -- cannot
display images. Some users turn off image loading even if their software can display images
(especially if they are using a modem or have a slow connection). HTML provides a mechanism
to tell readers what they are missing on your pages if they can't load images.

The ALT attribute lets you specify text to be displayed instead of an image. For example:
<IMG SRC="UpArrow.gif" ALT="Up">

where UpArrow.gif is the picture of an upward pointing arrow. With graphics-capable viewers
that have image-loading turned on, you see the up arrow graphic. With a text-only browser or if
image-loading is turned off, the word Up is shown in your window in place of the image.

You should try to include alternate text for each image you use in your document, which is a
courtesy for your readers -- or, for users who might be visually impaired, a necessity.

Images as Hyperlinks

Inline images can be used as hyperlinks just like plain text. The following HTML code:

<A HREF="hotlist.html"><IMG SRC="BarHotlist.gif" ALT="[HOTLIST]"></A>

Produces the following result:

(Note that this link doesn't actually go anywhere.) The blue border that surrounds the image
indicates that it's a clickable hyperlink. You may not always want this border to be displayed,
though. In this case you can use the BORDER attribute of the IMG tag to make the image appear
as normal. Adding the BORDER attribute and setting it to zero:

<A HREF="hotlist.html"><IMG SRC="BarHotlist.gif" BORDER=0


ALT="[HOTLIST]"></A>

Produces the following result:

The BORDER attribute can also be set to non-zero values, whether or not the image is used as a
hyperlink. In this case, the border will appear using the default text color for the web page. For
instance, if you wanted to give your image a plain black border to help it stand out on the page,
you might try this:

<IMG SRC="BarHotlist.gif" BORDER=6 ALT="[HOTLIST]">

And get the following result:


Background Graphics

Newer versions of Web browsers can load an image and use it as a background when displaying a
page. Some people like background images and some don't. In general, if you want to include a
background, make sure your text can be read easily when displayed on top of the image.

Background images can be a texture (linen finished paper, for example) or an image of an object
(a logo possibly). You create the background image as you do any image.

However you only have to create a small piece of the image. Using a feature called tiling, a
browser takes the image and repeats it across and down to fill your browser window. In sum you
generate one image, and the browser replicates it enough times to fill your window. This action is
automatic when you use the background tag shown below.

The tag to include a background image is included in the <BODY> statement as an attribute:

<BODY BACKGROUND="filename.gif">

Background Color

By default browsers display text in black on a gray background. However, you can change both
elements if you want. Some HTML authors select a background color and coordinate it with a
change in the color of the text.

Always preview changes like this to make sure your pages are readable. (For example, many
people find red text on a black background difficult to read!) In general, try to avoid using high-
contrast images or images that use the color of your text anywhere within the graphic.

You change the color of text, links, visited links, and active links (links that are currently being
clicked on) using further attributes of the <BODY> tag. For example:

<BODY BGCOLOR="#000000" TEXT="#FFFFFF" LINK="#9690CC">


This creates a window with a black background (BGCOLOR), white text (TEXT), and silvery
hyperlinks (LINK).

The six-digit number and letter combinations represent colors by giving their RGB (red, green,
blue) value. The six digits are actually three two-digit numbers in sequence, representing the
amount of red, green, or blue as a hexadecimal value in the range 00-FF. For example, 000000 is
black (no color at all), FF0000 is bright red, 0000FF is bright blue, and FFFFFF is white (fully
saturated with all three colors).

These number and letter combinations are generally rather cryptic. Fortunately an online resource
is available to help you track down the combinations that map to specific colors and there is
software available for you to do this on your workstation:

• VisiBone Online Color Lab for the Webmaster's Palette

For some basic colors -- typically those in the standard sixteen-color Windows 3.1 palette -- you
can also use the name of the color instead of the corresponding RGB value. For example, "black",
"red", "blue", and "cyan" are all valid for use in place of RGB values. However, while not all
browsers will understand all color names, any browser that can display colors will understand
RGB values, so use them whenever possible.

External Images, Sounds, and Animations

You may want to have an image open as a separate document when a user activates a link on
either a word or a smaller, inline version of the image included in your document. This is called
an external image, and it is useful if you do not wish to slow down the loading of the main
document with large inline images.

To include a reference to an external image, enter:

<A HREF="MyImage.gif">link anchor</A>


You can also use a smaller image as a link to a larger image. Enter:
<A HREF="LargerImage.gif"><IMG SRC="SmallImage.gif"></A>
The reader sees the SmallImage.gif image and clicks on it to open the LargerImage.gif file.

Use the same syntax for links to external animations and sounds. The only difference is the file
extension of the linked file. For example,

<A HREF="AdamsRib.mov">link anchor</A>

specifies a link to a QuickTime movie. Some common file types and their extensions are:

plain text
.txt
HTML document
.html
GIF image
.gif
TIFF image
.tiff
X Bitmap image
.xbm
JPEG image
.jpg or .jpeg
PostScript file
.ps
AIFF sound file
.aiff
AU sound file
.au
WAV sound file
.wav
QuickTime movie
.mov
MPEG movie
.mpeg or .mpg
Keep in mind your intended audience and their access to software. Most UNIX workstations, for
instance, cannot view QuickTime movies.

Tables

Before HTML tags for tables were finalized, authors had to carefully format their tabular
information within <PRE> tags, counting spaces and previewing their output. Tables are very
useful for presentation of tabular information as well as a boon to creative HTML authors who
use the table tags to present their regular Web pages. (Check out the NCSA home page for an
excellent example of using tables to control page layout.)

Think of your tabular information in light of the coding explained below. A table has heads where
you explain what the columns/rows include, rows for information, cells for each item. In the
following table, the first column contains the header information, each row explains an HTML
table tag, and each cell contains a paired tag or an explanation of the tag's function.

Table Elements
Element Description
<TABLE> ... defines a table in HTML. If the BORDER attribute is
</TABLE> present, your browser displays the table with a border.
<CAPTION> ... defines the caption for the title of the table. The default
</CAPTION> position of the title is centered at the top of the table. The
attribute ALIGN=BOTTOM can be used to position the
caption below the table.
NOTE: Any kind of markup tag can be used in the
caption.
<TR> ... </TR> specifies a table row within a table. You may define
default attributes for the entire row: ALIGN (LEFT,
CENTER, RIGHT) and/or VALIGN (TOP, MIDDLE,
BOTTOM). See Table Attributes at the end of this table
for more information.
<TH> ... </TH> defines a table header cell. By default the text in this cell
is bold and centered. Table header cells may contain other
attributes to determine the characteristics of the cell and/or
its contents. See Table Attributes at the end of this table
for more information.
<TD> ... </TD> defines a table data cell. By default the text in this cell is
aligned left and centered vertically. Table data cells may
contain other attributes to determine the characteristics of
the cell and/or its contents. See Table Attributes at the end
of this table for more information.
Table Attributes
NOTE: Attributes defined within <TH> ... </TH> or <TD> ... </TD> cells
override the default alignment set in a <TR> ... </TR>.
Attribute Description
ALIGN (LEFT, CENTER, RIGHT) Horizontal alignment of a cell.
VALIGN (TOP, MIDDLE, Vertical alignment of a cell.
BOTTOM)
COLSPAN=n The number (n) of columns a cell
spans.
ROWSPAN=n The number (n) of rows a cell spans.
NOWRAP Turn off word wrapping within a cell.

General Table Format

The general format of a table looks like this:

<TABLE>
<!-- start of table definition -->

<CAPTION> caption contents </CAPTION>


<!-- caption definition -->

<TR>
<!-- start of header row definition -->
<TH> first header cell contents </TH>
<TH> last header cell contents </TH>
</TR>
<!-- end of header row definition -->

<TR>
<!-- start of first row definition -->
<TD> first row, first cell contents </TD>
<TD> first row, last cell contents </TD>
</TR>
<!-- end of first row definition -->

<TR>
<!-- start of last row definition -->
<TD> last row, first cell contents </TD>
<TD> last row, last cell contents </TD>
</TR>
<!-- end of last row definition -->

</TABLE>
<!-- end of table definition -->

You can cut-and-paste the above code into your own HTML documents, adding new rows or cells
as necessary. The above example looks like this when rendered in a browser.

The <TABLE> and </TABLE> tags must surround the entire table definition. The first item
inside the table is the CAPTION, which is optional. Then you can have any number of rows
defined by the <TR> and </TR> tags. Within a row you can have any number of cells defined by
the <TD>...</TD> or <TH>...</TH> tags. Each row of a table is, essentially, formatted
independently of the rows above and below it. This lets you easily display tables like the one
above with a single cell, such as Table Attributes, spanning columns of the table.

Tables for Nontabular Information

Some HTML authors use tables to present nontabular information. For example, because links
can be included in table cells, some authors use a table with no borders to create "one" image
from separate images. Browsers that can display tables properly show the various images
seamlessly, making the created image seem like an image map (one image with hyperlinked
quadrants).

Using table borders with images can create an impressive display as well. Experiment and see
what you like.

Fill-out Forms

Web forms let a reader return information to a Web server for some action. For example, suppose
you collect names and email addresses so you can email some information to people who request
it. For each person who enters his or her name and address, you need some information to be sent
and the respondent's particulars added to a data base.

This processing of incoming data is usually handled by a script or program written in Perl or
another language that manipulates text, files, and information. If you cannot write a program or
script for your incoming information, you need to find someone who can do this for you.

The forms themselves are not hard to code. They follow the same constructs as other HTML tags.
What could be difficult is the program or script that takes the information submitted in a form and
processes it. Because of the need for specialized scripts to handle the incoming form information,
fill-out forms are not discussed in this primer. Check the Additional Online Reference section for
more information.

Troubleshooting

Avoid Overlapping Tags

Consider this example of HTML:

<B>This is an example of <I>overlapping</B>


HTML tags.</I>

The word overlapping is contained within both the <B> and <I> tags. A browser might be
confused by this coding and might not display it the way you intend. The only way to know is to
check each popular browser (which is time-consuming and impractical).

In general, avoid overlapping tags. Look at your tags and try pairing them up. Tags (with the
obvious exceptions of elements whose end tags may be omitted, such as paragraphs) should be
paired without an intervening tag in between. Look again at the example above. You cannot pair
the bold tags without another tag in the middle (the first definition tag). Try matching your coding
up like this to see if you have any problem areas that should be fixed before you release your files
to a server.

Embed Only Anchors and Character Tags

HTML protocol allows you to embed links within other HTML tags:

<H1><A HREF="Destination.html">My heading</A></H1>

Do not embed HTML tags within an anchor:

<A HREF="Destination.html">
<H1>My heading</H1>
</A>

Although most browsers currently handle this second example, the official HTML specifications
do not support this construct and your file will probably not work with future browsers.
Remember that browsers can be forgiving when displaying improperly coded files. But that
forgiveness may not last to the next version of the software! When in doubt, code your files
according to the HTML specifications (see For More Information below).

Character tags modify the appearance of the text within other elements:

<UL>
<LI><B>A bold list item</B>
<LI><I>An italic list item</I>
</UL>

Avoid embedding other types of HTML element tags. For example, you might be tempted to
embed a heading within a list in order to make the font size larger:

<UL>
<LI><H1>A large heading</H1>
<LI><H2>Something slightly smaller</H2>
</UL>

Although some browsers handle this quite nicely, formatting of such coding is unpredictable
(because it is undefined). For compatibility with all browsers, avoid these kinds of constructs.
(The Netscape <FONT> tag, which lets you specify how large individual characters will be
displayed in your window, is not currently part of the official HTML specifications.)

What's the difference between embedding a <B> within a <LI> tag as opposed to embedding a
<H1> within a <LI>? Within HTML the semantic meaning of <H1> is that it's the main heading
of a document and that it should be followed by the content of the document. Therefore it doesn't
make sense to find a <H1> within a list.

Character formatting tags also are generally not additive. For example, you might expect that:

<B><I>some text</I></B>
would produce bold-italic text. On some browsers it does; other browsers interpret only the
innermost tag.

Do the Final Steps

Validate Your Code

When you put a document on a Web server, be sure to check the formatting and each link
(including named anchors). Ideally you will have someone else read through and comment on
your file(s) before you consider a document finished.

You can run your coded files through one of several on-line HTML validation services that will
tell you if your code conforms to accepted HTML. If you are not sure your coding conforms to
HTML specifications, this can be a useful teaching tool. Fortunately the service lets you select the
level of conformance you want for your files (i.e., strict, level 2, level 3). If you want to use some
codes that are not officially part of the HTML specifications, this latitude is helpful.

Dummy Images

When an <IMG SRC> tag points to an image that does not exist, a dummy image is substituted
by your browser software. When this happens during your final review of your files, make sure
that the referenced image does in fact exist, that the hyperlink has the correct information in the
URL, and that the file permission is set appropriately (world-readable). Then check online again!

Update Your Files

If the contents of a file are static (such as a biography of George Washington), no updating is
probably needed. But for documents that are time sensitive or covering a field that changes
frequently, remember to update your documents!

Updating is particularly important when the file contains information such as a weekly schedule
or a deadline for a program funding announcement. Remove out-of-date files or note why
something that appears dated is still on a server (e.g., the program requirements will remain the
same for the next cycle so the file is still available as an interim reference).

Browsers Differ

Web browsers display HTML elements differently. Remember that not all codes used in HTML
files are interpreted by all browsers. Any code a browser does not understand is usually ignored
though.

You could spend a lot of time making your file "look perfect" using your current browser. If you
check that file using another browser, it will likely display (a little or a lot) differently. Hence
these words of advice: code your files using correct HTML. Leave the interpreting to the
browsers and hope for the best.
Commenting Your Files

You might want to include comments in your HTML files. Comments in HTML are like
comments in a computer program--the text you enter is not used by the browser in any formatting
and is not directly viewable by the reader just as computer program comments are not used and
are not viewable. The comments are accessible if a reader views the source file, however.

Comments such as the name of the person updating a file, the software and version used in
creating a file, or the date that a minor edit was made are the norm.

To include a comment, enter:

<!-- your comments here -->


You must include the exclamation mark and the hyphens as shown.

Das könnte Ihnen auch gefallen