Short explanation of HTML

[ambient page updated 11 Oct 05] ... [ home ] ... [ garrett@umn.edu ]

The world's shortest introduction to HTML

("If it's not mentioned here, don't do it!")

  - paul garrett, GPL copyright 2000

<P> "new paragraph": adds vertical space and starts a new line
(Repeated "<P>"'s do not reliably add extra space.) (Also, originally
there was an end-paragraph tag, </P>, but it seems no longer in use,
and browsers don't need it.)

<BR> "linebreak": starts a new line without any vertical space between
(Repeated "<BR>"'s will probably add extra vertical space, but the
response will vary depending upon the browser. Don't do it.)

<HR NOSHADE> "horizontal rule": creates a solid line across the
page. Without the "NOSHADE" some browsers will make dotted lines,
which are visually distracting.

<CENTER>   </CENTER> "center": anything between these will be centered

<H1>   </H1> "heading 1": anything between these will be in the
largest possible letters

<H2>   </H2> "heading 2": anything between these will be in the
second largest possible letters

<H3>, <H4>, etc do not really give progressively smaller fonts on most
browsers, so just forget it.

If you want your heading centered, the best way is
<CENTER>
<H2> This is the heading </H2>
</CENTER>
since the <H2> doesn't do the centering for you

<STRONG>   </STRONG> "strong": anything between will be in boldface
<B>   </B>           "boldface": same effect as with STRONG

<EM>   </EM> "emphasis": anything between will be in italic
<I>   </I>           "italic": same effect as with EM

(Trying to combine <H2>, <B>, and or <I> does not have predictable
effects, so just forget it.)

<HTML>   </HTML> The first should appear at the top of the page, and
the other at the bottom

<TITLE>  </TITLE> "title": this should appear at the top, just after
the <HTML> tag, and what you put between them is what will show up on
the title bar in browsers, and is also what various search engines
see.

<BODY>   </BODY> "body": the first one should come after </TITLE>, and
the second just before </HTML>. This is the main part of an html
document.

(If you want to fool around with the background color, font colors,
etc., you can do it by adding parameters inside the <BODY> tag, as in
<BODY BGCOLOR="ba9999">
The specification of color is as three 2-digit (?!) hexadecimal
numbers from 0-255, the first specifying how much red, the second how
much green, and the third how much blue. So ff0000 is brightest pure red,
00ff00 is brightest pure green, etc., while 500000 is a dim red. I
won't even tell you how to mess with font and link colors because you
shouldn't do it... ;)

<UL>                   ("unordered list")
<LI> first list item   ("list item")
<LI> second list item
...
<LI> last list item
</UL>                  ("end unordered list") 

The latter construction puts _bullets_ in front of things. There's not much
vertical space between, so instead you might want to do

<UL>                   ("unordered list")
<P><LI> first list item   ("list item")
<P><LI> second list item
...
<P><LI> last list item
</UL>                  ("end unordered list") 

to have more space between the items in the list. For a _numbered_
list, do

<OL>                   ("ordered list")
<LI> first list item   ("list item")
<LI> second list item
...
<LI> last list item
</OL>                  ("end unordered list") 

This will give numbers instead of bullets. There is also the
"definition list" which has no bullets, no numbers, and indents the
"definition definition" for each list element:

<DL>                                    ("definition list")
<DT> a term                             ("definition term")
<DD> definition of the term             ("definition definition") is _indented_!
<DT> another term
<DD> the definition of the other term
...
<DT> the last term
<DD> the last definition
</DL>                                   ("end definition list")

And lists can be nested inside each other. In unordered lists, the
bullets may get replaced by hollow squares or other things in the sub
lists. Example:

<UL>
<LI> first list element
<LI> second list element
<UL>                                (begin sublist)
<LI> first element in sublist
<LI> second element in sublist
</UL>                               (end sublist)
<LI> last list element
</UL>

Somewhere here it must be pointed out that browsers do not care about
"ordinary" linebreaks or tabs, _nor_ multiple spaces. So if you
attempt to format an HTML page by tabs and spaces the browsers will
simply ignore you. 

Much of the time this is actually ok, since the _content_ may be more
important than its shape on the page. But, for those occasions when
format really, really matters, you can enclose the whole thing in
<PRE>   </PRE> "preformatted": everything in between will be presented
in the HTML page exactly as you typed it in. (A minor negative side
effect is that a less attractive font will be used also.)

TABLEs are another choice for formatting, with their own
disadvantages. The simplest way to set up a table is
<TABLE BORDER=1>
<TR><TD>1st entry 1st row <TD> 2nd entry 1st row <TD> 3rd entry 1st row </TR>
<TR><TD>1st entry 2nd row <TD> 2nd entry 2nd row <TD> 3rd entry 2nd row </TR>
<TR><TD>1st entry 3rd row <TD> 2nd entry 3rd row <TD> 3rd entry 3rd row </TR>
</TABLE>

The "<TR>" indicates the beginning of "table row".
Each "<TD>" indicates a new "table datum".
The "BORDER=1" makes lines between rows and columns. If you leave this
off, which you might in some cases, then the entries will be formatted
in the same way but no lines between. It seems that generally it's
better to _have_ the lines.

Again, browsers do not care how you use "ordinary" linebreaks, so if
you have long or many entries in a "table row" it is ok and even
_preferable_ (for readability) to let them wrap around and put extra
lines between, etc.:

<TABLE BORDER=1>

<TR><TD>long 1st entry 1st row <TD> long 2nd entry 1st row 
<TD> long 3rd entry 1st row </TR>

<TR><TD>long 1st entry 2nd row <TD> long 2nd entry 2nd row 
<TD> long 3rd entry 2nd row </TR>

<TR><TD>even longer 1st entry 3rd row <TD> and longer 2nd entry 3rd
row <TD> 3rd entry 3rd row </TR>

</TABLE>


You can achieve indentation of a chunk of stuff by enclosing it in
<DIR>    </DIR>
tags


The _coolest_ part of the whole business is _links_ from one page to
another. There are two somewhat different sorts, which we might call
"local" and "long distance".

A "local" (or "relative") link is a link to another file that is in the
same directory (or possibly in a subdirectory, etc.) For example, to
link _to_ a file "another.html" from file "firstfile.html", do

	  <A HREF="another.html"> click here </A>

The "click here" will occur underlined and in blue, and when you click
on it your browser will take you to "another.html".

If the destination is in a subdirectory "subdir", then the link can be
written as

		  <A HREF="subdir/another.html"> click here </A>

and so on.

From a stylistic viewpoint, the "click here" is not so good. Rather,
simply do something like

		 <A HREF="subdir/another.html"> another.html </A>

putting the destination's name in between the tags.

The "non-local" (or "absolute") link has the whole URL of the
destination, such as

  <A HREF="http://www.math.umn.eduy/~garrett/another.html"> another.html </A>

As in the case of all the other tags as well, when the line gets too
long you can choose to break it however you want, since the browser
doesn't care: for example, it would be ok to do

<A HREF="http://www.math.umn.eduy/~garrett/another.html"> 
    another.html 
</A>

The only place that linebreaks and extra spaces are _bad_ is if they
interrupt the URL, or other stuff in the tags themselves. For example,
in making a TABLE it would be bad to do
<TAB LE> or
<TAB
LE>


It is also possible to make links to and from locations within a
single document. This is _not_ generally a good idea: if the viewer
can get to a spot by "physical" scrolling, they'll have a better
feeling for where they are. By contrast, a link effectively
short-circuits the viewer's sense of location. Nevertheless, here's
how you do it: 

	 <A HREF="#destination"> Click here </A>

will make "Click here" appear underlined and in blue, and when you
click on it you'll go to the spot where

		<A NAME="destination"></A>

appears. The "<A NAME="destination"> </A>" itself is invisible. One more
time: using a lot of this within a document can really disorient the
reader, and sacrifice the last little bits of "physical" sense that
the viewer has for location within the document.


And it is also possible to make links to specific locations within
_other_ documents, by a combination of the above: the link

<A HREF="http://www.math.umn.edu/~garrett/other.html#there"> click </A>

will take you to the spot in
"http://www.math.umn.edu/~garrett/other.html"
where "<A> #there </A>" appears.


For some modest mathematical purposes superscripts and subscripts can
be achieved by, for example,
	
	x<SUP>2</SUP>    y<SUB> i <SUB> j </SUB> </SUB>

for "x-superscript-2" and "y-sub-i-sub-j". This works about 95& of the
time. In the remaining small fraction of the time (mostly when these
constructions occur at the left margin), the non-subscript/superscript
part gets pulled along with the subscript/superscript, so that "x-sup-2"
might turn into "superscript-x2". 


Of some importance, we have _special_characters_. To start,
of course the "<" and ">" should not be used except for the HTML
markup tags. If you want a literal "<", do "&lt;", and if you want ">"
do "&gt;". The pattern is very consistent for these and other special
characters: the special characters begin with an ampersand, and end
with a semi-colon.

There are many "proposed" or "inconsistently supported" characters,
but the ordinary letters with diacritical marks, useful for names in
languages other than English, are pretty consistently supported and
might really matter. The pattern of specification is not so crazy,
either: 

À = &Agrave; 	 capital A with grave accent
Á = &Aacute;		 capital A with acute accent
 = &Acirc;		 capital A with circumflex
à = &Atilde;		 capital A with tilde
Ä = &Auml;   	 capital A with umlaut
Å = &Aring;	    capital A with ring
Æ = &AElig;      capital AE ligature

Ç = &Ccedil;     capital C with cedilla

È = &Egrave;		 capital E with grave accent
É = &Eacute;		 capital E with acute accent
Ê = &Ecirc;		 capital E with circumflex
Ë = &Euml;	    capital E with umlaut

Ì = &Igrave;		 capital I with grave accent
Í = &Iacute;		 capital I with acute accent
Î = &Icirc;		 capital I with circumflex
Ï = &Iuml;	    capital I with umlaut

Ñ = &Ntilde;		 capital N with tilde

Ò = &Ograve;		 capital O with grave accent
Ó = &Oacute;		 capital O with acute accent
Ô = &Ocirc;		 capital O with circumflex
Õ = &Otilde;		 capital O with tilde
Ö = &Ouml;   	 capital O with umlaut
Ø = &Oslash;		 capital O with slash

Ù = &Ugrave;		 capital U with grave accent
Ú = &Uacute;		 capital U with acute accent
Û = &Ucirc;		 capital U with circumflex
Ü = &Uuml;	    capital U with umlaut

ß = &szlig;      small sz ligature (German)

... and there are the lowercase versions of all these, denoted
analogously by

à = &agrave;		 lowercase A with grave accent
á = &aacute;		 lowercase A with acute accent
â = &acirc;		 lowercase A with circumflex
ã = &atilde;		 lowercase A with tilde
ä = &auml;		 lowercase A with umlaut
å = &aring;		 lowercase A with ring
æ = &aelig;		 lowercase AE ligature

and so on for all the others, also.

Last, we have _comments_: things which will not appear at all in the
HTML. These are of the form
<!-- this is a comment -->
<!--
so is this
 -->
That is, the opening of the comment is left-angle, exclamation,
hyphen, hyphen, and the closing is space, hyphen, hyphen, right-angle.

There is a minor peculiarity that the closing "-->" oughtn't appear at
the left margin, or some browsers will get confused.

 ********** The End ***********



Unless explicitly noted otherwise, everything here, work by Paul Garrett, is licensed under a Creative Commons Attribution 3.0 Unported License. ... [ garrett@umn.edu ]

The University of Minnesota explicitly requires that I state that "The views and opinions expressed in this page are strictly those of the page author. The contents of this page have not been reviewed or approved by the University of Minnesota."