Empty Elements And Semantics

Now that you know how elements, tags, and attributes work, you can use any HTML element since they all have the same format: opening tag (with desired attributes), content, closing tag—well, almost all of them. There is a small group of elements known as empty elements, which have only the opening tag. There are also many elements that have the same visual effect, but are used for different purposes which are dictated by (X)HTML semantics. In this chapter, I explain both.

Empty Elements

Empty elements consist solely of an opening tag; they have neither content nor closing tag. Two examples are the horizontal rule element and the image element.

The Horizontal Rule Element

The task of the horizontal rule element, which has the element name hr, is simple: place a horizontal line on the webpage. Below is an example:

A Page That Includes A Horizontal Rule Element:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>A Webpage With Horizontal Rule</title>
</head>
<body>
<h1>A Webpage With A Horizontal Rule</h1>
<p>This paragraph goes before the horizontal rule (the horizontal line on this page).</p>
<hr>
<p>This paragraph goes after the horizontal rule.</p>
</body>
</html>

Here is the result:

A horizontal rule between two paragraphs

The Image Element

The task of the image element, which has the element name img, is a bit more complex: place a specific graphic on the webpage. Below is an example:

A Page That Includes An Image Element
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>A Webpage A Webpage With An Image</title>
</head>
<body>
<h1>A Webpage With An Image</h1>
<p>This paragraph contains the W3C Valid HTML 4.01 Strict icon: <img src="./valid-html401.png" alt="Valid HTML 4.01 Strict">.</p>
</body>
</html>

Here is the result:

A webpage with an image

Why Empty Elements Are Empty

As you know, the html, head, title, body, h1, p, em, strong, and a elements follow the start tag, content, end tag format; in Your First Webpage, I demonstrated the effects of leaving out the end tag of a non-empty element. However, these elements differ from the hr and img elements in one fundamental aspect: non-empty can contain text and/or other elements. In other words, non-empty elements can have content.

Empty elements can't.

The information that the browser needs from an hr element is contained in the element name itself; in other words, the hr element tells a browser put a horizontal line here by its very presence. When the start tag is finished, the browser is finished with the hr element.

The same goes for the img element. It tells the browser put an image here by its very presence; which image to use is indicated by the src attribute (which stands for source) and contains the URI of the image to use, and the text to display if the image doesn't appear is contained in the alt attribute. Again, once the start tag is finished, the browser is done with the element.

You could, of course, give hr and img elements extra information by adding title, class, id, and/or style attributes, but we already know that attributes are always contained in the start tag.

Therefore, the definition of an empty element is an element in which all possible information pertaining to that element is contained solely in its start tag. This is why there is no such thing as an /hr or /img tag in (X)HTML; they're needless and a browser will not recognize them.

If you have looked at the code for the page where I highlight parts of the code of the webpage demonstrated in Your First Webpage, you may have seen the following:

The Script Element:
<script type="text/javascript" src="../../hilite_script.js"></script>

Considering there is nothing between the start tag and end tag of the script element, one might think that this should be an empty element. But recall I said all possible information. The src attribute refers to a separate file that is used as part of this page. I could easily place the contents of said file within the script element, and it would work equally well. Therefore, not all possible information is contained within the script start tag, therefore script is not an empty element.

There are 10 empty elements in HTML 4.01 Strict and XHTML 1.0 Strict, and I'll explain each one when we get to it.

XML And Empty Elements

Perhaps the most well-known difference between HTML and XHTML is the handling of empty elements. In HTML, an empty element is assumed to be closed when its tag ends. That's never the case in any XML language; the tag must be explicitly closed. The way to do this is to put a forward slash (/) just before the greater-than character (>); this is the extra use of the forward slash I mentioned back in The Basics of Markup. Very often, a space is placed before the slash, which allows that slash to be read as a simple and minor syntax error when the page is read as HTML (which is, remember, the only way Internet Explorer and very old browsers can read XHTML). The space is not required when the page is read as an XHTML document, but it doesn't hurt.

The example below shows the hr and img elements as they would appear in XHTML.

Sample XHTML Empty Elements
<hr />
<img src="./valid-html401.png" alt="Valid HTML 4.01 Strict" />

Of course, this does mean that any element can technically be empty—but this can cause problems if the page is being read as HTML.

Semantics

Question:
When you build a webpage, why do you use a p element for a paragraph?
Answer:
That's what it's for.

Semantics refers to the usage of elements, in particular elements that have similar visual effects as others. While obviously the a, title and hr elements have specialized roles while lacking any real substitutes, the same is not true for the em element and another element called cite.

Below is a webpage showing a quote from Harry S Truman, former President of the United States of America:

A page using the <em> and <cite> elements

Note that both the words cite and emphasize, and his name are in italics. The reasons are different: cite and emphasize are given extra emphasis because they are the elements being demonstrated, but the name tells you who made the quote. Therefore, we use different elements to denote the difference:

The Code Of The Above:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>Cite And Emphasize</title>
</head>
<body>
<h1>The Difference between <em>Cite</em> and <em>Emphasize</em></h1>
<p>"The buck stops here." <cite>(Harry S Truman)</cite></p>
</body>
</html>

Yes, the visual effect is identical, but there is a rule that I must emphasize right now: it is unwise to use an element solely for its visual effect. Visual effect is also known as presentation, and that was never part of the original intent of (X)HTML.

One of the reasons semantics are so important is you never know who will be reading your webpage or using what browser, and there are browsers in use that ignore all visual effects a page has to offer. Some are text-only browsers (which need no further description). There are also browsers for blind people, which include audio browsers and tactile browsers—and unlike graphical browsers (like Internet Explorer or Firefox), they do treat cite and em elements differently!

In the following chapters, I will explain most of the elements of (X)HTML as well as I can. I'll start with the head elements, then move on to the most numerous group of elements: the inline elements. It is here that semantics is most important.