Webpages often have images or bits of text you can click on to access another online resource. These are known as hyperlinks. Normally that resource is another webpage but hyperlinks can also be used for email, online graphics, downloadable files, animations, and so on.
The very purpose of hyperlinks is to associate text with a Uniform Resource Identifier (URI for short), which is also known as a Uniform Resource Locator (URL for short). A URI is simply text that describes where a resource is on the Internet.
You've likely seen URIs before; they appear in the address bar of the browser where you can type in the address of the page you want to go to. What you type in is an example of an URI. In fact, you used one in your first page:
The part in bold (
) is an excellent example of a URI.http://www.w3.org/TR/html4/strict.dtd
There are two types of URIs on the World Wide Web: absolute and relative.
An absolute URI is a web address with the exact location of the resource explicitly stated. It contains the following parts (which I will demonstrate using the URI in the Doctype):
In a network, a protocol is a standardized means of communication between computers. (The Internet, of course, is the largest network in the world.) The http
at the beginning of the sample URI stands for HyperText Transfer Protocol, the protocol of the World Wide Web, where webpages reside. For obvious reasons, it is the most common protocol in URIs on the World Wide Web.
If you're viewing the webpages on a CD, or viewing a webpage you've stored on your own computer, you may see the URI being with file://
, file:///
or just the drive letter. This is fine; it means you're getting it from your own computer's file system, not from a webserver.
The domain is the location of the website.
Domains are actually read backwards by the computers that run the World Wide Web. For example, when it reads www.w3.org
, the web checks the following:
.org) exists.
w3exists within the
.orgtop-level domain.
wwwexists within the
w3.orgdomain.
This backwards checking is why servers are able to offer subdomains.
All websites have at least one folder that stores files and other folders, much like folders on your computer. The part of the URI that describes the sequence of folders to look in is the path.
Think of a website as a filing cabinet. The root folder (which, like the root element, stores everything else) would be the cabinet itself. And while you can store all your files in the root folder itself, it would be like having a stack of paper in a filing cabinet with no drawers: the bigger the stack (or the more files in the website), the harder it is to keep track of things.
Folders in the website would be like the drawers in that filing cabinet, and the file folders in those drawers. The path
for a real-life filing cabinet may be Second drawer from the bottom, third folder from the front.
By the way, the /
at the start of the path stands for the root folder, which has no name.
Here, the path says Start in the root folder. Go to the folder named
TR/
. Inside that is the folder html4/
. Inside that is the desired file.
Last comes the file name, which is, of course, the file you want.
In this case, it's strict.dtd
. You may go to the URI
if you like—it's simply the HTML 4.01 Document Type Definition.http://www.w3.org/TR/html4/strict.dtd
A relative URI finds files relative to the current page you are looking at. It consists of two parts:
For this reason, it only works within the same domain. Since the URI used in the Doctype is not an actual webpage, I'll use another URI for examples: http://www.w3.org/TR/html401/about.html
, which is the URI pointing to a webpage explaining the HTML specification.
The paths for relative URIs use three special folder names:
This refers the folder containing the webpage you are looking at. Omitting this usually has no effect on the URI, but it depends on the webserver you are using. It's generally a wise idea to include it.
With the webpage I suggested,
would refer to ./
http://www.w3.org/TR/html401/
This refers to the parent folder of the folder containing the current page. With the sample URI, this would refer to
.http://www.w3.org/TR/
Should you want the parent folder of the parent folder of the current folder, you would repeat ../
like this:
, which would refer to ../../
.http://www.w3.org/
http://www.w3.org/
.Parent
and root
folders are analogous to parent and root elements in an HTML document.
Some final notes on URIs.
When working within the same domain, relative URIs are indispensible. With them, you can keep identical copies of your website on your server and on your own computer without having to change anything.
When linking to a resource on another domain, however, absolute URIs are the only type you can use.
There is one, and only one, part of a URI that decides whether it is absolute or relative: the protocol. If it is present, the URI is treated as absolute; if absent, then relative. It is also a mistake to have the protocol twice in a single URI. Keep this in mind to avoid mistakes with hyperlinks.
It is possible to write out a URI using character references, which is useful when
/),
There are two ways to do this.
You simply write the code as if you were going to display it on the screen, beginning with the ampersand (&
) and ending with the semicolon. But this causes things to get sticky: some URIs include the ampersand itself! In such cases, it is necessary to write out the ampersand using its character reference so the browser doesn't try and interpret the letters that follow the ampersand as a character reference—otherwise, it might do exactly that.
You've likely seen this in the address bar of your browser from time to time, particularly the code %20
, which shows up if the URI has a space in it. To figure out which characters are which is very easy: they correspond to the Unicode code points (for example, the space is U+0020
)—but percent-encoding is limited to only two hexadecimal codes, which means any character above FF16
(25510
in decimal) cannot be used. These codes are further restricted by whether or not they are valid characters, as explained in Special Characters.
There is a special element-attribute combination required for creating a hyperlink:
a
elementhref
attributea
ElementTo create a hyperlink, you need to use the anchor element which has the element name
. Like a
em
and strong
, a
is an inline element. Unlike em
and strong
, you may not nest one a
element inside another, although you may nest em
and strong
elements inside it and vice-versa.
href
AttributeAbove, I described URIs. This is the attribute which contains them in a hyperlink.
Technically, you can put in whatever string of text you like without raising an error while validating, but be warned: your browser will treat it like a URI, your browser will request the resource
specified, and your most likely error will be the well-known 404: Page Not Found
error.
A link would look like this:
You may be wondering why I have given the element and attribute such short shrift. The reason is simple: you've seen it all before. The a
element is an inline element, the href
attribute requires text in a specific format to work, and that's all there is to it.
There is a question that might arise, and rightly so, about links that don't use the http://
protocol. What about, for example, e-mail? Or an internet resource that has nothing to do with the World Wide Web?
Hyperlinks will indeed work with other protocols, but the usual system of folders I described above may not work with other protocols; you have to know and follow the rules of each protocol you use. Such protocols can include:
http://
protocol.File Transfer Protocol, which allows files to be uploaded to a server as well as downloaded. Like
http://
and https://
, URIs using this protocol require that you include a path.furc://naiagreen/
, which is a link to the game's help area. The furc://
protocol and Furcadia are © Dragon's Eye Productions (<http://www.furcadia.com>
) and are mentioned with permission.A protocol always ends in ://
.
When a user clicks on a link that uses a protocol that is not meant for a browser, the browser will tell the computer to launch the program with which the protocol was associated. If the browser can find no such program, the user will get warning to that effect.
Amongst hyperlinks, the email hyperlink is something of an oddball. An e-mail address is not exactly a URI, but the hyperlink treats it like one. The syntax is in two parts and very straightforward.
The viewer's user agent will decide how to handle the link. The usual result is to launch a program associated with e-mail.
I included p
tags as a reminder that a
elements cannot be child elements of the body
element—they must be contained within a block element such as p
or h1
.