Introduction to XHTML
What is Xhtml?
Why Xhtml?
Differences Between Xhtml and html
Xhtml Syntax
Xhtml Document Type Definitions [DTD]
Xhtml How To
What is XHTML?
XHTML is a reformulation of HTML 4.01 in XML, and can be put to immediate
use with existing browsers by following a few simple guidelines. XML is
a markup language where everything has to be marked up correctly, which
results in "well-formed" documents. XML was designed to describe
data and HTML was designed to display data.
Today´s market consists of different browser technologies, some browsers
run internet on computers, and some browsers run internet on mobile phones
and hand helds. The last-mentioned do not have the resources or power to
interpret a "bad" markup language. Therefore - by combining HTML
and XML, and their strengths, we got a markup language that is useful now
and in the future - XHTML. XHTML pages can be read by all XML enabled devices
AND while waiting for the rest of the world to upgrade to XML supported
browsers, XHTML gives you the opportunity to write "well-formed"
documents now, that work in all browsers and that are backward browser compatible
!!!
- XHTML is a Web Standard
- XHTML 1.0 became an official W3C Recommendation January 26, 2000.
- A W3C Recommendation means that the specification is stable, that it has been reviewed by the W3C membership, and that the specification is now a Web standard.
Why XHTML?
The web has existing backwards and cross-compatibility issues relating to legacy browsers. This is because during the www ‘boom’ Microsoft and Netscape, among other less important developers, produced many browser versions, which varied in terms of html specifications and importantly in how the structure of html documents was referenced. At present the job of the web developer is a becoming a little easier in that the major vendors of user agents Microsoft and Netscape have actually begun to adopt the common standards of the w3c. This is ok for modern browsers, but what if we have to consider older browsers in our design phase. The situation we now have is that we have to make websites compatible in different old and current user agents and at the same time plan for the future.
We cannot plan exactly what will happen in the future but it seems quite clear that user agents of the future [future web browsers, handheld devices, mobile phones etc] will adopt xml as the standard development technology since it builds on the latest html 4.01 standard but is a lot more powerful in terms of what it can do. For example using supporting technology an xml document can automatically be converted to a pdf document.
The support for xml is still very much in its development phase. For now we have xhtml.
The W3C’s ideas on the subject:
“Developers who migrate their content to XHTML 1.0 will realize the following benefits:
- XHTML documents can be written to operate as well or better than they did before in existing HTML 4-conforming user agents as well as in new, XHTML 1.0 conforming user agents.
- As the XHTML family evolves, documents conforming to XHTML 1.0 will be more likely to interoperate within and among various XHTML environments.
- The XHTML family is the next step in the evolution of the Internet. By migrating to XHTML today, content developers can enter the XML world with all of its attendant benefits, while still remaining confident in their content´s backward and future compatibility.”
Differences Between XHTML and HTML
- Elements Must Be Properly Nested
<b><i>This text is bold and italic</i></b>
- Documents Must Be Well-formed
<html>
<head> ... </head>
<body> ... </body>
</html>
- All Tag Names Must Be in Lower Case
<body>
<p>A paragraph</p>
</body>
- All XHTML Elements Must Be Closed
<p>A paragraph</p>
- Empty Elements Must also Be Closed
Empty elements must either have an end tag or the start tag must end with />
To make your XHTML compatible with today´s browsers, you should add an extra space before the "/" symbol like this: <br />, and this: <hr />.
This is a break<br />
Here comes a horizontal rule:<hr />
Here´s an image <img src="happy.gif" alt="Happy face" />
XHTML Syntax
- Attribute Names must be in Lower Case
<table width="100%">
- Attribute Values must be Quoted
<table width="100%">
- Attribute Minimization is Forbidden
This is wrong: <frame noresize>
This is correct: <frame noresize="noresize">
- The id Attribute replaces the Name Attribute
This is wrong: <img src="picture.gif" name="picture1" />
This is correct: <img src="picture.gif" id="picture1" />
Note: To interoperate with older browsers for a while, you should use both name and id, with identical attribute values, like this:
<img src="picture.gif" id="picture1" name="picture1" />
- The Lang Attribute
The lang attribute applies to almost every XHTML element. It specifies the language of the content within an element.If you use the lang attribute in an element, you must add the xml:lang attribute, like this:
<div lang="no" xml:lang="no">Heia Norge!</ div>
- Mandatory XHTML Elements
All XHTML documents must have a DOCTYPE declaration. The html, head and body elements must be present, and the title must be present inside the head element.
This is a minimum XHTML document template:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<title>Title goes here</title>
</head>
<body>Body text goes here
</body>
</html>
Note: The DOCTYPE declaration is not a part of the XHTML document itself. It is not an XHTML element, and it should not have a closing tag.
XHTML DTDs
An XHTML document is validated against a Document Type Definition (DTD). Before an XHTML file can be properly validated, a correct DTD must be added as the first line of the file. The <!DOCTYPE> is mandatory
An XHTML document consists of three main parts:
- the DOCTYPE
- the Head
- the Body
The DOCTYPE declaration should always be the first line in an XHTML document.
The DTD specifies the syntax of a web page in SGML.The DTD is used by SGML
applications, such as HTML, to specify rules that apply to the markup of
documents of a particular type, including a set of element and entity declarations.
XHTML is specified in an SGML document type definition or ´DTD´.
An XHTML DTD describes in precise, computer-readable language the allowed
syntax and grammar of XHTML markup.
There are currently 3 XHTML document types:
- XHTML 1.0 Strict
Use this when you want really clean markup, free of presentational clutter. Use this together with Cascading Style Sheets. The Strict DTD includes elements and attributes that have not been deprecated or do not appear in framesets.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
- XHTML 1.0 Transitional
Use this when you need to take advantage of HTML´s presentational features and when you want to support browsers that don´t understand Cascading Style Sheets. The Transitional DTD includes everything in the strict DTD plus deprecated elements and attributes.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
- XHTML 1.0 Frameset
Use this when you want to use HTML Frames to partition the browser window into two or more frames. The Frameset DTD includes everything in the transitional DTD plus frames as well.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
XHTML How To
A Note About the DOCTYPE - Your pages must have a DOCTYPE declaration if you want them to validate as correct XHTML.
Be aware however, that newer browsers (like Internet Explorer 6) might
treat your document differently depending on the <!DOCTYPE> declaration.
If the browser reads a document with a DOCTYPE, it might treat the document
as "correct". Malformed XHTML might fall over and display differently
than without a DOCTYPE.
Web Site Validation
Validate XHTML documents against the official W3C DTD. The W3C´s online
validation service is a good way of learning what is and is not allowed
in your markup and can be found here.
You may wish to consider using a markup tidy tool such as Dave Raggett’s
HTML ‘TIDY’. Links to this can be found at the w3c
website.
