Structuring Information

The Structuring Information course is concerned primarily with xml, since this has become the standard method for structuring information for use on the web and other media platforms. In order to use xml we will alsowork with DTDs (document type definitions) and schemas; and XSLT, the means by which you can output an xml document as html, as a pdf or Word file, or as almost any text-based file type.

The course will include regular weekly assignments, as well as three longer projects. The assignments will be individual but the longer projects will be carried out in small groups.

The course will be in three three-week sections during which we will cover the basic principles and techniques. By the end of the course you will know enough to create an xml schema that will be implemented in the new student portfolios the media department will be introducing at the start of the next academic year.

Current course times

In Period 3 the course is intended for Year 1 and Year 4 multimedia students.

The course is being taught in the Media Lab (Room 323) at 9.15 Monday and Wednesday mornings beginning on February 5th, 2007. Sessions take place between 9.15 and 12.00. Please do not be late.

1. Understanding and writing XML

These sessions will take place in weeks 1 to 3.

Session 1: structured information, html and xml

This session will look at the difference between unstructured and structured information. We will then revise the structure of html and compare it to xml. Finally we will begin to construct small, well-formed xml documents.

Self study: read and study the first three sections of XML Basics in the XML Files, which is an authoritative online reference to creating and manipulating xml. You may also have a look at the official W3 specifications for xml, although these are very detailed and most useful for reference later.

Session 2: validating xml with schemas

This session will begin with a more complex example of an xml document. We will consider how to store turkey recipes as xml. We will think about how to make sure that the xml that we write matches the specifications it is supposed to follow. How do we if the xml we are writing is valid? Schemas and DTDs provide two solutions to this problem and we shall begin by making a schema.

Self study: describe the structure of an address book in a sequence of simple sentences. Create an xml document that contains four or five entries for this address book in the format you have described. Create a schema that will validate the xml file and make certain that each entry fits the format.

Session 3: validating xml with DTDs

This session will concentrate on the structure of DTDs. We will examine an existing DTD, look at how it works, and see if we can improve it. Can we improve it while still leaving it backwardly compatible?

Assignment 1: updating the RexStream validation

In 2005 students developed an xml-based system, called RexStream, designed for marking up old books. We downloaded several files from Project Gutenberg, marked them up, and then output them to the web, or to Palm-powered pdas. This specification uses a DTD. Your assignment is in two parts. Firstly you must download a Gutenberg file, and mark it up using the RexStream specifications. Secondly you should replace the supplied DTD with an xml schema that does the same job. The necessary files and instructions will be made available for download in week 3.

2. Transforming and outputting XML

These sessions will take place in weeks 4 to 6.

Session 4: Reading what we have written

This session will begin by working through the first assignment. Writing xml is often a matter of interpretation, and there may be more than one correct solution. Students will therefore present their work and we will discuss the differences between the examples. We will look at what we can do with the xml document once we have got it. We will begin to examine XSLT.

Session 5: Writing XSLT

XSLT files enable you to turn xml into any suitable output format. We will begin by turning the file from the first assignment into an html file. We will then look at other possibilities, such as outputting the xml as palm markup language, or as a pdf file. To do this we will look at how Open Office works.

Session 6: XSLT and style sheets

This session will concentrate on using style sheets with XSLT.

Assignment 2: one input, many outputs

Details of this project, including instructions for downloading the material, will be posted here in week 5.

3. Creating and testing an XML schema

These sessions will take place in weeks 7 to 9. Details will be posted in week 5.

This section of the course will include a major project, aimed at strengthening the online portfolios that for students at Arcada.

Assignment 3: the definition of a peerticle

Details of this project, including instructions for downloading the material, will be posted here in week 7.

4. Final presentations

Presentations of the results of the major project will be held during week 10.