February 13, 2003
SMIL. You are next!
From: Computerworld Singapore, Singapore - 13 Feb 2003
The Synchronised Multimedia Integration Language aims to bring the web towards rich media presentation … again.
By Louis Chua
The Synchronised Multimedia Integration Language (SMIL, pronounced "smile"), a little known specification from the World Wide Web consortium (W3C), is designed to enable simple authoring of interactive audio-visual presentations on multiple devices.
SMIL is typically used for "rich media" or multimedia presentations, which integrate streaming audio and video with images, text or any other media type. It is an HTML-like language that aims to be easy-to-learn and many SMIL presentations can be written using a simple text editor.
The language SMIL is a subset of XML (extensible markup language) and is written as an XML application. Simply put, it enables authors to specify what should be presented and when it should be presented, enabling them to control the precise time that a sentence is spoken, making it coincide with the display of a given image on the screen.
SMIL is the language used to support W3C's Synchronised Multimedia Activity, which focuses on the choreographing of multimedia presentations where audio, video, text and graphics are combined in real-time on the web.
Currently, W3C's Synchronised Multimedia Activity is focusing on the design of a language to cover all necessary aspects of timed text on the web. Typical applications of timed text are the real time subtitling of foreign-language movies on the web, captioning for people lacking audio devices or with hearing impairment, karaoke, scrolling news items or -teleprompter applications.
As part of the W3C's organisational design, work done on Synchronised Multimedia is being managed as part of W3C's Interaction Domain.
The Interaction Domain seeks to improve and evolve web user interface technologies. Work includes formats and languages that add new interaction methods to the web such as speech recognition, multi-modal access, as well as mechanisms for handling the increasing number of new web access devices such as mobile phones, personal digital assistants and interactive television sets.
A frequently question asked is … Why SMIL? After all, do we not already have Flash from Macromedia, which can achieve similar results? The difference is that Flash remains essentially an animation data type – it is a content type rather than pure content itself. Furthermore, as an open recommendation from W3C, SMIL can be used by anyone to create compliance software.
Although MPEG (Moving Picture Experts Group) also looks at content and coding, SMIL is more web-centric unlike MPEG, which is more media centric and involves more than just content and coding. A close comparison would be D-HTML (dynamic hypertext markup language). However, D-HTML uses scripted definitions of local behaviours, without a notion of the presentation's context. Actions such as timed events are therefore difficult to co-ordinate.
Then there are W3C technologies such as cascading style sheets (CSS) which are compatible with SMIL, which means CSS can be created for media-based SMIL, with CSS code complementing SMIL layout. So why not simply use CSS?
There is a difference between CSS's text-flow and SMIL's time-flow documents. Although the XML nesting tells a lot about text layout, it tells very little about temporal layout, which is what SMIL is good for. The non-modularised nature of CSS also induces too much overhead and there are conceptual limitations to CSS for time-based presentations.
SMIL, on the other hand, is basically an XML document with defined XML DTD (document type definition) and schema. It is a declarative, integration language with the media elements referred to and not included. This allows SMIL documents to be hand-authored, though probably few would try since there are already many tools available, such as an SMIL authoring tool known as Fluition and RealNetwork's RealProducer G2 Authoring Kit.
SMIL can be used for a variety of applications, such as interactive video, video on demand, online training, audio, animation, and in other ways that were previously unavailable. With SMIL, the ability to create rich media presentations is simplified and is available to any user with a computer and an Internet connection.
The use of SMIL allows the creator to synchronise timing for media playback within a presentation. One concept in SMIL is the usage of Regions to separate a presentation. Using Regions defining, playback of a video can be taking place in one Region, while another Region can be synchronised and brought into the presentation at specific, designated time points to enhance the impact.
SMIL, being an XML-based language, which means that it is like HTML except it is extensible. The SMIL document starts off with the SMIL element, which branches off into head and body elements. Within the head and body elements, there can be more elements types defined. To build a simple presentation, the elements needed are meta, layout, root-layout, region, switch, anchor, media object elements, switch, paragraph and sequence.
A SMIL document consists of a collection of elements that are placed in a certain order. The head element contains information describing the document. Within the head element, the layout contains the positioning information of the presentation while the root-layout determines the properties of the presentation window. Region controls the properties of the rendering surfaces for the media and the region control lay on top of root-layout.
The body contains the content of the document. The elements inside a
SMIL creation is usually the last step in the authoring process. All of the artwork, encoding and basic interactions have to be mapped out before authoring the SMIL document. As with any media production process, preparation is the key to smooth authoring workflow.
Copyright IDG Communications (S) Pte. Ltd.