CPSC 461: Copyright (C) 2003 Katrin Becker Last Modified May 26, 2003 07:30 PM
FILE FORMATS
Document Formats

Main Data File Format Types:
Documents (text; word-processor; LaTex; postscript; etc.)
Electronic Data Interchange (EDI : formats for e-commerce)
Scientific
Graphics (raster; vector)
Audio (sound & voice)
Animation (moving pictures)
Multimedia (video; combined)
SGML
(Standard Generalized Markup Language) An ISO standard for defining the format in a text document. An SGML document uses a separate Document Type Definition (DTD) file that defines the format codes, or tags, embedded within it. Since SGML describes its own formatting, it is known as a meta-language. SGML is a very comprehensive language that includes hypertext links. The HTML format used on the Web is an SGML document that uses a fixed set of tags. .***
XML
(EXtensible Markup Language) An open standard for describing data from the W3C. It is used for defining data elements on a Web page and business-to-business documents. It uses a similar tag structure as HTML; however, whereas HTML defines how elements are displayed, XML defines what those elements contain. HTML uses predefined tags, but XML allows tags to be defined by the developer of the page. Thus, virtually any data items, such as product, sales rep and amount due, can be identified, allowing Web pages to function like database records. By providing a common method for identifying data, XML supports business-to-business transactions and is expected to become the dominant format for electronic data interchange (see EDI).***
EDI
(Electronic Data Interchange) The electronic communication of business transactions, such as orders, confirmations and invoices, between organizations. Third parties provide EDI services that enable organizations with different equipment to connect. Although interactive access may be a part of it, EDI implies direct computer to computer transactions into vendors' databases and ordering systems.

The Internet is expected to give EDI quite a boost, but not by using private networks and the traditional EDI data formats (X12, EDIFACT and TRADACOMS). Rather, XML is expected to be the glue that connects businesses together using the Web as the communications vehicle. Check out X12, EDIFACT, TRADACOMS, extranet and XML.***
DSSSL
(Document Style Semantics and Specification Language) A style sheet and transformation language for SGML documents. It allows an SGML document to be formatted for presentation or converted into another structure. Jade is an example of a DSSSL processor written by James Clark. For information, visit www.jclark.com. ***

A word about Postscript (and the like):
Programming languages have instructions - to do what? define variables, data structures; read & write stuff; control loops; do arithmetic; etc. High level programming languages are useful because we can use the same instructions, give it to a compiler or interpreter and let IT deal with the low-level machine-specific instructions. There are many instruction sets for many different machines. These instructions do many similar things, but the exact incantation for a specific machine will vary. The kinds of instructions include things like: arithmetic; logical operations; branches; read/write memory, send interrupts, etc.

We already know all this.

Printers also have low-level instructions. We must have some way to tell the printer what to do - it is a computer of sorts, after all, so it must have an instruction set. The instructions for a printer are likely to be somewhat different from those for a general purpose machine. These instructions are likely to say things about colours; pixels; etc.

PostScript is a "page description language". It is also a full-blown programming language. It has an interpreter. Applications can generate output, in PostScript, that describes what a particular document should look like. The PostScript interpreter, then takes that and translates it into the machine instructions required by the specific printer.

*** Computer Desktop Encyclopedia, Alan Freedman (c) 1981-2002 The Computer Language Company Inc. All rights reserved. Ver. 15.1, 1st Quarter 2002

Back to TopCPSC 461: Copyright (C) 2003 Katrin Becker Last Modified May 26, 2003 07:30 PM