A Bi-Monthly Newsletter
Volume 6, Issue 2, March 2003
STC WVC Home>Newsletter Table of Contents>Where Online Documentation...
 

Where Online Documentation

and Global Audiences Meet

By John Watkins

Software packages use various forms of online documentation for user support. At the very least there is normally a “README” or release notes file. The advent of multimedia has prompted many software manufacturers to include other forms of documentation with their software; in fact, some manufacturers have completely eliminated printed materials and are relying solely on online documentation.

The two most common examples of online documentation are Help files (typically accessed through the program itself), and online manuals (for user manuals, installation guides, etc.). A third area of increasing importance is the use of Web-based content for distributing all online documentation.

Help Files

Help files are the most common form of online documentation. Microsoft has two standards for Windows Help files: the RTF based WinHelp, and the HTML based HTMLHelp. WinHelp has been used since Windows 3.0, while HTMLHelp was released in August 1997 and is becoming increasingly popular for the Windows 98, Windows 2000, Windows Me, and Windows XP operating systems. Most WinHelp files are written using tools like RoboHelp (by eHelp Corporation).

When translating Help files, the localizer needs access to the files used to build the Help:

  • Source content files (.RTF, .DOC, or .HTML), containing the bulk of the Help file text.
  • Project files (used to compile the Help content).
  • Bitmaps or other graphics that are used in the Help files (often containing text that requires translation).
  • Other content files (such as the .CNT file used to create the table of contents for the Help file).

It is also possible to use macros in Help files. These macros should be examined to see if they work in the native operating system. For example, the following macro opens the printer’s control panel:

ExecProgram(“control.exe printers”, 0)

When used in Windows 9X, nothing needs to be changed to make this work on non-English operating systems. But for Windows 3.1, the “printers” part of the macro needs to match Microsoft’s translation of that section of the control panel:

  • French ExecProgram(“control.exe Imprimantes”, 0)
  • German ExecProgram(“control.exe drucker”, 0)
  • Dutch ExecProgram(“control.exe printers”, 0)

Adobe Acrobat (PDF) Files for Online Manuals

Another very popular form of online documentation uses Adobe Acrobat to create PDF files. Acrobat is a cross-platform electronic documentation distribution format, providing 100% graphic, font, and page layout fidelity on a variety of operating systems. With Acrobat, documents are viewed as the authors originally intended on virtually any computer platform. It is also possible to add functionality to a PDF file, providing hyperlinks, bookmarks, and the like to enhance the user experience.

Adobe Acrobat consists of three programs: Reader, Exchange, and Distiller. Reader allows a user to view, search, and print but not to create documents. Exchange offers the user all the features of Reader plus the ability to edit and annotate documents. Distiller allows authors to produce Acrobat documents as PDF files (Portable Document Format). A PDF file is created from PostScript printer files originally generated in some other application (such as Word, PageMaker, Quark, etc.). A PDF file can be generated from any program that can produce a PostScript file.

For a translator to be able to work with online documentation using Acrobat, the source files are needed. The source file is translated and then converted into a PDF afterward. The PDF file is then sent to the copy editor and proofreader. Using Acrobat Reader and Exchange, the copy editor and proofreader can edit the translation even if they do not own the source application.

Single Source Considerations

Just as single source content management strategies have affected the documentation development world, so too have they impacted the world of online documentation. Today, it is possible to create content using a desktop publishing software package and then, with third-party software, convert this source content to Web site files and Windows Help files—all from the same source. The localization of single source content requires a hybrid approach, including both documentation localization and engineering localization. Typically, the source content can be localized directly, then the final file formats, whether PDF, HTML, or Help files, are “engineered” to verify compatibility on native operating systems.

Web Sites

Despite the burst of the Internet “bubble” in 2001, the use of Web sites continues to grow steadily. Their importance in establishing business markets, generating sales, and setting up hosted application services is well established. Even within an organization, the use of Web-based technologies for information management is now commonplace.

The Web has provided tremendous technological opportunities for providing timely information to your colleagues and your customers around the world. That, in fact, is the rub: people anywhere in the world can look up your company and product information on your Web site. That information should be available to them in their own language if you rely upon them as customers (or colleagues in the case of intranets).

It is easy to say that Web site information should be provided in your customer’s native language, but there are important issues to consider before deciding to localize your Web content. Web sites, by their very nature, encourage site hosts to update and/or modify the information frequently, since visitors to the Web site expect to see up-to-date information. It is this expectation that makes localization of Web sites a bit more challenging. A change to one Web page on the site requires changes to the same page in all of the languages supported by the site. Clearly, Web site maintenance becomes more complicated with each language supported.

Let’s look at two scenarios: the intranet for an international company and a consumer Web site for a product sold internationally. Before localizing a company-wide intranet (with international offices) Web site, you should consider:

  • How many foreign staff members use your intranet?
  • Do they require text in their native language?
  • Could certain key pages be localized while leaving the bulk of the site in English?

Similarly, the decision to localize your marketing and sales pages, targeted for your specific market, should be carefully evaluated. While localizing the Web site makes your product more visible in a foreign market, you should be sure that you can anticipate a return on the localization investment. As with intranet considerations, it may be possible to localize a subset of your pages to keep costs down while still acknowledging your global market.

Cost concerns aside, it is clear that some level of Web site localization is desirable for many businesses. The following subsections address the localization process for Web sites.

Web Site Localization Process

The complexity of localizing a Web site falls roughly between that of document and software localization. That is, more engineering support is required than is typically needed in document localization, but a bit less than that required for software localization.

Before localization can proceed, the Web site must be evaluated for complexity. Web pages are comprised of content (text), graphic objects, hyperlinks, and advanced engineering features. Each of these components requires consideration in the localization process. The content may be part of the page construction or dynamically loaded through scripts or a database interface.

Web Text and Graphics

Fortunately, most of the content of a Web page is typically text and graphics. As with the preceding discussion on documentation localization, the same rules apply. It is important to remember that HTML pages have some text that is not immediately apparent, for example:

  • Page titles, that appear at the top of the browser interface,
  • Graphic titles, the ALT attributes that appear when graphics are loading or when users choose not to download the graphics, and Hyperlink titles.
  • Graphic objects on a Web site that contain text are also normally localized. To avoid having to edit the graphics objects (a more complicated process than text editing), text objects should be separated from the graphics objects. Text can always be superimposed on a graphic using absolute positioning for graphics and text under DHTML.

Hyperlinks

Hyperlinks on a Web page have the potential to take users to regions of your Web site or to Web sites of others that are not localized. It may be necessary to modify these hyperlinks so that alternative sites written in the appropriate language are selected instead. Or, an explanation can be given in the target language stating that these hyperlinks lead to sites written in English.

Advanced Web Features

Many Web sites use features that provide more dynamic Web pages. As the pages become more dynamic, the potential for complications in localization increases. The latest trend in this regard is the interface of Web sites to content management systems that store content in XML “chunks.” These chunks are then displayed on the Web site through templates that control their look and feel. Other technologies, such as the use of program scripts (CGI, Perl, Java and Active X controls) provide dynamic functionality to Web site displays and content.

As Web sites become more dynamic in nature, their functionality must be considered during the localization process. Ideally, the same standards for software internationalization are applied to any code or scripts that are included on the Web site (simplifying the localization process). As with software projects, any text strings used in the script must be identified for the localization process. Fortunately, the amount of text in these code modules is normally quite small and therefore localization is straightforward. Still, these modules must be tested on native language operating systems, to assure they function properly.

The most complex addition to the Web arsenal is the database interface. It is the database that makes content management systems work effectively. Here, much of the page content is stored in a database and then displayed on the Web page as needed. For example, if a user asks to see all of the large wool shirts that you sell, your database would serve up a list on the Web page of all of the large wool shirts you have in inventory. If this list is localized, then an added level of complexity is introduced to the Web page: the database not only must serve up the list of shirts, but the right localized list of shirts, and that list must fit correctly onto the page. This is typically accomplished by designing the database to handle this added level of complexity with a “by language” table structure. Similarly, the Web style sheet is modified to handle the “by language” text expansion requirements so that the localized content looks correct on the screen.

Extended and Double-byte Characters On the Web

Web pages must be able to display the characters of languages from all over the world. The accented characters that are found in Western European languages (French, Spanish, Italian, German, and Portuguese) are relatively easy to display. Most personal computers support the extended ASCII character set required to represent these letters. For HTML, these special characters may be represented either by specific HTML codes or by setting the language encoding for the page. For example, an é is represented in HTML as &eacute (an acute accent over the letter e). These special codes are generated automatically if you use an HTML generator with a WYSIWYG (what you see is what you get) interface.

Localizing Web sites into Eastern and Central European languages, and into double-byte languages, is slightly more complicated as their character sets are completely different from that of the Western European languages. Fortunately, the browsers from Internet Explorer and Netscape, Version 4 and higher, support language-encoding metatags. As long as the user’s PC has the appropriate language support (available as multi-language support on Windows and Mac platforms), or the native language operating system, the extended character sets appear correctly. Both the Web pages and the browser must be configured to support the desired character set. Fonts must also be installed on the computer to view these double-byte languages. Using the HTML <META> tag element, the character encoding necessary to view a particular page can be set automatically. For example, <META http-equiv=“content-type” content=“text/html; charset=big5”> indicates that the page is encoded for Traditional Chinese.

Next Step

With a better understanding of the issues involved in the localization and translation of various forms of online documentation, technical writers are better able to develop source content for the global market place. The benefits of exploiting online documentation strategies are just as applicable to the foreign marketplace. For additional information, see the award winning Guide to Localization and Translation, available from Lingo Systems (www.lingosys.com).

As Vice president of Operations at Lingo Systems, John Watkins is responsible for all aspects of localization production. With over 15 years experience in the international software development industry, John has led the expansion of Lingo Systems’ production department to meet our increasing client demand for full-scale localization. John is a recognized expert in international software development and is often invited to present at symposia on the topics of software localization and international e-commerce.

logo and link to STC site
Home | Calendar | Membership | Newsletter | Employment | Education | SIGs | Competition | Links | Administrative
Copyright © 1998-2003 Willamette Valley Chapter. All rights reserved.
Comments or Questions?   Disclaimer