
A Bi-Monthly Newsletter
Volume 6, Issue 2, March 2003
STC WVC Home>Newsletter
Table of Contents>Where Online Documentation...
Where Online Documentation
and Global Audiences Meet
By John Watkins
Software packages use various forms of online documentation for user
support. At the very least there is normally a “README” or
release notes file. The advent of multimedia has prompted many software
manufacturers to include other forms of documentation with their software;
in fact, some manufacturers have completely eliminated printed materials
and are relying solely on online documentation.
The two most common examples of online documentation are Help files (typically
accessed through the program itself), and online manuals (for user manuals,
installation guides, etc.). A third area of increasing importance is the
use of Web-based content for distributing all online documentation.
Help Files
Help files are the most common form of online documentation. Microsoft
has two standards for Windows Help files: the RTF based WinHelp, and the
HTML based HTMLHelp. WinHelp has been used since Windows 3.0, while HTMLHelp
was released in August 1997 and is becoming increasingly popular for the
Windows 98, Windows 2000, Windows Me, and Windows XP operating systems.
Most WinHelp files are written using tools like RoboHelp (by eHelp Corporation).
When translating Help files, the localizer needs access to the files
used to build the Help:
- Source content files (.RTF, .DOC, or .HTML), containing the bulk of the
Help file text.
- Project files (used to compile the Help content).
- Bitmaps or other graphics that are used in the Help files (often containing
text that requires translation).
- Other content files (such as the .CNT file used to create the table of
contents for the Help file).
It is also possible to use macros in Help files. These macros should
be examined to see if they work in the native operating system. For example,
the following macro opens the printer’s control panel:
ExecProgram(“control.exe
printers”, 0)
When used in Windows 9X, nothing needs to be changed to make this work
on non-English operating systems. But for Windows 3.1, the “printers”
part of the macro needs to match Microsoft’s translation of that
section of the control panel:
- French ExecProgram(“control.exe Imprimantes”, 0)
- German ExecProgram(“control.exe drucker”, 0)
- Dutch ExecProgram(“control.exe printers”, 0)
Adobe Acrobat (PDF) Files for Online Manuals
Another very popular form of online documentation uses Adobe Acrobat
to create PDF files. Acrobat is a cross-platform electronic documentation
distribution format, providing 100% graphic, font, and page layout fidelity
on a variety of operating systems. With Acrobat, documents are viewed
as the authors originally intended on virtually any computer platform.
It is also possible to add functionality to a PDF file, providing hyperlinks,
bookmarks, and the like to enhance the user experience.
Adobe Acrobat consists of three programs: Reader, Exchange, and Distiller.
Reader allows a user to view, search, and print but not to create documents.
Exchange offers the user all the features of Reader plus the ability to
edit and annotate documents. Distiller allows authors to produce Acrobat
documents as PDF files (Portable Document Format). A PDF file is created
from PostScript printer files originally generated in some other application
(such as Word, PageMaker, Quark, etc.). A PDF file can be generated from
any program that can produce a PostScript file.
For a translator to be able to work with online documentation using Acrobat,
the source files are needed. The source file is translated and then converted
into a PDF afterward. The PDF file is then sent to the copy editor and
proofreader. Using Acrobat Reader and Exchange, the copy editor and proofreader
can edit the translation even if they do not own the source application.
Single Source Considerations
Just as single source content management strategies have affected the
documentation development world, so too have they impacted the world of
online documentation. Today, it is possible to create content using a
desktop publishing software package and then, with third-party software,
convert this source content to Web site files and Windows Help files—all
from the same source. The localization of single source content requires
a hybrid approach, including both documentation localization and engineering
localization. Typically, the source content can be localized directly,
then the final file formats, whether PDF, HTML, or Help files, are “engineered”
to verify compatibility on native operating systems.
Web Sites
Despite the burst of the Internet “bubble” in 2001, the use
of Web sites continues to grow steadily. Their importance in establishing
business markets, generating sales, and setting up hosted application
services is well established. Even within an organization, the use of
Web-based technologies for information management is now commonplace.
The Web has provided tremendous technological opportunities for providing
timely information to your colleagues and your customers around the world.
That, in fact, is the rub: people anywhere in the world can look up your
company and product information on your Web site. That information should
be available to them in their own language if you rely upon them as customers
(or colleagues in the case of intranets).
It is easy to say that Web site information should be provided in your
customer’s native language, but there are important issues to consider
before deciding to localize your Web content. Web sites, by their very
nature, encourage site hosts to update and/or modify the information frequently,
since visitors to the Web site expect to see up-to-date information. It
is this expectation that makes localization of Web sites a bit more challenging.
A change to one Web page on the site requires changes to the same page
in all of the languages supported by the site. Clearly, Web site maintenance
becomes more complicated with each language supported.
Let’s look at two scenarios: the intranet for an international
company and a consumer Web site for a product sold internationally. Before
localizing a company-wide intranet (with international offices) Web site,
you should consider:
- How many foreign staff members use your intranet?
- Do they require text in their native language?
- Could certain key pages be localized while leaving the bulk of the site
in English?
Similarly, the decision to localize your marketing and sales pages, targeted
for your specific market, should be carefully evaluated. While localizing
the Web site makes your product more visible in a foreign market, you
should be sure that you can anticipate a return on the localization investment.
As with intranet considerations, it may be possible to localize a subset
of your pages to keep costs down while still acknowledging your global
market.
Cost concerns aside, it is clear that some level of Web site localization
is desirable for many businesses. The following subsections address the
localization process for Web sites.
Web Site Localization Process
The complexity of localizing a Web site falls roughly between that of
document and software localization. That is, more engineering support
is required than is typically needed in document localization, but a bit
less than that required for software localization.
Before localization can proceed, the Web site must be evaluated for complexity.
Web pages are comprised of content (text), graphic objects, hyperlinks,
and advanced engineering features. Each of these components requires consideration
in the localization process. The content may be part of the page construction
or dynamically loaded through scripts or a database interface.
Web Text and Graphics
Fortunately, most of the content of a Web page is typically text and
graphics. As with the preceding discussion on documentation localization,
the same rules apply. It is important to remember that HTML pages have
some text that is not immediately apparent, for example:
- Page titles, that appear at the top of the browser interface,
- Graphic titles, the ALT attributes that appear when graphics are loading
or when users choose not to download the graphics, and Hyperlink titles.
- Graphic objects on a Web site that contain text are also normally localized.
To avoid having to edit the graphics objects (a more complicated process
than text editing), text objects should be separated from the graphics
objects. Text can always be superimposed on a graphic using absolute positioning
for graphics and text under DHTML.
Hyperlinks
Hyperlinks on a Web page have the potential to take users to regions
of your Web site or to Web sites of others that are not localized. It
may be necessary to modify these hyperlinks so that alternative sites
written in the appropriate language are selected instead. Or, an explanation
can be given in the target language stating that these hyperlinks lead
to sites written in English.
Advanced Web Features
Many Web sites use features that provide more dynamic Web pages. As the
pages become more dynamic, the potential for complications in localization
increases. The latest trend in this regard is the interface of Web sites
to content management systems that store content in XML “chunks.”
These chunks are then displayed on the Web site through templates that
control their look and feel. Other technologies, such as the use of program
scripts (CGI, Perl, Java and Active X controls) provide dynamic functionality
to Web site displays and content.
As Web sites become more dynamic in nature, their functionality must
be considered during the localization process. Ideally, the same standards
for software internationalization are applied to any code or scripts that
are included on the Web site (simplifying the localization process). As
with software projects, any text strings used in the script must be identified
for the localization process. Fortunately, the amount of text in these
code modules is normally quite small and therefore localization is straightforward.
Still, these modules must be tested on native language operating systems,
to assure they function properly.
The most complex addition to the Web arsenal is the database interface.
It is the database that makes content management systems work effectively.
Here, much of the page content is stored in a database and then displayed
on the Web page as needed. For example, if a user asks to see all of the
large wool shirts that you sell, your database would serve up a list on
the Web page of all of the large wool shirts you have in inventory. If
this list is localized, then an added level of complexity is introduced
to the Web page: the database not only must serve up the list of shirts,
but the right localized list of shirts, and that list must fit correctly
onto the page. This is typically accomplished by designing the database
to handle this added level of complexity with a “by language”
table structure. Similarly, the Web style sheet is modified to handle
the “by language” text expansion requirements so that the
localized content looks correct on the screen.
Extended and Double-byte Characters On the Web
Web pages must be able to display the characters of languages from all
over the world. The accented characters that are found in Western European
languages (French, Spanish, Italian, German, and Portuguese) are relatively
easy to display. Most personal computers support the extended ASCII character
set required to represent these letters. For HTML, these special characters
may be represented either by specific HTML codes or by setting the language
encoding for the page. For example, an é is represented in HTML
as é (an acute accent over the letter e). These special codes
are generated automatically if you use an HTML generator with a WYSIWYG
(what you see is what you get) interface.
Localizing Web sites into Eastern and Central European languages, and
into double-byte languages, is slightly more complicated as their character
sets are completely different from that of the Western European languages.
Fortunately, the browsers from Internet Explorer and Netscape, Version
4 and higher, support language-encoding metatags. As long as the user’s
PC has the appropriate language support (available as multi-language support
on Windows and Mac platforms), or the native language operating system,
the extended character sets appear correctly. Both the Web pages and the
browser must be configured to support the desired character set. Fonts
must also be installed on the computer to view these double-byte languages.
Using the HTML <META> tag element, the character encoding necessary
to view a particular page can be set automatically. For example, <META
http-equiv=“content-type” content=“text/html; charset=big5”>
indicates that the page is encoded for Traditional Chinese.
Next Step
With a better understanding of the issues involved in the localization
and translation of various forms of online documentation, technical writers
are better able to develop source content for the global market place.
The benefits of exploiting online documentation strategies are just as
applicable to the foreign marketplace. For additional information, see
the award winning Guide to Localization and Translation, available
from Lingo Systems (www.lingosys.com).
As Vice president of Operations at Lingo Systems, John Watkins is
responsible for all aspects of localization production. With over 15 years
experience in the international software development industry, John has
led the expansion of Lingo Systems’ production department to meet
our increasing client demand for full-scale localization. John is a recognized
expert in international software development and is often invited to present
at symposia on the topics of software localization and international e-commerce.
|