Вы находитесь на странице: 1из 11

Communicating

with Text on Screen: Character Sets and


Typology
JR Hines, Software Consulting Engineer, Inc., 1125 Country Club,
Allen,
TX 75002 (JohnRichardHines@Yahoo.com)

Introduction
We communicate information to others with drawings, gestures or words (either verbally
or written). Most intentional communication is done with written words. We use hand-
written notes, short text messages or hasty emails when communicating informally:
spelling, grammar and presentation are not usually import. But, when we have important
information to communicate, we take care to present the information in formal documents
that make understanding the information easier.
Correct spelling and good grammar are important to your documents. However, proper
appearance is also important. You must use characters your reader recognizes and omit
those he does not recognize. You must also use characters shaped to make reading the
information pleasant. And, you must follow the Three Bears Rule, using characters that
are neither too large or too small.
This document (Communicating with Text on Screen: Character Sets and Typology)
discusses how to improve the appearance of documents, especially documents displayed
on screens.
Useful Terms
The following terms will be used in the rest of this note.

Term Definition

Alphabet Group of related characters. Generally, alphabets are associated


with a culture.

Font An alphabet is a group of characters with a unique graphical size,


weight and style. Size is either a percent or some number of pixels
of the alphabet. Weight is the boldness of the alphabet. Style is
normal, italic or oblique (similar to italics) of the alphabet.

Monospace All characters are the same width. Documents created by a


alphabet typewriter (or a line printer) use Monospace fonts. Reading
Monospace documents is difficult.

Proportional Each character has a different but appropriate width think of


alphabet lower case I next to lower case W. Documents created by a print
shop (or an ink jet or laser printer) can use either Monospace or
Proportional alphabets. Reading Proportional documents is easy.

Typeface The general appearance of a character set. The typeface is the


characteristic that groups characters of a character set together.
Fonts that share a common appearance but have different size,
weight and/or style have the same typeface.

Typography Arranging characters, words and lines and groups of lines to make
text easier to read. It involves the selection of a typeface, character
size, the spacing of characters in words, and the length and spacing
between lines.
The Right Characters
The right characters (the alphabet you use) are defined by the culture of your readers.
Three hundred years ago, the printers of each culture determined the set of characters (a
character set) of that culture. A hundred years ago, the manufacturers of typewriters
determined the character set. Fifty years ago, the manufacturers of computers and printers
determined the default character set. Today, many trade groups, countries and groups of
countries set up standards organization to enforce standardization among computers and
printers.
ASCII, the 128 seven-bit character set used in computers and printers in the 1970s,
became the default Anglosphere character set. Later, Extended ASCII, the 256 8-bit
character set used in IBM-compatible personal computers in 1980s, because the default in
most Anglosphere and Western European countries. Unfortunately, 256 characters are not
enough to meet the needs of the non-Anglosphere, non-Western European world so
Microsoft, Apple and the web community introduced three similar but different 16-bit
character sets that support up to 65,536 characters to serve the world market.
(Unfortunately, even the first 256 characters of the three character sets are not the same.)
Microsofts character set is called Windows, Apples is called MacRoman and the webs is
called Unicode. All three work fine for printed documents. However, customer code, a
browser or other generic viewer displaying a document that uses the Windows or
MacRoman character sets may show funny characters (usually the outline of a square).
There are only two ways to be sure your document will look the same no matter what your
users use to view it: (1) dont use characters which are not supported by all three
alphabets (easy); or (2) change the default text encoding (hard).
Turn Off Unsupported Characters in Your Word Processor
The default setting for Microsoft Word and other word processors is to insert unsupported
characters that supposedly promote readability. (BTW: Characters like Microsofts Smart
Quotes that have the word Smart in front of them are almost always unsupported.)
Turning off the use of so-called smart characters will eliminate most problems. In
Microsoft Word and Libre Office Writer, you turn of these unsupported characters by
turning off the AutoCorrect/AutoFormat As You Type/Replace As You Type options. A
similar process should work with Mac word processors, but I have not tried to do so.

Manually Remove Unsupported Characters If Your Document Already Has


Them
If you have already created a document with unsupported characters, there is no easy way
to remove them. Change AutoCorrect as discussed above to avoid creating new issues
then do a global search and replace that removes smart quotes (both and ), fraction
characters, superscript ordinals and dashes.

You Can Change the Default Character Set Sort Of


You cant really change the character set, at least in Word. But, you can change the font.
A few fonts in Windows have the word Unicode in their name so they will show when
you are using non-standard characters.
Character Shapes
Character shape is specified by the characters typeface.
Choosing a Typeface
The typefaces you choose determine the general appearance of characters in your
documents. Generally you chose one typeface for headers, another for text.

Monospace and Proportional Typefaces


Typefaces are either monospace (every character the same width) or proportional (every
character has a different width think of a lower case I next a lower case W: the I does
not need to be as wide as the W). Columns of numbers should use a monospace typeface
but a proportional typeface should be used for every other purpose. Note: The most
commonly used monospace typeface is Courier. The most commonly used proportional
typefaces are Times Roman and Helvetica.

Serif and Sans Serif Typefaces


Typefaces are either serif (characters have small lines at the left and right edges of letters
to draw the eye from left to right) or sans serif (characters have smooth left and right
edges that slow down the eyes movement from left to right which emphasizes words
instead of lines). Traditionally, Serif typefaces have been used for blocks of text while
Sans Serif typefaces have been used for headers. Recent versions of Microsoft Word have
reversed this: a serif typeface is used for headers while a sans serif typeface is used for
blocks of text.
I follow the traditional rule. Note: Courier and Times Roman are serif typefaces;
Helvetica is a sans serif typeface.

Typeface Copyrights
If Solomon was creating documents on a computer, he might have written something like
this in Ecclesiastes: Of making many typefaces there is no end and much study wearies
the body. There are many, many typefaces. Many typefaces are attractive and have a
proper place. However, most of the attractive typefaces have been copyrighted and
software that implements them must have licenses.
The cost of some typeface licenses is quite high so custom code, browsers and viewers do
not implement them. Instead, they use a typeface mapping system that automatically
replaces unlicensed typefaces with a similar less-expensive typeface they have already
licensed. This keeps you and your readers out of legal problems. However, specifying
commonly used typefaces in your documents will usually keep you away from this
problem.
Warning: If you use an uncommon typeface that is mapped to some other typeface,
WYSIWYG (What You See Is What You Get) becomes a hope rather than a fact.
Warning: If the customer code, browser or viewer chosen by your users does not have a
license to use a copyrighted typeface, you and your users may end up the legitimate prey
of some copyright attorney.

Selecting a Specific Typeface


Microsoft Word offers hundreds of typefaces but only five are widely used in web
applications: Courier New, Times New Roman, Arial, Georgia and Verdana.
The most common Microsoft typefaces used by web applications are Courier New
(traditional monospace serif typeface much like Courier), Times New Roman (a traditional
proportional serif typeface much like Times Roman) and Arial (a traditional proportional
sans serif typeface much like Helvetica). If custom code, a browser or a viewer does not
have licenses for these typefaces, documents using them are still WYSIWYG when the
custom code, browser or viewer maps them to another form of Courier, Times Roman, or
Helvetica.
Note: Its safe to assume that all custom code, browsers and viewers support something
much like Courier New, Time New Roman and Arial.
Less common Microsoft typefaces used by web applications are Georgia (a transitional
serif typeface somewhat like Times Roman), and Verdana (humanist sans serif somewhat
like Helvetica). If custom code, a browser or a viewer does not have licenses for these
typefaces, documents using them are often not truly WYSIWYG if the browser or viewer
maps them to some form of Times Roman or Helvetica.
My suggestion is that you only use Courier New, Times New Roman and Ariel if you care
about WYSIWYG.
Choose One or More Fonts from the Typefaces
This note has all its headers in Arial typeface and all its body in Times New Roman
typeface. I use three header fonts: the main headers are Arial 16 Bold; the subheaders are
Arial 13 Bold Italic; and the subsubheaders are Arial 12 Italic. I use a single body font:
Times New Roman 11.
Note: These are custom settings. The default Heading 1, 2, and 3 were difficult to
distinguish so I made changes.
I use proportional typefaces because proportional typefaces are easier to read than
monospace typefaces. I use a sans serif type for my headers because sans serif typefaces
make the reader focus on the words in the headers so the user will slow down and
carefully read them. I use a serif typeface because sans serif typefaces make the reader
focus on the lines of text in the body so the user will quickly read the body text.
If I had columns of numbers where numbers need to be lined up one under another, I
would need a second body font, probably Courier New 11.
Font Sizes
As I mentioned above, this note uses header fonts between 12 and 16. These sizes are
widely used for headers because most people can easily read them. Larger header fonts
are only useful if you are announcing topics like the end of the world or large tax cuts.
As I mentioned above, this note uses body fonts that are 11. Young writers frequently use
smaller fonts to put more information on each page. An unexpected consequence of
smaller fonts is that older readers stop reading. If your target market is those over forty, I
recommend body fonts that are 12 instead of 11. If your target market is those over sixty, I
recommend body fonts that are 14 instead of 12.

Вам также может понравиться