Beyond Pangrams?

macaroon
7.Feb.2005 2.34am
macaroon's picture

Whilst the Quick Brown Fox... is good for testing individual glyphs. What do people use to extensively test their font's with, beyond the scope of a single pangram sentence, say with a large body of text? Whilst I could cut 'n' paste any old text, could anyone recommend a specific string to check a newly created font?



Nick Shinn
7.Feb.2005 3.31am
Nick Shinn's picture

I try it out at random on various brochures, magazines, ads, etc., that I've worked on and have Quark/InDesign files for. Even Word! Recipes are good.

I've also written my own text, to include the more infrequently used glyphs and marks. Caesar salad.

"five truffles offer sufficient flavo(u)r" has all the f and f-ligature glyphs. Can't remember where I found that.

Here is something I invented:
537+489=1026

...by trial and error. Luc Devroye subsequently calculated it mathematically -- there are over a hundred of these equations.


smarks
7.Feb.2005 10.31pm
smarks's picture

There's always the pseudo-Latin Lorem Ipsum stuff. Go here for an explanation and a web-based text generator. Some applications like InDesign have placeholder text generators built-in.


macaroon
8.Feb.2005 12.55am
macaroon's picture

That's the ticket. Just the kind of thing that I was after.

@Nick
I take it your mysterious sum is a way of using digits 0-9.


Nick Shinn
8.Feb.2005 3.26am
Nick Shinn's picture

>I take it your mysterious sum is a way of using digits 0-9.

Right. And like a word pangram, it is not an arbitrary sequence, but has a proper syntax. (Although it really doesn't "mean" anything, but then neither does "Waltz nymph...")


macaroon
8.Feb.2005 4.03am
macaroon's picture

Ah, that's why I use "Foxy diva Jennifer Lopez wasn't baking my quiche" instead.

All we need now is a text generator that spews out several paragraphs including the 676 odd kerning pairs. Not that'd you'd ever kern such an odd combination of less commonly used letters, such as "Kzquex". Unless it's a comic book font and Kzquex is leader of the Yvgionz from the Planet Xizql VII!


Nick Shinn
8.Feb.2005 4.54am
Nick Shinn's picture

Nonetheless, I never cease to be amazed at the "unusual" character combinations one comes accross in everday words.

"savvy" and "keyword" really test the fit of one's "v" glyphs.

Yv is a common combo in French, as well as amongst the Yvgionz.

Don't forget Steve Yzerman, even though the hockey season is stranded.

Recently, it's been interesting to see which newspaper/magazine faces have kerned "Ts".


grod
8.Feb.2005 7.39am
grod's picture

676 odd kerning pairs.
And these would be?


macaroon
8.Feb.2005 9.30am
macaroon's picture

aa, ab, ac, ad...
ba, bb, bc, bd...
Continued on to zz.

That's assuming lower case only. That figure would be much larger if you combined kerning pairs of upper and lower case.

I think that's 676 (my maths may be wrong - it has been in the past)


hrant
8.Feb.2005 11.13am
hrant's picture

You don't want to worry about all possible combinations!

But certainly sample/test text that includes frequently-ocurring pairs (and words) would be wonderful. Especially if it's in "normal" English.

I actually did something along these lines once:
http://www.microsoft.com/typography/links/news.aspx?NID=2454
But I realize now that it's neither here nor there: too short to be very useful, too long to be memorable.

hhp


miles
8.Feb.2005 2.22pm
miles's picture

I've come to the conclusion that I may as well check all possible letter combinations, including lower to upper. The combination will occur in use some time or other.


as8
8.Feb.2005 2.27pm
as8's picture


andreas
8.Feb.2005 2.44pm
andreas's picture

BTW: Its best to design the lower case letters in that kind you don't need any kerning.


Stephen Coles
8.Feb.2005 2.54pm
Stephen Coles's picture

Miles' conclusion is also Font Bureau's and Bitsteam's method.


hrant
8.Feb.2005 3.07pm
hrant's picture

Checking all possible combinations* (manually) is one thing, setting up elaborate sample texts to do that is another, and actually putting in kerning for everything yet another.

* Except things that should be totally handled by the default spacing (in fact during the determination of the base spacing), like "no".

Andreas: Spacing your fonts intelligently to reduce kerning is good practice*, but trying to avoid kerning outright is a bit... quixotic! Unless there's a peculiar technical -or user- limitation.

* Except for the rare case where leaning heavily on kerning is required, like for the highest quality Armenian setting.

hhp


macaroon
9.Feb.2005 1.11am
macaroon's picture

Okay then, what kind of sample text do people use to check spacing or kerning?


alfabet
9.Feb.2005 6.13am
alfabet's picture

I use this sample text supplied by GarageFonts

http://www.alphabet-design.com/temporary/GFkern.txt


twardoch
9.Feb.2005 8.14am
twardoch's picture

Is there a reason why you guys are using texts written in just one language?

Adam


Miss Tiffany
9.Feb.2005 8.18am
Miss Tiffany's picture

They were waiting for you, Adam, to mix it up for them.


raph
10.Feb.2005 9.51pm
raph's picture

In case anyone cares, here are the 118 lowercase digraphs (out of 676 total possible) that do not occur in the English language (or at least /usr/share/dict/words distributed with Red Hat Linux 9.0, which in this day and age is basically the same thing):

bq bx bz cf cj cp cv cw cx dx fb fj fp fq fv fx fz gc gq gv gx hj hx hz jb jc jd jf jg jh jj jl jm jn jp jq jr js jt jv jw jx jy jz kq kx kz lx mg mj mx mz pq pv px qa qb qc qd qe qf qg qh qj qk ql qm qn qo qp qq qr qs qt qv qw qx qy qz sx tj tq tx vb vc vf vg vh vj vk vm vn vp vq vt vv vw vx vz wj wq wv ww wx wz xd xj xk xr xs xz yy zf zh zj zn zq zx

It's particularly gratifying to see 'hz' in the list :)


Thomas Phinney
10.Feb.2005 10.17pm
Thomas Phinney's picture

Hmmm. Is "fjord" (fj) not an English word? I find that rather "offputting" (fp). I'm not much of a football fan, but "halfback" (fb) is still a word in English. Finally, I work with somebody named "Zhao" (zh).

The other key consideration is that very few fonts we make will only have to set English. For most fonts, it's very anglo-centric to do kerning and such as if that were the case.

Cheers,

T


raph
10.Feb.2005 11.31pm
raph's picture

Thomas: yeah, that wordlist certainly has some gaps; fj popped out at me too. Good catch on the fp and fb!

Zhao wouldn't have qualified because these are lc kerns only, but there is always Brezhnev.

Part of my point in posting this is to show that the coverage of kern pairs by English words is in fact quite spotty--even just counting lc pairs, you're missing almost a fifth of the possible combinations. Once you add in uppercase, you get four times the number of digraphs that need coverage, and the rising popularity of CamelCase makes it even more likely to actually run into those.

So yeah, I'm going to make sure to cover all possible digraphs when I actually get around to releasing my fonts, not just the ones that seem likely in English.


Nick Shinn
11.Feb.2005 4.02am
Nick Shinn's picture

I wonder what the word is with qi in it?

The list could probably be reduced to single figures if you merged and purged with a database of all the words -- including proper names -- that occur in a typical English-language newspaper over the course of a year.


Nick Shinn
11.Feb.2005 4.05am
Nick Shinn's picture

Adam, what are the common characters that appear adjacent to the l-slash? -- and are there characters which never do? This effects the design of the glyph, not just its kerning.


William Berkson
11.Feb.2005 4.38am
William Berkson's picture

Nick, 'qi' is used in pin-yin, the system of romanized Chinese. The q is used for the 'ch' sound. This is used to input Chinese on computers, and also is the PRC preferred way to transliterate Chinese. FYI, qi (pronounced chee) means air, atmosphere, or spirit, and is also the term for the mysterious energy that is supposed to flow through the body and the landscape. It is a key term in both acupuncture and fung shui, as well as being part of many, many Chinese words.


Nick Shinn
11.Feb.2005 4.53am
Nick Shinn's picture

Yes, but Ralph's list was supposed to be for English, and excluded the Scandinavian "fj", so I was wondering if there is an English word with "qi" in it.


William Berkson
11.Feb.2005 5.30am
William Berkson's picture

Iraqi


twardoch
11.Feb.2005 7.35am
twardoch's picture

Nick,

the most problematic pairs are lslash-y and lslash-w. I'm working on a more comprehensive overview.

Adam


hrant
11.Feb.2005 10.05am
hrant's picture

> Is there a reason why you guys are using
> texts written in just one language?

The same reason you wrote that in that language? :-)
But you already know, I do think we need a lot more.

--

BTW, if you want to find words with a certain digraph, check out R Eckler's "Making the Alphabet Dance", pp 65-68. William's "Iraqi" is better than what Eckler has ("Qiviut") but generally he lists the most frequent word (or at least a highly frequent word) with that pair - quite useful.

hhp


raph
11.Feb.2005 10.32am
raph's picture

The two qi words in /usr/share/dict/words (on my Mac) are Iraqi and qintar.

I'm interested enough in this to make an all-digraph generator. My gut feeling is that making it look something like normal text will make it easier to see spacing problems than, say, a big table.

What should be the goals for the resulting text? Obviously, containing all digraphs is the most important, but what other criteria are there? Should it try to hit real English words when possible? Should letter frequencies be totally uniform, or closer to real distributions?