Beyond Pangrams?

macaroon's picture

Whilst the Quick Brown Fox... is good for testing individual glyphs. What do people use to extensively test their font's with, beyond the scope of a single pangram sentence, say with a large body of text? Whilst I could cut 'n' paste any old text, could anyone recommend a specific string to check a newly created font?

Nick Shinn's picture

I try it out at random on various brochures, magazines, ads, etc., that I've worked on and have Quark/InDesign files for. Even Word! Recipes are good.

I've also written my own text, to include the more infrequently used glyphs and marks. Caesar salad.

"five truffles offer sufficient flavo(u)r" has all the f and f-ligature glyphs. Can't remember where I found that.

Here is something I invented:
537+489=1026 trial and error. Luc Devroye subsequently calculated it mathematically -- there are over a hundred of these equations.

smarks's picture

There's always the pseudo-Latin Lorem Ipsum stuff. Go here for an explanation and a web-based text generator. Some applications like InDesign have placeholder text generators built-in.

macaroon's picture

That's the ticket. Just the kind of thing that I was after.

I take it your mysterious sum is a way of using digits 0-9.

Nick Shinn's picture

>I take it your mysterious sum is a way of using digits 0-9.

Right. And like a word pangram, it is not an arbitrary sequence, but has a proper syntax. (Although it really doesn't "mean" anything, but then neither does "Waltz nymph...")

macaroon's picture

Ah, that's why I use "Foxy diva Jennifer Lopez wasn't baking my quiche" instead.

All we need now is a text generator that spews out several paragraphs including the 676 odd kerning pairs. Not that'd you'd ever kern such an odd combination of less commonly used letters, such as "Kzquex". Unless it's a comic book font and Kzquex is leader of the Yvgionz from the Planet Xizql VII!

Nick Shinn's picture

Nonetheless, I never cease to be amazed at the "unusual" character combinations one comes accross in everday words.

"savvy" and "keyword" really test the fit of one's "v" glyphs.

Yv is a common combo in French, as well as amongst the Yvgionz.

Don't forget Steve Yzerman, even though the hockey season is stranded.

Recently, it's been interesting to see which newspaper/magazine faces have kerned "Ts".

grod's picture

676 odd kerning pairs.
And these would be?

macaroon's picture

aa, ab, ac, ad...
ba, bb, bc, bd...
Continued on to zz.

That's assuming lower case only. That figure would be much larger if you combined kerning pairs of upper and lower case.

I think that's 676 (my maths may be wrong - it has been in the past)

hrant's picture

You don't want to worry about all possible combinations!

But certainly sample/test text that includes frequently-ocurring pairs (and words) would be wonderful. Especially if it's in "normal" English.

I actually did something along these lines once:
But I realize now that it's neither here nor there: too short to be very useful, too long to be memorable.


miles's picture

I've come to the conclusion that I may as well check all possible letter combinations, including lower to upper. The combination will occur in use some time or other.

as8's picture

andreas's picture

BTW: Its best to design the lower case letters in that kind you don't need any kerning.

hrant's picture

Checking all possible combinations* (manually) is one thing, setting up elaborate sample texts to do that is another, and actually putting in kerning for everything yet another.

* Except things that should be totally handled by the default spacing (in fact during the determination of the base spacing), like "no".

Andreas: Spacing your fonts intelligently to reduce kerning is good practice*, but trying to avoid kerning outright is a bit... quixotic! Unless there's a peculiar technical -or user- limitation.

* Except for the rare case where leaning heavily on kerning is required, like for the highest quality Armenian setting.


macaroon's picture

Okay then, what kind of sample text do people use to check spacing or kerning?

alfabet's picture

I use this sample text supplied by GarageFonts

twardoch's picture

Is there a reason why you guys are using texts written in just one language?


raph's picture

In case anyone cares, here are the 118 lowercase digraphs (out of 676 total possible) that do not occur in the English language (or at least /usr/share/dict/words distributed with Red Hat Linux 9.0, which in this day and age is basically the same thing):

bq bx bz cf cj cp cv cw cx dx fb fj fp fq fv fx fz gc gq gv gx hj hx hz jb jc jd jf jg jh jj jl jm jn jp jq jr js jt jv jw jx jy jz kq kx kz lx mg mj mx mz pq pv px qa qb qc qd qe qf qg qh qj qk ql qm qn qo qp qq qr qs qt qv qw qx qy qz sx tj tq tx vb vc vf vg vh vj vk vm vn vp vq vt vv vw vx vz wj wq wv ww wx wz xd xj xk xr xs xz yy zf zh zj zn zq zx

It's particularly gratifying to see 'hz' in the list :)

Thomas Phinney's picture

Hmmm. Is "fjord" (fj) not an English word? I find that rather "offputting" (fp). I'm not much of a football fan, but "halfback" (fb) is still a word in English. Finally, I work with somebody named "Zhao" (zh).

The other key consideration is that very few fonts we make will only have to set English. For most fonts, it's very anglo-centric to do kerning and such as if that were the case.



raph's picture

Thomas: yeah, that wordlist certainly has some gaps; fj popped out at me too. Good catch on the fp and fb!

Zhao wouldn't have qualified because these are lc kerns only, but there is always Brezhnev.

Part of my point in posting this is to show that the coverage of kern pairs by English words is in fact quite spotty--even just counting lc pairs, you're missing almost a fifth of the possible combinations. Once you add in uppercase, you get four times the number of digraphs that need coverage, and the rising popularity of CamelCase makes it even more likely to actually run into those.

So yeah, I'm going to make sure to cover all possible digraphs when I actually get around to releasing my fonts, not just the ones that seem likely in English.

Nick Shinn's picture

I wonder what the word is with qi in it?

The list could probably be reduced to single figures if you merged and purged with a database of all the words -- including proper names -- that occur in a typical English-language newspaper over the course of a year.

Nick Shinn's picture

Adam, what are the common characters that appear adjacent to the l-slash? -- and are there characters which never do? This effects the design of the glyph, not just its kerning.

William Berkson's picture

Nick, 'qi' is used in pin-yin, the system of romanized Chinese. The q is used for the 'ch' sound. This is used to input Chinese on computers, and also is the PRC preferred way to transliterate Chinese. FYI, qi (pronounced chee) means air, atmosphere, or spirit, and is also the term for the mysterious energy that is supposed to flow through the body and the landscape. It is a key term in both acupuncture and fung shui, as well as being part of many, many Chinese words.

Nick Shinn's picture

Yes, but Ralph's list was supposed to be for English, and excluded the Scandinavian "fj", so I was wondering if there is an English word with "qi" in it.

twardoch's picture


the most problematic pairs are lslash-y and lslash-w. I'm working on a more comprehensive overview.


hrant's picture

> Is there a reason why you guys are using
> texts written in just one language?

The same reason you wrote that in that language? :-)
But you already know, I do think we need a lot more.


BTW, if you want to find words with a certain digraph, check out R Eckler's "Making the Alphabet Dance", pp 65-68. William's "Iraqi" is better than what Eckler has ("Qiviut") but generally he lists the most frequent word (or at least a highly frequent word) with that pair - quite useful.


raph's picture

The two qi words in /usr/share/dict/words (on my Mac) are Iraqi and qintar.

I'm interested enough in this to make an all-digraph generator. My gut feeling is that making it look something like normal text will make it easier to see spacing problems than, say, a big table.

What should be the goals for the resulting text? Obviously, containing all digraphs is the most important, but what other criteria are there? Should it try to hit real English words when possible? Should letter frequencies be totally uniform, or closer to real distributions?

Stephen Coles's picture

Miles' conclusion is also Font Bureau's and Bitsteam's method.

Miss Tiffany's picture

They were waiting for you, Adam, to mix it up for them.

Syndicate content Syndicate content