Kerning the Impossible

ec429's picture

I've been experimenting with a technique I call "deviation kerning", that allows one to kern monospace fonts while retaining the essential vertical alignment property that monospace provides.

I've created a Unix terminal emulator that implements this technique (and also contextual glyph substitution, which I've used chiefly for ligatures) and am currently living with it as my terminal; it hasn't exhibited any eye-melting tendencies yet :)

Details are at http://jttlov.no-ip.org/projects/monokern.htm, source code is at http://github.com/ec429/monokern. The font is based on xterm's default (12x6) font, but I'm also working on a slightly larger (18x8) Humanist font; I've drawn the glyphs but I haven't finished the kerning tables yet.

What do you think of deviation kerning? And is the 18x8 font readable? I'm aware that it's a mish-mash; that's because I tried to differentiate the characters (in code, legibility can be more important than readability). Does the 'e' fit?

AttachmentSize
A demonstration of the ligature substitutions (yes, there are too many)6.27 KB
C code (of the kerning procedure) displayed in termk10.54 KB
The 18x8 font (not kerned yet)20.31 KB
hrant's picture

Very cool idea.

hhp

sim's picture

Which typeface did you use in your sample?

BTW, to be more readable be careful using light blue text on dark blue background ;-)

sgh's picture

This is a cool idea! However, I'm not completely sure I understand how it works. I compiled the code, and ran termk, but I don't quite understand how the spacing is adjusted. Does this work as a combination of kerning and ligatures? For ligatures (assuming ligatures are only made up of 2 characters), how do you decide whether to combine on the left or the right side of a character?

Somewhere you mention that the calculation of spacing is done using dynamic programming and runs in O(n) time. Could you explain it more explicitly?

One thing that I noticed using termk on the command line and in editors is that changing one character can have an effect very far away---for instance, when typing "hhhhhhhhhhhhhhhhhhhi", when the 'i' is typed, all of the 'h's shift right one pixel. For editors, it might be worth considering how to force changes to be local to minimize visual disturbance. I presume this is why many WYSIWYG editors use ragged right, instead of full justification. ;)

Best wishes, Stephen

ec429's picture

sim: the typeface for the samples (except for the 18x8 one) is the xterm 'Default' font, with some modifications (the ligatures, for instance, I drew from scratch).

sgh: The basic procedure is the kerning; the ligatures are done afterwards as contextual glyph substitutions conditional upon spacing. For instance, one of the rules in the 'ligatures' file is "f-i** fi", meaning "an 'i' preceded by an 'f' kerned close, followed by anything, is replaced by the glyph from 'li_fi.pbm'". Some of the ligatures are three-character, and the ligatures file consists of a series of rules whose order in the file determines precedence. For instance, higher up the file is the rule "f-f-i ffi", which uses a different glyph for the middle 'f' of 'ffi' to that which would be used for the 'f' of 'fi' otherwise. When making the ligatures, I have been careful to ensure that any two compatible ligatures have a combined three-character ligature (not always the same as the ligatures that make it up), such as fth. There is one exception, Fth, because the 't' of Ft is smaller than normal; in Fth only the th is ligated. I should add that technically many of the so-called 'ligatures' are really only glyph substitutions, because the characters that comprise them are not joined together.

The algorithm to calculate the spacing makes use of the fact that the only thing that 'matters' to a character spacing-wise is the position of the character before it. So, it works along from left to right, at each point holding a table of the best score for each of the three positions of the last character (-1, 0, and +1 px). Then, for each of the three possible positions of the next character, it works out the new score (so that's 9 combinations, and for each it just adds one kerning-pair-score to one score from the table), and finds which score is best for each next-character-position, and those new best scores go in the table. All of this takes constant time per character, so traversing the whole string should take O(n) time.

When it gets to the end, it just has to pick whichever of the three scores is best, and work backwards along the string reconstructing the deviation of each character. The current implementation does this by attaching the deviation values for the whole string to each of the three table entries, which means those values have to be copied for the table updates - which means the current implementation is actually O(n^2) - but it should be possible to store a kind of 'trellis' that can be traced backwards in O(n) time. I just haven't got around to putting in the effort since it's already fast enough for 80 columns.

The action-at-a-distance effect is difficult to get rid of - or rather, forcing changes to be local would tend to give very poor kerning. For instance, when you start typing your hhhh... it doesn't know whether you're going to follow that with iiii... or mmmm..., so it has no idea whether to give you plenty of space (by pushing all the h left) or plenty of anti-space (by pushing them right). So, you find you've typically only got about half as much 'flexibility' to work with. Certainly, it could be done, but I find that one rapidly gets used to bits of the display 'wobbling' as you type - it's not as distracting as you'd think. (Playing NetHack with it, on the other hand... now /that's/ impossible)

sgh's picture

ec429: Thanks for the explanation! I think I understand now.

I wonder if a similar effect can be achieved by using (lots of) ligatures (or contextual substitutions) in a monospaced font. Then it could be used by standard terminals and editors. The downside of this approach is that it doesn't have the global optimization that you currently are doing with deviation kerning.

(Playing NetHack with it, on the other hand... now /that's/ impossible)

:-)

ec429's picture

sgh: My guess is, without the global optimisation you'd really struggle to get anything good, because most of what you want to do is narrowing: most monospace fonts are designed around the em, with anything narrower being stretched (hence the huge serifs on i and l). Consequently, if you can't stretch other spaces out, you're going to be unable to have a decent kerning of things like i, l, f; they'll simply end up with too much space around them.

As for use in standard terminals and editors, the current method could be implemented in existing font rendering engines (by people who understand how those engines work), though anything that only re-renders part of the line when things change will encounter problems.
If you wanted monokerning in a GUI text editor, you'd probably have to modify the editor to render whole lines at once, in which case you may as well just make it use monokern directly like termk does.

Anyway, if you know how to code, I suggest you try experimenting with ways to force locality and see whether it works; personally, I don't think locality matters that much, since when I'm editing text (or code) I have a lot of regressive eye movement anyway (because I'm constantly re-evaluating what I'm typing).

I suppose it might help if there were some way to smoothly move or fade the text to the new position when it moves, but that would be rather CPU-intensive (and might cause problems with small fonts which would just blur into a grey mush).

Syndicate content Syndicate content