Subsetting a TTF font - How to do efficiently.

Richard Fink's picture

How do I subset a font down to a custom codepage of glyphs? In other words, if I've got a font with 393 gylphs, and I want to bring it down to a particular subset of 228, how do I delete the glyphs I don't want efficiently? (Without saving to a new vfb file, and manually deleting them. Very laborious and prone to error.)

Doesn’t seem to be anything in the FontLab manual about it and I've searched here on this site, too.
I have FontLab, FOG, TTX, OTMaster, FontForge. (Anything else I should acquire for this purpose?)

Is there some kind of export/import procedure that will get me there?

Thanks.

Rich

John Hudson's picture

FontLab doesn't provide a means to do this, other than as you describe (manual deletion). To avoid possible errors in the deletion, you can define a custom encoding, sort glyphs by encoding, and delete any glyph that falls outside the encoding.

Ideally, you want a dedicated subsetting function, that will not only subset the glyph set but will also recompile GSUB and GPOS and other tables.

Mitternacht's picture

I'm not sure this is exactly what you're looking for, but FontSquirell's @font-face kit generator lets you create subsets from font files.
FontSquirrel Generator : http://www.fontsquirrel.com/fontface/generator

The tool is designed to create full @font-face kits but you can upload your TTF, choose TTF only as output format and use the “Expert” subset creation tool (which is very handy). It only takes 2 minutes so it won't be a huge waste of time if it's not what you're looking for :) Please note you have to tick the “Yes, the fonts I'm uploading are legally eligible for web embedding” check box even if it's not the case and you don't intend to use it on the Web. Otherwise it won't let you do what you want to do.

Good luck!

Richard Fink's picture

@jh

To avoid possible errors in the deletion, you can define a custom encoding, sort glyphs by encoding, and delete any glyph that falls outside the encoding.

This would not be so bad and had occured to me. But newbie me seems to be missing a way to spot what lies outside the encoding.
Yes, the glyphs line up with the encoding, but where exactly does it end?
Plus, there are usually some glyphs - spacing chars and such - that I want to preserve.
Not straightforward, unfortunately. But I'll take a second and third look at it.

@mitternacht

Yes. Overall, the Generator does a quite excellent job. And you can customize it with a glyph list, too.
However, in some cases it doesn't reconcile the other tables - like those that John Hudson mentioned - in exactly the way I want. That's why I'm looking for an alternate solution. And just to have it handy, offline, too.

Clearly, fonts as a technology are designed with a bias towards growing larger, not scaling down!

A friend of mine sent me a FontForge script that I'm going to play with today and will probably get working.

If anybody else has any ideas, or scripts, or anything, I'm all ears.

Jack B. Nimblest Jr.'s picture

>...I'm all ears

Subsetting is why I wanted to burn the former Appendix B & C of the WOFF spec.

Under what used to be best practices for WOFF producers, Appendix B said, WOFF makers must make WOFFs that are 100% compatible with the original font, or be outside of the W3C specification. (I'm not sure how this has changed since it was absorbed into other parts of this and other documentation).

Then, under the best practices for WOOF-using user agents, Appendix C said these agents could subset the font, leaving out whatever might not be needed. (and I'm not sure how this has changed since it was absorbed into other parts of this and other documentation).

The sad thing is, that UA's doing the subsetting on the user's machine is like refining petroleum in each homeowner's boiler. Even stranger, is that the W3C does not seem to appreciate, and RF, you obviously don't, that automatically subsetting a font, with its compressed kerning, glyphs and hints, will require a complete TT interpreter.

I know I'm a complete TT interpreter and can subset any font accurately, but I wonder how many others there are out there.

John Hudson's picture

David: Under what used to be best practices for WOFF producers, Appendix B said, WOFF makers must make WOFFs that are 100% compatible with the original font, or be outside of the W3C specification.

That's an understandable misinterpretation, and the text has been clarified. In this context, 'original font' meant the font that is placed into the WOFF wrapper, which must be losslessly wrappable and unwrappable. A tool that produces WOFF files might perform other functions to a font file, such as subsetting, before wrapping the resulting font data, but for WOFF purposes it is the font that goes into the wrapper that matters, not the font at the beginning of the process. This, as I say, has been clarified, and I believe the term 'original font' is avoided.

Richard Fink's picture

@db
and RF, you obviously don't, that automatically subsetting a font, with its compressed kerning, glyphs and hints, will require a complete TT interpreter.

No, actually I *do* know. Surprise!
There's no way in hell that user agents will be able to sub-set TTF files on the fly.
They could do it badly, I suppose, and break them.

But I don't think that's the intention of the WOFF spec. And I can't see anything like that being implemented or even entertained in the near term.
The browser will process whatever is in the wrapper, and that's that.

Keeping track of @font-face bug reports, browser makers seem to be having enough trouble as it is.

rich

Jack B. Nimblest Jr.'s picture

a>FontForge scripting does this in a snap:

Cool! (who knew it'd be so simple!) It says in the notes is that it's updated to grab the glyphs used by features, which means it's doing or getting an interpretation of those. In previous developments, did subsetting of hinting and kerning get proven in proofs?

J> ...the text has been clarified...

Much to my delight. Now, is the dsig "in-scope", and if so how?

J>...but for WOFF purposes it is the font that goes into the wrapper that matters...

I can see that point of view, as far as the wrapper-lover should be concerned the feather has landed, mission accomplished!

All that remains now is the actual functioning of multi-dimensional industrial-strength typography amongst this herd of boons.

John Hudson's picture

David: Now, is the dsig "in-scope", and if so how?

The dsig table is preserved in the WOFF-wrapped font data just like all the other tables.

This does mean that if you want a valid dsig in your WOFF'd fonts, then you need to sign or re-sign the font that goes into the wrapper after you have performed any other operations, e.g. subsetting or adding custom data such as serialisation, permissions, etc.

The workflow would be:

1. Input font, i.e. the TTF master from which the specific WOFF'd font file will be derived.

2. Font data processing, manipulation, augmentation, e.g. subsetting, serialisation, etc.

3. Digital (re)signing.

4. WOFF

All that remains now is the actual functioning of multi-dimensional industrial-strength typography amongst this herd of boons.

Indeed.

Richard Fink's picture

>did subsetting of hinting and kerning get proven in proofs?
Proven in proofs. Sounds like something my Dad would have said.

Is there anything further, father? - No, that can't be right...
Is there anything father, further?

As DSigmund Freud said, "Never judge a cigar by its wrapper."

Thomas Phinney's picture

Kerning and (other) OT layout features are the tough parts for subsetting.

Cheers,

T

Jack B. Nimblest Jr.'s picture

TP>Kerning and (other) OT layout features are the tough parts for subsetting.

Isn't there a biblical story where two mothers are trying to cut a baby in half, and the king comes along and shows them how to write "accredited" dsigs instead, or something?

RF>Sounds like something my Dad would have said.

Mine too. Coincidentally, he worked in the baby-saving business.

>The workflow would be:

A work-flow, maybe — but I think you are taking a tiny bit of an industrial strength work-flow out of context to unintentionally misunderstand the issue.

I think, WOFF needs its own dsig, or to shut up about those of another format.

Richard Fink's picture

>Kerning and (other) OT layout features are the tough parts for subsetting.

Yup.

blokland's picture

Thomas: Kerning and (other) OT layout features are the tough parts for subsetting.

We have put subsetting on the list of planned functionality for (one of) the next release(s) of OTM.

In an FM-based workflow, subsetting is not very complex because the exported characters are defined by a .cha file and –as everybody knows by now– the character set does not have to cover the stuff listed in the linked OT features file (because the rewritten HOT tool removes the obsolete features from the font during generation). Of course, everything can be done in batch via command files.

FEB

blokland's picture

FYI, the FM-based workflow I built over time for DTL looks as follows:

http://www.fonttools.org/downloads/DTL_workflow.pdf

This is the result of building, i.e., stacking elements over time and it is quite possible that it could be made more compact if re-built from scratch, but it works fine and reliable for us.

FEB

Jack B. Nimblest Jr.'s picture

Type design tools can do all the subletting. Why didn't I think of that.

Richard Fink's picture

@blokland
thanks for the info. Will look it over.

John Hudson's picture

David: I think, WOFF needs its own dsig

Can you explain why?

Jack B. Nimblest Jr.'s picture

JH> Can you explain why (you think, WOFF needs its own dsig)?

I finally got to read posts beyond September on the pertinent W3C lists. Whew!

(Lol...What made you all hate so much the TrueType upon which the web was launched and rides today: Bad Acid?;))

To explain why WOFF needs a dsig? I rewrote its FAQ instead. I will say this, the "division" between "web fonts" and "print fonts", is really a W3C & UA construed faux-tech solution to get the foundries on board. The solution goes against the font behavior required by many of the coolest needs of modern publishing.

This is Håkon's apparent, and repeatedly implied direction, and I agree completely. So if you have the time, talk to him about his vision and then what's the big deal!? put a damn dsig in and try to make the WOFF spec ignore the payload's dsig entirely, please. Pretty please :)

Jens Kutilek's picture

David, isn’t the WOFF private data block a possible replacement for a dsig?

Richard Fink's picture

Curious, 'cause I don't know: What's the benefit of a DSIG?
I know that a lot of fonts have them. I know they add some bloat to the font.
If you're looking to cut down on download size, getting rid of the the DSIG, if there is one, is step #1.
What is or was its intended purpose?

Jack B. Nimblest Jr.'s picture

JK>David, isn’t the WOFF private data block a possible replacement for a dsig?

That is an excellent point, the WOFF private data block is a possible functional replacement for a dsig. But every founder would then need to write an app to expose it to the user, or write what would amount to a standard dsig in the pdb. A WOFF-specific dsig, however, would open up all sorts of possibilities for expressions of information.

Khaled Hosny's picture

Curious, 'cause I don't know: What's the benefit of a DSIG?

The only use that I've for DSIG is to force Windows to use OpenType icon for my TTF flavoured OpenType fonts, and I have been told some other dumb MS app checks for DSIG to decide wether the font is OpenType capable or not. Of course I just add a dummy DSIG table 9using FontForge).

John Hudson's picture

A WOFF-specific dsig, however, would open up all sorts of possibilities for expressions of information.

David, what do you imagine that a digital signature is or does?

Jack B. Nimblest Jr.'s picture

Lol, I'll take that as a "No!" a digital signature Is to WOFF as Mini-Me is to Dr. Evil? What it does is a smaller version of nothing?

Syndicate content Syndicate content