The "name" OpenType table for the "Cambria Math" font doesn't have the encodingID = 10 for the Windows platform. Is this a bug ?

Belloc's picture

In the figure below, you'll see a screen shot of two windows showing the Open Type "name" tables for the "MS Mincho" and the "Cambria Math" MS fonts.

If you consider the platformID = 3 (Windows fonts) on both results, you'll see that the "MS Mincho" font shows correctly, names for the encodingID = 1 (Unicode BMP UCS-2) and 10 (Unicode UCS-4). But the "Cambria Math" font only shows encodingID = 1. It doesn't show the encodingID = 10 names. I know for sure the "Cambria Math" font has both encodings. Is this a bug ?

Theunis de Jong's picture

How do you know "for sure" Cambria Math has Encoding 10 names? I can't see those in the version I have.

Perhaps it's not necessary for a font to contain enc.10 names even if it does contain an encoding 10 in the cmap. Of that, the specs say

... If the font is meant to support supplementary Unicode characters, it will additionally need a Format 12 subtable with a platform encoding ID 10. ...

... and these supplementary characters are typical for a CJK font but not for a math font. Hold on, I'm going to check what cmap encodings are in Cambria Math.

(Edit) Okay, Cambria Math also contains a 3/10 encoding, so that's not the reason.

Well, I cannot recall from the specs that says every cmap encoding needs to have a corresponding name entry.

John Hudson's picture

No, this isn't a bug. My understanding is that the font would only require an encodingID 10 name table if one or more name entries required use of supplementary plane Unicode characters and hence a format 12 subtable. The different encoding options for the name table cover the encoding needs of the name table entries, which are independent of the cmap subtables that cover the encoding needs of text set in the font. So in the case of a font like Cambria Math, the character set of the font includes supplementary plane characters (hence, cmap format 12 subtable (it also has a format 14 subtable, the first I've had to make)) but the name table entries do not make use of any of these characters.

Or am I misunderstanding your question?

PS. What tool are you using to examine the table entries? Are you able to look within a TTC, or are you breaking the Cambria TTC apart somehow?

Theunis de Jong's picture

The different encoding options for the name table cover the encoding needs of the name table entries, which are independent of the cmap subtables that cover the encoding needs of text set in the font.

John, does that mean that the Encoding 10 name entry in MS Mincho actually contains these supplementary characters?

PS. What tool are you using to examine the table entries? Are you able to look within a TTC, or are you breaking the Cambria TTC apart somehow?

I don't recognize the output format of Belloc's tool. Me, I used my own homegrown tool -- I needed it for Something Completely Different ;-)

Belloc's picture

John Hudson and Theunis de Jong,

First let me thank both of you for your replies.

@John Hudson

>> No, this isn't a bug. My understanding is that the font would only require an encodingID 10 name table if one or more name entries required use of supplementary plane Unicode characters and hence a format 12 subtable. The different encoding options for the name table cover the encoding needs of the name table entries, which are independent of the cmap subtables that cover the encoding needs of text set in the font. <<

I have to disagree with you on this John. Take for example the fonts "MT Extra" and "Symbol". These are symbol fonts as you probably know, and so, none of their "characters" will ever be present in the name table. Nevertheless, the encodingID = 0 is the only one present in this table for both fonts, for all platforms. Also, if you consider the fonts that have Unicode points above 0xFFFF, all of them, as far as I know, have encodingID = 10 present in their name tables for the Windows platform, with the exception of the "Cambria Math" font. My feeling is that this a bug, although a minor one, as you really don't need this information in the name table.

>> PS. What tool are you using to examine the table entries? Are you able to look within a TTC, or are you breaking the Cambria TTC apart somehow? <<

I have developed a small C++ program to parse this name table for any font specified by the user.

@Theunis de Jong

>> John, does that mean that the Encoding 10 name entry in MS Mincho actually contains these supplementary characters? <<

If you look at the names for the MS Mincho font on the screen shot above, you'll notice that there are characters that were printed as invalid. I presume those characters correspond to this encoding in the MS Mincho font. But there is one thing here I didn't understand : I used the font Arial to print the reports shown on those windows. As far as I can understand, the function TextOut() that I used to print those lines, should print the correct japanese characters, by just using the font fallback mechanism present in the Windows API. But apparently, that didn't happen ! Why ?

John Hudson's picture

Also, if you consider the fonts that have Unicode points above 0xFFFF, all of them, as far as I know, have encodingID = 10 present in their name tables for the Windows platform, with the exception of the "Cambria Math" font. My feeling is that this a bug, although a minor one, as you really don't need this information in the name table.

I classify bugs as things that are contrary to the spec and/or produce unwanted behaviour. To my knowledge, neither is the case here. If it is so that other fonts containing supplementary plane coverage in their cmap tables also provide encodingID 10 name table encoding, I'm much more inclined to see this as unnecessary bytes in those fonts than as bug in Cambria Math.

Belloc's picture

You may be right. After thinking a little bit more about what you said ("My understanding is that the font would only require an encodingID 10 name table if one or more name entries required use of supplementary plane Unicode characters and hence a format 12 subtable"), I have to agree that this is more reasonable. But then, how do you justify the encodingID = 0 for the MT Extra and Symbol fonts ?

There is one other thing that surprised me in relation to the "msmincho.ttc", the TTC font file for the MS Mincho font. The TTC header present in this file is as follows :

TTC Tag ttcf
version 0001 0000
numFonts 0002
OffsetTables[0] 0000 0020 [1] 0000 018C
ulDsigTag DSIG
ulDsigLength 0000 1B98
ulDsigOffset 0099 59FC

From this Open Type spec you'll find this assertion : "There are two versions of the TTC Header: Version 1.0 has been used for TTC files without digital signatures. Version 2.0 can be used for TTC files with or without digital signatures -- if there's no signature, then the last three fields of the version 2.0 header are left null."

Shouldn't then the version number for this font collection be 0002 0000, instead of 0001 0000 ?

John Hudson's picture

But then, how do you justify the encodingID = 0 for the MT Extra and Symbol fonts ?

I don't justify anything that was ever done in a symbol font made by anyone. It looks, based on your description of these and other fonts, as if either Monotype or Microsoft may at one time have had a policy to include a name table encodingID to match cmap coverage, regardless of whether this was actually needed in the name table entries.

Syndicate content Syndicate content