Modifying Apple's Myriad Pro

acm's picture

Hello,

I'm planning to work on my website design and I'm going to use a dynamic text replacement technique that will automatically change the headlines with an image. That way I'm able to use other fonts than the regular Arial, Verdana, Georgia, Courier and Times New Roman and, so far, I'm going for the Myriad Pro available in Mac OS X.

However, there's a slight problem: Romanian, the language I use on my website, has some characters not available in the default Myriad Pro like Ş and Ţ and I thought I could try and add those by hand.

Am I allowed to do that? And am I allowed to upload a font from the operating system to the server so I can use it to generate images with it?

Thank you.

Michel Boyer's picture

I tried other things and here is what I found. The characters whose glyphs are


are in 0x0162 and 0x21A for the majuscule, 0x0163 and 0x021B for the minuscule and here is what they are named by Adobe in Myriad Pro:

     uni0162 : Tcommaaccent
     uni0163 : tcommaaccent
     uni021A : uni021A
     uni021B : uni021B

It is the characters in 0x0162 and and 0x0163 that are not recognized by the Mac. If we rename the above four characters as follows:

     uni0162 : Tcedilla
     uni0163 : tcedilla
     uni021A : Tcommaaccent
     uni021B : tcommaaccent

then all the characters in the resulting font are recognized by the Mac, be they called *commaaccent or *cedilla. So it is not the names that are causing a problem but what they are naming. Is Tcommaaccent uni0162 or is it uni021A? As pointed out by Adam, in a quite different style, one way to lift the disagreement is not to use the names Tcommaaccent and tcommaaccent and use uni0162 and uni0163, on which everyone agrees.

Michel

PS. Notice that if we execute the command

curl -s http://www.unicode.org/Public/UNIDATA/NamesList.txt | egrep '^0162|^0163|^021A|^021B'

to get the names in Unicode's NamesList we get this:

    0162 LATIN CAPITAL LETTER T WITH CEDILLA *
    0163 LATIN SMALL LETTER T WITH CEDILLA *
    021A LATIN CAPITAL LETTER T WITH COMMA BELOW *
    021B LATIN SMALL LETTER T WITH COMMA BELOW *

and those are the names displayed by the Macintosh character palette.

Michel Boyer's picture

I must add that I fail to understand by what mechanism such disagreement would cause characters not to be acessible. If you have any idea, please tell me.

Michel

Michel Boyer's picture

And I am still more puzzled when, after a curl -s and a join I get the following comparative table of Adobe names and names in Unicode's file NamesList.txt

  0122  Gcommaaccent  LATIN CAPITAL LETTER G WITH CEDILLA
  0123  gcommaaccent  LATIN SMALL LETTER G WITH CEDILLA
  0136  Kcommaaccent  LATIN CAPITAL LETTER K WITH CEDILLA
  0137  kcommaaccent  LATIN SMALL LETTER K WITH CEDILLA
  013B  Lcommaaccent  LATIN CAPITAL LETTER L WITH CEDILLA
  013C  lcommaaccent  LATIN SMALL LETTER L WITH CEDILLA
  0145  Ncommaaccent  LATIN CAPITAL LETTER N WITH CEDILLA
  0146  ncommaaccent  LATIN SMALL LETTER N WITH CEDILLA
  0156  Rcommaaccent  LATIN CAPITAL LETTER R WITH CEDILLA
  0157  rcommaaccent  LATIN SMALL LETTER R WITH CEDILLA
  015E  Scedilla      LATIN CAPITAL LETTER S WITH CEDILLA *
  015F  scedilla      LATIN SMALL LETTER S WITH CEDILLA *
  0162  Tcommaaccent  LATIN CAPITAL LETTER T WITH CEDILLA *
  0163  tcommaaccent  LATIN SMALL LETTER T WITH CEDILLA *
  0218  Scommaaccent  LATIN CAPITAL LETTER S WITH COMMA BELOW *
  0219  scommaaccent  LATIN SMALL LETTER S WITH COMMA BELOW *
  021A  uni021A       LATIN CAPITAL LETTER T WITH COMMA BELOW *
  021B  uni021B       LATIN SMALL LETTER T WITH COMMA BELOW *

If the other "commaaccent" are recognized, why not also [T/t]commaaccent ?

Michel

k.l.'s picture

Don't spend too much thought on this. Since it is a Mac OS bug, the question is not 'why?' but 'when will it be fixed?'  :)  Follow Adam's and John's advices as regards glyph naming and the locl feature, and the font will work fine. At least in ≥ 10.4.

Michel Boyer's picture

Here is something else that does not quite fit the world I am used to. It comes from this chart of Unicode's standard:


The name mentions a cedilla, and the definition in BNF style says that it is built with a G and a character 0327 which is indeed a cedilla, yet the Gcedilla they display is with a comma. I must confess that I don't like this. This obviously confirms Miguel's comment. [edit] about my too strict interpretion of what looked to me like a BNF definition. The same holds for g, K, k, L, l, N, n, R, and r. Only S, s, T and t are shown with a cedilla in that chart.

Michel

Michel Boyer's picture

> Don’t spend too much thought on this.

I am learning and I have no need to follow advices because I am not a developer. My problem is with Unicode's definition itself and is probably somewhat "Academic" for the time being.

Michel

Michel Boyer's picture

> Don’t spend too much thought on this. (again)

I am spending too much time indeed, but it is quite fascinating. For instance, I have a Teach Yourself Romanian that dates back to 1970. People did not have computers at home by then. I don't know when fonts started to be digitized. Well, in that book all the t "cecilla" have a comma below. As for the s, on the very same page, very close one from the other, I can see one with a comma, one with a hook and one with the cedilla of Times New Roman.

John Hudson's picture

The name mentions a cedilla, and the definition in BNF style says that it is built with a G and a character 0327 which is indeed a cedilla, yet the Gcedilla they display is with a comma. I must confess that I don’t like this.

Go back and read my long post again. All the 'with cedilla' characters in Unicode except C/c cedilla and S/s cedilla are properly displayed with the comma accent form in the European orthographies that use these characters. The unification of cedilla and comma accent under the name cedilla was an early error in Unicode, and one which for stability reasons they chose not to correct except in the case of the S/s and T/t comma accent for Romanian (and given the massive confusion and conflicting text encodings that that correction has produced, one can see why they would avoid throwing the Baltic languages that use the other 'cedilla' characters into the same mess). It was a mistake to conflate these two diacritic marks and a mistake to call them 'with cedilla' in the formal names, but it is an old mistake and one that we have to live with.

Michel Boyer's picture

> it is an old mistake and one that we have to live with.

Thanks for clarifying. Is there other instances in the Unicode "specification" that require such an exegesis?

Michel

Michel Boyer's picture

This does not answer all my questions. If I look at the "cedillas" in New Times Roman, I see this


All the "cedillas" whether attached or detached match, except those under the Romanian "t"; this inconsistency must have a justification. Are they all "commas below" but looking different (except of course for S and s cedilla).

Michel

[added] You mention "the other ’cedilla’ characters". Maybe you are expecting too much from me, taking for granted a background I don't have. For me a cedilla may be detached, and when I write a c cedilla in French, it will most probably not be connected with the c, even if it is printed connected. So, for me something that looks like a cedilla even if it does not look like an attached cedilla is a cedilla; mixing detached cedillas with attached cedillas is no problem for me. But mixing a comma with a detached cedilla feels really weird.

Michel Boyer's picture

Here is a (rare) example of a detached cedilla in my old Teach Yourself Romanian.


Could this be a good example of a 'scommaaccent' ? All the T and t have real commas under, like those of Times New Roman.

Michel Boyer's picture

> It should be noted that Unicode also encoded a number of other characters nominally ’with cedilla’, but for which a comma accent form is preferred in all the European orthographies that use these diacritic letters: K/k, R/r and, importantly it turned out, T/t.

Should I conclude that the above R/r characters in Times New Roman are wrong?

Michel

John Hudson's picture

Is there other instances in the Unicode “specification” that require such an exegesis?

Yes, quite a few. Perversely, the relative messiness of Unicode as a standard is a testament to its success: it was willing to accept dubious encodings and politically motivated proposals (e.g. the Arabic presentation forms and the composed Hangual syllables), at least during the early years, in order to get the standard off the ground.

So, for me something that looks like a cedilla even if it does not look like an attached cedilla is a cedilla; mixing detached cedillas with attached cedillas is no problem for me. But mixing a comma with a detached cedilla feels really weird.

An unattached cedilla is probably pretty acceptable to a Romanian or Baltic reader. Indeed, there have been attempts to design what Chuck Bigelow call a 'commadilla', a deliberatly ambiguous, disconnected form that could be read as either a cedilla or a comma accent according to the preference of the reader.

John Hudson's picture

Should I conclude that the above R/r characters in Times New Roman are wrong?

Not in themselves: as I just wrote, this kind of unattached curved shape is probably an acceptable 'commaaccent', but it would be better if all the commaaccent glyphs were consistent. It isn't crucial for European languages, because the T/t commaaccent is only used alongside S/s commaaccent, not alongside the Baltic diacritics.

By the way, at which version of Times New Roman are you looking? The Windows Vista version distinguishes T/t with cedilla from T/t with commaaccent.

Michel Boyer's picture

> which version of Times New Roman are you looking

It is Monotype Version 3.05 that probably came with Microsoft Office 2004. My PC is not working at the moment but in any case, it does not run Vista. Vista is not supported by our staff. [added] I am working almost all the time on my mac.

Michel Boyer's picture

My disk is dead; I couldn't even try installing Office 2007. I presume that when the Unicode glyph name contained CEDILLA they chose one of the CEDILLA subglyphs under and when it contained the word COMMA they chose the COMMA subglyph under:


[added] ... except, of course, for g.

Michel

twardoch's picture

Fontlab Ltd.'s current recommendation is to design four glyphs using a cedilla accent, and giving the S with cedilla glyphs the *cedilla names and the T with cedilla glyphs uniXXXX names or *cedilla names. The notes that follow the glyph names are not the Unicode character names but actual descriptive names:

U+015E "Scedilla" Latin capital S with cedilla
U+015F "scedilla" Latin small s with cedilla
U+0162 "uni0162" or "Tcedilla" Latin capital T with cedilla
U+0163 "uni0163" or "tcedilla" Latin small t with cedilla

The remaining glyphs in question should include glyphs with the commaaccent diacritic and should use uniXXXX names, not *commaaccent names.

U+0122 "uni0122" Latin capital G with commaaccent below
U+0123 "uni0123" Latin small g with turned commaaccent above
U+0136 "uni0136" Latin capital K with commaaccent below
U+0137 "uni0137" Latin small k with commaaccent below
U+013B "uni013B" Latin capital L with commaaccent below
U+013C "uni013C" Latin small l with commaaccent below
U+0145 "uni0145" Latin capital N with commaaccent below
U+0146 "uni0146" Latin small n with commaaccent below
U+0156 "uni0156" Latin capital R with comma below
U+0157 "uni0157" Latin small r with commaaccent below
U+0218 "uni0218" Latin capital S with commaaccent below
U+0219 "uni0219" Latin small s with commaaccent below
U+021A "uni021A" Latin capital T with commaaccent below
U+021B "uni021B" Latin small t with commaaccent below

Syndicate content Syndicate content