New to Typophile? Accounts are free, and easy to set up.
Hello,
I'm planning to work on my website design and I'm going to use a dynamic text replacement technique that will automatically change the headlines with an image. That way I'm able to use other fonts than the regular Arial, Verdana, Georgia, Courier and Times New Roman and, so far, I'm going for the Myriad Pro available in Mac OS X.
However, there's a slight problem: Romanian, the language I use on my website, has some characters not available in the default Myriad Pro like Ş and Ţ and I thought I could try and add those by hand.
Am I allowed to do that? And am I allowed to upload a font from the operating system to the server so I can use it to generate images with it?
Thank you.
31 Aug 2007 — 4:59pm
I tried other things and here is what I found. The characters whose glyphs are
are in 0x0162 and 0x21A for the majuscule, 0x0163 and 0x021B for the minuscule and here is what they are named by Adobe in Myriad Pro:
uni0162 : Tcommaaccent
uni0163 : tcommaaccent
uni021A : uni021A
uni021B : uni021B
It is the characters in 0x0162 and and 0x0163 that are not recognized by the Mac. If we rename the above four characters as follows:
uni0162 : Tcedilla
uni0163 : tcedilla
uni021A : Tcommaaccent
uni021B : tcommaaccent
then all the characters in the resulting font are recognized by the Mac, be they called *commaaccent or *cedilla. So it is not the names that are causing a problem but what they are naming. Is Tcommaaccent uni0162 or is it uni021A? As pointed out by Adam, in a quite different style, one way to lift the disagreement is not to use the names Tcommaaccent and tcommaaccent and use uni0162 and uni0163, on which everyone agrees.
Michel
PS. Notice that if we execute the command
curl -s http://www.unicode.org/Public/UNIDATA/NamesList.txt | egrep '^0162|^0163|^021A|^021B'
to get the names in Unicode's NamesList we get this:
0162 LATIN CAPITAL LETTER T WITH CEDILLA *
0163 LATIN SMALL LETTER T WITH CEDILLA *
021A LATIN CAPITAL LETTER T WITH COMMA BELOW *
021B LATIN SMALL LETTER T WITH COMMA BELOW *
and those are the names displayed by the Macintosh character palette.
31 Aug 2007 — 5:46pm
I must add that I fail to understand by what mechanism such disagreement would cause characters not to be acessible. If you have any idea, please tell me.
Michel
3 Sep 2007 — 5:13am
And I am still more puzzled when, after a
curl -sand ajoinI get the following comparative table of Adobe names and names in Unicode's file NamesList.txt0122 Gcommaaccent LATIN CAPITAL LETTER G WITH CEDILLA
0123 gcommaaccent LATIN SMALL LETTER G WITH CEDILLA
0136 Kcommaaccent LATIN CAPITAL LETTER K WITH CEDILLA
0137 kcommaaccent LATIN SMALL LETTER K WITH CEDILLA
013B Lcommaaccent LATIN CAPITAL LETTER L WITH CEDILLA
013C lcommaaccent LATIN SMALL LETTER L WITH CEDILLA
0145 Ncommaaccent LATIN CAPITAL LETTER N WITH CEDILLA
0146 ncommaaccent LATIN SMALL LETTER N WITH CEDILLA
0156 Rcommaaccent LATIN CAPITAL LETTER R WITH CEDILLA
0157 rcommaaccent LATIN SMALL LETTER R WITH CEDILLA
015E Scedilla LATIN CAPITAL LETTER S WITH CEDILLA *
015F scedilla LATIN SMALL LETTER S WITH CEDILLA *
0162 Tcommaaccent LATIN CAPITAL LETTER T WITH CEDILLA *
0163 tcommaaccent LATIN SMALL LETTER T WITH CEDILLA *
0218 Scommaaccent LATIN CAPITAL LETTER S WITH COMMA BELOW *
0219 scommaaccent LATIN SMALL LETTER S WITH COMMA BELOW *
021A uni021A LATIN CAPITAL LETTER T WITH COMMA BELOW *
021B uni021B LATIN SMALL LETTER T WITH COMMA BELOW *
If the other "commaaccent" are recognized, why not also [T/t]commaaccent ?
Michel
3 Sep 2007 — 5:59am
Don't spend too much thought on this. Since it is a Mac OS bug, the question is not 'why?' but 'when will it be fixed?' :) Follow Adam's and John's advices as regards glyph naming and the locl feature, and the font will work fine. At least in ≥ 10.4.
3 Sep 2007 — 6:40am
Here is something else that does not quite fit the world I am used to. It comes from this chart of Unicode's standard:
The name mentions a cedilla, and the definition in BNF style says that it is built with a G and a character 0327 which is indeed a cedilla, yet the Gcedilla they display is with a comma. I must confess that I don't like this. This obviously confirms Miguel's comment. [edit] about my too strict interpretion of what looked to me like a BNF definition. The same holds for g, K, k, L, l, N, n, R, and r. Only S, s, T and t are shown with a cedilla in that chart.
Michel
3 Sep 2007 — 6:51am
> Don’t spend too much thought on this.
I am learning and I have no need to follow advices because I am not a developer. My problem is with Unicode's definition itself and is probably somewhat "Academic" for the time being.
Michel
3 Sep 2007 — 7:15am
> Don’t spend too much thought on this. (again)
I am spending too much time indeed, but it is quite fascinating. For instance, I have a Teach Yourself Romanian that dates back to 1970. People did not have computers at home by then. I don't know when fonts started to be digitized. Well, in that book all the t "cecilla" have a comma below. As for the s, on the very same page, very close one from the other, I can see one with a comma, one with a hook and one with the cedilla of Times New Roman.
3 Sep 2007 — 10:57am
The name mentions a cedilla, and the definition in BNF style says that it is built with a G and a character 0327 which is indeed a cedilla, yet the Gcedilla they display is with a comma. I must confess that I don’t like this.
Go back and read my long post again. All the 'with cedilla' characters in Unicode except C/c cedilla and S/s cedilla are properly displayed with the comma accent form in the European orthographies that use these characters. The unification of cedilla and comma accent under the name cedilla was an early error in Unicode, and one which for stability reasons they chose not to correct except in the case of the S/s and T/t comma accent for Romanian (and given the massive confusion and conflicting text encodings that that correction has produced, one can see why they would avoid throwing the Baltic languages that use the other 'cedilla' characters into the same mess). It was a mistake to conflate these two diacritic marks and a mistake to call them 'with cedilla' in the formal names, but it is an old mistake and one that we have to live with.
3 Sep 2007 — 11:51am
> it is an old mistake and one that we have to live with.
Thanks for clarifying. Is there other instances in the Unicode "specification" that require such an exegesis?
Michel
3 Sep 2007 — 12:48pm
This does not answer all my questions. If I look at the "cedillas" in New Times Roman, I see this
All the "cedillas" whether attached or detached match, except those under the Romanian "t"; this inconsistency must have a justification. Are they all "commas below" but looking different (except of course for S and s cedilla).
Michel
[added] You mention "the other ’cedilla’ characters". Maybe you are expecting too much from me, taking for granted a background I don't have. For me a cedilla may be detached, and when I write a c cedilla in French, it will most probably not be connected with the c, even if it is printed connected. So, for me something that looks like a cedilla even if it does not look like an attached cedilla is a cedilla; mixing detached cedillas with attached cedillas is no problem for me. But mixing a comma with a detached cedilla feels really weird.
3 Sep 2007 — 1:17pm
Here is a (rare) example of a detached cedilla in my old Teach Yourself Romanian.
Could this be a good example of a 'scommaaccent' ? All the T and t have real commas under, like those of Times New Roman.
3 Sep 2007 — 2:29pm
> It should be noted that Unicode also encoded a number of other characters nominally ’with cedilla’, but for which a comma accent form is preferred in all the European orthographies that use these diacritic letters: K/k, R/r and, importantly it turned out, T/t.
Should I conclude that the above R/r characters in Times New Roman are wrong?
Michel
3 Sep 2007 — 5:29pm
Is there other instances in the Unicode “specification” that require such an exegesis?
Yes, quite a few. Perversely, the relative messiness of Unicode as a standard is a testament to its success: it was willing to accept dubious encodings and politically motivated proposals (e.g. the Arabic presentation forms and the composed Hangual syllables), at least during the early years, in order to get the standard off the ground.
So, for me something that looks like a cedilla even if it does not look like an attached cedilla is a cedilla; mixing detached cedillas with attached cedillas is no problem for me. But mixing a comma with a detached cedilla feels really weird.
An unattached cedilla is probably pretty acceptable to a Romanian or Baltic reader. Indeed, there have been attempts to design what Chuck Bigelow call a 'commadilla', a deliberatly ambiguous, disconnected form that could be read as either a cedilla or a comma accent according to the preference of the reader.
3 Sep 2007 — 5:35pm
Should I conclude that the above R/r characters in Times New Roman are wrong?
Not in themselves: as I just wrote, this kind of unattached curved shape is probably an acceptable 'commaaccent', but it would be better if all the commaaccent glyphs were consistent. It isn't crucial for European languages, because the T/t commaaccent is only used alongside S/s commaaccent, not alongside the Baltic diacritics.
By the way, at which version of Times New Roman are you looking? The Windows Vista version distinguishes T/t with cedilla from T/t with commaaccent.
3 Sep 2007 — 6:50pm
> which version of Times New Roman are you looking
It is Monotype Version 3.05 that probably came with Microsoft Office 2004. My PC is not working at the moment but in any case, it does not run Vista. Vista is not supported by our staff. [added] I am working almost all the time on my mac.
4 Sep 2007 — 6:36pm
My disk is dead; I couldn't even try installing Office 2007. I presume that when the Unicode glyph name contained CEDILLA they chose one of the CEDILLA subglyphs under and when it contained the word COMMA they chose the COMMA subglyph under:
[added] ... except, of course, for g.
Michel
18 Aug 2008 — 10:13am
Fontlab Ltd.'s current recommendation is to design four glyphs using a cedilla accent, and giving the S with cedilla glyphs the
*cedillanames and the T with cedilla glyphsuniXXXXnames or*cedillanames. The notes that follow the glyph names are not the Unicode character names but actual descriptive names:U+015E "Scedilla"Latin capital S with cedillaU+015F "scedilla"Latin small s with cedillaU+0162 "uni0162"or"Tcedilla"Latin capital T with cedillaU+0163 "uni0163"or"tcedilla"Latin small t with cedillaThe remaining glyphs in question should include glyphs with the commaaccent diacritic and should use
uniXXXXnames, not*commaaccentnames.U+0122 "uni0122"Latin capital G with commaaccent belowU+0123 "uni0123"Latin small g with turned commaaccent aboveU+0136 "uni0136"Latin capital K with commaaccent belowU+0137 "uni0137"Latin small k with commaaccent belowU+013B "uni013B"Latin capital L with commaaccent belowU+013C "uni013C"Latin small l with commaaccent belowU+0145 "uni0145"Latin capital N with commaaccent belowU+0146 "uni0146"Latin small n with commaaccent belowU+0156 "uni0156"Latin capital R with comma belowU+0157 "uni0157"Latin small r with commaaccent belowU+0218 "uni0218"Latin capital S with commaaccent belowU+0219 "uni0219"Latin small s with commaaccent belowU+021A "uni021A"Latin capital T with commaaccent belowU+021B "uni021B"Latin small t with commaaccent below