Romanian T comma Vs. T cedilla

eolson's picture

What is the status of the T and t comma (021A & 021B)
versus the T and t cedilla (0162 and 0163) for Romanian?

In the Latin Extended A Unicode Doc it mentions the comma
being "preferred" over the cedilla. Adobe seems to be going
with the comma. Has the cedilla version fallen out of use and
or been replaced?

Any help is greatly appreciated.

twardoch's picture

The form "t with comma" is strongly recommended in all cases.

You can include one lowercase glyph "tcommaaccent" with the Unicode codepoints U+0163 and U+021B, and one uppercase glyph "Tcommaaccent" with the Unicode codepoints U+016A and U+021A.

Alternatively, you might include one lowercase glyph "tcommaaccent" (U+0163), one identical glyph "uni021B" (U+021B), one uppercase glyph "Tcommaaccent" (U+0162) and one identical uppercase glyph "uni021A" (U+021A). The latter is practical if you don't principally want to include double-mappings of Unicode characters. Otherwise, the first described approach is the way to go.

For get about the variant "t with cedilla".

Regards,
Adam

John Hudson's picture

I'll second what Adam has written: to date I have not found a single language that uses T/t with cedilla. The only likely candidate was Gagauzi, which is a Turkic language spoken in Romania, and which uses the S/s with cedilla following Turkish orthography, but the T/t with 'comma accent' following the Romanian orthography.

Note, however, this important issue: the S/s and T/t with cedilla and comma accent were only disunified in Unicode and ISO 10646 after the need to distinguish in plain text for bilingual documents became clear. The older Windows codepage for Romanian (CP 1250) uses the Unicode codepoints for the cedilla diacritics, not the newer comma diacritics. This means that the vast majority of Romanian documents are encoded using the cedilla diacritics, not the preferred comma forms. This needs to be addressed at the glyph level in OpenType fonts, using the Romanian language system tag ROM and the Localised Forms <locl> feature to map the S/s cedilla diacritic character codes to the comma diacritic glyphs.

I'm afraid I don't know how Gagauzi is typically encoded, so can't advise on what manner of OpenType gymnastics might be required.

eolson's picture

Thanks guys.
My hunch has been confirmed. I'll stick with the t + comma but
take your advice and double up with codepoints for safe keeping.

Syndicate content Syndicate content