Unicode, Hebrew & Mistakes

david h's picture

Unicode 5.2, Hebrew & Mistakes. What are the mistakes? Posting sooooooon.......

John Hudson's picture

Let me guess, does it have anything to do with atnah hafukh and yerah ben yomo?

david h's picture

... also other things :) but since you mentioned it.

The name Atnah Hafukh was given by Prof. Yeivin. He said that the distinction between atnah hafukh and yerah ben yomo 'serves as an indicator of the accuracy and purity of the tradition of the accentuation in a manuscript.'

See the editions by Prof. Dotan (Adi,1973) and Prof. Breuer -- there's a clear distinction between atnah hafukh and yerah ben yomo.

I don't understand why we waited so long; every decade they add couple glyphs :)
Now this is funny:
Unicode, page 237: "but some users in recent decades have begun to reintroduce this distinction"
'some users'?! what they mean by 'some users'?

BTW, there's a draft by the SII (2/2010) without this distinction! (I emailed them, so let's wait and see what is the answer)

John Hudson's picture

See the editions by Prof. Dotan (Adi,1973) and Prof. Breuer -- there's a clear distinction between atnah hafukh and yerah ben yomo.

Yes, but the way in which Unicode has chosen to encode atnah hafukh is problematic, because this is a character disunification, i.e. both atnah hafukh was previously presumed to be encoded as the same character as yerah ben yomo; now, it is encoded distinctly, but with the necessary caveat that existing documents will not make this distinction. However, because it is the form of yerah ben yomo that changes between when the distinction is not made and when it is made -- the atnah hafukh is the typical form when no distinction is made --, there is an inevitable incompatibility between existing documents and new fonts. Simply put, an existing document using U+05A2 for both atnah hafukh and yerah ben yomo will display with the wrong glyph in a font that supports this character distinction with appropriate glyphs for atnah hafukh and yerah ben yomo. There's no easy way around this, which is why the current build of the SBL Hebrew font still does not make a visual distinction between these two characters: the atnah hafukh glyph is used for both. I'm willing to change this, but have been waiting to see what feedback there is from users.

I don't understand why we waited so long; every decade they add couple glyphs

Frankly, I think the Israeli standards body must bear a lot of blame for the problems in Unicode encoding of Hebrew. They appear to have had almost no interest at all in anything except the encoding of modern standard Hebrew, and gave only cursory and unsystematic attention to Biblical text. These are the people who encoded the Biblical upper punctum extraordinarium but not the lower one, on the grounds that the latter was very rare. Even if a character is used only once, it is still a character.

Regarding the text that ends up in the Unicode Standard, e.g. "some users in recent decades have begun to reintroduce this distinction", this generally reflects whatever was said in the proposal(s) for the new characters. In the case of vowel and accent distinctions, e.g. qamats qatan vs qamats gadol or atnah hafukh vs yerah ben yomo, these were not as uncontroversial as one might have expected, hence 'some users', acknowledging that at least some members of the Unicode Hebrew community argued that these disunifications were unnecessary and would break existing practices.

david h's picture

Whether the blame goes to the Israeli standards, or the Unicode Consortium, or both of them -- the problem still remains. Your description -- "There's no easy way around this...the atnah hafukh glyph is used for both" -- is basically the 'by-product' that could have been avoided a long time ago.

Established editors and scholars such as Prof. Dotan, Prof. Breuer and Prof. Yeivin cared about that distinction. When the information is reachable I don't think we should wait a long time to add one glyph or two.

I don't know whether or not these different proposals are hot topics of debate, but not everything should be based on 'popular' demand, or 'popular' agreement. For example, I've been working for a l-o-n-g time on the Babylonian vocalization & masorah. There's no doubt that the average user & publisher would not need that, but the academic world would need that. But if adding one dot or two is based on the rare factor, then I think that the Babylonian vocalization would be added when the Messiah is here.

BTW, to paraphrase Prof. Yeivin, this distinction serves as an indicator of the accuracy of the bible's editor.

david h's picture

Not Hebrew.... but close. A little typo:


John Hudson's picture

David: I've been working for a l-o-n-g time on the Babylonian vocalization & masorah

Great, do you want to put a formal encoding proposal together for this? I can help put it together in appropriate form for submission to Unicode and ISO 10646, and can also get the Canadian JTC1 SC2 standards body to add their name to it for greater weight.

Characters are added to Unicode when people document them and submit proposals. If no one is doing this work, then it doesn't happen. The Israeli standards body shows no interest at all in anything beyond standard modern Hebrew and Tiberian vocalisation, so waiting for them to do anything is pointless. The recent additions have all been as a result of individual submissions.

[I have been in contact with one other person interested in encoding Babylonian vowel marks, but didn't receive a full and systematic list from her suitable for a proposal.]

John Hudson's picture

I'll drop a note to the editors re. the Samaritan typo.

david h's picture

> Great, do you want to put a formal encoding proposal together for this?

Why not.

> I have been in contact with one other person interested in encoding
> Babylonian vowel marks, but didn't receive a full and systematic list...

Is there a sample, or something? when was that?

John Hudson's picture

That was a few years ago. I probably have some notes kicking around in an email archive, but as I said, this wasn't full and systematic documentation, which is why it didn't progress to a formal proposal.

gohebrew's picture

The Unicode Consortium has failed even after all this time to present Hebrew correctly and completely.

For example, there is much easily accessed and read on the Internet about the shva-na symbol or glyph, and grammatical character. Yet, to this day, they sleep.

John, do you want to wake them up.

(Btw, thank you for your prayers.)

Syndicate content Syndicate content