Streamlining Arabic script text and font technology

behnam's picture

I throw some ideas here, to whom it may concern!

The current font and text rendering technology has been developed with Roman script fine typography in mind. Naturally, when exposed to the expansion and international needs, it has developed additional capabilities to respond to those needs.
The OpenType and AAT technology which were initially designed for 'fine typography' of Roman script, have become indispensable tools, not for 'fine' typography, but for mere readability of the Arabic text.
As a result, when it comes to 'fine typography' of Arabic script, the technological requirement (knowledge as well as support) has become so vast that 'fine typography' of Arabic script has remained in the domain of feasibility, not usability.

I'm not a programer nor a computer savvy but looking at what has already been achieved in Arabic fine typography, it seems to me that an Arabic text engine, optimized for Arabic script needs, is not really a complicated affair. All it needs to do is to follow Arabic text rendering the same way Arabic words are drawn on paper. You draw the word with connected characters, in their contextual variations, and then you place dots and signs on top of it.

The complication comes from the fact that with current technology, the priorities are not straight. We do not need the capability of dragging and repositioning a character on top of the previous letters. These are the capabilities of DTP applications which are so well demonstrated in Tasmeem of DecoType which has been also achieved in a different manner in MaryamSoft.

What I think would be a streamlined Arabic text engine, is first to rid of horizontal line state of mind for joining characters. The glyphs join together where their positioning anchors tell them to join. It might be horizontal, might not. This should be irrelevant.
Also, the characters should be first drawn, then identified by positioning dots and signs and diacritics. As I understand this is the way Tasmeem does the text. (excellent idea)
The third point which belongs to a streamlined basic Arabic text engine in my view, and *not* to sophisticated DTPs, is extended variants of characters.

What is commonly known as initial, medial and final forms of Arabic letters, and are partially and uselessly encoded in Unicode as well, are in fact the contextual variants of a letter as first priority. Contextualization of Arabic script is not limited to this.There are other contextual variants of a character, some in relation to the character before or after it, and some others -and this is my focus- in relation to alignment and justification of the text.
A streamlined Arabic text engine should understand and apply substitutions in relation to extension and justification of the text. It can be established with an algorithm for three level of priorities.
First priority would be the basic initial, medial and final forms. The second and third priorities are for their semi-extended and extended alternatives (which may or may not be provided by the font). Adjusting the extension of 'space' can then be harmonized within that algorithm.

In a streamlined Arabic text engine, these extended variants are an integrated part of contextualization process therefore the repositioning the character identifiers such as dots and signs comes after this whole contextualization process.

When we rid of 'horizontal line' state of mind, we rid of the character kashide which was the product of this state of mind as well.
Kashide (in Persian means 'extended') or Arabic Tatweel, encoded as U+0640 in Unicode, is in fact a product of technological limitations which shouldn't continue in digital age. It is the *character* that get extended in Arabic script not a horizontal line after that.
It wasn't even like that in early stage of metal typesetting. Typesetters painstakingly, were choosing the right shape for each and every character. They were not using kashide. This need became apparent over time and facing massive amount of text processing in a short deadline.

So the way I see it, extended alternates of characters are an indispensable part of basic Arabic contextualization. It should be integrated in Arabic text engine, flawlessly and as a matter of routine.
It is for font maker to designate level 1, 2 and 3 priorities of alternates (and 4, 5...?) but it is for the text engine to use an algorithm to deploy them based on these priorities.

This streamlining based on the way Arabic text is written (very close to Tom Milo project, minus the calligraphic side) will reduce the need for exorbitant number of glyphs for a fine typography in Arabic script, while it increases the capability of language support of an Arabic font exponentially.
It does need an extensive amount of anchors for each glyph for different positioning tasks but I don't think they are outside of the scope of current technology. They just need to be 'packaged' properly.
It also allows the designers, the artist designers and not necessarily computer nerds, to take part in the process and bring the Arabic script fine typography to a totally another level.

At the end, I should say that I didn't say anything particularly new. These ideas are already out there. Let's do it.
Let's provide an Arabic script text rendering package that could be used in any platform, any browser and any DTP and get out of this misery.


John Hudson's picture

Behnam, you've largely described Tom's ACE approach. Bear in mind that Tasmeem is a specific implementation of ACE, with a tool set appropriate to professional publication design. The ACE approach can be applied equally well to the general case of Arabic script typography.

So the way I see it, extended alternates of characters are an indispensable part of basic Arabic contextualization. It should be integrated in Arabic text engine, flawlessly and as a matter of routine. It is for font maker to designate level 1, 2 and 3 priorities of alternates (and 4, 5...?) but it is for the text engine to use an algorithm to deploy them based on these priorities.

I don't agree with this, or at least I have reservations. The number and variety of contextual forms are style-specific. This means that they are well-established in the classical styles, and these can be analysed in a way that produces a workable algorithm such as you describe. But this simply replaces the limitations of the Unicode etc. 4-form description of Arabic with a more generous set of limitations derived from the classical styles (so much for 'minus the calligraphic side'). It would be a definite improvement, but locating the layout based on this set of forms in the engine, rather than in the font, can limit the development of new styles. Is this a serious issue of concern? I don't know, because I can't anticipate what designers will do. But that's sort of the point: we can't anticipate the future forms of the Arabic script, so we should seek flexible layout technologies that can accommodate new kinds of shapes and connections.

behnam's picture

I understand your reservation. My concern is to 'streamline' the process. In a way that the technology can easily be adopted in various mediums, from text messaging devices to sophisticated DTPs, without being an obstacle to specialty applications for complex rendering. Also in a way that fine typography doesn't get too complicated for everyday use.
Maybe a more elaborated 'dialogue' between font and text engine can help creating a better algorithm?
In AAT, there is a 'just' feature which has not been developed in years. But it had an active dialogue with text engine in terms of when and where and with what priority and in which context kashide is deployed. Perhaps this kind of dialogue can be established between font and the text engine for extended alternates?
There is of-course another feature of Arabic typesetting which relates to the approximation of specific characters. This has been already answered in fonts with producing ligatures. Some of these ligatures in turn may have extended variants like individual characters.
If ACE can be this on one hand, and Tasmeem on the other, then I don't see where would be the problem.

k.l.'s picture

As John said, the work has been done already. There is ACE and it does what you ask for. It does layout automatically without need for manual drag-and-drop, that's optional. Just to clarify: ACE is the layout engine, while Tasmeem is an InDesign plug-in that makes the ACE layout engine available in InDesign. In so far I am not sure what you mean by "If ACE can be this on one hand, and Tasmeem on the other".

behnam's picture

Ha! I mean let's put ACE everywhere, from text messaging devices to browsers to operating systems... everywhere.
btw, how ACE does the justification?

behnam's picture

So I go back to my question because I didn't get the answer I wanted or it wasn't clear enough for me.
Justification of a streamlined Arabic technology is a part of basic contextualization of Arabic words. It can not be delegated to a host application. A host application can have adds-on to the basic process, but Arabic text, when justified, say in a small text message device, should take instructions from the font the same way it takes instruction for other contextualization process. It's not the device that decides how to proceed with basic justification (suppose that the device is set to justify). Any Arabic text process not doing so, is not doing the basic.

On a minor note, automatic kerning on the other hand, is not a very good idea for the basic process. Kerning in my view should be done in the font, tailored for the typeface. Unless there is something about automatic kerning that I don't understand. But there is something about it that doesn't sound right to me.

Syndicate content Syndicate content