Wednesday, September 9, 2009

Designing an Alphabet or Writing System

I love alphabets. When I say this, I include writing systems generally (it just rings better if I say "alphabets," though). I started creating them when I was a kid, and have always loved looking at foreign writing. In high school I created at least two code alphabets that I used with various friends, and I used a Greek-alphabet transliteration of English to trade notes with one of my boyfriends. In college I asked a friend to teach me about Arabic, and I had yet another personal alphabet that I used in my journal to make sure no one could peek and read. I also discovered some alphabets I'd never seen before, such as the loopy script of Malayalam. Great stuff.

I've also encountered a goodly number of fantasy alphabets, including the elvish and dwarvish scripts of Tolkien, the Kzinti script of Paul Chafe, and numerous others.

After all of this, I thought I'd try to distill a few thoughts here that might be useful to would-be creators of alphabets and other character systems.

Thought #1: Before you start creating arcane symbols, decide exactly what it is you're representing.

Any alphabet that simply replaces the English alphabet is not really an alphabet in its own right, but a code. It's cool - and goodness knows I've made a lot of these - but it probably won't be the best match for a really original alien or fantasy language.

It's good to ask yourself whether your symbols will be representing sounds, syllables, or meanings. English roughly represents sounds, while the Japanese hiragana and katakana systems represent syllables, and the Japanese Kanji, like the original Chinese characters, represent meanings. Any one of these can work, but a system that represents meanings is going to require a lot more complexity than one that only represents sounds, because the sounds of a language are a finite list, while the meanings just go on and on.

Thought #2: Don't just ask what you're representing, ask also how this writing system will be used.

I bring this up because I think its important for language designers to consider how often, and how quickly, the symbols they create must be written. Japanese Kanji are brutally hard to dash off a quick note in, although people do it regularly. I've seen fantasy character systems so complex that I can't imagine how people would be able to write them in any practical fashion. Contrast that situation, though, with the writing system used by Ursula LeGuin in her novel, The Telling. That system was intricately related to a whole belief system and sacred meanings were part of it; a lot of time and effort can be invested in writing when the final product is believed to have greater than everyday significance. For dashing off quick notes, though, simpler is probably better.

Thought #3: Think through the basic visual elements of your script, including stroke types and points or axes of orientation.

The English alphabet uses a finite number of stroke types: vertical and horizontal lines; two types of diagonal lines; curves; and dots. It orients to a primary axis located at the bottom of all of the characters - "writing on the line," so to speak. The characters then vary based on which strokes occur in which orientations to one another, to the axis, and to three different distance points measured in the vertical dimension off that axis (the horizontal bars of "e," "t"/"f," and "I."

Why is it worth thinking this stuff through? Because for ease of writing, you probably want to minimize the number of stroke types, keeping maximal simplicity while at the same time maintaining maximal difference between the different characters. Put it this way:

If the characters are too complex, you get screaming - but if all the characters look the same? More screaming.

Okay, great. Now let's assume you've got the basic characters sketched out. Do you want to add additional complexity, like capitalization, or cursive forms?

Answer: maybe. Additional complexity has its uses. Cursive (I was always told) was designed for the sake of speed, and it certainly has a sense of style to it. Capitalization helps a lot because it provides visual orientation for a reader, effectively saying, "Look here! It's the beginning of the sentence!" or "Look here! It's a name!" In German, it says something different: "Here's a noun!" Similar to this, if greatly more complicating, is the use of Kanji in Japanese. Kanji say "Here's a piece of meaning!" And given that Japanese is written without spaces between words, that piece of meaning generally also allows a reader to separate the beginning of a new word from the function words around it, and from any suffixes appended to previous words. Arabic has a different kind of complexity in its script: the "letters" take different forms depending on whether they occur at the start, in the middle, or at the end of a word. Again, this provides orientation on a larger level - and it reminds me to point out that empty spaces between words are another highly useful feature of script, used for general orientation to the language being represented.

Finally, I would be remiss if I didn't mention punctuation, but I don't want to go into much detail there, except to say that it is another type of orientation device. It works on the sentence level, but also within the sentence, to help clarify syntactic structures. For more fun with punctuation I'll direct you to Eats, Shoots, and Leaves by Lynne Truss, as she handles the discussion in much more depth - and far more amusingly - than I can.

At this point that I must bemoan the fact that it's so difficult to render a created alphabet into computerized blog form, because I would love to give examples. Suffice it to say that a character system with a deliberate balance between simplicity and complexity (differentiation), and one that uses appropriate cues to the beginnings and ends of words, will strike a viewer's eye as more "real" than one that doesn't. And just so I'm not completely without examples, I've written a sentence in Japanese for you:


I invite anyone who is able to speak a language using a different character system (and enter it into their computer) to volunteer examples in my comments area. I - and my readers - would love to see them.


  1. 感じ should be 漢字 if I'm not mistaken.

    I suppose this shows that same-sounding words might be avoided in a fictional language, but to do use them would make it more realistic.

    But terribly confusing. Like a novel and every character has the same name. And every place name also.

  2. Wow, Malale, thanks! Of course, you're right. I'll go fix it at once - chalk it up to writing my blog very late at night and trusting my computer to get those pesky characters right for me. I appreciate you correcting my error.

  3. Heh. In high school I taught Tolkien's dwarfrunes to my girlfriend in order to pass notes. :)

    Currently I'm learning the devanagari (used in Hindi and Nepali and others), where the vowels are not "inline" as in European languages, but written as inflections above and below, much as in Hebrew (and I think Arabic, not sure.)

    -- CWJ

  4. Thanks for coming by, CWJ. The devanagari sound really interesting. I remember really being amazed when I first heard of languages like Hebrew, where the word roots were made up of several consonants that had their vowels inserted between them. There's fascinating diversity in language forms throughout the world.

  5. You could scan them in, if you have a scanner. In fact, I think I might do that with mine.

  6. ये है देवनागरी लिपि का एक नमूना।

    Here is an example of the devanagari script.