Tuesday, August 5, 2008

Writing your language down

Bill Moonroe over at the Analog forum asked me to talk about writing systems, so I thought I'd do a bit of that tonight, starting with the linguistic characteristics of writing systems, moving through a few real world examples I'm familiar with, and finally taking a look at fitting writing systems to a created language and writing technologies.  It's quite a list, so here goes.

Some peoples don't write their language down at all.  Those who do tend to use one (or more!)  of the following three strategies.

1.  An alphabet.  The symbols of an alphabetic writing system are intended to depict the sounds of a language.  Alphabets generally start out as systems with roughly one-to-one correspondence between sounds (phonemes) and symbols - but anyone who has struggled with English spelling knows that this correspondence isn't always clean.  This is due primarily to two factors:  first, the fact that language sounds change more quickly than written spellings, and second, the fact that languages borrow words from other languages that may not be easily rendered in the alphabet (but must be rendered somehow!).

2.  A syllabary.  The symbols of a syllabic writing system are intended to depict chunks of sounds, usually the syllables of a language (though in the case of Japanese, the unit of sound that corresponds to a character can actually be less than one syllable).  What I said about language change applies here too, but at least in the case of Japanese, there has been official reform of the syllabary to try to bring "spelling" more into line with sound.

3. A set of pictographs or ideographs.  The symbols of an ideographic writing system are intended to depict units of meaning rather than units of sound.  Chinese is the classic world-language example of such a system, where there is a character for "I" and another for "you," etc.  Complex concepts can be depicted in such a system by putting two characters together, such as "electricity" and "talk" for "telephone."  And in this case, since no correspondence between sound and meaning is expected, changes in sound and changes in the character system occur independently.

On to examples.  Alphabets that I know about include the Roman alphabet used in different permutations for English, French, Indonesian, Dutch, and many others; also the Greek alphabet, the Cyrillic alphabet, the Hebrew alphabet, etc.  Feel free to comment listing any others you know about, and if anyone has further questions about alphabets do let me know.  But for now I'll assume this is a type of writing system anyone reading my blog has to be pretty familiar with.   

The syllabaries I know best are those of Korean (Hangul) and Japanese (Hiragana and Katakana).  Hangul is actually more properly representative of syllables than either of the kana systems.  Interestingly, each character is made up of subparts that represent sounds - but they're arranged as parts of a single more complex character.  Korean symbols can show either open syllables like "a" and "ka," or closed syllables like "kan", just by including one, two, or three sound parts in a single character.  The Japanese kana symbols represent only open syllables like "a" or "ka" and have two separate symbols that are used for closing syllables (one doubles the following consonant, and the other is roughly "N").  The reason there are two types of kana has nothing to do with sound, and everything to do with function; hiragana is used for core Japanese vocabulary, and katakana for foreign-derived words.

In the realm of ideographs I'll look at the Chinese symbols, because they are used by the Chinese, the Koreans, and the Japanese.  Many of these symbols began as pictographs, or picture-symbols of recognizable objects, and then became abstracted and complicated over time.  In much the same way as Hangul, they have recognizable subparts that can be recombined - but all these subparts are meaning-based, and none sound-based.  In a language with the non-conjugating structure of Chinese, such a system can be used alone.  But in Korean and Japanese, both of which have conjugations and small function words, they can be extremely inconvenient.This is where the "one or more systems" part comes in.  Korean uses both Hangul and Chinese characters, while Japanese uses both kana systems and Chinese characters - all mixed together by function.  

Okay, now for created languages.  Most created languages I have seen use alphabets, but if you're going to put your language in written form, I'd encourage you to think through three things.  

1.  While you're free to pick any type of system you want, it's helpful to consider language structure, as I mentioned for Chinese above, in choosing which system to use (unless you want to design more than one!).

2.  If you're going for an alphabet, you'll get a much more alien or world-local feel if you work directly with the sound system of your language, assigning symbols directly to sounds rather than using a code that corresponds roughly to the Roman alphabet.

3.  Consider writing technologies when you design your symbols.  Also known as, not everybody uses pencils!  Cuneiform was written with a reed on clay; runes were scratched on stone and wood; Chinese and Japanese began with brushes of bamboo and horsehair.  People will first begin to write with the materials available to them, on the materials available to them, and this will have an enormous influence on the appearance of the symbols.  I challenge anyone with a bioluminescent species to think about how that species would first begin to make recordings of its language (Wow, that's tough - and cool!).  Real world writing systems generally need to be written relatively quickly, the symbols should have systematic design and parts, and they should be easily differentiated from one another.

That's it for now!  I'll try to come back to worldbuilding tomorrow.


  1. Another thing to consider is the transition between an ideogram into a syllabary or alphabet using the rebus principle in which pictograms stand for sounds like a picture of a bee and a leaf together becomes the word belief which bears no significance to the either bee or leaf meaningwise. Also, a lot of the early languages like Hebrew and Egyptian didn't have vowels. Also, some of the earlier experiments in writing tended to mix syllabaries with some a few pictograms like Egyptian and Mayan. Another important thing to consider in designing a syllabary is just exactly how many syllables a language has. English has a lot, hundreds, which means hundreds of symbols. Egyptian is that way with hundreds of symbols. Japanese at least when you're just dealing with hirigana and katakana rather than kanji, has much fewer symbols, though. Also, what consitutes a syllable might be a little different from one language to another.

  2. Another thing that's interesting to think about on this subject is that alphabets, syllabaries, and ideographs seem to use different parts of the brain. People through brain damage of one kind or another have lost the ability to process alphabets for example, yet still retain the ability to read something like Kanji. It's kind of scary, actually. I've thought it might just be possible to create a virus which wipes out the ability to read alphabets and syllabaries yet retain the ability to read something like traditional Chinese. What would happen as the English speaking world tries to create an ideogram based writing system of it's own to restore literacy to its people? The only reason I haven't written this story is that I'm just not interested in turning China, which would probably be the prime suspect if a virus started wiping out literacy in the Western World, into the evil empire.

    Syllabaries also tend to be the easiest to learn to read as long is it's a language without massive amounts of syllables. It's just far easier to put syllables together to form words than going "d-o-g dog?"

  3. Thanks for your comments, Byron. Actually when the Japanese first adapted the kanji system from the Chinese, they used to use some of the characters for their meaning value and others for their sound. As you might guess, this was very confusing. So at some point, someone (some say it was Kobodaishi) invented the kana system(s) by simplifying the kanji that had been used for the sounds.

    The processing of alphabets, syllabaries, and ideographs in the brain is definitely different. It always amazes me that nobody ever worried about how many characters would be necessary to make ideographic systems work. Though Japanese might seem simpler because it has the kana systems, it actually used to use thousands of kanji in addition to those. At this point, the law requires that newspapers stick to a "short" list of about 2000.