Character Structure

How Characters
Are Built.

Chinese characters are not arbitrary drawings. Every character can be classified into one of six formation types — and once you understand the dominant type, vocabulary acquisition changes fundamentally.

~82%

Phono-Semantic

One part = meaning, one part = sound hint

~13%

Compound Ideographic

Two or more meaning components combined

~4%

Pictographic

Visual representation of the concept

~1%

Indicative

Abstract marks indicating position or quantity

Of the ~13,000 standard Traditional Chinese characters, over four-fifths follow the phono-semantic pattern — meaning a single learnable strategy applies to most of what you will encounter.

六書

The Six Formation Types

01 ~4%

xiàngxíng

Pictographic

象形

Characters that visually represent their meaning. The oldest and most intuitive type — the origin of the writing system.

sun

A circle with a central dot — the sun as it appears in the sky

yuè

moon

A crescent shape — the moon's most common visible form

shān

mountain

Three peaks rising from a base line

tree/wood

A trunk with branches above and roots below

shuǐ

water

A flowing central current with flanking streams

huǒ

fire

Flames rising from a base

Pure pictographs are rare in modern Chinese — only about 4% of characters. But they form the building blocks from which all other types are constructed.

02 ~1%

zhǐshì

Indicative

指事

Characters that represent abstract concepts through position, number, or symbolic marks. Where pictographs draw objects, indicatives signal ideas.

shàng

above / up

A mark above a horizontal baseline

xià

below / down

A mark below a horizontal baseline

one

A single horizontal stroke

èr

two

Two horizontal strokes

sān

three

Three horizontal strokes

běn

root / origin

木 (tree) with a mark at the base — pointing to the root

Indicatives are the smallest category. Their logic is transparent once explained, and they often appear as components within more complex characters.

03 ~13%

huìyì

Compound Ideographic

會意

Characters formed by combining two or more meaningful components whose combination creates a new meaning. The logic is often poetic.

xiū

to rest

人 (person) + 木 (tree) — a person leaning against a tree

míng

bright / clear

日 (sun) + 月 (moon) — the two brightest objects together

sēn

forest

三個木 (three trees) — many trees together

hǎo

good

女 (woman) + 子 (child) — a classic (if historically loaded) compound

nán

man / male

田 (field) + 力 (strength) — one who works the fields

xìn

trust / letter

人 (person) + 言 (speech) — a person's word

Not all compound ideographs have transparent logic today — some meanings have shifted over millennia. But many remain intuitively readable once the components are known.

04 ~82%

xíngshēng

Phono-Semantic

形聲

Characters combining a semantic component (indicating meaning category) with a phonetic component (hinting at pronunciation). This is the dominant formation type — the key that unlocks efficient character learning.

qīng

clear / clean

氵(water radical) + 青 (phonetic qīng) — something clean and water-related

qǐng

please / to request

訁(speech radical) + 青 (phonetic) — a speech-related action, pronounced like qīng

qíng

emotion / feeling

忄(heart radical) + 青 (phonetic) — something felt in the heart

qíng

sunny / clear weather

日 (sun radical) + 青 (phonetic) — sunny, bright weather

jīng

eye / pupil

目 (eye radical) + 青 (phonetic) — part of the eye

jīng

essence / refined

米 (rice/grain radical) + 青 (phonetic) — refined, pure grain

Learning one phonetic component like 青 (qīng) gives you a pronunciation prediction for every character that contains it. The sounds shift with tone and centuries of change, but the relationship remains recognisable.

05 Rare

zhuǎnzhù

Derivative Cognates

轉注

Characters with shared origins that have diverged in form and/or meaning over time. The most theoretically debated category — scholars disagree on which characters qualify.

kǎo

to examine / to test

Originally related to 老 (old) — both derived from the same ancient form

lǎo

old / elderly

The source from which 考 diverged — both once represented 'old person'

Derivative cognates are primarily of interest to scholars of historical linguistics. For practical learners, awareness that some character pairs share ancient origins is sufficient.

06 Occasional

jiǎjiè

Phonetic Loan

假借

Characters borrowed for their sound value to represent a different word — where no existing character existed for an abstract concept. The original meaning may be displaced entirely.

lái

to come

Originally meant 'wheat' (a pictograph of wheat stalks) — borrowed for the similar-sounding word 'to come'

I / me

Originally depicted a weapon — borrowed for the first-person pronoun, which had no character

běi

north

Originally depicted two people back-to-back — borrowed for 'north' based on sound

Phonetic loans explain why some characters seem to have no logical connection between their form and their meaning — the original meaning was something else entirely.

The Practical Payoff

Phonetic Series: One Component, Many Characters

When you encounter a new phono-semantic character, the phonetic component is a pronunciation hint. Learning common phonetic components gives you a head start on dozens of characters at once.

Phonetic Component

青 (qīng)

qīng — clear
qǐng — to request
qíng — emotion
qíng — sunny
jìng — quiet
jīng — essence
jīng — eye/pupil

Phonetic Component

工 (gōng)

jiāng — river
kōng — empty/sky
gōng — achievement
gōng — to attack
hóng — red
hóng — swan/great

Phonetic Component

方 (fāng)

fáng — room/house
fàng — to release
fáng — to defend
fǎng — to visit
仿 fǎng — to imitate

Phonetic Component

各 (gè)

gé — standard/grid
kè — guest
lù — road
luò — to fall
luò — to connect

Frequently Asked Questions

Do I need to memorise the six formation types?

Not as an exam subject — but understanding them transforms how you approach unfamiliar characters. The most practically important insight is that 82% of characters are phono-semantic: one part signals meaning, one part signals sound. Once you can identify radicals and phonetic components, you can make intelligent predictions about characters you have never studied.

Do phonetic components still predict modern pronunciation reliably?

Partially — and decreasingly so. Phonetic components were accurate when they were created, but centuries of sound change (vowel shifts, tone mergers, regional divergence) have degraded the correspondence. In modern Mandarin, roughly 66% of phono-semantic characters have phonetic components that still predict the syllable fairly reliably. Another 20% give partial hints. The remaining 14% are misleading or opaque. This makes phonetic components a useful prediction tool, not a reliable rule.

Does Simplified Chinese preserve the phono-semantic structure?

Sometimes yes, sometimes no. Many simplifications preserve the radical and phonetic component structure, just with fewer strokes. Others replace the entire character with a phonetic loan or a merged form that eliminates the original components. The character 聽 (tīng, to listen) is a phono-semantic compound in Traditional with the 耳 (ear) radical — Simplified replaced it with 听, which was originally an unrelated character. The semantic logic of Traditional is generally better preserved than in Simplified.

Why are there only a few pictographic characters if that's where writing started?

Pictographs were adequate for concrete nouns — sun, moon, tree, water — but limited for abstract concepts, actions, and grammatical particles. Language expands; pictographic writing doesn't scale. The other five formation types emerged as solutions to this problem. By the Han dynasty, the system had become predominantly phono-semantic, which is both more flexible and more compact. The ~4% of characters that remain pure pictographs are mostly the ones that started the whole enterprise — the simplest, most universal objects.