Written Cantonese

Written Cantonese
Wikipedia is good.svg
LanguagesYue Chinese
Time period
Republic of China (or earlier) until now
Sister systems
Written Hokkien

Written Cantonese is the written form of Cantonese, the most complete written form of Chinese after that for Mandarin Chinese and Classical Chinese. Written Chinese was originally developed for Classical Chinese,[citation needed] and was the main literary language of China until the 19th century. Written vernacular Chinese first appeared in the 17th century and a written form of Mandarin became standard throughout China in the early 20th century.[1] While the Mandarin form can in principle be read and spoken word for word in other Chinese varieties, its intelligibility to non-Mandarin speakers is poor to incomprehensible because of differences in idioms, grammar and usage. Modern Cantonese speakers have therefore developed their own written script, sometimes creating new characters for words that either do not exist or have been lost in standard Chinese.

With the advent of the computer and standardization of character sets specifically for Cantonese, many printed materials in predominantly Cantonese-speaking areas of the world are written to cater to their population with these written Cantonese characters.

Written Cantonese on the packaging of Hong Kong beverage brand Vitasoy


Before the 20th century, the standard written language of China was Classical Chinese, which has grammar and vocabulary based on the Chinese used in ancient China, Old Chinese. However, while this written standard remained essentially static for over two thousand years, the actual spoken language diverged further and further away. Some writings based on local vernacular speech did exist but these were rare. In the early 20th century, Chinese reformers like Hu Shih saw the need for language reform and championed the development of a vernacular that allowed modern Chinese to write the language the same way they speak. The vernacular language movement took hold, and the written language was standardised as vernacular Chinese. Mandarin was chosen as the basis for the new standard.

The standardisation and adoption of written Mandarin pre-empted the development and standardisation of vernaculars based on other varieties of Chinese. No matter which dialect one spoke, one still wrote in standardised Mandarin for everyday writing. However, Cantonese is unique amongst the non-Mandarin varieties in having a widely used written form. Cantonese-speaking Hong Kong used to be a British colony isolated from mainland China before 1997, so most HK citizens do not speak Mandarin. Written Cantonese was developed as a means of informal communication. Still, Cantonese speakers must use standard written Chinese, or even literary Chinese, in most formal written communications, since written Cantonese may be unintelligible to speakers of other varieties of Chinese.

Historically, written Cantonese has been used in Hong Kong for legal proceedings in order to write down the exact spoken testimony of a witness, instead of paraphrasing spoken Cantonese into standard written Chinese. However, its popularity and usage has been rising in the last two decades, the late Wong Jim being one of the pioneers of its use as an effective written language. Written Cantonese has become quite popular in certain tabloids, online chat rooms, instant messaging, and even social networking websites; this would be even more evident since the rise of localism in Hong Kong from the 2010s, where the articles written by those localist media are written in Cantonese. Although most foreign movies and TV shows are subtitled in Standard Chinese, some, such as The Simpsons, are subtitled using written Cantonese. Newspapers have the news section written in Standard Chinese, but they may have editorials or columns that contain Cantonese discourses, and Cantonese characters are increasing in popularity on advertisements and billboards.

Written Cantonese advertising banner in Mainland China

It has been stated that written Cantonese remains limited outside Hong Kong, including other Cantonese-speaking areas in Guangdong Province.[2] However, colloquial Cantonese advertisements are sometimes seen in Guangdong, suggesting that written Cantonese is widely understood and is regarded favourably, at least in some contexts.

Some sources will use only colloquial Cantonese forms, resulting in text similar to natural speech. However, it is more common to use a mixture of colloquial forms and standard Chinese forms, some of which are alien to natural speech. Thus the resulting "hybrid" text lies on a continuum between two norms: standard Chinese and colloquial Cantonese as spoken.

Cantonese characters

Early sources

A good source for well documented written Cantonese words can be found in the scripts for Cantonese opera. Readings in Cantonese colloquial: being selections from books in the Cantonese vernacular with free and literal translations of the Chinese character and romanized spelling (1894) by James Dyer Ball has a bibliography of printed works available in Cantonese characters in the last decade of the nineteenth century. A few libraries have collections of so-called "wooden fish books" written in Cantonese characters. Facsimiles and plot precis of a few of these have been published in Wolfram Eberhard's Cantonese Ballads. See also Cantonese love-songs, translated with introduction and notes by Cecil Clementi (1904) or a newer translation of these by Peter T. Morris in Cantonese love songs : an English translation of Jiu Ji-yung's Cantonese songs of the early 19th century (1992). Cantonese character versions of the Bible, Pilgrims Progress, and Peep of Day, as well as simple catechisms, were published by mission presses. The special Cantonese characters used in all of these were not standardized and show wide variation.

Characters today

Written Cantonese contains many characters not used in standard written Chinese in order to transcribe words not present in the standard lexicon, and for some words from Old Chinese when their original forms have been forgotten. Despite attempts by the government of Hong Kong in the 1990s to standardize this character set, culminating in the release of the Hong Kong Supplementary Character Set (HKSCS) for use in electronic communication, there is still significant disagreement about which characters are correct in written Cantonese, as many of the Cantonese words existed as descendants of Old Chinese words, but are being replaced by some new invented Cantonese words due to the Hong Kong Government's lack of knowledge about some of the Cantonese words.


General estimates of vocabulary differences between Cantonese and Mandarin range from 30 to 50 percent.[citation needed] Donald B. Snow, the author of Cantonese as Written Language: The Growth of a Written Chinese Vernacular, wrote that "It is difficult to quantify precisely how different" the two vocabularies are.[3] Snow wrote that the different vocabulary systems are the main difference between written Mandarin and written Cantonese.[3] Ouyang Shan made a corpus-based estimate concluding that one third of the lexical items used in regular Cantonese speech do not exist in Mandarin, but that between the formal registers the differences were smaller. He analyzed a radio news broadcast and concluded that of its lexical items, 10.6% were distinctly Cantonese.[3] Here are examples of differing lexical items in a sentence:

Written Cantonese and standard written Chinese equivalents, with Cantonese Yale romanization
Gloss Written Cantonese Standard Written Chinese
is haih sih (Mandarin: shì)
not m̀h bāt (Mandarin: bù)
they/them 佢哋 keúih-deih 他們 tā-mùhn (Mandarin: tāmen)
(possessive marker) ge dīk (Mandarin: de)
Is it theirs? 係唔係佢哋嘅?

haih-m̀h-haih keúih-deih ge?


Sih-bāt-sih tā-mùhn dīk?
(Mandarin: Shì bùshì tāmen de?)

In the above table the two Chinese sentences are grammatically identical, using an A-not-A question to ask "Is it theirs?" (referring to some prior mentioned thing). But the characters are all different, though they correspond 1:1.


There are certain words that share a common root with standard written Chinese words. However, because they have diverged in pronunciation, tone, and/or meaning, they are often written using a different character. One example is the doublet lòih (standard) and lèih (Cantonese), meaning "to come." Both share the same meaning and usage, but because the colloquial pronunciation differs from the literary pronunciation, they are represented using two different characters. Some people argue that representing the colloquial pronunciation with a different (and often extremely complex) character is superfluous, and would encourage using the same character for both forms since they are cognates (see Derived characters below).

Native words

Some Cantonese words have no equivalents in Mandarin, though equivalents may exist in classical or other varieties of Chinese. Cantonese writers have from time to time reinvented or borrowed a new character if they are not aware of the original one. For example, some suggest that the common word (leng), meaning pretty in Cantonese but also looking into the mirror in Mandarin, is in fact the character .[4]

Today those characters can mainly be found in ancient rime dictionaries such as Guangyun. Some scholars have made some "archaeological" efforts to find out what the "original characters" are. Often, however, these efforts are of little use to the modern Cantonese writer, since the characters so discovered are not available in the standard character sets provided to computer users, and many have fallen out of usage.

In Southeast Asia, Cantonese people may adopt local Malay words into their daily speech, such as using the term 鐳 /lɵy/ rather than saying 錢 /tsʰiːn˨˥/ which would be what the Hong Kong Cantonese would say, meaning money and written 錢.


Cantonese particles may be added to the end of a sentence or suffixed to verbs to indicate aspect. There are many such particles; here are a few.

  •  – "mē" is placed at the end of a sentence to indicate disbelief, e.g., 乜你花名叫八兩金咩? Is your nickname really Raymond Lam?
  •  – "nē" is placed at the end of a sentence to indicate a question,[5] e.g., 你叫咩名呢? What is your name?
  •  – "meih" is placed at the end of a sentence to ask if an action is done yet, e.g., 你做完未? Are you done yet?
  •  – "háh" is placed after a verb to indicate a little bit, i.e., "eat a little bit"; "há" is used singly, to show uncertainty or unbelief, e.g., 吓?乜係咁㗎? What? Is it really like that?
  •  – "gán" is placed after a verb to indicate a progressive action, e.g.,我食緊蘋果。 I'm eating an apple.
  •  – "jó" placed after a verb to indicate a completed action, e.g., 我食咗蘋果。 I ate an apple.
  •  – "saai" placed after a verb to indicate an action to all of the targets, e.g., 我食晒啲蘋果。 I ate all the apples.
  •  – "maàih" is placed after a verb to indicate an expansion of the target of action, or that the action is an addition to the one(s) previously mentioned, e.g., 我食埋啲嘢就去。 I'll go after I finish eating the rest. ("eating the rest" is an expansion of the target of action from the food eaten to the food not yet eaten); 你可以去先,我食埋嘢先去。 You can go first. I'll eat before going. (The action "eating" is an addition to the action "going" which is previously mentioned or mutually known.)
  • 哇/嘩 – "wā" 嘩! Wow!
  • 㗎啦 – "ga lā" is used when the context seems to be commonplace, e.g., 個個都係咁㗎啦。 Everyone is like that.
  • 啫嘛 – "jē ma" translates as "just", e.g., 我做剩兩頁功課啫嘛。 I just have two pages of homework left to do.


Some Cantonese loanwords are written in existing Chinese characters.

Written form of Cantonese[6] Jyutping Cantonese pronunciation English word English Pronunciation Written form of Mandarin
巴士 baa1 si2 /paː˥ɕiː˧˥/ bus /bʌs/ 公車 (Taiwan)
公共汽車、公交车 (Mainland China)
的士 dik1 si2 /tɪk˥ɕiː˧˥/ taxi /ˈtæksi/ 計程車 (Taiwan)
出租車 (Mainland China)
德士 (Singapore/Malaysia)
多士 do1 si6 /tɔ́ːɕìː/ toast /ˈtɘʊst/ 吐司
朱古力 zyu1 gu1 lik1 /tɕyː˥kuː˥lɪk˥/ chocolate /ˈtʃɒklɪt/ 巧克力
三文治 saam1 man4 zi6 /saːm˥mɐn˨˩tɕiː˨/ sandwich /ˈsænwɪdʒ/ 三明治
士多 si6 do1 /ɕiː˨tɔː˥/ store /stɔː/ 商店
士巴拿 si6 baa1 naa2 /ɕìːpáːnǎː/ spanner (wrench) /ˈspæn.ə(ɹ)/ 扳手
士多啤梨 si6 do1 be1 lei2 /ɕiː˨tɔː˥pɛː˥lei˧˥/ strawberry /ˈstrɔːbəri/ 草莓
啤梨 be1 lei2 /pɛː˥lei˧˥/ pear /peər/ 梨子
沙士 saa1 si6 /saː˥ɕiː˨/ SARS /sɑːz/ 嚴重急性呼吸道症候群
非典 (Mainland China)
拜拜 baai1 baai3 /paːi˥paːi˧/ bye bye /ˈbaɪbaɪ/ 再見
BB bi4 bi1 /piː˨˩piː˥/ baby /ˈbeɪbi/ 嬰兒
菲林 fei1 lam2 /fei˥lɐm˧˥/ film /fɪlm/ 膠卷
菲屎 fei1 si2 /fei˥ɕiː˧˥/ face (reputation) /feɪs/ 面子
三文魚 saam1 man4 jyu4 /saːm˥mɐn˨˩jyː˨˩/ salmon /ˈsæmən/ 鮭魚
沙律 saa1 leot6 /sáːlɵ̀t̚/ salad /ˈsæləd/ 沙拉
taai1 /tʰáːi/ 1. tire
2. tie
1. /ˈtaɪ̯ə/
2. /taɪ/
1. 輪胎
2. 領帶
褒呔 bou1 taai1 /póutʰáːi/ bowtie /bəʊˈtaɪ/ 蝴蝶型領結
fei1 /féi/ fee (ticket) /fiː/
bo1 /pɔ́ː/ ball /bɔːl/
哈囉 haa1 lou3 /háːlōu/ hello /həˈləʊ/ 您好
迷你 mai4 nei2 [mɐ̏i.něi] mini /ˈmɪni/
摩登 mo1 dang1 /mɔ́ːtɐ́ŋ/ modern /ˈmɒdən/ 時尚、現代
肥佬 fei4 lou2 [fȅilǒu] fail /feɪl/ 不合格
咖啡 gaa3 fe1 /kāːfɛ́ː/ coffee /ˈkɒfi/ 咖啡
OK ou1 kei1 /ʔóukʰéi/ okay /ˌəʊˈkeɪ/ 可以
kaak1 /kʰáːk̚/ card /kɑːd/
啤牌 pe1 paai2 /pʰɛ́ː pʰǎːi/ poker /ˈpəʊkə/ 樸克
gei1 /kéi/ gay /ɡeɪ/ 同性戀
(蛋)撻 (daan6) taat1 (/tàːn/) /tʰáːt̚/ (egg) tart /tɑːt/ (蛋)塔
可樂 ho2 lok6 /hɔ̌ː.lɔ̀ːk̚/ cola /ˈkəʊ.lə/ 可樂
檸檬 ning4 mung1 [nȅŋméŋ] lemon /ˈlɛmən/ 檸檬
扑成 buk1 sing4 [pók̚.sȅŋ] boxing /ˈbɒksɪŋ/ 拳擊
刁時 diu1 si2 [tíːu.sǐː] deuce (before the final game of tennis) 平分
干邑 gon1 jap1 [kɔ́ːn.jɐ́p̚] cognac 法國白蘭地酒
沙展 saa1 zin2 [sáː.tsǐːn] sergeant 警長
士碌架 si3 luk1 gaa2 [sīːlók̚.kǎː] snooker 彩色檯球
士撻(打) si3 taat1 (daa2) [sīː.tʰáːt̚ tǎː] starter 啟輝器
士啤 si3 be1 [sīː.pɛ́ː] spare 後備,備用
士啤呔 si3 be1 taai1 [sīː.pɛ́ː tʰáːi] spare tire 備用輪胎
Often used to describe people with waist and abdomen fat
士的 si3 dik1 [sīː.ték̚] stick 手杖,拐杖
士多房 si3 do1 fong4 [sīː.tɔ́ː fɔ̏ːŋ] storeroom 貯藏室
山埃 saan1 aai1 [sáːn ʔáːi] cyanide 氰化物
叉(電) caa1 (din3) [tsʰáː.tīːn] (to) charge 充電
六式碼 luk3 sik1 maa2 [lōk̚.sék̚ mǎː] Six Sigma 六西格瑪
天拿水 tin1 naa4 seoi2 [tʰíːnnȁː sɵ̌y] (paint) thinner 稀釋劑,溶劑
比高 bei2 gou1 [pěikóu] bagel 過水麵包圈 (Mainland China)

貝果 (Taiwan)

比堅尼 bei2 gin1 nei4 [pěikíːnnȅi] bikini 比基尼泳裝
巴士德消毒 baa1 si1 dak1 siu1 duk6 /páː.sí tɐ́k̚.siːú.tʊ̀k̚/ pasteurized 用巴氏法消毒過的
巴打 baa1 daa2 [páː.tǎː] brother 兄弟
巴黎帽 baa1 lai4 mou2 [páːlɐ̏imǒu] beret 貝雷帽
巴仙 baa1 sin1 / pat6 sen1 [páːsíːn] / /pʰɐ̀t̚.sɛ́ːn/ percent 百分之


古龍水 gu2 lung4 seoi2 [kǔː.lȍŋ sɵ̌y] cologne 科隆香水 (Mainland China)
布冧 bou3 lam1 [pōulɐ́m] plum 洋李,李子,梅
布甸 bou3 din1 [pōu.tíːn] pudding 布丁
打令 daa1 ling2 [táː.lěŋ] darling 心愛的人
打比(打吡) daa2 bei2 [tǎː.pěi] derby 德比賽馬
kaa1 [kʰáː] car (火車)車廂
卡式機 kaa1 sik1 gei1 [kʰáː.sék̚ kéi] cassette 盒式錄音機
卡士 kaa1 si2 [kʰáː.sǐː] 1. cast
2. class
1. 演員陣容
2. 檔次,等級;上品,高檔,有品味
卡通 kaa1 tung1 [kʰáː.tʰóŋ] cartoon 動畫片,漫畫
卡巴 kaa1 baa1 [kʰáː.páː] kebab 烤腌肉串
甲巴甸 gaap3 baa1 din1 [kāːp̚.páː.tíːn] gabardine 華達呢
le1 [lɛ́ː] level 級,級別
叻㗎 lek1 gaa4 [lɛ́ːk̚.kȁː] lacquer 清漆
sin1 [síːn] cent
他菲亞酒 taa1 fei1 aa3 zau2 [tʰáː.féi ʔāː.tsɐ̌u] tafia 塔非亞酒
冬甩 dung1 lat1 [tóŋ.lɐ́t̚] doughnut 炸麵餅圈 (Mainland China)
奶昔 naai2 sik1 [nǎːi.sék̚] milkshake 牛奶冰淇淋
安士 on1 si2 [ʔɔ́ːnsǐː] ounce 盎司,英兩,啢
安哥 on1 go1 /ʔɛ́ːn.kʰɔ́/ encore 再來一個,再演奏(Song)一次

Cantonese character formation

Cantonese characters, as with regular Chinese characters, are formed in one of several ways:


Some characters already exist in standard Chinese, but are simply reborrowed into Cantonese with new meanings. Most of these tend to be archaic or rarely used characters. An example is the character 子, which means "child". The Cantonese word for child is represented by 仔(jai), which has the original meaning of "young animal".

Marked phonetic loans

Many characters used in Cantonese writings are formed by putting a mouth radical () on the left hand side of another better-known character (e.g. ), usually a standard Chinese character. This indicates that the new character sounds like the standard character, but is only used phonetically in the Cantonese context. (An exception is 咩, which does not sound like 羊 (sheep), but sounds like the sound that sheep make.) The characters which are commonly used in Cantonese writing include:

Character Romanization Notes Standard Chinese equivalent
gaa function word
háah/háa function word
yaa/yaah function word
āak v. cheat, hoax
gám function word like this, e.g., 噉就死喇 這樣
gam function word like this, e.g., 咁大件 這麼
function word indicates past tense
function word, also a contraction of 乜嘢
saai function word indicates completion, e.g., 搬嗮 moved all, finished moving ,
deih function word, indicates plural form of a pronoun
nī/nēi adv. this, these
m̀h adv. not, no, cannot; originally a function word
lāang function word
āam[7] adv. just, nearly
adv. correct, suitable
dī/dīt genitive, similar to 's but pluralizing i.e., 呢個 this → 呢啲 these, 快點 = 快啲 = "hurry!" , ,
yūk v. to move
hái prep. at, in, during (time), at, in (place)
adv. that, those
ge genitive, similar to 's; sometimes function word ,
māk n. mark, trademark; transliteration of "mark"
laak function word
laa function word
yéh n. thing, stuff 東西, 事物
sāai v. to waste 浪費
lèih/làih v. to come; sometimes function word
háaih function word
gauh function word a piece of
lō/lo function word
táu v. to rest
haam v. to cry
maih/máih v. not be, contraction of 唔係 m̀h haih, used following 係 in yes-no questions; also other uses ,
final particle expressing consent and denial, liveliness and irritation, etc.

There is evidence that the mouth radical in such characters can, over time, be replaced by a Signific, which indicates the meaning of the character. The new character is then a semantic compound. For instance, (lām, "bud"), written with the signific ("cover"), is instead written in older dictionaries as , with the mouth radical.

Derived characters

Other common characters are unique to Cantonese or are different from their Mandarin usage, including: 乜, 冇, 仔, 佢, 佬, 俾, 靚 etc. The characters which are commonly used in Cantonese writing include:

  • móuh (v. not have). Originally . Standard written Mandarin: 沒有
  • haih (v. be). Standard written Mandarin:
  • kéuih (pron. he/she/it). Originally . Standard written Mandarin: , 她, 它, 牠, 祂
  • māt (pron. what) often followed by 嘢 to form 乜嘢. Originally 物也. Standard written Mandarin: 什麼
  • jái (n. son, child, small thing). Originally .
  • lóu (n. guy, dude). Originally .[8]
  • 畀/俾 béi (v. give). Standard written Mandarin:
  • leng (adj. pretty, handsome). Standard written Mandarin: 漂亮
  • 晒/曬 saai (adv. completely; v. bask in sun)
  • fan (v. sleep). Originally . Standard written Mandarin: ,
  • 攞/拎 ló/ling (v. take, get). Standard written Mandarin:
  • leih (n. tongue). Standard written Mandarin:
  • guih (adj. tired). Standard written Mandarin:
  • dehng (n. place) often followed by 方 to form 埞方. Standard written Mandarin: 地方

The words represented by these characters are sometimes cognates with pre-existing Chinese words. However, their colloquial Cantonese pronunciations have diverged from formal Cantonese pronunciations. For example, ("without") is normally pronounced mòuh in literature. In spoken Cantonese, (móuh) has the same usage, meaning, and pronunciation as , except for tone. represents the spoken Cantonese form of the word "without", while represents the word used in Classical Chinese and Mandarin. However, is still used in some instances in spoken Cantonese, such as 無論如何 ("no matter what happens"). Another example is the doublet 來/嚟, which means "come". (lòih) is used in literature; (lèih) is the spoken Cantonese form.


Though most Cantonese words can be found in the current encoding system, input workarounds are commonly used by those not familiar with them. Some Cantonese writers use simple romanization (e.g., use D as 啲), symbols (add a Latin letter "o" in front of another Chinese character; e.g., 㗎 is defined in Unicode but will not display if not installed on the device in use, hence the proxy o架 is often used), homophones (e.g., use 果 as 嗰), and Chinese characters which have different meanings in Mandarin (e.g., 乜, 係, 俾; etc.) For example,

Character 喇, 嘢。
Substitution o係 la, D 野。
Gloss you being there good (final particle), thousand pray don't mess with he/she (genitive particle) things/stuff.
Translation You'd better stay there, and under no circumstances mess with his/her stuff.

See also


  • Snow, Donald B. Cantonese as Written Language: The Growth of a Written Chinese Vernacular. Hong Kong University Press, 2004. ISBN 962209709X, 9789622097094.


  1. ^ Mair, Victor. "How to Forget Your Mother Tongue and Remember Your National Language".
  2. ^ e.g., (Snow, 2004)
  3. ^ a b c Snow, Cantonese as Written Language: The Growth of a Written Chinese Vernacular, p. 49.
  4. ^ cantonese.org.cn
  5. ^ ctcfl.ox.ac.uk
  6. ^ A list compiled by lbsun
  7. ^ Wikipedia:粵語本字表 - 維基百科,自由嘅百科全書
  8. ^ Zhifu Yu. 粵講粵過癮[100601][細路]. Foshan TV. Retrieved 3 September 2013.

Further reading

  • Cheung, Kwan-hin 張系顯; Bauer, Robert S. (2002). The Representation of Cantonese with Chinese Characters. Journal of Chinese Linguistics Monograph Series. 18. Chinese University Press. JSTOR 23826027. OCLC 695438049.
  • Li, David C.S. (2000). "Phonetic Borrowing: Key to the vitality of written Cantonese in Hong Kong". Written Language & Literacy. 3 (2): 199–233. doi:10.1075/wll.3.2.02li.
  • Snow, Donald (1991). Written Cantonese and the culture of Hong Kong: the growth of a dialect literature (PhD thesis). Indiana University. OCLC 1070381666.
  • ——— (1993). "A short history of published Cantonese: what is dialect literature?". Journal of Asian Pacific Communication. 4 (3): 127–148. ISSN 0957-6851. OCLC 43573899.
  • ——— (2004). Cantonese as Written Language: The Growth of a Written Chinese Vernacular. Hong Kong University Press. ISBN 978-962-209-709-4.

External links

This page was last updated at 2021-02-24 12:55, update this pageView original page

All information on this site, including but not limited to text, pictures, etc., are reproduced on Wikipedia (wikipedia.org), following the . Creative Commons Attribution-ShareAlike License


If the math, chemistry, physics and other formulas on this page are not displayed correctly, please useFirefox or Safari