The following four lines are from the Dream of the Red Chamber (紅樓夢), a famous and important novel of 18th century China. Each line contains one (or more as in the first line) character(s) from the CJK Unified Ideographs Extension B which uses the surrogate pairs method of coding (highlighted in red). The table below has an image of the text on the left and then the actual text coded in UTF-8 on the right. Some browsers will fail on these characters, but will fail in different ways.


What you should see:   Your browser:
 
薛蟠又道:“女兒樂,一根𣬠𣬶往里戳。”
(Chap. 28) My appolgies for starting out with one of Xue Pan's more indelicate lines, but it has the advantage of containing two characters from Ext. B!
     
 
花鸂𪄠,彩鴛鴦
(Chap. 30)    
     
 
只聽得屋內嘻𠺕嘩喇的亂響
(Chap. 64)    
     
 
還是這樣胡鬧,𠳹嗓了黃湯,折磨人家。
(Chap. 79) This is a nice foil to the first line, for here Xue Pan gets his comeuppance.
 

The font in the image is Simsun (Founder Extended) from Microsoft, which contains most of the Supplementary Ideographic Plane (SIP) suitable for Chinese. Currently, the browsers that can display these characters in Mac OS X are: Omniweb 4.2 beta 1, Safari 1.0 beta, Opera, and Mozilla! This is a vast improment over the situation when no broswers could support the display of these characters. I would strongly recommend that anyone working with ancient or classical Chinese texts now do it in Unicode--almost all the characters you would need for these texts are now encoded in Unicode and can be displayed on the web! On Windows XP and Windows 2000, you can use Mozilla.