GB2312 | GBK(GB13000.1) | GB18030 | BIG-5

BIG-5 Character Set
BIG-5, commonly known as BIG-5 Code, is an encoding scheme for the traditional Chinese characters prevailing in Taiwan and Hong Kong. It is not an official encoding scheme. It has some inherent defects and fails to get a high opinion from the industry, but it is widely used in the computer industry, especially the Internet, thus becoming the de facto industrial standard.

There has not been detailed description of the BIG-5's background. The following is some introduction according to my best knowledge:

In October 1983, Taiwan National Science Council, Mandarin Promotion Council of the Ministry of Education, the Central Standard Bureau, Electronic Information Processing Center of the Major Planning Office under the Administrative Department jointly worked out Chinese Ideographic Standard Code for Information Interchange (abbreviated as CISCII). After trial use and amendment, it was declared the official standard by Taiwan Central Standard Bureau on August 4, 1986, with the number being CNS 11643. This standard was revised and published again on May 21, 1992, when it was renamed Chinese Standard Interchange Code. On January 4, 1995, Taiwan Central Standard Bureau published CNS 11643-1 "Methods for Using the Chinese Standard Interchange Code".

BIG-5 Code is an encoding scheme developed by Taiwan Information Industry Promotion Council in 1984, based on CISCII. However, I have not heard of the origin of the name "BIG-5".

Big-5 is a two-byte encoding scheme. The value of its first byte ranges between A0~FE in the hexadecimal system, while the second byte 40~7E and A1~FE. So the highest order for the first byte is 1 and that for the second byte can be 1 or 0.

The graphical symbol and Hanzi of the Big-5 Code are basically consistent with the first and second Planes in the CNS 11643 Standard. It specifies 13461 symbols and Hanzi, including:

1. 408 symbols: The encoding positions are from A140 through A3FE (actually ending at A3BF, with vacant positions at the end).

2. 13053 Hanzi: This is divided into two portions of the frequently used Hanzi and less frequently used Hanzi. The Hanzi in each portion is ordered by stroke count/radical. Of these,

a. 5401 frequently used Hanzi: the encoding position ranges from A440~C67E, including all the 4808 Hanzi in the "Table of Standard Font of Frequently Used Chinese Characters" issued by Taiwan's Ministry of Education, 587 frequently used Hanzi in textbooks used for Taiwan's primary and secondary schools, as well as 6 Hanzi with variant forms.

b. 7652 less frequently used Hanzi: the encoding position ranges from C940 to F9FE (actually ending at F9D5, with vacant positions at the end). This includes all the 6341 Hanzi in the "Table of Standard Font of Less Frequently Used Chinese Characters" issued by Taiwan's Ministry of Education, 1311 Hanzi used more often in the "Table of Standard Font of Rarely Used Chinese Characters". The remaining A040~A0FE, C6A1~C8FE and FA40~FEFE are blank area. Some blank positions are often used for user defined area, mostly used for storing the frequently used Hong Kong characters and Cantonese dialect characters.

The currently prevalent BIG-5 fonts mostly have 7 frequently used characters at the positions from F9D6~F9DC. If these 7 characters are included, then the total number is 13060 Hanzi, 13468 Hanzi and symbols. In addition, some BIG-5 fonts, like the True Type Mingxi Style in the Windows traditional Chinese version, have 33 tabs and 1 "■"symbol at the positions from F9DD to F9FE.



Certificate
Standard
Download

Founder Electronics Co.,Ltd. All Rights Reserved.
Tel:+86-10-82531542  Fax:+86-10-62981438
E-mail:font_support@founder.com