Sie sind auf Seite 1von 15

DRAFT MALAYSIAN STANDARD

11G0XXR0-Version2.0

STAGE : Deliberation DATE : 11/08/2011

Information technology - Jawi coded character set for information interchange

OFFICER/SUPPORT STAFF: MMS

ICS:
Descriptors: Jawi, coded character set

Copyright 2011 DEPARTMENT OF STANDARDS MALAYSIA

MS XXXX:XXXX

CONTENTS Page
Committee representation .......................................................................................................... ii Foreword ................................................................................................................................. iii 0 1 2 3 4 5 Introduction ................................................................................................................... 4 Scope............................................................................................................................ 4 Normative references ................................................................................................... 4 Terms and definitions ................................................................................................... 4 Name and meanings .................................................................................................... 5 Jawi coded character set .............................................................................................. 5

Table 1 Table 2 Table 3 Table 4 Table 5


2T

Jawi coded character set ......................................................................................... 6 Example of ZWNJ in writing a Jawi word ................................................................ 8 Coded character for ZWNJ ...................................................................................... 8 Example of the usage of Arabic-Indic numeral in writing Jawi word ................... 8 Coded character for Arabic-Indic numeral ........................................................... 8
2T

Table 6
2T

The usage of Arabic hamzah three-quarter in indicating phonetic sound ............... 9


2T

Table 7
2T

Coded character for Arabic hamzah three-quarter .................................................. 9


2T

Bibliography ............................................................................................................................. 12
2T 2T

STANDARDS MALAYSIA 2011 - All rights reserved

MS XXXX:XXXX

Committee representation
The Industry Standards Committee on Information Technology, Communications and Multimedia (ISC G) under whose authority this Malaysian Standard was developed, comprises representatives from the following organisations: Association of Consulting Engineers Malaysia Department of Standards Malaysia Federation of Malaysian Manufacturers Institut Tadbiran Awam Negara, Malaysia Malaysian Administrative, Modernisation and Management Planning Unit Malaysian International Chamber of Commerce and Industry Malaysian National Computer Confederation Malaysian Technical Standards Forum Bhd MIMOS Berhad Ministry of Domestic Trade, Co-operatives and Consumerism Ministry of Information, Communication and Culture Ministry of International Trade and Industry Ministry of Science, Technology and Innovation Multimedia Development Corporation Sdn Bhd Multimedia University Persatuan Industri Komputer dan Multimedia Malaysia Science and Technology Research Institute for Defence SIRIM Berhad (Secretariat) Suruhanjaya Komunikasi dan Multimedia Malaysia Telekom Malaysia Berhad The Institution of Engineers, Malaysia Universiti Teknologi Malaysia

The Technical Committee on Multilingual Information Technology which developed this Malaysian Standard consists of representatives from the following organisations: .my DOMAIN REGISTRY Dewan Bahasa dan Pustaka Malaysia Jabatan Kemajuan Islam Malaysia Kementerian Pelajaran Malaysia Persatuan Pencinta Tulisan Jawi Selangor SIRIM Berhad (Secretariat) Universiti Kebangsaan Malaysia Universiti Malaya Universiti Putra Malaysia Universiti Teknologi Malaysia Universiti Teknologi MARA

ii

STANDARDS MALAYSIA 2011 - All rights reserved

MS XXXX:XXXX

FOREWORD
This Malaysian Standard was developed by the Technical Committee on Multilingual Information Technology under the authority of the Industry Standards Committee on Information Technology, Communications and Multimedia. Compliance with a Malaysian Standard does not of itself confer immunity from legal obligations.

STANDARDS MALAYSIA 2011 - All rights reserved

iii

MS XXXX:XXXX

Information technology - Jawi coded character set for information interchange 0 Introduction

Jawi is one of the two official writing scripts in Malaysia for the Malay language. It consists of 37 characters where 31 of these characters are adopted from Arabic characters, 5 of these characters are adopted from Persia and 1 new character unique for the Malay language.
NOTE. In ISO-10646(E):2003, Jawi is also known as old Malay script

No standard coded character for Jawi is currently available. Hence, there is an urgent need to have a standard coded character set for Jawi for information interchange, text processing applications, information archiving and data entry.

Scope

This Malaysian Standard specified a set of 37 coded graphic characters identified as Jawi alphabet. The standard also specified 3 additional characters need for writing Jawi. This set of coded graohic characters is intended for use in data and text processing applications and also for information interchange. The set contains graphic characters used for general purpose applications in typical office environments in the Malay language using Jawi script. Some of the characters in this set are based on ISO 8859-6:1999.

Normative references

The following normative references are indispensable for the application of this standard. For dated references, only the edition cited applies. For undated references, the latest edition of the normative reference (including any amendments) applies. MS ISO/IEC 10646, Information technology - Universal multiple - Octet coded character set (UCS) ISO/IEC 8859-6, Information technology - 8-bit single-byte coded graphic character sets. Part 6: Latin/Arabic alphabet

Terms and definitions

For the purposes of this standard, the following terms and definitions apply.

STANDARDS MALAYSIA 2011 - All rights reserved

MS XXXX:XXXX

3.1

character

Character is a member of a set of elements used for the organisation, control or representation of data. A character conveys distinctions in meaning or sounds. A character has no intrinsic appearance.
EXAMPLE

is ARABIC CHARACTER BEH

3.2

coded character set

A standard for assigning numeric values, character names, and representative (sample) images to each character contained in a coded character set. Typically, a character is given a name, which also serves to differentiate with other characters of the coded character set. 3.3 font

Font is a collection of glyph images having the same basic design.


EXAMPLE Courier Bold Oblique and Arabic typesetting Arial Adobe Arabic Times New Roman

3.4

glyph

Glyph is a recognisable abstract graphic symbol which is independent of any specific design. For example, the character Arabic letter beh can have many different glyphs. A glyph conveys distinctions in form or appearance. A glyph has no intrinsic meaning.
EXAMPLE

Name and meanings

This standard assigns a unique name and a unique identifier to each coded Jawi character set. These names and identifiers have been taken MS ISO/IEC 10646. This standard specifies a graphic symbol for each Jawi character. However, this standard does not specify a particular style or font design for imaging Jawi characters.

Jawi coded character set

This part of the standard specifies 42 characters for Jawi coded character set (see Table 1).

STANDARDS MALAYSIA 2011 - All rights reserved

MS XXXX:XXXX

Table 1. Jawi coded character set


MS ISO/IEC 10646 code point U+0627 U+0628 U+062A U+0629 U+062B U+062C U+0686 U+062D U+062E U+062F U+0630 U+0631 U+0632 U+0633 U+0634 U+0635 U+0636 U+0637 U+0638 U+0639 U+063A U+06A0 U+0641 U+06A4

Name Arabic letter alef Arabic letter beh Arabic letter teh Arabic letter teh marbuta Arabic letter theh Arabic letter jeem Arabic letter tcheh Arabic letter hah Arabic letter khah Arabic letter dal Arabic letter thal Arabic letter reh Arabic letter zain Arabic letter seen Arabic letter sheen Arabic letter sad Arabic letter dad Arabic letter tah Arabic letter zah Arabic letter ain Arabic letter ghain Arabic letter ain with three dots above Arabic letter feh Arabic letter veh

STANDARDS MALAYSIA 2011 - All rights reserved

MS XXXX:XXXX

Table 1. Jawi coded character set (continued)


MS ISO/IEC 10646 code point U+0642 U+06A9 U+0762 U+0644 U+0645 U+0646 U+0648 U+06CF U+0647 U+0621 U+064A U+0649 U+06BD U+068E U+06AD U+06AC U+06D1 Name Arabic letter qaf Arabic letter keheh Arabic letter keheh with dot above Arabic letter lam Arabic letter meem Arabic letter noon Arabic letter waw Arabic letter waw with dot above Arabic letter heh Arabic letter hamza Arabic letter yeh Arabic letter alef maksura Arabic letter noon with three dots above Arabic letter dul Arabic letter ng Arabic letter kaf with dot above Arabic letter with yeh with three dots below

5.1

Coded character for ZWNJ (Zero-Width Non-Joiner)

This standard also specifies non-printing character Zero-Width Non-Joiner (ZWNJ). ZWNJ is a non-printing character used in writing systems that make use of cursive joining. When placed between two characters that would otherwise be connected into a ligature, a ZWNJ causes them to be printed in their final and initial forms. In Jawi, ZWNJ is used in writing Malay words such as teks, sains, and golf. Table 2 shows an example for the word sains. The coded character for ZWNJ is shown in Table 3.

STANDARDS MALAYSIA 2011 - All rights reserved

MS XXXX:XXXX

Table 2. Example of ZWNJ in writing a Jawi word Romanisation word Sains Incorrect without ZWNJ Correct with ZWNJ

(sinus)

(sains)

NOTE. With ZWNJ the final 2 characters is not joined

The Jawi word ( sains) when written without ZWNJ

will mean sinus.

Table 3. Coded character for ZWNJ


MS ISO/IEC 10646 code point Name

U+200C
1T

Zero Width Non-Joiner

5.2

Coded character set for Extended Arabic-indic digit two

This standard also specifies the extended Arabic-Indic digit two . The Arabic-Indic digit two is uses in Jawi spelling system to indicate repeatation of a Jawi word. It is an important part of Jawi. For instance, the word rumah-rumah is written in Jawi as Table 4 shows examples of the usage of Arabic-Indic Digit Two ZWNJ is shown in Table 5.

not

. The coded character for

.-

Table 4. Example of the usage of Arabic-Indic numeral in writing Jawi words Duplicate words Rama-rama Pelajar-pelajar Pesakit-pesakit Duplicate word in Jawi

Table 5. Coded character for Arabic-indic numeral


ISO/IEC 10646 code point Name

U+06F2

Extended Arabic_indic digit 2

STANDARDS MALAYSIA 2011 - All rights reserved

MS XXXX:XXXX

5.3

Coded character set for Arabic hamzah three-quarter This standard also specifies Arabic hamzah three-quarter . The Arabic hamzah three-quarter character is used to indicate a specific phonetic sound called gliding phonetic sound in certain Malay words. Without this character, Malay words in Jawi would be pronounced incorrectly. The Arabic hamzah three-quarter position higher from the baseline. The Example of Malay words and phonetic sound is shown in Table 6. Table 6. The usage of Arabic hamzah three-quarter in indicating phonetic sound Malay word buih with Arabic hamzah three-quarter without Arabic hamzah three-quarter mulai with Arabic hamzah three-quarter Arabic hamzah three-quarter without Arabic hamzah three-quarter Jawi Phonetic buih (glottal stop) buih (gliding) mulai (glottal stop) mulai (gliding)

The gliding phonetic sound involves Malay words with diftong au, ai and ui. Examples of such words are buih, kuih, kuil, tuil, laut, paut, sauh, kain and baik. Currently, the character Arabic hamzah three-quarter is not coded in MS ISO/IEC 10646 or ISO/IEC 8859-6. Table 7. Coded character for Arabic hamzah three-quarter
MS ISO/IEC 10646 code point Name

Arabic hamzah three-quarter

6. Jawi coded character glyph


This following table show the glyphs for each Jawi coded. Each glyph is only a representative image of possible style and type for each coded character set. Table 8: A representative glyph of each Jawi coded character set
Standalone shape Initial shape Middle shape Final shape

Name ARABIC LETTER ALEF ARABIC LETTER BEH

STANDARDS MALAYSIA 2011 - All rights reserved

MS XXXX:XXXX
ARABIC LETTER TEH ARABIC LETTER TEH MARBUTA ARABIC LETTER THEH ARABIC LETTER JEEM ARABIC LETTER TCHEH ARABIC LETTER HAH ARABIC LETTER KHAH ARABIC LETTER DAL ARABIC LETTER THAL ARABIC LETTER REH ARABIC LETTER ZAIN ARABIC LETTER SEEN ARABIC LETTER SHEEN ARABIC LETTER SAD ARABIC LETTER DAD ARABIC LETTER TAH ARABIC LETTER ZAH ARABIC LETTER AIN ARABIC LETTER GHAIN ARABIC LETTER AIN WITH THREE DOTS ABOVE ARABIC LETTER FEH ARABIC LETTER VEH ARABIC LETTER QAF ARABIC LETTER KEHEH ARABIC LETTER KEHEH WITH DOT ABOVE ARABIC LETTER LAM ARABIC LETTER MEEM ARABIC LETTER NOON

10

STANDARDS MALAYSIA 2011 - All rights reserved

ARABIC LETTER WAW ARABIC LETTER WAW WITH DOT ABOVE ARABIC LETTER HEH ARABIC LETTER HAMZA ARABIC LETTER YEH ARABIC LETTER ALEF MAKSURA ARABIC LETTER NOON WITH THREE DOTS ABOVE ARABIC LETTER DUL ARABIC LETTER NG ARABIC LETTER KAF WITH DOT ABOVE ARABIC LETTER WITH YEH WITH THREE DOTS BELOW

MS XXXX:XXXX

STANDARDS MALAYSIA 2011 - All rights reserved

11

MS XXXX:XXXX

Bibliography

[1]

MS 1368:1994, Information processing - Jawi character set for information interchange MS 2396: 2011, Information Technology - Keyboard layout for Jawi characters MS ISO/IEC 9541-1:1995, Information technology - Font information: Part1: Architecture MS ISO/IEC 9995-1, Information technology - Keyboard layouts for text and office systems - Part 1: General principles governing keyboard layouts ISO/IEC TR 15285, Information technology - An operational model for characters and glyphs Dewan Bahasa dan Pustaka - Daftar Kata Bahasa Melayu Rumi Sebutan - Jawi, edisi kedua, 2008

[2]

[3]

[4]

[5]

[6]

12

STANDARDS MALAYSIA 2011 - All rights reserved

Acknowledgements
Members of Technical Committee on Multilingual Information Technology

Dr Rohana Mahmud (Chairperson) Mr Mohd Zamri Murah (Deputy Chairman) Mr Muhaimin Mat Salleh (Secretary) Ms Noor Harnidda Khalid Mr Wan Mohd Saophy Amizul Wan Mansor Hajjah Mariam Abdullah Hajjah Che Siah Che Man/ Dr Janudin Sardi Dr Abdul Azim Abd Ghani/ Dr Rabiah A Kadir Assoc Prof Muhammad Mun'im Ahmad Zabidi Prof Dr Zainab Abu Bakar/ Assoc Prof Dr Mazani Manaf

Universiti Malaya Universiti Kebangsaan Malaysia SIRIM Berhad .my DOMAIN REGISTRY Dewan Bahasa dan Pustaka Malaysia Jabatan Kemajuan Islam Malaysia Persatuan Pencinta Tulisan Jawi Selangor Universiti Putra Malaysia Universiti Teknologi Malaysia Universiti Teknologi MARA

STANDARDS MALAYSIA 2011 - All rights reserved

Das könnte Ihnen auch gefallen