Beruflich Dokumente
Kultur Dokumente
Abstract
There are a lot of dictionaries available in the market. However, it‘s hard to find a Malay-
Arabic dictionary complete with picture or video to visualize what is the word represent, and
pronunciation in voice. Thus, the researchers developed and implemented a Malay-Arabic
online multimedia dictionary for beginner (OMMAr). The user is able to enter a Malay word
and if the word matches any of the entry in the database system, the meaning in Arabic will
be displayed with the picture or video and the sound for the word pronunciation. If there is no
match, the system will propose similar words. The system is also capable of the doing the
same for Arabic words. The technology used is client-server technology where the mobile
device (Smartphone) is the client, and there is a web server to facilitate the lookup table
(database) and storage for the audio-visual aids. This paper will present the design of
OMMAr on Smartphone which uses the UTF-8 character encoding to store the Arabic words.
There is a special treatment to the web pages, the web server, and also the database. The web
pages use UTF-8 HTML encoding while the web server needs a special function to translate
the HTML code into UTF-8. To store the Arabic characters, the database‘s collation also
need to be set to UTF-8. The authors use the Flash format for the videos and the sounds,
while the images are in JPEG, GIF or PNG format. This paper is to report the design,
implementation and testing of the systems user interface on several popular Smartphone
devices.
1. Introduction
Dictionary is a book that lists the words of a language in alphabetical order and gives their
meaning, or their equivalent in a different language (Oxford Dictionary). Since dictionary is
supposed to be a book (as defined by Oxford Dictionary), so a dictionary that is available
online, such as Merriam-Webster.com, is an online dictionary. As the word online is also
related to computer, this online dictionary must also possess the fast searching features. What
the researchers are doing is to capitalize the strength of computer processing to enhance the
old version of dictionary (in papers) into a rich multimedia dictionary. The application is
equipped with the audio of the word‘s pronunciation to help the user pronounce the word and
pictures or video to visualize what the word represents.
Evolution of information technology and communication (ICT) today did affect many
development of dictionary that can be installed and used in mobile devices. There are several
example of dictionary that can be accessed through mobile. There are LookWAYup mobile
dictionary (http://lookwayup.com), SABC Mobile Dictionary (Mobidic)
Besides that there are some versions of dictionary that need to be installed into mobile as a
program like MOT Mobile Dictionary. These dictionary was developed to help business user
to use language in their daily job. For MSDict English Phrases Dictionary (S60) 4.10, it has
many phrases, word arrangements and basic quotes.
2. The challenges
We face difficulties to manipulate the Arabic words, especially in the process of saving and
searching the word entries. So we use one of the Unicode character sets, which is the UTF-8
to represent the Arabic words. However, UTF-8 character sets need some special treatment to
manipulate them correctly in the Web environment. The problem and solution is thoroughly
discussed by Khirulnizam et al., 2008 [CAMP2008].
The researchers produce the images and the sound manually. The sound is developed in a
recording studio, where voice is recorded, edited and noise is filtered manually. There are
technologies available to automatically convert text (Malay or Arabic) into sound; such as
Malay TTS by Tan (2004), SMaTTS by Othman et al. (2007) and ATTS by Elshafaei (2004).
However we have not looked into it seriously. The images are also developed manually.
The previous system is available in regular web interface. However for the mobile device
version, the system needs to be downgraded to suit the small screens. The JavaScript
onscreen keyboard application is also customized. Another problem that occur is the Flash
application to produce the sound. Since the application is in Flash, there are several smart
phone devices that do not support it such as iPhone. This is the main challenge that has not
been resolved yet.
The system is developed using PHP script as the middleware, running on Apache web server.
The interfaces are on HTML and the database is residing in MySQL 5 database server.
Network
Client-side Server-side
Figure 1: The schematic diagram of the application.
The entity relationship diagram in Figure 1 is the database design. We use the MySQL
version 5 because it has the capability to handle UTF-8 character set which is a must have
feature since we store Arabic characters.
Begin
End
This system has two major search features; the Malay, and the Arabic word search. Figure 4
shows the algorithm for the Malay word searching. This section will display the Arabic word
or the suggested similar Arabic word to the user. It also provides the image and the audio aid
of the word‘s pronunciation.
Figure 5 is the algorithm used in doing the search for Arabic word. The user will get the
Malay word as the output, and all the audio and visual aids.
4. Testing
These are the screen shots taken from the application prototype that has been developed
(Figure 6, 7, 8 and 9). Figure 7 is the interface to receive user request to search Malay word.
User has the option of selecting the exact word, contains the search request or begins with the
search request.
The result of the search is displayed in Figure 8 if the word is found. Click ‗lanjut‘ (details) to
view the multimedia content. All the voices and image is displayed in Figure 9.
The keyboard is provided in Arabic search request page, Figure 10. This keyboard is to
facilitate the user who does not have the physical Arabic keyboard.
Figure 7: The user interface to enter the Malay word by the user.
Figure 10: The user interface to enter the Arabic word by the user.
The testing was done only on mobile device environment using three types of mobile devices
which are Symbian, iPhone, and Windows Mobile. The result is as shown in Table 1. The
symbol ―√‖ represents capable, while ―X‖ is unable to perform.
Platform
iPhone (using Windows Mobile
Symbian
Functions Safari on (Mobile IE on
(Nokia E63)
iPhone 3Gs) HTC Touch)
Display Arabic √ √
√
words
Receive Arabic √ √
√
words
Search Arabic √ √
√
words
Arabic on-page √ √
√
keyboard
Pronunciation √ X √
Most of Smartphone‘s browser that support Unicode and Flash will not have problems in
accessing the application. By default iPhone does not support Flash. That‘s the main reason
why the pronunciation does not perform properly on iPhone. The pronunciation‘s interface is
developed using Flash.
The testing provides a positive outcome for this prototype application. There are a lot more
platform to test. Another future work is to add more entries, develop more audio, visual aid
(animation and video) for more Arabic and Malay words. The prototype is accessible from
http://bit.ly/mommar/ .
6. Acknowledgement
This research is funded under the Research and Development grant, provided by Kolej
Universiti Islam Antarabangsa Selangor, Bandar Seri Putra, Bangi, Selangor.
References
Elshafei, Moustafa, Al-Muhtaseb, Husni and Al-Ghamdi, Mansour. 2002. Techniques for
high quality Arabic speech synthesis. Journal of Information Sciences—Informatics and
Computer Science,Volume 140 , Issue 3, pages 255 - 267.
Khirulnizam A. Rahman, Syuria Amiruddin, Che Wan Shamsul Bahri Che Wan Ahmad, Wan
Harun Hussaini, and Siti Zaharah Mohid. 2008. Solving the Arabic UTF-8 Characters
Transaction Issues in an Online Malay-Arabic Dictionary. Proceeding presented in National
Conference on Information Retrieval and Knowledge Management (CAMP08), 18 March
2008, Kuala Lumpur, Malaysia.
Khirulnizam Abd Rahman, Che Wan Shamsul Bahri Che Wan Ahmad, Juzlinda Mohd
Ghazali, Syuria Amirrudin, Siti Zaharah Mohid, Dr. Muhammad Haron Husaini (2008)
Othman Khalifa, and Zakiah Hanim Ahmad. 2007. SMaTTS: Standard Malay Text to Speech
System. International Journal of Computer Science, 2 (4). pp. 285-293. ISSN 2070-3856.
Accessed on 28 December 2008, from http://www.waset.org/ijcs/v2/v2-4-40.pdf.
Tan, Tian Swee. 2004. The design and verification of malay text to speech synthesis
system. Masters thesis, Univeriti Teknologi Malaysia. Accessed on 28 December 2008 from
http://eprints.utm.my/4000/1/TanTianSweeMED2004TTT.pdf .