Beruflich Dokumente
Kultur Dokumente
Web Internationalization
Objectives
Web Internationalization • Describe the standards that define the
Standards and Practice architecture & principles for I18N on the
web
• Scope limited to markup languages
29th Internationalization and Unicode Conference • Provide practical advice for working with
international data on the web, including the
design and implementation of multilingual
Tex Texin (XenCraft/Yahoo) web sites and localization considerations
Yves Savourel (ENLASO Corporation)
Copyright © 2002-2006 Tex Texin and Yves Savourel.
• Be introductory level
Web Internationalization – Standards and Practice Slide 2
Caution
Highlights a note for users or developers to be careful.
Web Internationalization – Standards and Practice Slide 3 Web Internationalization – Standards and Practice Slide 4
Web Internationalization – Standards and Practice Slide 5 Web Internationalization – Standards and Practice Slide 6
Web Internationalization – Standards and Practice Slide 7 Web Internationalization – Standards and Practice Slide 8
Web Internationalization – Standards and Practice Slide 9 Web Internationalization – Standards and Practice Slide 10
Web Internationalization – Standards and Practice Slide 13 Web Internationalization – Standards and Practice Slide 14
• Note:
• Many user agents (browsers) support a user override
– Transcoders do not generally correct charset ID
for charset (highest priority)
Web Internationalization – Standards and Practice Slide 27 Web Internationalization – Standards and Practice Slide 28
Web Internationalization – Standards and Practice Slide 29 Web Internationalization – Standards and Practice Slide 30
Identifiers
Web Internationalization – Standards and Practice Slide 31 Web Internationalization – Standards and Practice Slide 32
Form Data Set- GET Method Submission Form Data Set Encoding
<form name="input” method=“GET"
action="http://www.xencraft.com/cgitest" Application/x-www-form-urlencoded
enctype="application/x-www-form-urlencoded"> Name=Value&Name2=Value2&Name3=Value3
– Control names/current values listed in the order they appear in
Name: <input type="text" name="Name" size=“10” /> the document.
<input type="radio" name="sex" value="m"> Male – Names separated from values by =
<input type="radio" name="sex" value="f"> Female – Name/value pairs separated by &
<input type="submit" value="Send"> – Spaces replaced by +
</form> – Line breaks represented as CR LF: %0D%0A
– Non-alphanumeric and non-ASCII characters and ‘+’, ‘&’, ‘=’, are
replaced by %HH
– Browsers map current encoding byte values to %HH
This simple form will submit a an HTTP GET with: – If the server doesn’t know browser’s character
http://www.xencraft.com/cgitest?Name=Tex&sex=m encoding, it may decode form data incorrectly.
Web Internationalization – Standards and Practice Slide 41 Web Internationalization – Standards and Practice Slide 42
Submit Form
Example comparing two character encodings: (GET or POST)
Browser Server
O/S Charset =z CHARSET=x
Charset=ISO-8859-1
encoding=x-www-form-urlencoded
Name=Fran%E7ois+Ren%E9+Strau%DF
Web Internationalization – Standards and Practice Slide 43 Web Internationalization – Standards and Practice Slide 44
Web Internationalization – Standards and Practice Slide 45 Web Internationalization – Standards and Practice Slide 46
Web Internationalization – Standards and Practice Slide 59 Web Internationalization – Standards and Practice Slide 60
Web Internationalization – Standards and Practice Slide 61 Web Internationalization – Standards and Practice Slide 62
Web Internationalization – Standards and Practice Slide 63 Web Internationalization – Standards and Practice Slide 64
Web Internationalization – Standards and Practice Slide 69 Web Internationalization – Standards and Practice Slide 70
Web Internationalization – Standards and Practice Slide 71 Web Internationalization – Standards and Practice Slide 72
Web Internationalization – Standards and Practice Slide 73 Web Internationalization – Standards and Practice Slide 74
Web Internationalization – Standards and Practice Slide 77 Web Internationalization – Standards and Practice Slide 78
DNS Servers
ACE (Punycode, profile of Bootstring)
• Adopters: XLink, XPointer, URN, XML, XML Schema
Convert to ASCII, Prepend “xn- -”
• IE, Firefox, Opera, Safari, and others
– http://www.w3.org/International/O-URL-and-ident.html Application
Resolver http://xn--wgv71a119e.jp
Servers
Web Internationalization – Standards and Practice Slide 81 Web Internationalization – Standards and Practice Slide 82
Web Internationalization – Standards and Practice Slide 87 Web Internationalization – Standards and Practice Slide 88
Web Internationalization – Standards and Practice Slide 89 Web Internationalization – Standards and Practice Slide 90
<xsl:template match="Text">
<xsl:if test="lang($Language)">
<p><xsl:value-of select="."/>
(<xsl:value-of select="@xml:lang"/>)</p>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
Web Internationalization – Standards and Practice Slide 91 Web Internationalization – Standards and Practice Slide 92
Message 100 in English. (en)Message 200 [insertion in French] Message 100 in English. Message 200 [insertion in French] in
in American English. (en-us)Message 400 in British English. American English. Message 200 en Québecquois. Message 300
(EN-GB) en français. Message 400 in British English.
Web Internationalization – Standards and Practice Slide 93 Web Internationalization – Standards and Practice Slide 94
Î Example: LanguagesCSS.htm
Web Internationalization – Standards and Practice Slide 95 Web Internationalization – Standards and Practice Slide 96
• The <q> element for in-line quotations (auto- • CSS allows control of the type of quote to use
quotation marks expected). according to the language.
• The <blockquote> element for paragraph- *[lang|=fr] { quote:'\ab\a0' '\a0\bb' }
type quotations (indented, and no auto- qo:before { content:open-quote }
qo:after { content:close-quote }
quotation marks expected).
• Examples
Î Example: Input, Output: Quotes.htm. Î HTML: Input, CSS, Output: QuotesWithCSS.htm.
Î XML: Input, CSS File, Output: Quotes.xml.
Web Internationalization – Standards and Practice Slide 99 Web Internationalization – Standards and Practice Slide 100
Web Internationalization – Standards and Practice Slide 103 Web Internationalization – Standards and Practice Slide 104
Web Internationalization – Standards and Practice Slide 107 Web Internationalization – Standards and Practice Slide 108
Original = THIS TEXT SHOULD BE ALL LOWERCASED. Original = THIS TEXT SHOULD BE ALL LOWERCASED.
Transformed = this text should be all lowercased. Transformed = THIS TEXT SHOULD BE ALL LOWERCASED.
Original 1 = tHIS tEXT sHOULD bE cAPITALIZED. Original 1 = tHIS tEXT sHOULD bE cAPITALIZED.
Transformed = THIS TEXT SHOULD BE CAPITALIZED. Transformed = tHIS tEXT sHOULD bE cAPITALIZED.
Original 2 = this text should be capitalized. Original 2 = this text should be capitalized.
Transformed = This Text Should Be Capitalized. Transformed = this text should be capitalized.
Web Internationalization – Standards and Practice Slide 109 Web Internationalization – Standards and Practice Slide 110
Web Internationalization – Standards and Practice Slide 111 Web Internationalization – Standards and Practice Slide 112
Web Internationalization – Standards and Practice Slide 113 Web Internationalization – Standards and Practice Slide 114
Text Flow – Bidi Example Source (1/2) Text Flow – Bidi Example Source (2/2)
<p style="direction:rtl; unicode-bidi:embed"> <p dir="rtl">
Using CSS:<br/> חברתPepper Creek LLC, <span dir="ltr">Using dir="ltr-
עובדים550- מונה יותר מ,עתה-שנוסדה זה.</p> span":</span> <br/>
חברתPepper Creek LLC,
<p dir="rtl"> עובדים550- מונה יותר מ,עתה-שנוסדה זה.</p>
Using dir="rtl":<br/>
חברתPepper Creek LLC, <p dir="ltr">Using dir="ltr" (wrong):<br/>
עובדים550- מונה יותר מ,עתה-שנוסדה זה.</p> חברתPepper Creek LLC,
עובדים550- מונה יותר מ,עתה-שנוסדה זה.</p>
Web Internationalization – Standards and Practice Slide 119 Web Internationalization – Standards and Practice Slide 120
Web Internationalization – Standards and Practice Slide 121 Web Internationalization – Standards and Practice Slide 122
Web Internationalization – Standards and Practice Slide 123 Web Internationalization – Standards and Practice Slide 124
Web Internationalization – Standards and Practice Slide 125 Web Internationalization – Standards and Practice Slide 126
Î Example: Ruby.htm
Web Internationalization – Standards and Practice Slide 127 Web Internationalization – Standards and Practice Slide 128
Web Internationalization – Standards and Practice Slide 131 Web Internationalization – Standards and Practice Slide 132
Web Internationalization – Standards and Practice Slide 135 Web Internationalization – Standards and Practice Slide 136
Web Internationalization – Standards and Practice Slide 137 Web Internationalization – Standards and Practice Slide 138
– Impossible to add needed bidi tags in an attribute. – Keep them outside of the document if possible
– Cause segmentation issues in many tools. (e.g. using include mechanisms).
– Much more difficult to have metadata for attributes – At least, make sure elements with such data are
than for elements. identified for the localizer (who might need to
– You cannot set different languages for two apply a process different than for the rest of the
attributes in the same element. document content).
– More tricky to set unique IDs for attributes. – Internationalize your scripts/queries/etc.
Web Internationalization – Standards and Practice Slide 146 Web Internationalization – Standards and Practice Slide 147
Web Internationalization – Standards and Practice Slide 148 Web Internationalization – Standards and Practice Slide 149