Foreign Language customization

Setting up support for your language for use on the Vitual World servers is easy, Vitual World has taken great care to provide the latest and most powerfull tools available today for this purpose, we also comply with all norms, procedures and standards set to this end on the Internet .

Human beings on our planet have, past and present, used a number of languages.  There are many reasons why one would want to identify the language used when presenting information.

So in order for you to configure your account for a specific language or languages we must work on 3 areas:

Depending on your language you will need to find it's corresponding ISO-xxxx-xx tag and character definition so that the encodings are set correctly for the visiting browsers and characters specific to your language to be processed automatically for your users.

1.- Web Server configuration (apache)

In order to configure your local language or provide automatic support for more than one language you need to work with your .htaccess file.

The .htaccess file serves as a tool to configure apache's behavior local to your account. This allows you to set the operational environment of the web server to meet all of your requirements, one of which is the language negotiation capabilities.

There are 3 directives that directly relate to that, and they are available to you at any time to put them to good use.  You first need, as mentioned before, to find out the language tag that is specific to your needs and the iso-xxxx-xx character set that belongs to it.

So for example if your language is Japanese then you have access to 3 character sets:

Charset         File extension

EUC-JP             .euc
ISO-2022-JP        .jis
SHIFT_JIS          .sjis

The language definition in apache is composed of 2 letters that signifies  the language in question, following with the Japanese example the Language tag for apache would then be jp and it's extension .jp

Language        Code    File Extension

Japanese        jp      .jp

Now that we have the charset for encoding documents in your language and that we also know how to tell apache which is our default language tag, we only need to go ahead and set this parameter into the .htaccess file so that the new settings enter into effect.

Add the following lines to the .htaccess file:

LanguagePriority  jp en

AddCharacterset EUC-JP          .euc
AddCharacterset ISO-2022-JP     .jis
AddCharacterset SHIFT_JIS       .sjis

AddLanguage             jp

These settings will add the needed information to fully support all available variants of the Japanese language, character set and encodings needed to present the information correctly.

The LanguagePriority directive tells apache that "jp" (Japanese is to be set as the default language) and "en" (English as secondary language)

The AddCharacterset  adds support to the special characters used in displaying data in Japanese.

The AddLanguage directive instructs apache to provide support for the Japanese language.

By doing this you will make sure that your static content managed by apache will be  processed as you expect.

If a user visits your site and has not set the Japanese language into his/her browser then he/she will  be notified of the official language of the site and normally the browser will offer to add the necessary software to support the character encodings and if granted auto install any needed files for this to happen. Of course you can always prepare an "en" (English page) that automatically gets displayed into the browser if the visitor has this as his/her default browser language.

Thus the use of the languagepriority tag becomes evident at this point.

2.- CGI Configuration (scripts mainly perl)

If your site uses cgi scripts based on perl you need to send, prior to any actual content, a header that includes the content type which helps both the server and the browser to process the information base on the type of data being sent at any given moment, thus a text/html type of content which is the type you would use to generate an HTML page as the output of a CGI script would need to have the added ISO character set encoding specification in it, an example would be:

Print "Content-type: text/html charset=ISO-2022-JP\r\n";

Or if you are using cgi.pm

print $in->header(-type=>'text/html', -charset=>'ISO-2022-JP');

for all other programming languages or cgi you may already be using, or will use, you need to set the charset whenever you prepare to create dynamic content output that finally will translate into a html page at the stage where you generate the headers sent to the browser as in the previous perl examples.

3.- Static HTML pages with META tags

As discussed earlier if you set correctly the language priority and add support for the corresponding character set you can benefit from automatic content negotiation by just using the correct extension, so elaborating from previous examples you would name your html files for example index.html.jis or index.html.euc, index.html.jp or even index.html.sjis which would automatically provide support for this and serve the correct one to the browser of your visitors.

Of course there is always more than one way to do things and if you do not want to use the extensions method and you wish to use the normal html, extension then you can just add at the header section of each of your documents the following meta tag:

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-2022-JP">

This will instruct the browser to switch to the desired character / encoding specified on the meta tag.

Finally if your language is not Japanese and you wish to benefit from this feature just replace the iso, charset and language tags where appropriate to customize your site for your own language.

4.- Email

When processing email, that is to provide encoded content other than english, you must add into the header portion of your mailing script the correct Content-Type: text/plain; charset="iso-xxxx-xx" string, the same as for static html, this needs to be done no matter if the email is sigle-part message, or a MIME multi-apart one, on the later one you need to specify this on each part.

This line tells the receiving email client exactly what MIME type or types are included in the mail message. As long as the MIME-type referenced is compatible with the mail program it should have no problems automatically decoding the attachments. In the example, [text/plain; charset="ISO-2022-JP"] just tells us that the message contains a regular ASCII text message encoded in Japanese. 

The implementation depends upon your choice of CGI's language, commonly there is a mail function that helps set the headers and other needed items of an email, you should consult your reference manual to use the correct one.

A final word of advice, if you find yourself using a combination of static html pages and CGI scripts, for example a plain html form to be processed by a cgi script then you need to combine methods.  This means you would add the charset by means of a meta tag into your static html form and into the CGI script in the appropriate area as discussed earlier where your script sends to the browser the content type string and of course set your .htaccess file as explained before.

Home