Free Software

Software Engineering with FOSS and Linux

Selecting language of multilingual web sites

Multilingual sites will usually offer a way to their users to switch between languages of the content, either through a link in their pages or through the configuration of user preferences. For first-time visitors, however, a site needs a way to determine their prefered language(s). The standard way to identify this is by inspecting the Accept-Language HTTP header sent to the site by the user’s browser.

According to the HTTP 1.1 Specification, the Accept-Language header can be used to assign a weight to each language, determining the users’ prefered order of natural languages of multilingual content. For example,

Accept-Language: el,en;q=0.5,fr;q=0.4

means that the user prefers Greek content, but if it is not available then English and French are also acceptable, with English having a higher priority.

You can parse the Accept-Language header to determine the appropriate language. Although a simple parsing can be used in most cases, addressing the gritty details of the specification can be a bit tricky. Here is an implementation of the parsing algorithm in PHP. It might be an overkill, but it gives you a pretty good idea:


<?
$default_lang 
"en";

function sort_descending_weights$a$b )
{
    
# Each array element is a (lang,weight) pair
    
if ( $a] != $b] )
    {
        return ( 
$a] < $b] ) ? : -1;
    }

    # If two languages have the same weight, then we might want to impose 
    # our own precedence. Put your own ordering code here. For simplicity, we
    # just assume that the default language takes priority.
    
if ( $a] == $default_lang ) return -1;
    else if ( 
$b] == $default_lang ) return 1;
    else return 
0;
}

function is_language_available$lang )
{
    return 
true;
}

function get_prefered_language( )
{
    global 
$default_lang;
    
    
# If no Accept_Language header exists, use the site's default language
    
if ( !in_array'HTTP_ACCEPT_LANGUAGE'$_SERVER ) ) 
    {
        return 
$default_lang;
    }

    # Parse the header. A * indicates any language not explicitly specified
    
$h $_SERVER'HTTP_ACCEPT_LANGUAGE' ];
    
$list explode","$h );
    if ( 
count$list ) == ) return $default_lang;
    
$prefs = array();
    foreach ( 
$list as $langs )
    {
        
$tmp explode";q="$langs );
        
$lang $tmp];
        
$weight count$tmp ) == 1.0 $tmp];
        
array_push$prefs, array( $lang$weight ) );
    }

    # The specification doesn't enforce weight to be in descending order.
    # Sort the parsed values.
    
usort$prefssort_descending_weights );    

    # Pick an available language. is_language_available() is a stub.
    
foreach ( $prefs as $pref )
    {
        list( 
$lang$weight ) = $pref;
        if ( 
is_language_available$lang ) ) return $lang;
    }
    return 
$default_lang;
}    

echo get_prefered_language();
?>


In Firefox 3.0, users may specify their prefered content languages in the Preferences / Content / Languages Menu:

prefs lang

For ease of use, exact weights don’t need to be specified. When users change the order of the languages in the list, the browser will calculate and send the appropriate weights in the HTTP request.

Advertisements

September 3, 2009 Posted by | Programming | , , , , , | 4 Comments