123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566 |
- //
- // Copyright (c) 2009-2011 Artyom Beilis (Tonkikh)
- //
- // Distributed under the Boost Software License, Version 1.0. (See
- // accompanying file LICENSE_1_0.txt or copy at
- // http://www.boost.org/LICENSE_1_0.txt)
- //
- // vim: tabstop=4 expandtab shiftwidth=4 softtabstop=4 filetype=cpp.doxygen
- /*!
- \page messages_formatting Messages Formatting (Translation)
- - \ref messages_formatting_into
- - \ref msg_loading_dictionaries
- - \ref message_translation
- - \ref indirect_message_translation
- - \ref plural_forms
- - \ref multiple_gettext_domain
- - \ref direct_message_translation
- - \ref extracting_messages_from_code
- - \ref custom_file_system_support
- - \ref msg_non_ascii_keys
- - \ref msg_qna
- \section messages_formatting_into Introduction
- Messages formatting is probably the most important part of
- the localization - making your application speak in the user's language.
- Boost.Locale uses the <a href="http://www.gnu.org/software/gettext/">GNU Gettext</a> localization model.
- We recommend you read the general <a href="http://www.gnu.org/software/gettext/manual/gettext.html">documentation</a>
- of GNU Gettext, as it is outside the scope of this document.
- The model is following:
- - First, our application \c foo is prepared for localization by calling the \ref boost::locale::translate() "translate" function
- for each message used in user interface.
- \n
- For example:
- \code
- cout << "Hello World" << endl;
- \endcode
- Is changed to
- \n
- \code
- cout << translate("Hello World") << endl;
- \endcode
- - Then all messages are extracted from the source code and a special \c foo.po file is generated that contains all of the
- original English strings.
- \n
- \verbatim
- ...
- msgid "Hello World"
- msgstr ""
- ...
- \endverbatim
- - The \c foo.po file is translated for the supported locales. For example, \c de.po, \c ar.po, \c en_CA.po , and \c he.po.
- \n
- \verbatim
- ...
- msgid "Hello World"
- msgstr "שלום עולם"
- \endverbatim
- And then compiled to the binary \c mo format and stored in the following file structure:
- \n
- \verbatim
- de
- de/LC_MESSAGES
- de/LC_MESSAGES/foo.mo
- en_CA/
- en_CA/LC_MESSAGES
- en_CA/LC_MESSAGES/foo.mo
- ...
- \endverbatim
- \n
- When the application starts, it loads the required dictionaries. Then when the \c translate function is called and the message is written
- to an output stream, a dictionary lookup is performed and the localized message is written out instead.
- \section msg_loading_dictionaries Loading dictionaries
- All the dictionaries are loaded by the \ref boost::locale::generator "generator" class.
- Using localized strings in the application, requires specification
- of the following parameters:
- -# The search path of the dictionaries
- -# The application domain (or name)
- This is done by calling the following member functions of the \ref boost::locale::generator "generator" class:
- - \ref boost::locale::generator::add_messages_path() "add_messages_path" - add the root path to the dictionaries.
- \n
- For example: if the dictionary is located at \c /usr/share/locale/ar/LC_MESSAGES/foo.mo, then path should be \c /usr/share/locale.
- \n
- - \ref boost::locale::generator::add_messages_domain() "add_messages_domain" - add the domain (name) of the application. In the above case it would be "foo".
- \note At least one domain and one path should be specified in order to load dictionaries.
- This is an example of our first fully localized program:
- \code
- #include <boost/locale.hpp>
- #include <iostream>
- using namespace std;
- using namespace boost::locale;
- int main()
- {
- generator gen;
- // Specify location of dictionaries
- gen.add_messages_path(".");
- gen.add_messages_domain("hello");
- // Generate locales and imbue them to iostream
- locale::global(gen(""));
- cout.imbue(locale());
- // Display a message using current system locale
- cout << translate("Hello World") << endl;
- }
- \endcode
- \section message_translation Message Translation
- There are two ways to translate messages:
- - using \ref boost_locale_translate_family "boost::locale::translate()" family of functions:
- \n
- These functions create a special proxy object \ref boost::locale::basic_message "basic_message"
- that can be converted to string according to given locale or written to \c std::ostream
- formatting the message in the \c std::ostream's locale.
- \n
- It is very convenient for working with \c std::ostream object and for postponing message
- translation
- - Using \ref boost_locale_gettext_family "boost::locale::gettext()" family of functions:
- \n
- These are functions that are used for direct message translation: they receive as a parameter
- an original message or a key and convert it to the \c std::basic_string in given locale.
- \n
- These functions have similar names to thous used in the GNU Gettext library.
- \subsection indirect_message_translation Indirect Message Translation
- The basic function that allows us to translate a message is \ref boost_locale_translate_family "boost::locale::translate()" family of functions.
- These functions use a character type \c CharType as template parameter and receive either <tt>CharType const *</tt> or <tt>std::basic_string<CharType></tt> as input.
- These functions receive an original message and return a special proxy
- object - \ref boost::locale::basic_message "basic_message<CharType>".
- This object holds all the required information for the message formatting.
- When this object is written to an output \c ostream, it performs a dictionary lookup of the message according to the locale
- imbued in \c iostream.
- If the message is found in the dictionary it is written to the output stream,
- otherwise the original string is written to the stream.
- For example:
- \code
- // Translate a simple message "Hello World!"
- std::cout << boost::locale::translate("Hello World!") << std::endl;
- \endcode
- This allows the program to postpone translation of the message until the translation is actually needed, even to different
- locale targets.
- \code
- // Several output stream that we write a message to
- // English, Japanese, Hebrew etc.
- // Each one them has installed std::locale object that represents
- // their specific locale
- std::ofstream en,ja,he,de,ar;
- // Send single message to multiple streams
- void send_to_all(message const &msg)
- {
- // in each of the cases below
- // the message is translated to different
- // language
- en << msg;
- ja << msg;
- he << msg;
- de << msg;
- ar << msg;
- }
- int main()
- {
- ...
- send_to_all(translate("Hello World"));
- }
- \endcode
- \note
- - \ref boost::locale::basic_message "basic_message" can be implicitly converted
- to an apopriate std::basic_string using
- the global locale:
- \n
- \code
- std::wstring msg = translate(L"Do you want to open the file?");
- \endcode
- - \ref boost::locale::basic_message "basic_message" can be explicitly converted
- to a string using the \ref boost::locale::basic_message::str() "str()" member function for a specific locale.
- \n
- \code
- std::locale ru_RU = ... ;
- std::string msg = translate("Do you want to open the file?").str(ru_RU);
- \endcode
- \subsection plural_forms Plural Forms
- GNU Gettext catalogs have simple, robust and yet powerful plural forms support. We recommend to read the
- original GNU documentation <a href="http://www.gnu.org/software/gettext/manual/gettext.html#Plural-forms">here</a>.
- Let's try to solve a simple problem, displaying a message to the user:
- \code
- if(files == 1)
- cout << translate("You have 1 file in the directory") << endl;
- else
- cout << format(translate("You have {1} files in the directory")) % files << endl;
- \endcode
- This very simple task becomes quite complicated when we deal with languages other than English. Many languages have more
- than two plural forms. For example, in Hebrew there are special forms for single, double, plural, and plural above 10.
- They can't be distinguished by the simple rule "is n 1 or not"
- The correct solution is to give a translator an ability to choose a plural form on its own. Thus the translate
- function can receive two additional parameters English plural form a number: <tt>translate(single,plural,count)</tt>
- For example:
- \code
- cout << format(translate( "You have {1} file in the directory",
- "You have {1} files in the directory",
- files)) % files << endl;
- \endcode
- A special entry in the dictionary specifies the rule to choose the correct plural form in the target language.
- For example, the Slavic language family has 3 plural forms, that can be chosen using following equation:
- \code
- plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;
- \endcode
- Such equation is stored in the message catalog itself and it is evaluated during translation to supply the correct form.
- So the code above would display 3 different forms in Russian locale for values of 1, 3 and 5:
- \verbatim
- У вас есть 1 файл в каталоге
- У вас есть 3 файла в каталоге
- У вас есть 5 файлов в каталоге
- \endverbatim
- And for Japanese that does not have plural forms at all it would display the same message
- for any numeric value.
- For more detailed information please refer to GNU Gettext: <a href="http://www.gnu.org/software/gettext/manual/gettext.html#Plural-forms">11.2.6 Additional functions for plural forms</a>
- \subsection adding_context_information Adding Context Information
- In many cases it is not sufficient to provide only the original English string to get the correct translation.
- You sometimes need to provide some context information. In German, for example, a button labeled "open" is translated to
- "öffnen" in the context of "opening a file", or to "aufbauen" in the context of opening an internet connection.
- In these cases you must add some context information to the original string, by adding a comment.
- \code
- button->setLabel(translate("File","open"));
- \endcode
- The context information is provided as the first parameter to the \ref boost::locale::translate() "translate"
- function in both singular and plural forms. The translator would see this context information and would be able to translate the
- "open" string correctly.
- For example, this is how the \c po file would look:
- \code
- msgctxt "File"
- msgid "open"
- msgstr "öffnen"
- msgctxt "Internet Connection"
- msgid "open"
- msgstr "aufbauen"
- \endcode
- \note Context information requires more recent versions of the gettext tools (>=0.15) for extracting strings and
- formatting message catalogs.
- \subsection multiple_gettext_domain Working with multiple messages domains
- In some cases it is useful to work with multiple message domains.
- For example, if an application consists of several independent modules, it may
- have several domains - a separate domain for each module.
- For example, developing a FooBar office suite we might have:
- - a FooBar Word Processor, using the "foobarwriter" domain
- - a FooBar Spreadsheet, using the "foobarspreadsheet" domain
- - a FooBar Spell Checker, using the "foobarspell" domain
- - a FooBar File handler, using the "foobarodt" domain
- There are three ways to use non-default domains:
- - When working with \c iostream, you can use the parameterized manipulator \ref
- boost::locale::as::domain "as::domain(std::string const &)", which allows switching domains in a stream:
- \n
- \code
- cout << as::domain("foo") << translate("Hello") << as::domain("bar") << translate("Hello");
- // First translation is taken from dictionary foo and the other from dictionary bar
- \endcode
- - You can specify the domain explicitly when converting a \c message object to a string:
- \code
- std::wstring foo_msg = translate(L"Hello World").str("foo");
- std::wstring bar_msg = translate(L"Hello World").str("bar");
- \endcode
- - You can specify the domain directly using a \ref direct_message_translation "convenience" interface:
- \code
- MessageBox(dgettext("gui","Error Occurred"));
- \endcode
- \subsection direct_message_translation Direct translation (Convenience Interface)
- Many applications do not write messages directly to an output stream or use only one locale in the process, so
- calling <tt>translate("Hello World").str()</tt> for a single message would be annoying. Thus Boost.Locale provides
- GNU Gettext-like localization functions for direct translation of the messages. However, unlike the GNU Gettext functions,
- the Boost.Locale translation functions provide an additional optional parameter (locale), and support wide, u16 and u32 strings.
- The GNU Gettext like functions prototypes can be found \ref boost_locale_gettext_family "in this section".
- All of these functions can have different prefixes for different forms:
- - \c d - translation in specific domain
- - \c n - plural form translation
- - \c p - translation in specific context
- \code
- MessageBoxW(0,pgettext(L"File Dialog",L"Open?").c_str(),gettext(L"Question").c_str(),MB_YESNO);
- \endcode
- \section extracting_messages_from_code Extracting messages from the source code
- There are many tools to extract messages from the source code into the \c .po file format. The most
- popular and "native" tool is \c xgettext which is installed by default on most Unix systems and freely downloadable
- for Windows (see \ref gettext_for_windows).
- For example, we have a source file called \c dir.cpp that prints:
- \code
- cout << format(translate("Listing of catalog {1}:")) % file_name << endl;
- cout << format(translate("Catalog {1} contains 1 file","Catalog {1} contains {2,num} files",files_no))
- % file_name % files_no << endl;
- \endcode
- Now we run:
- \verbatim
- xgettext --keyword=translate:1,1t --keyword=translate:1,2,3t dir.cpp
- \endverbatim
- And a file called \c messages.po created that looks like this (approximately):
- \code
- #: dir.cpp:1
- msgid "Listing of catalog {1}:"
- msgstr ""
- #: dir.cpp:2
- msgid "Catalog {1} contains 1 file"
- msgid_plural "Catalog {1} contains {2,num} files"
- msgstr[0] ""
- msgstr[1] ""
- \endcode
- This file can be given to translators to adapt it to specific languages.
- We used the \c --keyword parameter of \c xgettext to make it suitable for extracting messages from
- source code localized with Boost.Locale, searching for <tt>translate()</tt> function calls instead of the default <tt>gettext()</tt>
- and <tt>ngettext()</tt> ones.
- The first parameter <tt>--keyword=translate:1,1t</tt> provides the template for basic messages: a \c translate function that is
- called with 1 argument (1t) and the first message is taken as the key. The second one <tt>--keyword=translate:1,2,3t</tt> is used
- for plural forms.
- It tells \c xgettext to use a <tt>translate()</tt> function call with 3 parameters (3t) and take the 1st and 2nd parameter as keys. An
- additional marker \c Nc can be used to mark context information.
- The full set of xgettext parameters suitable for Boost.Locale is:
- \code
- xgettext --keyword=translate:1,1t --keyword=translate:1c,2,2t \
- --keyword=translate:1,2,3t --keyword=translate:1c,2,3,4t \
- --keyword=gettext:1 --keyword=pgettext:1c,2 \
- --keyword=ngettext:1,2 --keyword=npgettext:1c,2,3 \
- source_file_1.cpp ... source_file_N.cpp
- \endcode
- Of course, if you do not use "gettext" like translation you
- may ignore some of these parameters.
- \subsection custom_file_system_support Custom Filesystem Support
- When the access to actual file system is limited like in ActiveX controls or
- when the developer wants to ship all-in-one executable file,
- it is useful to be able to load \c gettext catalogs from a custom location -
- a custom file system.
- Boost.Locale provides an option to install boost::locale::message_format facet
- with customized options provided in boost::locale::gnu_gettext::messages_info structure.
- This structure contains \c boost::function based
- \ref boost::locale::gnu_gettext::messages_info::callback_type "callback"
- that allows user to provide custom functionality to load message catalog files.
- For example:
- \code
- // Configure all options for message catalog
- namespace blg = boost::locale::gnu_gettext;
- blg::messages_info info;
- info.language = "he";
- info.country = "IL";
- info.encoding="UTF-8";
- info.paths.push_back(""); // You need some even empty path
- info.domains.push_back(blg::messages_info::domain("my_app"));
- info.callback = some_file_loader; // Provide a callback
- // Create a basic locale without messages support
- boost::locale::generator gen;
- std::locale base_locale = gen("he_IL.UTF-8");
- // Install messages catalogs for "char" support to the final locale
- // we are going to use
- std::locale real_locale(base_locale,blg::create_messages_facet<char>(info));
- \endcode
- In order to setup \ref boost::locale::gnu_gettext::messages_info::language "language", \ref boost::locale::gnu_gettext::messages_info::country "country" and other members you may use \ref boost::locale::info facet for convenience,
- \code
- // Configure all options for message catalog
- namespace blg = boost::locale::gnu_gettext;
- blg::messages_info info;
- info.paths.push_back(""); // You need some even empty path
- info.domains.push_back(blg::messages_info::domain("my_app"));
- info.callback = some_file_loader; // Provide a callback
- // Create an object with default locale
- std::locale base_locale = gen("");
- // Use boost::locale::info to configure all parameters
- boost::locale::info const &properties = std::use_facet<boost::locale::info>(base_locale);
- info.language = properties.language();
- info.country = properties.country();
- info.encoding = properties.encoding();
- info.variant = properties.variant();
- // Install messages catalogs to the final locale
- std::locale real_locale(base_locale,blg::create_messages_facet<char>(info));
- \endcode
- \section msg_non_ascii_keys Non US-ASCII Keys
- Boost.Locale assumes that you use English for original text messages. And the best
- practice is to use US-ASCII characters for original keys.
- However in some cases it us useful in insert some Unicode characters in text like
- for example Copyright "©" character.
- As long as your narrow character string encoding is UTF-8 nothing further should be done.
- Boost.Locale assumes that your sources are encoded in UTF-8 and the input narrow
- string use UTF-8 - which is the default for most compilers around (with notable
- exception of Microsoft Visual C++).
- However if your narrow strings encoding in the source file is not UTF-8 but some other
- encoding like windows-1252, the string would be misinterpreted.
- You can specify the character set of the original strings when you specify the
- domain name for the application.
- \code
- #include <boost/locale.hpp>
- #include <iostream>
- using namespace std;
- using namespace boost::locale;
- int main()
- {
- generator gen;
- // Specify location of dictionaries
- gen.add_messages_path(".");
- // Specify the encoding of the source string
- gen.add_messages_domain("copyrighted/windows-1255");
- // Generate locales and imbue them to iostream
- locale::global(gen(""));
- cout.imbue(locale());
-
- // In Windows 1255 (C) symbol is encoded as 0xA9
- cout << translate("© 2001 All Rights Reserved") << endl;
- }
- \endcode
- Thus if the programs runs in UTF-8 locale the copyright symbol would
- be automatically converted to an appropriate UTF-8 sequence if the
- key is missing in the dictionary.
- \subsection msg_qna Questions and Answers
- - Do I need GNU Gettext to use Boost.Locale?
- \n
- Boost.Locale provides a run-time environment to load and use GNU Gettext message catalogs, but it does
- not provide tools for generation, translation, compilation and management of these catalogs.
- Boost.Locale only reimplements the GNU Gettext libintl.
- \n
- You would probably need:
- \n
- -# Boost.Locale itself -- for runtime.
- -# A tool for extracting strings from source code, and managing them: GNU Gettext provides good tools, but other
- implementations are available as well.
- -# A good translation program like <a href="http://userbase.kde.org/Lokalize">Lokalize</a>, <a href="http://www.poedit.net/">Pedit</a> or <a href="http://projects.gnome.org/gtranslator/">GTranslator</a>.
-
- - Why doesn't Boost.Locale provide tools for extracting and management of message catalogs. Why should
- I use GPL-ed software? Are my programs or message catalogs affected by its license?
- \n
- -# Boost.Locale does not link to or use any of the GNU Gettext code, so you need not worry about your code as
- the runtime library is fully reimplemented.
- -# You may freely use GPL-ed software for extracting and managing catalogs, the same way as you are free to use
- a GPL-ed editor. It does not affect your message catalogs or your code.
- -# I see no reason to reimplement well debugged, working tools like \c xgettext, \c msgfmt, \c msgmerge that
- do a very fine job, especially as they are freely available for download and support almost any platform.
- All Linux distributions, BSD Flavors, Mac OS X and other Unix like operating systems provide GNU Gettext tools
- as a standard package.\n
- Windows users can get GNU Gettext utilities via MinGW project. See \ref gettext_for_windows.
- - Is there any reason to prefer the Boost.Locale implementation to the original GNU Gettext runtime library?
- In either case I would probably need some of the GNU tools.
- \n
- There are two important differences between the GNU Gettext runtime library and the Boost.Locale implementation:
- \n
- -# The GNU Gettext runtime supports only one locale per process. It is not thread-safe to use multiple locales
- and encodings in the same process. This is perfectly fine for applications that interact directly with
- a single user like most GUI applications, but is problematic for services and servers.
- -# The GNU Gettext API supports only 8-bit encodings, making it irrelevant in environments that natively use
- wide strings.
- -# The GNU Gettext runtime library distributed under LGPL license which may be not convenient for some users.
- */
|