The European Commission is taking another step in its efforts to foster multilingualism. The commission's collection of about 1 million sentences and their high-quality translations in 22 of the 23 official European Union languages--including those of the new member states--is the largest in so many languages and is now freely available.
This kind of data is highly sought after by developers of machine translation systems in which automatic translation software "learns" from manually translated texts how words and phrases are correctly and contextually translated. The data can also help the development of other linguistic software tools such as grammar and spell checkers, online dictionaries, and multilingual text classification systems.
In a press release, Janez Potocnik, European commissioner for science and research, said, "This unique collection of language data contributes to the creation of a new generation of software tools for human language processing and helps foster the competitiveness of the language industry, which is already one of the fastest growing industries in the European Union."
The E.U institutions have more multilingual texts than any other organization because of the. …