Tuesday, March 9, 2010

Un articulo in le “New York Times” compara le traductor de Google con altere simile traductores electronic.


(Languages of this post: Interlingua, English)


Hodie (9 martio 2010) ha apparite in le “New York Times” un articulo sur le systema de traduction electronic de Google. Illo notava que le servitores in su centrales informatic pote functionar como un sol computator gigantic que pote attaccar ingente collectiones de datos. Google ha usate iste capacite, inter altere cosas, pro disveloppar su servicio de traduction. Illo nunc include 52 linguas differente, plus que ulle altere systema simile. Ma alicun gruppos de linguas, como le gruppo romanic, le systema de Google manipula melio que alteres.

Il ha duo manieras de construer systemas de traduction electronic. Un methodo de attaccar le problema es construer systemas pro le analyse informatic del syntaxe de linguas differente e facer traductiones inter illos con referentia principalmente al grammatica de iste linguas, sia directemente de un lingua directemente a un altere, sia de un lingua a un “interlingua electronic” e postea al lingua final del traduction.

Un altere maniera es usar technicas statistic. Isto es lo que face Google, e illo ha besonio de ingente quantitates de datos, e le computatores de Google, combinate in un sol systema pro processar documentos in un varietate de linguas, pote comparar billiones de documentos e lor traductiones pro disveloppar un systema electronic inductive que pote facer traductiones de nove textos. “Nostre infrastructura es ideal pro tal applicationes”, diceva Vic Gundotra, un vice presidente de Google. “Nos pote usar lo pro projectos que altere personas non pote mesmo imaginar.”

Franz Och, un ancian professor del University of Southern California in Los Angeles, qui es le chef del equipa que disveloppa le systema de traduction informatic de Google, diceva que primo ille non voleva travaliar pro Google proque ille non credeva que Google vermente voleva dedicar le ressources necesse pro disveloppar un systema de traduction que esserea vermente utile. Ma Larry Page, un del fundatores de Google, le assecurava que le compania estava completemente preste a dedicar omne su ressources informatic a iste projecto.

Durante que multe systemas de traduction simile a illo de Google usa forsan un billion parolas de texto pro crear le modelo de un lingua, Google usa centenares de billiones pro su systema. Illo ha processate omne le documentos del Nationes Unite con traductiones a in su six linguas official e anque documentos del Parlamento Europee, que traduce multes de su documentos a in 23 linguas. Pro linguas plus obscur, Google ha disveloppate un collection de utensilios pro traductores human que les adjuta durante que illes travalia. Le systema de Google tunc conserva lor lor textos original e lor traductiones pro su continue analyse statistic de iste linguas.

Durante que le professor Och concede que le systema de traduction de Google non es perfecte, ille nota que illo se meliora rapidemente durante que le systema analysa crescente quantitates de texto, e io personalmente ha potite observar su capacitates augmentante durante que io ha preparate articulos multilingue pro iste blog. Mesmo con su imperfectiones, Google me ha adjuate multo in le preparation de iste articulos, ben que io debe dedicar multe travalio al correction del traductiones producite per su systema, que anque me ha inseniate a exprimer me de un maniera plus simple e directe durante que io prepara le textos de iste sito.

Ecce un comparation de duo textos in francese e espaniol e lor traductiones per un traductor human, le systema de Google, le systema de Yahoo Babel Fish via Systran, e le traductor Bing de Microsoft:


Texto del “Petit Prince” de Antoine de Saint-Exupéry:

“Le premier soir je me suis donc endormi sur la sable à mille miles de toute terre habitée. J’étais bien plus isolé qu’un naufragé sur un radeau au milieu de l’océan.”

Traduction human: “On the first night, I fell asleep on the sand, a thousand miles from any human habitation. I was far more isolated than a shipwrecked sailor on a raft in the middle of the ocean.” (Wordsworth Children’s Classics, 1995)

Traduction per Google Translate: “The first night I went to sleep on the sand a thousand miles from any human habitation. I was far more isolated than a shipwrecked sailor on a raft in the middle of the ocean.”

Traduction per Yahoo Babel Fish via Systran: “The first evening I thus fell asleep on sand with thousand miles of any inhabited ground. J’ stays much more insulated qu’ a shipwrecked man on a raft in the middle of l’ ocean.”

Traduction per Microsoft Bing: “The first evening I thus fell asleep one thousand miles of any ground inhabited with sand. I stays much more insulated was shipwrecked man on a raft in the middle of the ocean.”


Texto de “Cien años de soledad” de García Márquez:

“Muchos años después, frente al pelotón de fusilamiento, el coronel Aureliano Buendía había de recordar aquella tarde remota en que su padre lo llevó a conocer el hielo.”

Traduction human: “Many years later, as he faced the firing squad, Colonel Aureliano Buendía was to remember that distant afternoon when his father took him to discover ice.” (Harper Collins, 2003)

Traduction per Google translate: “Many years later, he faced the firing squad. Colonel Aureliano Buendía was to remember that distant afternoon when his father took him to discover ice.”

Traduction per Yahoo Babel Fish via Systran: “Many years later, in front of the firing squad, colonel Aureliano Buendía had to remember that one behind schedule remote one in which to his he took it father to know the ice.”

Traduction per Microsoft Bing: “Many years later, at the front of the firing squad, Colonel Aureliano Buendia was remember that remote afternoon his father took him to the ice.”

---

An article in the “New York Times” compares Google’s translator with other similar electronic translators.

Today (March 9, 2010) there appeared in the “New York Times” an article on Google’s electronic translation system. It noted that the servers in its computer centers can function like a single gigantic computer that can attack huge collections of data. Google has used this capacity, among other things, to develop its translation service. It now includes fifty-two different languages, more than any other similar system. But some groups of languages, such as the Romance group, Google’s system manipulates better than others.

There are two ways of constructing electronic translating systems. One method of attacking the problem is to construct systems for the computerized analysis of the syntax of different languages and to make translations beween them by referring principally to the grammar of these languages, whether directly from one language to another or from one language to an “electronic Interlingua” and then to the final language of the translation.

Another way is to use statistical techniques. This is what Google does, and it has need for enormous quantities of data, and Google’s computers, combined into a single system for processing documents in a variety of languages, can compare billions of documents and their translations to develop an inductive electronic system that can make translations of new texts. “Our infrastructure is ideal for such applicationes,” said Vic Gundotra, one of Google’s vice presidents. “We can use it for projects that others can’t even imagine.”

Franz Och, a former professor of the University of Southern California in Los Angeles, who is the head of the team that is developing Google’s electronic translation system, said that at first he did not want to work for Google because he did not believe that Google really wanted to dedicate the resources needed to develop a translation system that would be really useful. But Larry Page, one of the founders of Google, assured him that the company was completely ready to dedicate all its computer resources to this project.

While many translation systems similar to Google's use perhaps a billion words of text to model a language, Google uses hundreds of billions for its system. It has processed all the documents of the United Nations with translations into six official languages and also the documents of the European Parliament, which translates many of its documents into twenty-three languages. For more obscure languages, Google has developed a collection of tools for human translators that helps them while they work. Google’s system then preserves their original texts and their translations for its ongoing statistical analysis of these languages.

While Professor Och concedes that Google’s translation system is not perfect, he notes that it is improving rapidly as it analyzes growing amounts of text, and I have personally been able to observe its growing capabilities as I have prepared multilingual articles for this blog. Even with its imperfections, Google has helped me a lot in the preparation of these articles, though I have to dedicate a lot of work to correcting the translations produced by their system, which also has taught me to express myself in a more simple and direct way as I prepare texts for this site.

Here is a comparison of two texts in French and Spanish and their translations by a human translator, the Google system, Yahoo’s Babel-Fish-via-Systran system, and Microsoft’s Bing translator:


Text from “The Little Prince” by Antoine de Saint-Exupéry:

“Le premier soir je me suis donc endormi sur la sable à mille miles de toute terre habitée. J’étais bien plus isolé qu’un naufragé sur un radeau au milieu de l’océan.”

Human translation: “On the first night, I fell asleep on the sand, a thousand miles from any human habitation. I was far more isolated than a shipwrecked sailor on a raft in the middle of the ocean.” (Wordsworth Children’s Classics, 1995)

Translation by Google Translate: “The first night I went to sleep on the sand a thousand miles from any human habitation. I was far more isolated than a shipwrecked sailor on a raft in the middle of the ocean.”

Yahoo Babel Fish’s translation via Systran: “the first evening I thus fell asleep on sand with thousand miles of any inhabited ground. J’ stays much more insulated qu’ a shipwrecked man on a raft in the middle of l’ ocean.”

Microsoft Bing’s translation: “The first evening I thus fell asleep one thousand miles of any ground inhabited with sand. I stays much more insulated was shipwrecked man on a raft in the middle of the ocean.”


Text from “Cien años de soledad” de García Márquez:

“Muchos años después, frente al pelotón de fusilamiento, el coronel Aureliano Buendía había de recordar aquella tarde remota en que su padre lo llevó a conocer el hielo.”

Human translation: “Many years later, as he faced the firing squad, Colonel Aureliano Buendía was to remember that distant afternoon when his father took him to discover ice.” (Harper Collins, 2003)

Translation by Google Translate: “Many years later, he faced the firing squad. Colonel Aureliano Buendía was to remember that distant afternoon when his father took him to discover ice.”

Yahoo Babel Fish’s translation via Systran: “Many years later, in front of the firing squad, colonel Aureliano Buendía had to remember that one behind schedule remote one in which to his he took it father to know the ice.”

Microsoft Bing’s translation: “Many years later, at the front of the firing squad, Colonel Aureliano Buendia was remember that remote afternoon his father took him to the ice.”

No comments: