At one point in The Imitation Game (again), Commander Denniston enters Alan Turing’s workshop, shuts down Christopher the code-breaking machine, then orders Turing off the premises. The machine is not quite complete. Turing, terrified, protects it with his own body, and insists that the machine will work. He and his work is saved by fellow code-breakers who stand up for him. Then Denniston gives him one more month to make Christopher work.
Aspiring teams of machine translation research weren’t so lucky after, in 1964, the US government thought to set up a committee to look into their progress. The committee, pompously named the Automatic Language Processing Advisory Committee, or ALPAC in short, was active for two years, engaged in discussions, heard testimonies – and, in 1966, came up with a report that many thought was the nemesis of machine translation research.
I don’t want to analyze the ALPAC report in detail – the report is much more elaborate than one could judge it in a single blog post. Besides, greater experts have done that already. You can read the report itself by clicking here – and here is a quite well-known analysis from the point of view of a machine translation researcher, penned by John Hutchins in 1996.
At the same time, I think the ALPAC report brought about substantial changes in the way we look at translation technology – and no, not because the US government used it as a pretext to cut the funding of machine translation research.
True, the report points out on page 32:
[…] while we have machine-aided translation of general scientific text, we do not have useful machine translation. Further, there is no immediate or predictable prospect of useful machine translation.
I’d like to call your attention to the first phrase about machine-aided translation, which is overlooked by many who find the ALPAC report altogether evil – turns out that many who think so haven’t actually read the report itself. Which is quite a shame because the report goes on:
Machine-aided translation may be an important avenue toward better, quicker, and cheaper translation.
Then the report gets technical (again), and in Appendix 12, it describes a machine-assisted translation effort at the US Armed Forces base in Mannheim, Germany. This, to my knowledge, is the first account of using a proper translation memory. But that is not my point today.
My point that in the consequences of the ALPAC report, we witness a true paradigm shift: while machine translation used to aspire to replace humans, machine-assisted translation wishes to enhance their abilities.
Through my engineering studies, there was an understanding – mostly implied, seldom explicitly said – that we create technology for human users. To me, that implied two things: technology created for the benefit of human beings on the one hand, to be controlled by human beings on the other.
Unattended machine translation, at least in the form seen by ALPAC, seemed to violate the first principle. The report looks at MT from an utilitarian perspective, and concludes that it fails to serve the benefit of human beings well enough.
At the same time, unattended machine translation is probably the closest thing we have to autonomous machine intelligence, which was dreamed about, applauded, and dreaded by many. It seems these days that we use MT in a way that the problems with both benefit and control culminate in the worst possible way.
The moment we ask how technology serves the benefit of humankind, and what possible adverse effects it has, we move from the technical ground to the moral one.
There seems to be no question whether translation benefits humankind, and so any technology that makes translation quicker, better, and more accessible, also benefits humankind. Bad translations, on the other hand, can cause damage, as can any technology done – or used – badly.
After the ALPAC report, researchers took a gasp and slowly resumed their work, mainly from private funds. Public funds became abundant again in the 1980s and the 1990s, this time on the other side of the Atlantic, in the emerging European Union.
There was another change: practicioners of machine translation slowly relinquished the idea that unattended machine translation could replace human translators anytime soon. Some scholars voiced this very clearly. My favorite is a paper from 1980 called The Proper Place of Men and Machines in Language Translation, written by Martin Kay of Xerox Labs. In his reality check, Kay once and for all demotes Bar-Hillel’s idea of fully automatic high-quality translation (FAHQT) – disproved by Bar-Hillel himself – to the realm of desirable but unattainable ideals. At the same, he introduces the idea of an interactive translation environment that offers in-place help for translators in a graphical user interface. Graphical user interfaces, also pioneered by Xerox, were the thing in the software development of the 1980s.
In fact, Martin Kay doesn’t do much more than reiterate the findings of the ALPAC report, in a more accessible and practical manner. The timing of this reiteration couldn’t have been better, though: the 1980s saw computers becoming everyday tools for non-computer people, too. Computer software could extend the abilities of people sometimes termed as ‘knowledge workers’: interactive interfaces no longer required that the computer perform a long sequence of operations unattended, without human intervention.
The late 1980s and the early 1990s saw the ascent of the first commercial interactive – human-centered – translation environments, offering resources such as translation memories (to recycle earlier translations) and term bases (glossaries or dictionaries translators could set up for themselves), to be used while typing the translation.
In the late 1990s though, unattended machine translation found its way back to the mainstream with a lot of help from the web and its search engines. Remember AltaVista coming up with a translation service called BabelFish in late 1997? This facility offered almost instant translation of web pages, but back then it wasn’t ingenious enough to use pairs of web pages to derive translations from. Instead, it used a then well-known rule-based machine translation engine, working with pre-compiled linguistic rules, mainly after the fashion of generative linguistics, the approach I mentioned just two posts ago. It took a new decade and a new idea for Internet companies to start using actual parallel texts (source-language texts together with their translations) harvested from the very web they searched, in order to set up a statistical method for translation.
People also started to use machine translation differently. In fact the people who had access to machine translation grew a lot in numbers. Not only did engineers relinquish the dream of replacing humans in translation, humans also became more satisfied with imperfect but very quick translations. When browsing through a list of materials (mostly web search results), it’s enough to get the gist of a page, so that you can find out if it’s worth reading. In the early 2000s, we technologists even tended to call these machine translation programs gisting engines.
In the meantime, the then-already-flourishing professional translation industry needed more and more help – from larger numbers of translators, and from technology, too. As early as in the 1970s, industrial companies began to produce actual translations by machine-translating text, and then post-editing the output. These texts were often written in a controlled language, so that the rule-based machine translation programs of the period, quite limited in their abilities, could produce usable translations.
Creating a rule-based machine translation program is costly and difficult: it’s not for the faint of heart, so to say. You spend years and millions of dollars just to get your translation rules – well, not quite right, but somewhat acceptable.
Statistical machine translation makes it sort of easy. You just need to collect a lot of text – but it better be in the range of tens of millions of words in every language pair: eventually, you can end up spending years and millions of dollars all the same. Except when your subject matter is fairly restricted, and the translations at hand are reliably good. Then, in principle, you can work off some tens of thousands of words.
Around 2010, commercial and open-source packages (the link leads to the most popular example) became available to the public that anyone could use to build their own machine translation systems. This is the same sort of popularization that happened to computers in the early 1980s, and to publishing in the late 1980s: everyone could own their own computers, then everyone could produce printing-quality documents. Then, come the Internet, everyone could start dumping their writings, good and bad alike, on everyone else, just like I’m doing now.
Expertise doesn’t come ready with equipment. As a result, most (but not all!) customized machine translation systems were, and still are, worthless. Not to mention the general – non-customized – systems, which are not intended for document translation. The inexperienced use and hasty publication of custom engines, and the laymen’s unrealistic expectations towards general systems have done great damage to the perception of machine translation. Many translation professionals now see machine translation as a grossly overrated tool used by various industries for one purpose: to force their prices down.
This is not the fault of machine translation as such. Some customized machine translation programs – and some frameworks to customize them – are making real progress towards turning out really useful translations. These systems all assume that source-language texts will belong to a well-defined, restricted subject field; and they use very carefully selected, high-quality translations to train. In addition, the best of them can be easily retrained when there are new translations or when terminology changes.
These latter systems represent the best in machine translation, and you must look to them if you really want to find out what machine translation is like.
At the same time, the makers of general machine translation systems are also doing honest and useful research, but these systems are simply not intended for translating documents. Their purpose is to help human users read foreign-language texts, or to translate short snippets of texts – such as tweets or messages in a chat box.
In these last paragraphs, I deliberately avoided linking so that I don’t advertise any commercial players inadvertently.
What is the difference between worthless and usable machine translation if humans get to say the last word? A few percent. If ten percent of the machine translation output is considered by humans as usable (even after editing), it’s already a gain on translation time and productivity. The editable output can often be as good as recycled human translations, or translation memory matches as the profession calls them. This is because the recycled translation doesn’t belong to the same text, and needs editing anyway.
From various forums of professional translators, I sense increasing hostility towards post-edited machine translation, and the organizations that use it. I think this hostility is morally justified; but not because machine translation is inherently flawed or machine-translation post-editing is unfit for use in professional translation. No: although I still very much doubt that we’ll ever have human-equivalent machine translation, I find machine translation a very useful piece of technology, capable of contributing a lot to professional translations.
The problem runs deeper than that: it all boils down to how developers of technology look at human beings. I don’t believe technology providers look much beyond the practical problem at hand: I don’t know if they have a conscious view of humanity – which, ultimately, technology is supposed to serve.
As technology providers, on the whole, we have two choices: either we believe that human users can be made part of the technology, that is, they can be rationally talked into complying with whatever the particular technology requires. Or, we accept that humans are sentient beings who can – and prefer to – make decisions for themselves, and who will go a long way to protect their liberty to do so. We’ll also have to accept that their decisions and preferences cannot always be explained rationally. Like I said, this runs deep – when we feel that someone attempts to take control of us in a way or another, we give a very strong emotional response, and we also usually act to avert this attempt.
You see I’m biased: but there seems to be compelling evidence mounting against the rationalist view of humanity. The hostility we see on translation forums is not scientifically acceptable as proof: but some of the research in social science about denial has sufficient evidence. In short, if you try combating denial, justified or not, all attempts at rational explanation will actually reinforce the denial. This leads far away from the point, but it tells us that humans will, for better or worse, refuse to become part of the machine.
Being thrown at with a document that was just machine-translated, with the assignment to edit it, gives a feeling of constraint; a feeling of being controlled, overridden. Not to mention that it seems obvious that the costliest way to correct errors in a text – errors in a translation – is to leave all corrections until after the text is produced (the link requires subscription). So, post-editing seems to have an inherent efficiency problem, and, what’s worse, most of its present forms give all the wrong ideas to those who are supposed to do the actual work.
Being an engineer myself, I find it hard to refrain from making specific technology suggestions, but that would really be off-topic for this blog – plus, it would go against the very sort of caution I advise here. Suffice it to say that machine translation is a very valuable resource that helps with language understanding to a great extent. But it does require human attention when it’s used to produce publication-ready documents – and technology providers must exercise caution when they invite human participants in the process.
It’s not wrong to research and use machine translation. It’s not wrong to seek to improve it, and attempt to use it unattended wherever possible (provided it does not do harm). But when humans are involved in reviewing machine translation, technology must treat them with respect, and not suggest that it aspires to replace or displace humans. You see, technology is a form of communication: indirect as it may be, it reflects the attitude of its makers. When you create technology, always remember that you are actually talking to your users – and to some extent, to their clients, too.
In 1964, Stanisław Lem wrote – in his book called Summa Technologiae – that technology is the natural continuation of human evolution: “Using technologies as its [sic!] organs, man’s homeostatic activity has turned him into the master of the Earth; yet he is only powerful in the eyes of an apologist such as himself.” Many share this view these days, although I chose to quote it for a different reason today: if makers of technology are aware of this, they can more consciously dream up machines and software that extend humans’ abilities in a more natural way.
While writing this post, I came across an interesting moral paradox. It seems we have a moral obligation to do our best to think and act rationally, to understand the behavior of nature and other people – because we have the ability to do so. But at the same time, we mustn’t expect others to be rational – and we ourselves must be aware that we will fail at it, too. However, this seems to be a topic for my other blog called The Third Tower, so let me follow up this thought over there someday.