Pedantry - Moved to http://pedantry.fistfulofeuros.net

Tuesday, December 23, 2003
 
Déménagement - Verhuizing - Moved Address

Yes, it's been too long, but between the move, the job, my dysfunctional computer and the transition from a furnished apartment in a cozy college town to an unfurnished apartment across the street from the Georgian embassy in Brussels has taken quite a toll.

But now it's Christmas. I'm finished moving, and so is my blog. We are now at:

http://pedantry.fistfulofeuros.net



Update your bookmarks appropriately.
 

Friday, November 14, 2003
 
Blame Canada - still

This was too good not to blog, and it doesn't really fit A Fistful of Euros. It's in today's LA Times.

Canada Doesn't Deserve These Vial Accusations
Peter McKnight of the Vancouver Sun

Blame Canada? I always thought that was just a satirical song from a postmodern TV cartoon.

Sure, I always figured we probably foisted a few too many comedians and journalists on to the U.S., but of late, the days of being pilloried because of Jim Carrey and Peter Jennings appear positively halcyon.

Seems Canada's now being blamed for exporting everything from terrorism to gay marriage, from lax laws on illegal drugs to "B.C. bud" — the best marijuana the American dollar isn't supposed to buy.

So it's not surprising U.S. drug czar John Walters described the Great White North as "the one place in the hemisphere where things are going the wrong [way] rapidly." And it's also not surprising that the vitriol aimed at Canada isn't limited to illegal drugs. Prescription drugs, too, are going the wrong way — from Canada to the U.S. — as American consumers now spend about $1 billion a year at Canadian pharmacies.

That's not Canada's fault, of course. It's simply because drug prices in the U.S. are so exorbitant — the highest in the world — that those who need them most, such as seniors on fixed incomes, have to truck up to the land of igloos several times a year to get their fixes.

This is a problem of American apathy, not Canadian kindness, toward the underprivileged. But you'd never gather that from the rhetoric of some American politicians, who say the problem stems from the fact that Canada isn't paying "its fair share."

Let's consider that criticism. Sure, Canadians pay anywhere from 30% to 80% less than Americans for the same drugs, but only because the Canadian pharmaceutical industry agreed to price controls in exchange for increased patent protection. Specifically, the government agreed to extend the length of time before generic versions of patented drugs could come on the market, and the industry agreed to invest more money in research and development. So Canadians are getting what we bargained for and are, therefore, paying our fair share.

Those opposed to the export of Canadian drugs also suggest that they really have the best interests of Americans at heart. After all, they're not concerned about the profits of drug companies, they insist, but rather about safety, because, you know, those beer-swilling Canadians just can't be trusted to produce good stuff.

That position, which is held by the Food and Drug Administration, pharmaceutical firms and some senators and House members, was concisely articulated by National Assn. of Chain Drug Stores President Craig Fuller, who said, "Importation of prescription drugs is illegal because it's unsafe."

That one statement represents a masterful example of backward logic. There's no evidence that Canadian drug manufacturing and labeling standards are any less rigorous than those of the U.S. — in fact, Canada tends to be the more cautious of the two countries in approving and marketing drugs. Several American studies, including one by the state of Illinois, confirm that Americans face no increased health risks in consuming Canadian pharmaceuticals.

Nevertheless, by making importation illegal, agencies like the FDA are prevented from ensuring such things as proper handling during personal importation. So importation may be unsafe, but only because it's illegal, which turns Fuller's statement on its head.

All of this is not to suggest that selling drugs to Americans is good for Canada. On the contrary, the Canadian health-care system is beginning to suffer from the mass exports.

Some pharmaceutical companies, including Bayer and Eli Lilly, are taking advantage of loopholes in their price control agreements and are beginning to raise the prices of drugs like Cipro and Zantac. In the last few months, some drug prices have risen almost 10%. And some companies are threatening to limit Canada's supply of drugs, which isn't good for anyone.

The present state of affairs can't continue. And it's up to the U.S. to find a solution because this is an American problem, even though it's causing problems in Canada. Canadian pharmaceutical manufacturers could stop providing drugs to Americans, of course, but that would violate the spirit of free trade, something Americans have championed for years.

So maybe the United States is the place where things are rapidly going wrong.

Blame America? No, that will never catch on.

 

Monday, November 10, 2003
 
Hi, Pietro!

An old friend from an earlier incarnation of the web has tracked me down to here, saying "hi" by sending me something from my Amazon wish list. I don't know when he sent it - my address at Amazon.com goes to my wife's APO box, which hasn't been checked in a month. So, Pietro, if you've been wondering why I haven't said anything, it's because I only got it today. I apologise for not having much to blog for the moment - I'm moving in the next three weeks and I'm desparately trying to actually get everything together for the move. In December I go back to daily blogging. Anyway, it's good to hear from you. Drop me a line - if you're still in Europe it'd be nice to actually meet you.
 


Friday, November 07, 2003
 
Relocating myself and my blog

Busy, busy, busy...

I'm almost at the end - I think - of the writing work that's been so instrumental in giving me writer's block, and I'm moving to Woluwe-Saint-Pierre/Sint-Pieters-Woluwe at the end of the month - a part of greater Brussels. Back to terre francophone.

At the same time, I will be moving Pedantry to a new format, a new URL and relaunching it. For those of you have have sat patiently waiting for the last two months as I haven't written squat, I apologise. I should be back to writing code for a living forthwith, a trade that lends itself far better to blogging.

Blog moving news will be posted here as it becomes available.
 


Saturday, October 18, 2003
 
Going to America on Tuesday

I'm off to Idaho until the 30th. I have a huge writing load for work that I'm taking with me, but I intend to blog, either here on on A Fistful of Euros something about going back to the States after two years in Europe. Then, when I get back, I'm going to resume regular blogging both here and at A Fistful of Euros. I've breached the simplest rule of all writing, and especially of blogging, by not doing it frequently. It's been a tough couple of months, but I'm still planning to move to Moveable Type and resume regular blogging.

The last few weeks, I haven't even been reading the blogs much. I hope no one is feeling neglected or slighted through my failure to keep up with the blogosphere or link to their blogs lately. I've had a request to syndicate one of my posts in print, so that provides some motivation to work.

Edward's comment in the post below makes me think a little presentation of computer assisted translation might be in order, since that's what has been on my mind lately anyway. Anyway, I hope everybody is having a good fall.
 

Friday, October 10, 2003
 
Yet another fascinating extract from my current masterpiece

II. Measuring Generated Text Quality

Quality assessment is a recurring problem in the translation industry, predating the arrival of computers by many centuries. The Koine Greek translation of the Torah, prepared in the second and third centuries B.C.E., is reputed to have used to have used either divine inspiration or peer review to guarantee quality, depending on which apocryphal account you prefer. Despite these efforts, it is regarded as a quite poor translation overall and Hebrew scholars brought at least six centuries of correction to it. (Catholic Encyclopaedia 1911) Peer review, being somewhat easier to arrange than divine inspiration, is now widely accepted as the best general approach to insuring translation quality. However, it is no less labour intensive than translation itself and consequently tends to be reserved for literary and scholarly works. It is generally minimal or non-existent in common commericial practice except in areas of unusually high liability like medical documentation and legal materials.

Evaluting translation quality has not, on the whole, changed in quite a long time. The SAE J2450 translation quality standard, which was only finalised in 2001, is little more than a way for a human reviewer to assign a number to their evalation of the translated text. Efforts to evaluate MT quality have largely followed traditional practice by using a human evaluator to assess the quality of the output. These methods, however, have very serious limitations. They are quite labour intensive, so they are difficult to provide on a large enough basis to establish overall quality. The subjective nature of these evaluations makes uniformity a serious problem. Lastly, in an enviroment where the machine translation product is not expected to stand alone but is used as an aid to the human translator, it is very difficult to ensure that these quality evaluations genuinely reflect translator labour.

However, using generated texts to write final translations offers us an obvious way to evaluate their quality: We can compare the generated output to the final, human produced translation. It is by comparison of the two that we can tell how much and how little we are actually assisting the translator. Evaluating machine translation quality cheaply and comprehensively means deploying an algorithm to evalute the difference between the generated text segment and the final translation.

Fortunately, this problem has been addressed by a group of algorithms usually refered to as edit distance metrics by mathematicians and fuzzy string matching by computer scientists. These kinds of algorithms are already deployed in most translation memory systems under the label fuzzy matches. However, fuzzy matching has such a poor reputation among translators that we would prefer to use the more concise technical term edit distance to describe it.

Hey, I managed to get a reference to the Septuagint into a computer science grant proposal. I think I deserve some credit for that. :^)


Wednesday, October 08, 2003
 
Using the lazy web as a proofreader

I've been off-line for a week. Things have been a bit messy lately in real life. Along with a number of other compilications to my life, my wife has gone back to the States for a month, and I expect to join her for a week at the end of the month. I also managed to get my old cellphone number back, so that's at least one good thing.

And, I have a new post up on A Fistful of Euros for anyone suffering from withdrawl due to my lack of blogging. :^)

Anyway, I have to make up some new, more technical materials for my company's grant proposal, and since some of my readers are also translators, I though I might put up the first part - the section which offers no real clue how to clone our work. I'm open to reactions.

The research programme that we are advancing is motivated by a number of practical considerations as well as a particular theoretical model of the translation process.

Of the novel tools that the 90's introduced to the translation industry, it is apparent that only one has enjoyed genuine success and acceptance by translators: translation memory. We believe that the failure of machine translation to gain acceptance, despite being an older and far more ambitious technology that has absorbed far more time and funding, is substantially the failure of the cognitive models that have driven it.

The promoters of machine translation have traditionally viewed MT not as a labour saving device for translators but as a partial replacement of them. This sort of thinking continues to permeate discussions of MT within the translation industry, where the term "post-editing" is still used to describe the task of human translation in conjunction with MT systems. In this model, translation is a process driven by the MT system, and the translator is understood as a post-editor who adds value to a machine translated text. Translation memory, in contrast, is a translator driven system. It is nothing but a database of existing translations and its contents are entirely determined by translators. It is a genuine labour-saving device, since it minimises the translator's workload by making sure that for any particular segment of text the translator need only translate it once. The translator is neither replaced nor reduced to a lesser role, because every sentence in the translated text is still the work of a translator.

We believe that translation should remain a process driven by translators, who remain the focus of all translation activity. The mechanical aids placed at translators' disposal should not be imagined as doing translation, but as devices designed to enhance the productive power of individuals. We contend that the task of these systems is to offer the translator a packet of information which is easily absorbed and which minimises the cognitive load of composing new translations. In this way, the computational apparatus which surrounds the translator acts as an extension of his or her own cognitive apparatus. Successful automation in the translation industry will be built on gains in machine aided translation, not automated translation, for the foreseeable future.

This distinction between machine-driven and translator-driven work lends itself to a family of models of activity and cognition generally known as distributed cognition. We are using a particular framework called Sociocultural Activity Theory to give our efforts a theoretical basis. (Vygotsky 1932/1986, Cole & Engeström 1993) This theory is increasingly important in the software design industry, which has long confronted difficulties in building software that enhances productivity. (Nardi, et al 1995, and Walenstein 2002) It advances a number of theoretical constructs that are useful in analysing the translation process, but we will only look at two of them here. The first of these is the idea that artefacts of some sort always stand between people and the objects of their activity. Second, artefacts in conjunction with human knowledge and abilities can form a single system, termed a "functional organ", in which the tool is adapted to the person and the person to the tool, enabling the whole to function better than the parts.

This sort of analysis suits the translation process quite well. Translating is a very information intensive process which, even in the pre-computer era, made heavy use of tools external to the translator. In the classical context, these were usually printed reference materials, such as dictionaries and glossaries as well as translations of related materials, and mechanical text production devices like typewriters. The typewriter in conjunction with the manual skills of the translator is a functional organ for the production of written texts. In the same sense, reference materials in conjunction with the linguistic capabilities of the translator are a functional organ for transforming information from one language into another. It is primarily this latter functional organ which is the object of our research, and we are largely concerned with the functioning and enhancement of those cognitive supports which are external to the translator.

Translators are human. They have limited memories, limited attention spans and suffer from fatigue and other performance-limiting phenomena. We cannot realistically change this property of human bodies, and some linguists believe that even if we could, our ability to learn and manipulate language might well be damaged rather than enhanced. (See Newport 1990, for example.) Yet, the qualities that we would most like to see in a translation are the very ones that the human translator is least naturally suited to give us: completeness, accuracy and consistency. Thus, the translator is compelled to use cognitive supports like dictionaries and term lists during translation.

This human frailty was a major motivation behind early MT. (Although admittedly the labour-intensive nature of translation was a more important motivator.) Machines are well suited to ensuring completeness, accuracy and consistency. However, despite over fifty years of effort, the core process of uncontrolled natural language translation still cannot be genuinely automated. The form and complexity of the information involved requires an authentically human knowledge of the world. (Bar Hillel 1960 is the classical source of this claim.) Even if we could construct computers potentially capable of storing and manipulating this encyclopaedic information about the world, it is not clear that there is any way to acquire this data except by embedding the computer in a slowly maturing human body.

Computers are, therefore, not well suited to the very human problem of constructing good translations. Consequently, enhancing the productivity of translators through automation means using computers to create better functional organs for translation. We must pay a great deal more attention to the interface between human translators and the machines that support their activity, and, although the translator must adapt to the machine, it is far more important for us to adapt the machine to the translator.

Machine translation, while it may not offer us much hope of substantially replacing the translator, does offer us the prospect of a very convenient interface between automated systems and the translator. We want, ideally, to generate a text that encapsulates the information that the translator would ordinarily be forced to search out in reference books and previous translations. This sort of comprehensive search and consistent result is the domain in which the computer excels, but where the human translator often fails. By putting the result in the form of a readable text, we minimise the additional cognitive load of interpreting this information. Where the text diverges only slightly from being a correct translation, the work of fixing it is quite simple. Where it diverges sharply, if it remains a readily comprehensible text which has, to the degree possible, used the terminology and usages which we would expect to find in a good translation, we believe that we have still made translators' work much easier by reducing the need to laboriously look up terms and check with previous translations.

Update: They move quickly over at Taccuino di Traduzione, where not only is this post linked to, but there is also a link to an article on machine translation in Italian. Alas, my Italian is not too good, so I used Babelfish as an aid in reading it. However, the stripped-down Systran code that powers Babelfish translated the title as "The bacon of the translator automatic rifle", which, I think, neatly demonstrates the point that the article is trying to make.