Category: translate korean

Issues in Calculating Rates for KO>EN Translation Jobs, Revisited

Several years ago I posted an article about why I don't generally offer per-word rates for Korean>English translation. The following is from a recent email to a client, explaining things in a bit more detail.

======

Dear <Client>,
 
Here are the issues  I can think of now which make it hard to use source word/character rates on Korean>English work.
  1. The majority of the work I get for KO>EN is scanned source files in PDF format, which can't be analyzed precisely until the translation is complete. On those jobs, fixed quotes in advance or target word billing are the most reasonable. Sometimes these PDFs can be converted to Word through OCR or the native Adobe Acrobat conversion. However, for various reasons, these word counts are extremely unreliable.
  2. Even if the files are editable, I find that it takes an extra measure of care to ensure everyone's talking about the same thing when referring to Korean words/characters. To make matters worse, if the language settings in Word aren't set right, the software will count Korean words as characters (or vice versa, I can't remember which right now) and that creates confusion. At least until a few years ago, Excel also didn't count Korean words and characters correctly.
  3. Korean does not have a long tradition of using words (or even writing left-to-right), and I find that Koreans are not as consistent in their use of spacing as we are in English. Therefore, what you find is that different writing styles yield different Korean word counts, even as the final English translated word count remains unchanged. Furthermore, when clients equate Korean with Chinese and Japanese which don't use words, it adds another layer of confusion. Your colleague mentioned that internally you are assuming two Korean characters to be one word, but that is arbitrary. Korean words are calculated based on discrete units of meaning, and separated by spaces. 
  4. Different types of content return different word count expansions. For example, Korean word lists will translate to English almost at one for one. However, because Korean grammar attaches tags to words and those tags are then translated to English as separate words, the expansion rate increases the more "prose-y" a text is. The expansions vary depending on subject matter, too.
  5. As with the current job, many Korean writers, especially on technical documents, mix a lot of English words into the text. These are embedded in the Korean grammar though and can't be excluded from the word count. However, if the letters of the English words are counted as characters (which is what happens if not analyzed separately), it runs the word count way up. On today's job, there were 1,500 English words mixed in with some 4,000 Korean words. That means rejigging the word counting formula to avoid overcharging. Counting source characters also means having to do something extra with numbers, since that also runs up the count. 
For all these reasons, it is so much easier to just use the English word counts, which are predictable and universally understood. But of course, it is true that this makes it hard to quote projects in advance. One solution is simply to ask me to quote projects first, if you have the time to wait. But as I mentioned to your colleague today, I've also started offering a character rate to clients that just have to have a source-based billing structure. But since it's imprecise, it's still best if I can analyze, adjust and quote the work in advance to take account of the various issues mentioned above. Keep in mind though that if I'm taking the risks of all these unknown factors with an advance quote, I'm also going to aim a bit high; generally, my most competitive pricing is available on English-word rates.
 

Korean Translation Tip: Why You Can’t Translate Phrase-by-Phrase Between English and Korean

We frequently get translation requests for content where the source text has been chopped up into sentence fragments. This is especially common with captions for video, since the content needs to show up on-screen in bite-sized pieces. But sometime clients even send such requests because they want to be able to rearrange words themselves later, or because they sent over bilingual files for translation in a CAT tool which were improperly translated.

In the first case, as long as the source text forms complete thoughts and the translation doesn't have to correspond 1-for-1 by sentence fragment, we can translate it. But the "mix-and-match" approach is a recipe for disaster. 

Here's a video I put together to illustrate how structurally different Korean and English are and to show why the translation of complete thoughts must be done at the sentence level.

 

 

Korean Translation Tip – If you're ever tempted to ask that English sentences and phrases be translated into Korean in the order the words appear in the English (or vice versa), please watch this video again to remind yourself that English and Korean can't be connected in such a linear way.

Korean Translation Tip: Spacing Around Parentheses in Korean Looks Funky and Inconsistent

This tip is based on a reader question about the following graphic in last month's message.

image from koreanconsulting.typepad.com

The reader asked why there isn't a space before or after the parentheses… Good question!

The answer may surprise you, but no, there should not be spaces there. The reason is hard to explain clearly without getting into complicated grammar, but the basic idea is that Korean is made up of character units functioning as standalone words, and also of character tags/markers attached to standalone words to indicate various grammatical meanings (such as subject, object, etc.).

In the case above, without the words in parenthesis, the phrase would read "Study Hard" Campaign은, where 은 is a topic marker attached to the word Campaign. Therefore, the added parenthetical text is stuck right in between the word and its tag and no space is added on either side.

Keep in mind that spaces should be added around parentheses when additional text is not being stuffed between a word and its tag. This phenomenon seems to occur almost exclusively when English and/or numbers are inserted into Korean text. The spacing around parenthesis within pure Korean text generally follows the same rules as we use in English (though Koreans get used to such usage and often go without spacing even when it should be there).

Korean Translation Tip – Correct spacing in Korean around parentheses often looks funky and inconsistent to English speakers. Feel free to bug your linguist for confirmation, but expect to get a response back saying it’s OK.

Microsoft Thanked Me for Renewing My Subscription to the Magazine “Office 365 Small Business Premium”

I have been using the Korean version of Office 365 Small Business Premium for a year and it's time to renew. A couple days ago, Microsoft sent me an email thanking me for renewing.

Only problem….

They used the word for "subscribe" that is only used when subscribing to things to read, such as newspapers and magazines.

And they didn't mess it up once… They used the wrong word four times in one email.

Check it out:

2015-07-26_20-18-43

In every case, the indicated Korean word should be changed to "사용권". In Korean, there is no straight translation for "subscription" in the English sense here. The correct word means "right to use", which, if you think about it, means exactly what it should.

Perhaps the linguists who worked on the job didn't know that they were translating for a software subscription, rather than a magazine subscription. Or perhaps they just reused old TM segments which had been translated for a magazine subscription situation. Or maybe they just weren't paying attention.

Whichever it was, the QA processes failed. 

For lots more of these, check out A Collection of Korean Translation Errors in the User Interfaces of Leading Software.

Explaining My Request for Feedback on Using Machine Translation and Cross-Project Translation Memory Leverage in High-Quality Translation Workflows

I recently asked my agency clients for their feedback on offering a rate discount in exchange for the right to use machine translation and termbase leveraging on certain projects. Machine translation brings up ideas of low-quality output, and cross-client TM sharing is generally off-limits due to confidentiality issues and rights to use content. Machine translation approaches are not without confidentiality and content rights complications, either. Here is a detail explanation of my thoughts on the approach, as explained in an email to a client.

======

Dear [Client] – Thanks for the response and opportunity to explain what I'm thinking. (I'm afraid the following is longer than I expected when I started typing.)

You're right that MT has generally been considered just a cheap way to deliver low-quality output. And I agree that Google Translate and the others are currently useless as resources by themselves on high-quality translation projects. But I think there may be a way that MT can be utilized within a range of high-quality workflows. 
 
MT is now being used on high-quality workflows within certain VERY narrow sub-topics. One of my clients is apparently even making it work for Chinese and Japanese, so Korean is clearly not impossible. For example, the translation of a 500-page cutting robot manual can be used to train an MT engine to produce a very good job on another cutting robot manual. Apparently the process breaks down pretty quickly though. The text for a cutting robot manual may not be a good enough match even for training an MT engine to generate high quality output on an assembly robot manual.
 
If we bring the human translator into the process, along with a properly prepared TB, the MT may be able to bring in a few value-added suggestions right away that the professional can finalize efficiently.
 
In addition, it appears to me that the CAT tools are on the verge of getting MT, TMs and TBs to work together, so that if the software finds, say, an 80% match in the TM, it can identify what's different, replace words from the TB, and then machine translate ONLY the sub-segments that are different. I don't know why this couldn't easily start moving fuzzy matches up 10-20% right off the bat. And there are various algorithms for figuring out how good the MT match is likely to be. That means these high-quality segments could be removed from the translation step and only included for proofreading (i.e. post-MT edited, but to perfection, not the conventional "good enough" level). In this way, the translator is still doing his/her job on the segments where the MT/TM/TB combination doesn't get to the required threshold, and the entire document is still proofread and/or linguistically QAd, and the final product is possibly of a higher quality and consistency than otherwise. Over time, this approach should yield increasingly higher efficiency.
 
There can be no doubt that this is the direction things are moving in our industry and I would like to start experimenting with it now. However, to plug in the MT functionality to my CAT tool generally requires special client permission, and so I've never done it, not even once. And to use the TM from one job to leverage it to build up seed TMs in various fields to apply to other client projects is also out of the question without specific approval.
 
Therefore, my idea is to start by offering a penny discount on projects where the client gives me, in effect, an "indefinite, irrevocable right to use" their content for such an approach. It would never involve revealing full coherent documents in public, but would mean that the segments/TB entries would go into various reference TMs/TBs/corpora to be applied to other projects and/or that the translations we do with that content may be used to train MT engines which may also be used on other projects.
 
I would then watch where things go from there. If I'm just hitting dead-ends and this content isn't useful to my bottom line and doesn't look like it will in the future, I could stop offering the discount. On the other hand, if the approach works, I could even increase the discounts over time. 
 
I see it as a long-term approach. About ten years ago, I was worried that the technology would replace me eventually; I'm cautiously taking the position that the technology is creating more opportunities than it is killing. For example, I remember Google Adwords about 10 years ago… It was doable as a layperson. Today, Google has built in so much complexity that I can't even find an expert who can do it properly and affordably. You would have thought that by now the process would have been automated, but it's gone in the opposite direction, and changes so fast that one cannot stay up on it without investing huge amounts of time in continuously learning.
 
I suspect translation is also going to unfold this way. At memoQfest in May, I realized that some of the approaches I use in translation with my team are unique, but that the software is changing so fast that I can't hope to use all the functionality that's available. Not only learning the software, but also figuring out how to apply it and then continuously updating those approaches to the changing landscape is a process that creates barriers to entry which are likely greater than they they've ever been. You may have noticed that late last year I updated my email footer greeting to say "translation technologist". That's still a bit more of a "hopeful" title than it is in reality, but I also see a new role opening up even in the freelance side of things, which is the role that bridges project managers with translators. Project managers rarely have the time or inclination to really extract all the value and ensure all the quality that exists in the project stages between end-client and translator, especially if translators aren't using CAT tools. And this is even before the MT/TM/TB combination hits its stride. With the right skills and tools, a translation technologist could help achieve all kinds of benefits in the production chain. This part of my thinking is still in-development, but being able to use MT and leveraging TMs on a cross-client basis is surely a place to start.
 
Let me know your thoughts on this.

Korean Translation Tip: Correct Font Handling in Korean Layout

I've posted several short articles on Korean layout in this set of tips, including Cardinal Rules of Korean-Language Layout, Korean Layout Rules for MS PowerPoint and Spacing Issues in MS Word.

In this post, I introduce some font handling advice to improve the way Korean-language layout looks to Korean readers. Of course, this won't get you the same font and layout sensitivity my team delivers, but it will help you avoid a couple no-nos.

So here's the scoop…

The Korean fonts come with a set of double-byte punctuation marks to which spaces are added before or after. These extra spaces aren't needed in Korean; indeed, they look bad! You should use single-byte punctuation (i.e. the same marks we use in English).

In addition, Korean fonts include a collection of English letters. However, don't use these, either! Proper font mapping avoids this issue, but if you find that the fonts still aren' t right, switch the English text back to an English font (most likely the font of the source document).  

Here's what a string of text might look like if punctuation and English fonts are handled incorrectly:

2015-07-15_15-26-28

And here's how it should look:

2015-07-15_15-29-47

 Korean Translation Tip A – Don't use double-byte punctuation in a Korean translation.

 Korean Translation Tip B – Don't use Korean fonts for text that remains in English.

Korean Translation Tip: Applying the Cardinal Rules of Korean-Language Layout to Microsoft PowerPoint Files

I've previously written about the Cardinal Rules of Korean-Language Layout. When clients handle layout on their end, I frequently send them the link to this article so that they or their layout expert can brush up quickly on the rules.

Of course, to implement those correctly, you've got to know how to use the software. For readers working in advanced design programs, I assume you know how.

This tip though is for people dealing with Korean in PowerPoint files who don't feel like paying somebody to fix things, but still want to deliver a good job.

—–

Occasionally a Korean PowerPoint slide will end up with text like the following excerpt from my master's thesis.

2014-04-06_4-50-54

See those ugly red lines under the text marked by the red arrow? That's PowerPoint's quality checker indicating non-existent language problems. And the blue arrow shows a Korean word split incorrectly at the end of the line.

How do you fix these two issues if you don't know Korean?

The answer is going to seem too easy, but it's amazing how hard it was for me to figure it out. (In fact, I didn't figure it out; I had to ask my super-smart layout guy Xiang for the answer!)

To fix things, select the text and then change the language to Korean as follows:

2014-04-06_4-52-00

Doing so produces this correctly formatted text:

2014-04-06_4-53-58

The above is still not my preferred style though. If you really want to make it look nice, right- AND left-justify the text to get this perfect specimen:

2014-04-06_4-54-15

Korean Translation Tip – Follow the above procedure on PowerPoint slides to make the text look like it should; otherwise, if the language settings aren't right, your great Korean translation may still look terrible.

——-

** By the way, setting the language correctly solves problems in PowerPoint files that contain other languages, too!

Korean Translation Tip: Three (3) Number-Related Tips in One (1) Easy Article

In previous posts, I've shared about number units unique to Korean and how Korean prose doesn't include a lot of spelled-out numbers

Here are three more tips…

———-

Many Koreans handwrite their numbers differently than we do in English. I used to think this was a generational thing, but I occasionally see funny number writing from young people, too. 

Translation Tip #1 – When translating handwritten Korean text, watch out for these variants on the numerals "9" and "8".

2014-04-06_3-19-12

———-

Most Koreans know how Roman numerals work, but it's not a normal system for writing in Korean. Why risk it when translating?

Korean Translation Tip #2 –  It's generally safer to change Roman numerals to Arabic numerals (ex: "Stage IV" >>  "4 단계") when translating from English to Korean.

———-

There are two ways of spelling and pronouncing numbers in Korean: the native-Korean way and the Chinese-derived way. Usage depends on context and/or what's being counted, and the correct approach is typically one or the other; not a choice of whichever the speaker prefers. This can be a point of confusion for late learners of Korean like yours truly, but it's second-nature to native Korean speakers.

The issue is particularly relevant for the numbers 1 through 99, which is the same range of numbers Koreans prefer to write as Arabic numerals, instead of spelling them out like we often do in English. (See link in first sentence above for details.)

Since the same Arabic numerals are used regardless of pronunciation or spelling and a Korean translation is likely to use those numerals, this issue normally remains invisible to non-Korean speakers. However, when the numbers are spelled out (which does happen, though not often), there could be situations where they appear to have been done so inconsistently even though they are correct.

Here's an example. The number five written out in native Korean is 다섯, but in Chinese-derived Korean is 오. "Five hours" is commonly written as "5시간" but could be spelled out as "다섯 시간". On the other hand, "five minutes" is best translated as "5분" but might also be written as "오분".

Korean Translation Tip #3 – You're unlikely to get confused by this when reviewing a Korean translation since these numbers will usually be written as Arabic numerals. But just keep in mind that it's possible the same numbers (especially smaller ones like the digits 1 through 9) may appear to be written out inconsistently even in cases where they are correct.

Yet Another Korean Translation Mistake in Google Android

My tablet recently upgraded itself and the following message appeared during the process:

20150123_2309235

It says:

Android is upgrading… 

Optimizing 92 of 116 apps

So the device is only optimizing 92 apps and it is not going to optimize the other 24? 

What Google meant to say was:

Android is upgrading…

Optimizing the 92nd of 116 apps

So, the correct translation should be

Android 업그레이드 중

앱 116개 중 92번째 것을 최적화 중

 

============

 

A Korean Translation Error in Microsoft Word 2013

This is the standard print dialogue in the Korean version of Microsoft Word:

2015-02-05_13-41-45

The text in the red circle says "Number of pages" (as in one page, 20 pages, etc.). However, as explained by the pop-up tip in the blue box, it's the spot for entering which pages to print (such as pages 5-10, or pages 5, 7 and 8).

The original English version would have said something like "Page numbers". But in Korean the correct translation here should be "페이지 번호/범위", which literally means "Page numbers/range".

 

========