Nojeok Hill: My View from the Top

Korean Translation Tip: Ornery Koreans Write Things Backward

In spite of the titles of this article, most Koreans are not ornery, nor do they do things backward. They just write differently than we do in English.

Here are some examples.

Fractions and page numbers

Koreans don’t say “two-thirds” or “page two of three”; they say “of three, two” and “of three pages, the second page”. Fortunately, this only applies when spoken and written out in long form. If you’re just writing  numerals, then nothing changes.

This means the simplest solution when translating is to add a forward slash. In other words, translate both “Page 3 of 5” and "three-fifths" to "3/5". Otherwise, you'll have to write it as "5 페이지 중 3 페이지" and "5분의 3".

Korean Translation Tip – If it’s imperative that numbers from an English source stay in the same order in Korean for fractions and pages, then convert them to numerals. This is especially relevant with codes that auto-update, such as page numbering in Word. Otherwise, you'll find yourself making this Google-esque mistake!

Dates

Korean dates are written "year/month/day". It’s not usually a big deal to switch things around during translation, but in some cases, this can get complicated. We recently had to translate the following:

"Dates should be entered as ddmmmyyyy (Example: 14SEP2016)"

Unfortunately, we had no choice but to translate this with a long explanation that reads in English as:

"Dates must be entered in the day/month/year format, where the date is entered with two digits, the month with the three-letter English abbreviation in capital letters and the year with four digits (14SEP2016)."

Whew… That was a mouthful!

Korean Translation Tip – It’s easy to understand and translate Korean dates if you know the sequence, but don’t take it for granted that your Korean audience will be used to the English format for filling out forms.

Addresses

Korean addresses are written in Korean starting from the largest units (country, province, city…) and moving to the smallest units (…street, building, house or office number), but the other way around in English.

Here’s how our address in Korea looks when written in English:

#2406 Chungang Heightsville, 23, Ansancheonseo-Ro

Danwon-Gu, Ansan-Si, Gyeonggi-Do 15361 Republic of Korea

This is the English rendering of it from Korean:

Republic of Korea, Gyeonggi-Do, Ansan-Si, Danweon-Gu

Ansancheonseo-Ro 23, Chungang Heightsville #2406 (15361)

Kind of weird, huh? Here's an article on it.

Korean Translation Tip – When translating English business cards to Korean, if your client wants the address translated to Korean (and most Western clients do!), then turn the order around.

AM/PM

The Korean for "AM" is "오전" and for "PM" is "오후", but these are added before the number, not after. So "8 o'clock AM" is written "오전 8시" and "8 o'clock PM" is "오후 8시".

Korean Translation Tip – You can get away without translating AM and PM to Korean; they are understandable by many Koreans in English. However, if you do translate them, then you have to put the Korean equivalents IN FRONT of the numbers, not AFTER.

Sentence Structure

Considering how different the sentence structures are between Western languages and Korean, is it any wonder that Korean is written the other way around in the above examples? In fact, sometimes it seems Korean and English are polar opposites. If you need a refresher on this point, check out these two one-minute videos from past tips.

Korean Translation Tip: Handle Korean Line Breaks Like a Pro

With the robust multilingual support in Adobe Indesign and recent versions of other design packages, many clients are opting to handle Korean layout in-house.

Unfortunately, people with absolutely no knowledge of Korean can really butcher a layout job.

My Korean Translation Tips have addressed some of the most egregious mistakes and easy-to-fix issues, including Tip #16 (Cardinal Rules of Layout), #26 (Korean Font Differences), #31 (PowerPoint Tips), #29 (Spacing Issues in Word) and #32 (More Font Handling).

But line breaks are also a point of concern.

Suppose you've got this source text:

2015-12-17_9-55-44

And your Korean translation team delivers this fill-in-the-blank translation of it in Word:

2015-12-17_10-20-33

Don't lay it out into your design program like this:

2015-12-17_9-57-23

Or like this:

2015-12-17_10-00-52

Or even like this:

2015-12-17_10-03-30

These are not uncommon issues; they happen all the time, especially when Korean text is mixed with punctuation and English.

Korean Translation Tip, Part I – Hire us to do the most professional layout for you, or at least have us do an in-context proof of the text after you do the layout.

Korean Translation Tip, Part II – If you ignore the first half of this tip, be sure after layout to check all lines that start or end with punctuation and/or English to verify that the text matches the way the translation was delivered to you.

** BONUS – Do you see above that there are four fill-in-the-blank lines in both the source English and translated Korean? They aren't in the same sequence in the two languages! Want to know why? Check out this article and you'll understand: Tip #34 (Why You Can't Translate Phrase-by-Phrase Between English and Korean)

Korean Translation Tip: Why You Can’t Translate Phrase-by-Phrase Between English and Korean, Part II

Last month I posted a short video illustrating how you can't mix-and-match sentence fragments to make proper sentences between English and Korean.

A few people pointed out that this often works with English and Western languages, and so they weren't sure why I would trouble them with a video like this for Korean.

In response, I've put together another one here to emphasize how different Korean is from English and to put this matter to rest, once and for all.

Korean Translation Tip – Just because you can translate phrase-by-phrase between Western languages does not mean you can do it between Western and Asian languages.

Issues in Calculating Rates for KO>EN Translation Jobs, Revisited

Several years ago I posted an article about why I don't generally offer per-word rates for Korean>English translation. The following is from a recent email to a client, explaining things in a bit more detail.

======

Dear <Client>,
 
Here are the issues  I can think of now which make it hard to use source word/character rates on Korean>English work.
  1. The majority of the work I get for KO>EN is scanned source files in PDF format, which can't be analyzed precisely until the translation is complete. On those jobs, fixed quotes in advance or target word billing are the most reasonable. Sometimes these PDFs can be converted to Word through OCR or the native Adobe Acrobat conversion. However, for various reasons, these word counts are extremely unreliable.
  2. Even if the files are editable, I find that it takes an extra measure of care to ensure everyone's talking about the same thing when referring to Korean words/characters. To make matters worse, if the language settings in Word aren't set right, the software will count Korean words as characters (or vice versa, I can't remember which right now) and that creates confusion. At least until a few years ago, Excel also didn't count Korean words and characters correctly.
  3. Korean does not have a long tradition of using words (or even writing left-to-right), and I find that Koreans are not as consistent in their use of spacing as we are in English. Therefore, what you find is that different writing styles yield different Korean word counts, even as the final English translated word count remains unchanged. Furthermore, when clients equate Korean with Chinese and Japanese which don't use words, it adds another layer of confusion. Your colleague mentioned that internally you are assuming two Korean characters to be one word, but that is arbitrary. Korean words are calculated based on discrete units of meaning, and separated by spaces. 
  4. Different types of content return different word count expansions. For example, Korean word lists will translate to English almost at one for one. However, because Korean grammar attaches tags to words and those tags are then translated to English as separate words, the expansion rate increases the more "prose-y" a text is. The expansions vary depending on subject matter, too.
  5. As with the current job, many Korean writers, especially on technical documents, mix a lot of English words into the text. These are embedded in the Korean grammar though and can't be excluded from the word count. However, if the letters of the English words are counted as characters (which is what happens if not analyzed separately), it runs the word count way up. On today's job, there were 1,500 English words mixed in with some 4,000 Korean words. That means rejigging the word counting formula to avoid overcharging. Counting source characters also means having to do something extra with numbers, since that also runs up the count. 
For all these reasons, it is so much easier to just use the English word counts, which are predictable and universally understood. But of course, it is true that this makes it hard to quote projects in advance. One solution is simply to ask me to quote projects first, if you have the time to wait. But as I mentioned to your colleague today, I've also started offering a character rate to clients that just have to have a source-based billing structure. But since it's imprecise, it's still best if I can analyze, adjust and quote the work in advance to take account of the various issues mentioned above. Keep in mind though that if I'm taking the risks of all these unknown factors with an advance quote, I'm also going to aim a bit high; generally, my most competitive pricing is available on English-word rates.
 

Korean Translation Tip: Why You Can’t Translate Phrase-by-Phrase Between English and Korean

We frequently get translation requests for content where the source text has been chopped up into sentence fragments. This is especially common with captions for video, since the content needs to show up on-screen in bite-sized pieces. But sometime clients even send such requests because they want to be able to rearrange words themselves later, or because they sent over bilingual files for translation in a CAT tool which were improperly translated.

In the first case, as long as the source text forms complete thoughts and the translation doesn't have to correspond 1-for-1 by sentence fragment, we can translate it. But the "mix-and-match" approach is a recipe for disaster. 

Here's a video I put together to illustrate how structurally different Korean and English are and to show why the translation of complete thoughts must be done at the sentence level.

 

 

Korean Translation Tip – If you're ever tempted to ask that English sentences and phrases be translated into Korean in the order the words appear in the English (or vice versa), please watch this video again to remind yourself that English and Korean can't be connected in such a linear way.

Korean Translation Tip: Spacing Around Parentheses in Korean Looks Funky and Inconsistent

This tip is based on a reader question about the following graphic in last month's message.

image from koreanconsulting.typepad.com

The reader asked why there isn't a space before or after the parentheses… Good question!

The answer may surprise you, but no, there should not be spaces there. The reason is hard to explain clearly without getting into complicated grammar, but the basic idea is that Korean is made up of character units functioning as standalone words, and also of character tags/markers attached to standalone words to indicate various grammatical meanings (such as subject, object, etc.).

In the case above, without the words in parenthesis, the phrase would read "Study Hard" Campaign은, where 은 is a topic marker attached to the word Campaign. Therefore, the added parenthetical text is stuck right in between the word and its tag and no space is added on either side.

Keep in mind that spaces should be added around parentheses when additional text is not being stuffed between a word and its tag. This phenomenon seems to occur almost exclusively when English and/or numbers are inserted into Korean text. The spacing around parenthesis within pure Korean text generally follows the same rules as we use in English (though Koreans get used to such usage and often go without spacing even when it should be there).

Korean Translation Tip – Correct spacing in Korean around parentheses often looks funky and inconsistent to English speakers. Feel free to bug your linguist for confirmation, but expect to get a response back saying it’s OK.

Microsoft Thanked Me for Renewing My Subscription to the Magazine “Office 365 Small Business Premium”

I have been using the Korean version of Office 365 Small Business Premium for a year and it's time to renew. A couple days ago, Microsoft sent me an email thanking me for renewing.

Only problem….

They used the word for "subscribe" that is only used when subscribing to things to read, such as newspapers and magazines.

And they didn't mess it up once… They used the wrong word four times in one email.

Check it out:

2015-07-26_20-18-43

In every case, the indicated Korean word should be changed to "사용권". In Korean, there is no straight translation for "subscription" in the English sense here. The correct word means "right to use", which, if you think about it, means exactly what it should.

Perhaps the linguists who worked on the job didn't know that they were translating for a software subscription, rather than a magazine subscription. Or perhaps they just reused old TM segments which had been translated for a magazine subscription situation. Or maybe they just weren't paying attention.

Whichever it was, the QA processes failed. 

For lots more of these, check out A Collection of Korean Translation Errors in the User Interfaces of Leading Software.

Explaining My Request for Feedback on Using Machine Translation and Cross-Project Translation Memory Leverage in High-Quality Translation Workflows

I recently asked my agency clients for their feedback on offering a rate discount in exchange for the right to use machine translation and termbase leveraging on certain projects. Machine translation brings up ideas of low-quality output, and cross-client TM sharing is generally off-limits due to confidentiality issues and rights to use content. Machine translation approaches are not without confidentiality and content rights complications, either. Here is a detail explanation of my thoughts on the approach, as explained in an email to a client.

======

Dear [Client] – Thanks for the response and opportunity to explain what I'm thinking. (I'm afraid the following is longer than I expected when I started typing.)

You're right that MT has generally been considered just a cheap way to deliver low-quality output. And I agree that Google Translate and the others are currently useless as resources by themselves on high-quality translation projects. But I think there may be a way that MT can be utilized within a range of high-quality workflows. 
 
MT is now being used on high-quality workflows within certain VERY narrow sub-topics. One of my clients is apparently even making it work for Chinese and Japanese, so Korean is clearly not impossible. For example, the translation of a 500-page cutting robot manual can be used to train an MT engine to produce a very good job on another cutting robot manual. Apparently the process breaks down pretty quickly though. The text for a cutting robot manual may not be a good enough match even for training an MT engine to generate high quality output on an assembly robot manual.
 
If we bring the human translator into the process, along with a properly prepared TB, the MT may be able to bring in a few value-added suggestions right away that the professional can finalize efficiently.
 
In addition, it appears to me that the CAT tools are on the verge of getting MT, TMs and TBs to work together, so that if the software finds, say, an 80% match in the TM, it can identify what's different, replace words from the TB, and then machine translate ONLY the sub-segments that are different. I don't know why this couldn't easily start moving fuzzy matches up 10-20% right off the bat. And there are various algorithms for figuring out how good the MT match is likely to be. That means these high-quality segments could be removed from the translation step and only included for proofreading (i.e. post-MT edited, but to perfection, not the conventional "good enough" level). In this way, the translator is still doing his/her job on the segments where the MT/TM/TB combination doesn't get to the required threshold, and the entire document is still proofread and/or linguistically QAd, and the final product is possibly of a higher quality and consistency than otherwise. Over time, this approach should yield increasingly higher efficiency.
 
There can be no doubt that this is the direction things are moving in our industry and I would like to start experimenting with it now. However, to plug in the MT functionality to my CAT tool generally requires special client permission, and so I've never done it, not even once. And to use the TM from one job to leverage it to build up seed TMs in various fields to apply to other client projects is also out of the question without specific approval.
 
Therefore, my idea is to start by offering a penny discount on projects where the client gives me, in effect, an "indefinite, irrevocable right to use" their content for such an approach. It would never involve revealing full coherent documents in public, but would mean that the segments/TB entries would go into various reference TMs/TBs/corpora to be applied to other projects and/or that the translations we do with that content may be used to train MT engines which may also be used on other projects.
 
I would then watch where things go from there. If I'm just hitting dead-ends and this content isn't useful to my bottom line and doesn't look like it will in the future, I could stop offering the discount. On the other hand, if the approach works, I could even increase the discounts over time. 
 
I see it as a long-term approach. About ten years ago, I was worried that the technology would replace me eventually; I'm cautiously taking the position that the technology is creating more opportunities than it is killing. For example, I remember Google Adwords about 10 years ago… It was doable as a layperson. Today, Google has built in so much complexity that I can't even find an expert who can do it properly and affordably. You would have thought that by now the process would have been automated, but it's gone in the opposite direction, and changes so fast that one cannot stay up on it without investing huge amounts of time in continuously learning.
 
I suspect translation is also going to unfold this way. At memoQfest in May, I realized that some of the approaches I use in translation with my team are unique, but that the software is changing so fast that I can't hope to use all the functionality that's available. Not only learning the software, but also figuring out how to apply it and then continuously updating those approaches to the changing landscape is a process that creates barriers to entry which are likely greater than they they've ever been. You may have noticed that late last year I updated my email footer greeting to say "translation technologist". That's still a bit more of a "hopeful" title than it is in reality, but I also see a new role opening up even in the freelance side of things, which is the role that bridges project managers with translators. Project managers rarely have the time or inclination to really extract all the value and ensure all the quality that exists in the project stages between end-client and translator, especially if translators aren't using CAT tools. And this is even before the MT/TM/TB combination hits its stride. With the right skills and tools, a translation technologist could help achieve all kinds of benefits in the production chain. This part of my thinking is still in-development, but being able to use MT and leveraging TMs on a cross-client basis is surely a place to start.
 
Let me know your thoughts on this.

Korean Translation Tip: Correct Font Handling in Korean Layout

I've posted several short articles on Korean layout in this set of tips, including Cardinal Rules of Korean-Language Layout, Korean Layout Rules for MS PowerPoint and Spacing Issues in MS Word.

In this post, I introduce some font handling advice to improve the way Korean-language layout looks to Korean readers. Of course, this won't get you the same font and layout sensitivity my team delivers, but it will help you avoid a couple no-nos.

So here's the scoop…

The Korean fonts come with a set of double-byte punctuation marks to which spaces are added before or after. These extra spaces aren't needed in Korean; indeed, they look bad! You should use single-byte punctuation (i.e. the same marks we use in English).

In addition, Korean fonts include a collection of English letters. However, don't use these, either! Proper font mapping avoids this issue, but if you find that the fonts still aren' t right, switch the English text back to an English font (most likely the font of the source document).  

Here's what a string of text might look like if punctuation and English fonts are handled incorrectly:

2015-07-15_15-26-28

And here's how it should look:

2015-07-15_15-29-47

 Korean Translation Tip A – Don't use double-byte punctuation in a Korean translation.

 Korean Translation Tip B – Don't use Korean fonts for text that remains in English.