1/17/2025 at 6:07:43 PM
This was an fun read, as someone who's both a Korean BW player and a speech recognition researcher.It's interesting to note that the original Korean transcription already has many errors, seemingly (and impressively) corrected by LLMs later on. For example, 12 안마당 빌드 (12 courtyard build) is actually 12 앞마당 빌드 (12 frontyard build), which might have been more understandable to BW players. Similarly 투에처리 빌드 (processing-at-two build? makes no sense lol) should have been transcribed 투해처리 빌드 (two-Hatchery build).
Therefore it may also be helpful to directly feed the slang dictionary into Whisper's inference process using contextual biasing. There are lots of ways to do this, but the simplest would be to increase the probability of slang words in the dictionary in the final prediction layer of Whisper by a constant factor. This is fairly easy to implement, for example by using HuggingFace's library: https://huggingface.co/docs/transformers/en/internal/generat...
by jaeyounkg
1/17/2025 at 11:06:21 PM
Thanks for the added context on the builds! As "foreign" BW player and fellow speech processing researcher, I agree shallow contextual biasing should help. While not difficult to implement, most generally available ASR solutions don't make it easy to use. There's a PR in ctranslate2 implementing the same feature so that it could be exposed in faster-whisper: https://github.com/OpenNMT/CTranslate2/pull/1789by woodson
1/17/2025 at 8:32:10 PM
I am a StarCraft fan and I have no idea what a courtyard or a frontyard is supposed to be! However I do know that the names of buildings, units, technologies, and strategies are usually heavily abbreviated in English. Perhaps the same is true in Korean? A 12 barracks build would usually just be called "12 rax", a two hatchery mutalisk build would be called "2 hatch muta", and a three hatchery hydralisk timing attack / all-in would be called "3 hatch hydra bust".by chongli
1/17/2025 at 8:57:51 PM
I believe the equivalent term used in English (exhibited in the new translation) is "natural", short for "natural expansion", which refers to the obvious location where the player should build their first expansion. It sounds like the term used in Korean for this concept literally means "front yard" rather than matching the English term.by rcthompson
1/17/2025 at 9:33:34 PM
Makes sense. And presumably the 12 means that you expand to your natural ("courtyard") with your 12th worker unit (probe, in the case of protoss).by Reason077
1/17/2025 at 11:03:11 PM
Not the parent commenter but not always. 9 pool just means you build a spawning pool at your main, for instance. This worker-prefix building build-order naming system also breaks down once people start referencing builds like 2 rax academy, 3 hatch muta, etc.by sushid
1/17/2025 at 11:07:01 PM
Right, "9 pool" means build a spawning pool when you have 9 workers. So "12 courtyard" means build an expansion when you have 12 workers.by Reason077
1/17/2025 at 11:15:00 PM
I think strictly "9 pool" means you build the pool when you have 9 supply. However, before you build a spawning pool, the only thing you can build that consumes supply is workers.by thaumasiotes
1/17/2025 at 9:13:48 PM
A lot of Korean slang is a little different. Source: not Korean but have been in the English community a long time and picked some stuff up."1rax double" is equivalent to "1rax expand" or "1rax CC". They use multi or double to mean expand in the early game. Instead of "cheese" or "all-in" they use "pil-sal-gi" which means ace/joker card or "han-bang" which means an army or attack on few resources.
I am not sure what short-hand they use for barracks, gateway, etc.
by starcraftgamer
1/17/2025 at 10:01:06 PM
Instead of "cheese" or "all-in" they use "pil-sal-gi" which means ace/joker cardThat’s a really interesting one to me! One thing I’ve noticed is that Koreans do not seem to have the same hangups / negative attitude towards cheese strategies as westerners do!
by chongli
1/17/2025 at 6:17:26 PM
Do they actually use the Korean word for, like, tossing something to refer to the Protoss? That’s a pretty funny cross-language pun if so.by bee_rider
1/17/2025 at 6:23:51 PM
Half of the words in the Korean blurb are just romanizations. Even build is just bil-deuby asdasdsddd
1/17/2025 at 10:53:52 PM
No, Protoss is just 토스, which is just hangulization of "Toss" aka Protoss.by sushid
1/17/2025 at 6:19:46 PM
Haha, no I acutually never associated this with the English word toss lol.by jaeyounkg