• September 6, 2021

Technical localization challenges and how to solve them

Stranger of Sword City Revisited (SoSCR) and Saviors of Sapphire Wings (SoSW) are two dungeon crawling JRPGs that have had previous releases for PlayStation Vita, in 2016 and 2019 respectively.

In late 2019, NIS America reached out to us to discuss the portability of these two games to Nintendo Switch and PC. This article describes the technical localization challenges that were encountered in the two games and how we approached their solution. It does not cover the actual translation of the text found in both games, which was handled by NIS America.

The original situation

The original SoSCR version of Vita was already an English localized version, so you might be wondering how difficult it was to create the English localized versions for new platforms.

Well, the Japanese release and the English release of this game were two completely separate SKUs, with only the Japanese or English text and no in-game option to switch between the two languages. This was something that had to be implemented for a global release.

As for SoSW, the original game had only had a Japanese release with the name 蒼 き 翼 の シ ュ バ リ エ (Aoki Tsubasa no Chevalier), there was no English text or release of any kind.

Japanese text encoding

One thing that applies to most Japanese games is that text and source code tend to be encoded using Shift-JIS encoding rather than the UTF-8 encoding that is much more commonly used in software. This can cause syntax errors when the compiler interprets the source code as other encoding and encounters unexpected characters. Our solution to this was to create a script that converts the source code to UTF-8.

However, there is a catch to doing this. If the source code contains encoded strings, they will also be converted, causing problems when these strings are passed to systems still waiting for Shift-JIS encoding. Fortunately for us, SoSCR and SoSW had neither.

Misinterpreted text appears as seemingly random characters.  Note that the Latin characters remain the same as the two encodings overlap for those characters

Misinterpreted text appears as seemingly random characters. Note that the Latin characters remain the same as the two encodings overlap for those characters

Asset text data

In both games, the text displayed to the player was spread across several different files and file types. The spacing is determined by content creation tools that determine which function or screen ends up displaying the text in the game.

So if you ever find yourself working on game portability, never assume you’ve found all of the text once you’ve come across the largest file that contains text data. A best practice is to search all files with a text editor or custom script to find common words and phrases. This will help you discover all the files that contain text in the project. Just be sure to check the binaries too!

Never assume you’ve found all the text once you’ve stumbled across the largest file that contains text data.

For this project, the text was divided into:

  • Text that is only displayed in the user interface.
  • Text that the characters say.
  • Text associated with objects, generally referred to as “type data” (items, spells, quests, etc.).
  • Text shown in a game manual.

The source of these different types of text was Excel spreadsheets. These Excel spreadsheets were then used as input for a number of custom tools, generating different files such as C ++ headers, plain text files, or custom binary formats. We had to take into account all these different sources of text data when we worked on the location systems to port the games.

The tutorial text for both games is contained in a separate file

The tutorial text for both games is contained in a separate file

Localization solutions for SoSCR

For SoSCR, an English and Japanese version of these texts already existed and the solution to make the game support both languages ​​at the same time was to create the compiled resources for both languages ​​and load the correct file based on the language settings. So for example, before we had a single file “dialog.dat” and then we would have two files called “dialog_en.dat” and “dialog_jp.dat”.

This was not the only solution to the problem: there was also the option to collate both languages ​​in the same file. This option will end up costing less space on the hard drive since there is only one file, but it requires tools that generate the file to change and the code that loads it. We found this to be the less ideal solution, since JRPGs like SoSCR have many different types of data; the amount of code that needed to be adapted was simply too great a risk. So in the end we opted for the first solution.

One case where this split file approach didn’t work was with the custom script file used for the dialog. This file contained the text spoken by the characters, as well as game-specific commands, such as giving the party a key item or starting a new mission. It was because of this that the file could not be reloaded while in-game. Attempting to do so would invalidate references to internal data structures and cause the game to crash. The tool that created this custom script file allows you to choose whether to use the English or Japanese text in the generated file. It then uses the text IDs from the script source files to include which line of text to insert into the output file.

The solution we used here was to decouple the language substitution that the script tool performs on text files. So now the custom script file only contains the text IDs and does not insert the final text, instead the game performs this search on the text of the dialog in English or Japanese depending on the language setting.

One of the most common forms of type data in RPGs is elements

One of the most common forms of type data in RPGs is elements

Location solutions for SoSW

In the case of SoSW, making a version in two languages ​​was more complicated. It also has a custom script file like SoSCR, but unlike the first game, the tool that generates the custom script file does not perform text substitution using text IDs. All spoken text is present directly in the source files of the custom script file.

Obviously, files like this cannot be delivered to a localization team, as they contain lines of code that are used to select the required text. So the problem we faced was the need to extract these strings into an excel sheet so that the workflow would be the same as that of SoSCR.

To do this, we created a Python script that would iterate over the custom script file, gathering all the text together while replacing that text with an ID so that it could be used to look up a string in English or Japanese at run time.

Because English text generally takes up more space than Japanese, some UI elements need to be resized

Another text source that caused problems was the type data text. These tend to have short names for identification, and since Japanese requires fewer characters for these words, the members of the associated type data structures were not long enough to fit into the English text.

At first, we tried to increase the byte size of these fields both in the game and in the tool that compiled the type data assets. Unfortunately, the tool turned out to be very fragile. So we ended up avoiding the problem by exporting the English data text to a completely new file and referring to it from the game only if the language was set to English. Although this is not an ideal solution, it does not have any drawbacks for the game itself, so we were happy with it.

One of the last issues we addressed was that because English text generally takes up more space than Japanese, some UI elements had to be resized to fit. Shrinking text is also an option here, but it tends to look pretty bad, especially in cases where there are multiple of the same type of text on the screen, such as in a list.

Lastly, there were also some images with text and videos with text. The solution here was to just use the multiple files approach again and load them based on language.

The location name background had to be changed to fit all location names correctly.

The location name background had to be changed to fit all location names correctly.

In the end, each project is a learning opportunity with its own challenges to overcome. This was the first time we’ve brought a dungeon crawling RPG to modern platforms and we’ve learned a lot from it.

Thomas Jongman was one of the protagonists of Codeglue’s Stranger of Sword City Revisited / Saviors of Sapphire Wings project and has been a game programmer at the company since 2019.

.

Leave a Reply

Your email address will not be published. Required fields are marked *