Foreign Characters for the Eclipse Build System
Having a problem with Eclipse and building files with foreign characters in the file name? If you are developing software, then read and follow this advice:
“Do NOT:!: use foreign characters in file names, paths or for anything else!”
What I mean with ‘foreign characters’ are things like éöüàäü, or simply anything which is outside the 7bit ASCII or Windows-1252 code page table, even if they are allowed by the file system of your operating system (e.g. Windows).
Or in other words: only use these characters for file or directory names:
Following that advice will keep you out of a lot of troubles, because many tool chain will simply not handle anything else well. You might be able to use spaces in file names, but to keep things on the save side: don’t use it.
If you follow that rule, you are fine and you can stop reading that article now .
Eclipse and Foreign Characters
You are still reading? So if you have foreign characters in your file or director name, then here is how you to workaround at least some of the Eclipse (and Windows!) issues around it.
Eclipse itself deals pretty well with foreign characters. The issue is with Windows and the command prompt/DOS Shell .
The issue is observed in Eclipse, e.g. in Texas Instrument Code Composer Studio v5. Having a file name with umlaut fails the build:
CCS Build and File with Umlaut
Obviously, the ‘Ü’ character of the source file is handled properly by Eclipse, but not by the build (make) system which run with command line tools and on DOS/cmd level.
Same thing with CodeWarrior and ARM gcc: the compiler is using a wrong file name:
GNU gcc build failure
Inspecting the make files shows that things are ok here:
So something is going wrong with calling make and the compiler. It looks a wrong character code translation is happening from Eclipse to the command prompt (DOS command line) level, and that code pages are not matching on my machine .
How to find out which Code Page uses cmd.exe? This excellent article shows that the command chcp (for change code page) shows the active code page:
chcp in the cmd.exe
Eclipse Code Page
But what encoding uses Eclipse? It must be the code page set by the Java environment? I find the settings under the menu Window > Preferences:
Eclipse Text File Encoding
So it shows for me the default windows code page 1252. It is possible to change the default code page of Eclipse (for the workspace) using the drop down box:
Changing Default Code Page
For CodeWarrior, it is possible to use an Eclipse command line argument to define the code page. This is set in the ‘cwide.ini’ file inside the eclipse installation folder. In trying to fix my problem, I have added this line to it and restarted Eclipse:
I was saying ‘trying’, because it fixed the error message reported back by the compiler, but the build still failed:
Build still fails with Code Page 850
Well, there must be something more. So I decided to revert my change in the cwide.ini file, and asked around for thoughts and help. And yes, someone came to the rescue and explained what is happening (Sluvy: thank you, thank you, thank you!).
The thing is that GNU make and even the compiler/linker is internally calling its own programs and batch files, invisible for me. And it looks like these executables likely are using a different code page, thus failing the build. But Sluvy has found a fix which requires a Windows registry change .
The trick is to permanently set the code page used by the Windows Command Processor (DOS Shell, cmd.exe) using a small ‘autorun’ command. Whenever something is using the command processor, it will execute my command, which is to set the code page to the same one I’m using in Eclipse.
For this, I run regedit.exe and go to this setting:
Here I use the context menu to add a new Multi-String Value:
Note: I case I already have that value, I do not need to add it, of course
Adding Registry Multi-String Value
I name the new value ‘Autorun’ and assign the command ‘chcp 1252′ to change the code page:
This is how it should look like:
Autorun in registry
To make that ‘autorun’ command invisible, I can use ‘@chcp 1252>nul’, see this link.
Building with Foreign Characters
Now time to try it out .
After changing the code page, it is advised to rebuild all the make files and do a clean build (menu Project > Clean).
And indeed: my project now compiles properly in Eclipse/CodeWarrior now:
Building with Eclipse and Foreign Characters in File Names
I learned a lot around Windows and Code Pages. And as always: it looks like the shortcomings of the past (7-bit ASCII code, etc) echoes into our world today, making things fail. Luckily there are is a way to overcome this, if necessary: it is possible to change the code page of the Windows command processor (cmd.exe) with a registry if it does not match the Eclipse code page used.
But it enforces even more my rule: “Do not use foreign characters in file names”, and simply sticking with normal characters and letters. It will avoid a lot of troubles .
Happy Code Paging
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)