Thesaurus import with german umlauts

Started by RobiWan, November 24, 2017, 09:13:19 PM

Previous topic - Next topic

RobiWan

Hello,

has anyone tried to import the thesaurus with German umlauts. This works for me but the umlauts are then all broken.


thrinn

Works for me. Just make sure that your input file is properly encoded (UTF-8).
Thorsten
Win 10 / 64, IMatch 2018, IMA

sinus

Quote from: thrinn on November 24, 2017, 09:24:42 PM
Works for me. Just make sure that your input file is properly encoded (UTF-8).

Yes, correct, me too.
Best wishes from Switzerland! :-)
Markus

RobiWan

Quote from: thrinn on November 24, 2017, 09:24:42 PM
Just make sure that your input file is properly encoded (UTF-8).

And how I can do that? I can only select in LR "export Keywords" and LR create a simple Textfile. I can read this file with editors like notepad and all umlauts are correct.


Mario

Save it in Windows Notepad and choose the Encoding "UTF-8" at the bottom.
Usually Lr saves the file with UTF-8 encoding automatically, but maybe not in your case.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

RobiWan

I can't believe it. I have tried it 3 times on Windows 10 and OSX. All times with the latest Clasic CC version.

Mario

Lr does not write an UTF-8 BOM to indicate the file as UTF-8 encoded. IMatch hence assumes it is ANSI-encoded.
I will change that to assume UTF-8 instead for the next release.

Open the file in Notepad and save it as Unicode. This will write the file as UNICODE with BOM and the import should work.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

RobiWan

#7
Quote from: Mario on November 25, 2017, 01:54:15 PM
Open the file in Notepad and save it as Unicode. This will write the file as UNICODE with BOM and the import should work.

Yes this works. Thank you.

Quote from: Mario on November 25, 2017, 01:54:15 PM
Lr does not write an UTF-8 BOM to indicate the file as UTF-8 encoded.

This is true, but I mean its not necessary to set BOM header to tell applications that here is UTF-8 encoding.