Page 1 of 1
Romanizations...
Posted: 05 Dec 2024, 15:44
by motorpsycho
I have this valuable script that someone on this forum kindly provided me years ago.
With it, I save the names of movies with a path like:
"
DIRECTOR'S LAST NAME, Director's First Name/Italian title (year) [original title]"
It works pretty fine, but I’d like to make some changes.
When there are names or titles in Asian languages (or any that use non-Latin characters), I’d like them to be
romanized.
The template should be like this:
"
DIRECTOR'S LAST NAME, Director's First Name/Italian title <if it exists> <otherwise international title> (Original title <romanized if non-Latin characters are used>) [year]"
I’m not sure if I’ve managed to explain myself or if this modification is even possible.
In any case, thanks for your attention!
Re: Romanizations...
Posted: 06 Dec 2024, 01:10
by rednoah
FileBot has built-in support for
ICU script transliteration.
e.g. use the built-in
ascii() function to take care of everything:
e.g. if you need Latin
(as opposed to ASCII which does not allow Diacritics) then you can do this:
Groovy: Select all
primaryTitle.transliterate('Any-Latin')
Re: Romanizations...
Posted: 07 Dec 2024, 15:10
by motorpsycho
Thanks for replying!
I'm trying to dig it...
Something actually changes if I add ".transliterate('Any-Latin')" to "primaryTitle"
The transliteration works but maybe it's not exactly what I'm looking for.
EG, for this title "ニンゲン合格" i get this translitteration: "ningen hé gé" but the correct one should be "Ninjin gōkaku"
Also for director name something goes wrong. I get "Qīng, zé, hēi" (which should have no sense at all) instead of "KUROSAWA, Kiyoshi"
I'll keep studying this thing.
Morevover I would like to change the output to:
"DIRECTOR'S LAST NAME, Director's First Name/Italian title <if it exists> <otherwise international title> (Original title <if different from italian><romanized if non-Latin characters are used>) [year]"
Thanks again.
Re: Romanizations...
Posted: 07 Dec 2024, 15:49
by rednoah
合格 are Chinese characters, used in Mandarin Chinese, Korean, Japanese, etc.
hé gé is the Pinyin romanisation these two characters in Mandarin Chinese.

Japanese is hard... even for computers... each Kanji can have somewhere between 2 and 20 readings plus special readings for exceptions like 大人 so romanisation requires a complete dictionary at the very least. That might explain why
ICU script transliteration doesn't seem to support Japanese Kanji at all.

Looks like you're out of luck if you need client-side Romanisation for Japanese Kanji. You cannot compute it from the Kanji that you have, and TMDB generally doesn't have a dedicated
"Romanised Japanese" field of information for each movie. Your best bet is the list of alternative titles, but your code might have to guess the correct entry
(if any) from the list of potentially more than one alternative titles.
Console Output: Select all
$ filebot -list --db TheMovieDB --q 48341 --format "{ ny } | { movie.alternativeTitles.JP }"
License to Live (1999) | [Ningen gôkaku]
Re: Romanizations...
Posted: 07 Dec 2024, 21:21
by motorpsycho
Yep, got it (or... at least I think so).
Maybe the best way is having different scripts.
The one you've shared seems so useful for some files, thanks a lot fot your time and your help!
Re: Romanizations...
Posted: 08 Dec 2024, 03:23
by rednoah
motorpsycho wrote: ↑07 Dec 2024, 21:21
Maybe the best way is having different scripts.
You could certainly have a single script that decides
(just as you would manually) on what to do depending on the movie at hand, perhaps based on the country of origin, original language, and so on.
e.g. if the movie is from Japan, search all the Japanese alternative titles for one that consists of Latin characters only
(i.e. the Romanji title) and then pick that one: