Romanizations...

All about user-defined episode / movie / file name format expressions
Post Reply
motorpsycho
Posts: 7
Joined: 23 Nov 2021, 14:38

Romanizations...

Post by motorpsycho »

I have this valuable script that someone on this forum kindly provided me years ago.
With it, I save the names of movies with a path like:
"DIRECTOR'S LAST NAME, Director's First Name/Italian title (year) [original title]"

It works pretty fine, but I’d like to make some changes.
When there are names or titles in Asian languages (or any that use non-Latin characters), I’d like them to be romanized.

The template should be like this:
"DIRECTOR'S LAST NAME, Director's First Name/Italian title <if it exists> <otherwise international title> (Original title <romanized if non-Latin characters are used>) [year]"

I’m not sure if I’ve managed to explain myself or if this modification is even possible.
In any case, thanks for your attention!

Format: Select all

{director.split(/\s/).reverse().join(', ').replaceFirst(/\w+/) { it.upper() }}/{ny} { allOf{ primaryTitle }{ localize.it.n }.joiningDistinct(', ', '[', ']'){ n.contains(it) ? null : it } }
User avatar
rednoah
The Source
Posts: 24030
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Romanizations...

Post by rednoah »

FileBot has built-in support for ICU script transliteration.


e.g. use the built-in ascii() function to take care of everything:

Groovy: Select all

primaryTitle.ascii()

e.g. if you need Latin (as opposed to ASCII which does not allow Diacritics) then you can do this:

Groovy: Select all

primaryTitle.transliterate('Any-Latin')
:idea: Please read the FAQ and How to Request Help.
motorpsycho
Posts: 7
Joined: 23 Nov 2021, 14:38

Re: Romanizations...

Post by motorpsycho »

Thanks for replying!
I'm trying to dig it...
Something actually changes if I add ".transliterate('Any-Latin')" to "primaryTitle"
The transliteration works but maybe it's not exactly what I'm looking for.
EG, for this title "ニンゲン合格" i get this translitteration: "ningen hé gé" but the correct one should be "Ninjin gōkaku"
Also for director name something goes wrong. I get "Qīng, zé, hēi" (which should have no sense at all) instead of "KUROSAWA, Kiyoshi"

I'll keep studying this thing.
Morevover I would like to change the output to:
"DIRECTOR'S LAST NAME, Director's First Name/Italian title <if it exists> <otherwise international title> (Original title <if different from italian><romanized if non-Latin characters are used>) [year]"

Thanks again.
User avatar
rednoah
The Source
Posts: 24030
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Romanizations...

Post by rednoah »

:idea: 合格 are Chinese characters, used in Mandarin Chinese, Korean, Japanese, etc. hé gé is the Pinyin romanisation these two characters in Mandarin Chinese.


:!: Japanese is hard... even for computers... each Kanji can have somewhere between 2 and 20 readings plus special readings for exceptions like 大人 so romanisation requires a complete dictionary at the very least. That might explain why ICU script transliteration doesn't seem to support Japanese Kanji at all.


:arrow: Looks like you're out of luck if you need client-side Romanisation for Japanese Kanji. You cannot compute it from the Kanji that you have, and TMDB generally doesn't have a dedicated "Romanised Japanese" field of information for each movie. Your best bet is the list of alternative titles, but your code might have to guess the correct entry (if any) from the list of potentially more than one alternative titles.

Console Output: Select all

$ filebot -list --db TheMovieDB --q 48341 --format "{ ny } | { movie.alternativeTitles.JP }"
License to Live (1999) | [Ningen gôkaku]
:idea: Please read the FAQ and How to Request Help.
motorpsycho
Posts: 7
Joined: 23 Nov 2021, 14:38

Re: Romanizations...

Post by motorpsycho »

Yep, got it (or... at least I think so).
Maybe the best way is having different scripts.

The one you've shared seems so useful for some files, thanks a lot fot your time and your help!
User avatar
rednoah
The Source
Posts: 24030
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Romanizations...

Post by rednoah »

motorpsycho wrote: 07 Dec 2024, 21:21 Maybe the best way is having different scripts.
You could certainly have a single script that decides (just as you would manually) on what to do depending on the movie at hand, perhaps based on the country of origin, original language, and so on.


e.g. if the movie is from Japan, search all the Japanese alternative titles for one that consists of Latin characters only (i.e. the Romanji title) and then pick that one:

Format: Select all

{ if (country =~ /JP/) ' [' + movie.alternativeTitles.JP.find{ it.isLatin() } + ']' }

Code: Select all

 [Ningen gôkaku]
:idea: Please read the FAQ and How to Request Help.
Post Reply