Page 1 of 1
Full roman numerals support
Posted: 09 May 2016, 23:59
by ZGab
Hello,
You can find here a full roman numeral regular expression for FileBot script fn:amc.
Code: Select all
{n.replaceAll(/\b(?i:M{1,4}(?:CM|CD|D?C{0,3})(?:XC|XL|L?X{0,3})(?:IX|IV|V?I{0,3})|M{0,4}(?:CM|C?D|D?C{1,3})(?:XC|XL|L?X{0,3})(?:IX|IV|V?I{0,3})|M{0,4}(?:CM|CD|D?C{0,3})(?:XC|X?L|L?X{1,3})(?:IX|IV|V?I{0,3})|M{0,4}(?:CM|CD|D?C{0,3})(?:XC|XL|L?X{0,3})(?:IX|I?V|V?I{1,3}))\b/, { it.upper() })}
I've found the regular expression from here :
http://stackoverflow.com/questions/2673 ... expression (comment from Corin - 5 votes)
I've just adapted for fn:amc scripting after testing it on
http://www.regexplanet.com/advanced/java/index.html
Note : FileBot fn:amc replaceAll works correctly only if all capture group are ignore by (?:) or including a modifier as ingorecase (?i:)
Re: Full roman numerals support
Posted: 19 May 2016, 20:30
by ZGab
For whom interested by this regex, here is a new version :
Code: Select all
{n.replaceAll(/\b(?i:M{1,4}(?:CM|CD|D?C{0,3})(?:XC|XL|L?X{0,3})(?:IX|IV|V?I{0,3})|M{0,4}(?:CM|C?D|D?C{1,3})(?:XC|XL|L?X{0,3})(?:IX|IV|V?I{0,3})|M{0,4}(?:CM|CD|D?C{0,3})(?:XC|X?L|L?X{1,3})(?:IX|IV|V?I{0,3})|M{0,4}(?:CM|CD|D?C{0,3})(?:XC|XL|L?X{0,3})(?:IX|I?V|V?I{1,3}))(?:[^\w']|\Z)/, { it.upper() })}
Compared to previous version it put compatible roman numerals string into upper if last roman numeral character is not followed by a simple quote.
Useful for french language and such cases to avoid unexpected uppercase :
Code: Select all
"À l'aveugle" instead of "À L'aveugle"
Code: Select all
"La vie d'Adèle" instead of "La vie D'Adèle"
etc...
Re: Full roman numerals support
Posted: 19 May 2016, 20:51
by kim
just use {n} (99.99 % of the time it's better then if you use any format to replace it)

use for filename: {"La vie d'Adèle".upperInitial()} (only because it's ez to read)
btw:
(original title) La vie d'Adèle - Chapitres 1 et 2
France La vie d'Adèle
France (French title) La vie d'Adèle
http://www.imdb.com/title/tt2278871/rel ... tt_ql_dt_2
https://www.themoviedb.org/movie/152584 ... uage=fr-FR
Re: Full roman numerals support
Posted: 19 May 2016, 21:35
by ZGab
You right, but to be able to use main title with special chars I'm using following format :
Code: Select all
...{norm =
{
it
.transliterate(defines.mylang.lower() + ';')
.replaceAll(/[`´‘’?""“”]/, "'")
.replaceAll(/[:|]/, " - ")
.replaceAll(/[?]/, "!")
.replaceAll(/[*\s]+/, " ")
.replaceAll(/\b(?i:M{1,4}(?:CM|CD|D?C{0,3})(?:XC|XL|L?X{0,3})(?:IX|IV|V?I{0,3})|M{0,4}(?:CM|C?D|D?C{1,3})(?:XC|XL|L?X{0,3})(?:IX|IV|V?I{0,3})|M{0,4}(?:CM|CD|D?C{0,3})(?:XC|X?L|L?X{1,3})(?:IX|IV|V?I{0,3})|M{0,4}(?:CM|CD|D?C{0,3})(?:XC|XL|L?X{0,3})(?:IX|I?V|V?I{1,3}))(?:[^\w']|\Z)/, { it.upper() })
.replaceAll(/\b[0-9](?i:th|nd|rd)\b/, { it.lower() })
};
norm(n)}...
I think it's useful when we have such case :
https://www.themoviedb.org/movie/813-ai ... uage=fr-FR
https://www.themoviedb.org/movie/61346- ... uage=fr-FR
Or non roman alphabet transliteration, in that case the case may be changed and I want to force good one for roman numerals.
I may be unnecessary, you're probably right
Re: Full roman numerals support
Posted: 19 May 2016, 21:58
by ZGab
I'll stop using this roman numerals regex cause to unicode characters :
Code: Select all
.replaceAll(/\b(?i:M{1,4}(?:CM|CD|D?C{0,3})(?:XC|XL|L?X{0,3})(?:IX|IV|V?I{0,3})|M{0,4}(?:CM|C?D|D?C{1,3})(?:XC|XL|L?X{0,3})(?:IX|IV|V?I{0,3})|M{0,4}(?:CM|CD|D?C{0,3})(?:XC|X?L|L?X{1,3})(?:IX|IV|V?I{0,3})|M{0,4}(?:CM|CD|D?C{0,3})(?:XC|XL|L?X{0,3})(?:IX|I?V|V?I{1,3}))(?:[^\w']|\Z)/, { it.upper() })
on :
Le désastre
give :
Le DÉsastre
cause to accentuated character following matching
D
Unicode character
é
begins with bit 00 matching with final
\Z
Note: the regex could work with endings
(?:[^\p{L}']|\Z)/
but I found this finally too complicated for nothing.
ref:
http://www.regular-expressions.info/unicode.html#prop
Re: Full roman numerals support
Posted: 19 May 2016, 22:01
by kim
{n.ascii()}
Re: Full roman numerals support
Posted: 26 May 2016, 09:40
by rednoah
This looks like fun. Please post your test data filenames so we can play with that as well.
I'd start with this:
Code: Select all
{n.replaceAll(/(?i)\b[XVI]+\b/){it.upper()}}
Maybe not super-complete, but probably good enough for 99.99% of use-cases and a bit more easy to read.

Re: Full roman numerals support
Posted: 03 Jun 2016, 23:27
by ZGab
Sure your right, I'd same before. But I'd like to avoid some 0.01% exception

Finally I consider that {n} is enough and TheMovieDB always returns the good name.
I'm fan with regex, if needed
