Page 1 of 1

Subtitles, check for 2 letter country code and then download

Posted: 02 Apr 2014, 13:50
by exaunique
Hi there,

I like to see some way to check for subtitles that already exists around a certain language.
And since all programs around the downloading of subtitles are using the 2 letter country codes in their naming convention, this seems to be a problem for filebot when using get-missing-subtitles.

When using for example
"filebot -get-missing-subtitles -r "Y:\Castle (2009)\Season 04" --lang nl -non-strict"

Which results in 2 subtitles for those that already had one:
Castle - S04E01 - Rise.nl.srt
Castle - S04E01 - Rise.nld.srt

Is there a option or way to fix this?

Re: Subtitles, check for 2 letter country code and then down

Posted: 02 Apr 2014, 14:42
by rednoah
Just run some regex renamer first to fix the 2-letter codes to 3-letter codes.

You could use this script here:
http://www.filebot.net/forums/viewtopic ... &t=5#p2100

And request that the other tools you may be using also switch to 3-letter codes.

Re: Subtitles, check for 2 letter country code and then down

Posted: 02 Apr 2014, 18:29
by exaunique
I disagree, a work around like that could work, but is in principle a bad solution...
Keep in mind, much bigger players and the fast majority (xbmc, sickbeard, coachpotato, etc. etc.) is using 2 letter codes as a standard.

Re: Subtitles, check for 2 letter country code and then down

Posted: 02 Apr 2014, 19:00
by rednoah
Bigger players yes, but there wouldn't be a need for filebot if they were smarter. :P FileBot can read both, but outputs the better one. And of course in the format you can pick between ISO2/3 when doing -rename --format, but as for downloading it's gonna use to most precise and unambiguous standard.

FYI, OpenSubtitles and Sublight, the big players that actually matter, all use 3-letter codes. ;)

Re: Subtitles, check for 2 letter country code and then down

Posted: 03 Apr 2014, 15:00
by exaunique
Can you show me a link on opensubtitles, in which they officially state a naming convention?
Because i cant find it anywhere.... it looks like you made that one up :)

The only direction opensubtitle is giving, is a alternative to filebot (http://www.opensubtitles.org/upload)
Which also makes use of 2 letters....

Anyways its beside the point, it should be checking for all "ISO 639" language codes or the more proper "IETF language tag"
I know for sure it can be fixed in minimum effort and I can help if needed?

Re: Subtitles, check for 2 letter country code and then down

Posted: 03 Apr 2014, 15:30
by rednoah
Read the API docs:
http://trac.opensubtitles.org/projects/ ... hSubtitles

search, upload, language detection, etc all methods work with SubLanguageID which is the 3-letter code.

That nicely avoids issues like .HI can mean both Hearing-Impaired, or Hindi language. There's a lot more cases where 2 letter code cause issues, or where certain languages are not even defined.


As for your Example that uses 2 letter language codes ... as far as I can tell it proves that OpenSubtitles uses 3-letter code for all uploads made through the website.

Open & View source:
http://www.opensubtitles.org/upload)

I see 3 letter language codes:

Code: Select all

<option value="alb">Albanian</option>

Re: Subtitles, check for 2 letter country code and then down

Posted: 03 Apr 2014, 22:28
by exaunique
Cheers, but you are sidetracking the issue, seeing we are talking about "sub"language id's now which is something totally different.
Also it shows that your program is using country codes, instead of the ISO 639 language codes used on for example opensubtitles....

Take example of my first post in here, which shows a subtitle named by filebot:
Castle - S04E01 - Rise.nld.srt <-- notice the nld?

If you are using the naming convention of iso, it should have been named:
Castle - S04E01 - Rise.dut.srt <-- it should be dut....

Again though its beside the point, it should be checking for all "ISO 639" language codes or the more proper "IETF language tag"
Or at least the 2 letter and the 3 letter language codes.

Re: Subtitles, check for 2 letter country code and then down

Posted: 04 Apr 2014, 06:47
by rednoah
SubLanguageID IS ISO 639-3 (as far as I know, and you obviously didn't bother any of the links I sent you)

Note that there is also ISO 639-2/T and ISO 639-2/B 3-letter codes which seems to be confusing you.

@see
http://en.wikipedia.org/wiki/List_of_IS ... _639_table
http://en.wikipedia.org/wiki/ISO_639-2#B_and_T_codes
http://en.wikipedia.org/wiki/ISO_639-3# ... _ISO_639-3

If things are badly named, well it's not like it's not working? Wanna normalize your structure? Use FileBot -rename, the {lang} binding will detect any language code and always force ISO 639-3 for your convenience.


PS: So I'll refine mu statement from before. Any software that uses codes for language (as opposed to locale) should use ISO 639-3 codes.

PS2: Original issue fixed with r2120