External subtitle languages?
External subtitle languages?
Hi
It seems that textLanguages only includes subtitle streams embedded with the video (probably provided by mediainfo). I haven't checked yet, but the same might be true for audioLabguages and external audio tracks.
Has anyone written a script to collect languages from external subtitle (*.idx, *.ass) and audio files that they would be willing to share?
Is it possible to detect which additional subtitles FileBot was able to download?
Cheers!
It seems that textLanguages only includes subtitle streams embedded with the video (probably provided by mediainfo). I haven't checked yet, but the same might be true for audioLabguages and external audio tracks.
Has anyone written a script to collect languages from external subtitle (*.idx, *.ass) and audio files that they would be willing to share?
Is it possible to detect which additional subtitles FileBot was able to download?
Cheers!
Re: External subtitle languages?
If you have a srt file with same name as video file and in same folder, then you can use e.g.
to rename srt file with the lang e.g. "movie.eng.mkv"
Code: Select all
{fn}{subt}
Re: External subtitle languages?
What I have is a collection of files, eg:
I have all the groovy code working to pick up the embedded audio ('EN') and subtitle streams ('de') and move all files to a folder "movie (1234) [EN de]".
However, I also want to add tags for the external audio track ('FR'), the titles in the idx/sub ('it' and 'sp') and the newly downloaded srt ('en') to come up with a folder named "movie (1234) [EN FR de en it sp]".
As these files have already been picked up and analyzed by FileBot, I wonder if it's possible to access this information - without re-inventing the wheel - and use it to format the folder name.
Code: Select all
movie (1234).mp4 <-- includes embedded English audio and German subtitle streams
movie (1234).idx <-- includes Italian and Spanish subs
movie (1234).sub
movie (1234).eng.srt <-- automatically downloaded by FileBot
movie (1234).fr.mp3 <-- French audio stream
However, I also want to add tags for the external audio track ('FR'), the titles in the idx/sub ('it' and 'sp') and the newly downloaded srt ('en') to come up with a folder named "movie (1234) [EN FR de en it sp]".
As these files have already been picked up and analyzed by FileBot, I wonder if it's possible to access this information - without re-inventing the wheel - and use it to format the folder name.
Re: External subtitle languages?
The {lang} binding should work for companion files:
Since {lang} works for individual items, it can also be accessed via {model} for all items from each individual item:
{model} is probably one of the most advanced bindings, so you won't find many examples. It's about taking into account all matches when formatting an individual match. It might be very useful for what you're trying to do though.
For SUB/IDX you'll have to read the IDX file (text file) and check the language there. The {lang} binding doesn't do that.
Code: Select all
$ ls
Avatar.2009.de.mp3
Avatar.2009.fr.mp3
Avatar.2009.mp4
Code: Select all
$ filebot -rename * --db TheMovieDB -non-strict --format "{ny}{'.'+lang.name}" --action TEST
[TEST] From [Avatar.2009.mp4] to [Avatar (2009).mp4]
[TEST] From [Avatar.2009.de.mp3] to [Avatar (2009).German.mp3]
[TEST] From [Avatar.2009.fr.mp3] to [Avatar (2009).French.mp3]
Since {lang} works for individual items, it can also be accessed via {model} for all items from each individual item:
Code: Select all
$ filebot -rename * --db TheMovieDB -non-strict --format "{ny}{'.'+model.lang.name}" --action TEST
[TEST] from [Avatar.2009.mp4] to [Avatar (2009).[German, French].mp4]
[TEST] from [Avatar.2009.de.mp3] to [Avatar (2009).[German, French].mp3]
[TEST] from [Avatar.2009.fr.mp3] to [Avatar (2009).[German, French].mp3]
{model} is probably one of the most advanced bindings, so you won't find many examples. It's about taking into account all matches when formatting an individual match. It might be very useful for what you're trying to do though.

Re: External subtitle languages?
Thanks, {model} seems to be what I was looking for!
Re: External subtitle languages?
I still seem to have issues with FileBot picking up audio tracks (and pure audio files ending in .ac3 or .dts seem to be recognized as movies). Apart from that, here is the language recognition code so far, if anyone has similar needs:
Just do myStreams.unique() before tagging your file.
Code: Select all
// LANGUAGE PROCESSING ------------------------------------------------------------------------------------------------------
// prduction language from the movie DB, not necessarily related to any spoken language in the movie!
def dbOriginalLanguage = call{info.OriginalLanguage};
// spoken languages from the movie DB, sorted by language name
def dbSpokenLanguages = call{languages}?:[];
// try to guess the primary language of the movie
def primaryLanguage = (1 == dbSpokenLanguages.size()) ? dbSpokenLanguages[0] : Language.findLanguage(dbOriginalLanguage);
// helper classes to enumerate embedded and external streams
enum StreamType{ AUDIO, TEXT, HARD }
class Stream {
StreamType type
Language lang
Stream(StreamType t, Language l) { type=t; lang=l; }
Stream(StreamType t, Locale l) { type=t; lang=Language.getLanguage(l); }
Stream(StreamType t, String l) { type=t; lang=Language.findLanguage(l); }
boolean equals(Object obj) { return lang.equals( (obj as Stream).lang); }
String toString() {
switch (type) {
case StreamType.AUDIO: return lang.ISO2.upper();
case StreamType.TEXT: return lang.ISO2;
case StreamType.HARD: return lang.ISO2+"-hard";
default: break;
}
return null;
}
}
// Embedded audio streams, provided by mediainfo
// TODO: access mediainfo data and find the first 'default' audio track, if available?
def embAudioStreams = (call{audioLanguages}?:[]) .collect{ new Stream( StreamType.AUDIO, it ) };
// External audio streams
def extAudioStreams = model.findAll{it.ext=~/(?i)^mp3|m4a|dts|ac3|wav|ogg$/} .collect{
new Stream( StreamType.AUDIO, it.lang?:"und" )
};
// Embedded text/subtitle streams, provided by mediainfo
def embTextStreams = (call{textLanguages }?:[]) .collect{ new Stream( StreamType.TEXT, it ) };
// External text/subtitle streams
// TODO: which ones are recognnized by FileBot ???
def extTextStreams = model.findAll{it.ext=~/(?i)^ass|psb|srt|ssa|ssf|sub$/} .collect{
new Stream( StreamType.TEXT, it.lang?:"und" )
};
// External VobSub titles (.idx and .sub) supporting multiple languages
def extVobSubStreams = model.findAll{it.ext=='idx'} .collectMany{
it.file.findAll(/^id: ([a-zA-Z]+)/)
} .collect{
new Stream( StreamType.TEXT, it )
};
// try to match patterns indicating hard subs
def embHardSubs = myGrepContext.findAll(RE_HARDSUBS) .collect{ new Stream( StreamType.HARD, it ) };
// if no other information available, assume the default audio stream is in the primary language
if( ! embAudioStreams && primaryLanguage ) embAudioStreams += [ new Stream( StreamType.AUDIO, primaryLanguage) ];
def myStreams = embAudioStreams+extAudioStreams+embHardSubs+embTextStreams+extVobSubStreams+extTextStreams;
Re: External subtitle languages?
Yes, secondary files (i.e. any non-video files that match a video file by name) will automatically get matched to the the same match as the primary file. Unfortunately for your use case, that also means that all MediaInfo bindings will get redirected and retrieve values from the primary video file.
i.e.
i.e.
Code: Select all
Movie (2000).mkv
Movie (2000).ac3
Movie (2000).dts
Re: External subtitle languages?
In general, that's what I want, as long as the mkv/mp4/avi/divx etc is the 'primary file', and external ac3/dts/mp3 tracks are always secondaries.
Are there any hooks to modify the matches before moving on to actions? Is it possible to cache some data between invocations of my script, for example in the 'model' object?
I also noticed that matching seems to happen between sibling folders, i.e. I have two movies in different versions (different codecs/resolutions/etc). This is often due to one being e.g. the German release and the other the original, differing in source material, run-time, etc:
As a workaround, I can process these separately I guess. Haven't fully investigated this yet due to the xattr issues.
Are there any hooks to modify the matches before moving on to actions? Is it possible to cache some data between invocations of my script, for example in the 'model' object?
I also noticed that matching seems to happen between sibling folders, i.e. I have two movies in different versions (different codecs/resolutions/etc). This is often due to one being e.g. the German release and the other the original, differing in source material, run-time, etc:
Code: Select all
Movies/(R)/Redacted (2000) [720p DTS 5.1 EN de]/Redacted (2000).nfo
Movies/(R)/Redacted (2000) [720p DTS 5.1 EN de]/Redacted (2000).mkv
Movies/(R)/Redacted (2000) [720p DTS 5.1 EN de]/Redacted (2000).de.srt
Movies/(R)/Redacted (2000) [704x384 DE]/Redacted (2000).nfo
Movies/(R)/Redacted (2000) [704x384 DE]/Redacted (2000).CD1.avi
Movies/(R)/Redacted (2000) [704x384 DE]/Redacted (2000).CD2.avi
Re: External subtitle languages?
1.
model can be particularly inefficient, because you'll access all files for each individual file each time, so it doesn't scale well. FileBot should be caching some stuff internally. There's a few crazy people that use internal APIs in custom formats. That's for another thread though.
2.
The --filter option should give you some control as to what gets matched. However, you don't get any hooks for checking the file/movie matches before processing. If you have the same movie file multiple times, then it will get matched to the same movie object multiple times. Your format will need to account for that if that's something that can happen in your case.
model can be particularly inefficient, because you'll access all files for each individual file each time, so it doesn't scale well. FileBot should be caching some stuff internally. There's a few crazy people that use internal APIs in custom formats. That's for another thread though.
2.
The --filter option should give you some control as to what gets matched. However, you don't get any hooks for checking the file/movie matches before processing. If you have the same movie file multiple times, then it will get matched to the same movie object multiple times. Your format will need to account for that if that's something that can happen in your case.