Page 1 of 1
					
				External subtitle languages?
				Posted: 09 Nov 2017, 13:08
				by thielj
				Hi
It seems that textLanguages only includes subtitle streams embedded with the video (probably provided by mediainfo). I haven't checked yet, but the same might be true for audioLabguages and external audio tracks.
Has anyone written a script to collect languages from external subtitle (*.idx, *.ass) and audio files that they would be willing to share?
Is it possible to detect which additional subtitles FileBot was able to download?
Cheers!
			 
			
					
				Re: External subtitle languages?
				Posted: 09 Nov 2017, 15:26
				by kim
				If you have a srt file with same name as video file and in same folder, then you can use e.g.
 to rename srt file with the lang e.g. "movie.eng.mkv"
 
			
					
				Re: External subtitle languages?
				Posted: 09 Nov 2017, 17:37
				by thielj
				What I have is a collection of files, eg:
Code: Select all
movie (1234).mp4             <-- includes embedded English audio and German subtitle streams
movie (1234).idx             <-- includes Italian and Spanish subs
movie (1234).sub
movie (1234).eng.srt         <-- automatically downloaded by FileBot
movie (1234).fr.mp3          <-- French audio stream
I have all the groovy code working to pick up the embedded audio ('EN') and subtitle streams ('de') and move all files to a folder 
"movie (1234) [EN de]". 
However, I also want to add tags for the external audio track ('FR'), the titles in the idx/sub ('it' and 'sp') and the newly downloaded srt ('en') to come up with a folder named 
"movie (1234) [EN FR de en it sp]".
As these files have already been picked up and analyzed by FileBot, I wonder if it's possible to access this information - without re-inventing the wheel - and use it to format the folder name.
 
			
					
				Re: External subtitle languages?
				Posted: 09 Nov 2017, 18:28
				by rednoah
				The 
{lang} binding should work for companion files:
Code: Select all
$ ls
Avatar.2009.de.mp3
Avatar.2009.fr.mp3
Avatar.2009.mp4
Code: Select all
$ filebot -rename * --db TheMovieDB -non-strict --format "{ny}{'.'+lang.name}" --action TEST
[TEST] From [Avatar.2009.mp4] to [Avatar (2009).mp4]
[TEST] From [Avatar.2009.de.mp3] to [Avatar (2009).German.mp3]
[TEST] From [Avatar.2009.fr.mp3] to [Avatar (2009).French.mp3]
Since 
{lang} works for individual items, it can also be accessed via 
{model} for all items from each individual item:
Code: Select all
$ filebot -rename * --db TheMovieDB -non-strict --format "{ny}{'.'+model.lang.name}" --action TEST
[TEST] from [Avatar.2009.mp4] to [Avatar (2009).[German, French].mp4]
[TEST] from [Avatar.2009.de.mp3] to [Avatar (2009).[German, French].mp3]
[TEST] from [Avatar.2009.fr.mp3] to [Avatar (2009).[German, French].mp3]
 is probably one of the most advanced bindings, so you won't find many examples. It's about taking into account all matches when formatting an individual match. It might be very useful for what you're trying to do though.
 

 For SUB/IDX you'll have to read the IDX file 
(text file) and check the language there. The 
{lang} binding doesn't do that.
 
			
					
				Re: External subtitle languages?
				Posted: 09 Nov 2017, 19:45
				by thielj
				Thanks, {model} seems to be what I was looking for!
			 
			
					
				Re: External subtitle languages?
				Posted: 11 Nov 2017, 14:50
				by thielj
				I still seem to have issues with FileBot picking up audio tracks (and pure audio files ending in .ac3 or .dts seem to be recognized as movies). Apart from that, here is the language recognition code so far, if anyone has similar needs:
Code: Select all
  // LANGUAGE PROCESSING ------------------------------------------------------------------------------------------------------
  // prduction language from the movie DB, not necessarily related to any spoken language in the movie!
  def dbOriginalLanguage = call{info.OriginalLanguage};
  // spoken languages from the movie DB, sorted by language name
  def dbSpokenLanguages  = call{languages}?:[];
  // try to guess the primary language of the movie
  def primaryLanguage    = (1 == dbSpokenLanguages.size()) ? dbSpokenLanguages[0] : Language.findLanguage(dbOriginalLanguage);
  // helper classes to enumerate embedded and external streams
  enum StreamType{ AUDIO, TEXT, HARD }
  class Stream {
      StreamType  type
      Language    lang
      Stream(StreamType t, Language l) { type=t; lang=l; }
      Stream(StreamType t, Locale l)   { type=t; lang=Language.getLanguage(l); }
      Stream(StreamType t, String l)   { type=t; lang=Language.findLanguage(l); }
      boolean equals(Object obj)       { return lang.equals( (obj as Stream).lang); }
      String toString() {
         switch (type) {
           case StreamType.AUDIO: return lang.ISO2.upper();
           case StreamType.TEXT:  return lang.ISO2;
           case StreamType.HARD:  return lang.ISO2+"-hard";
           default: break;
         }
         return null;
      }
  }
  // Embedded audio streams, provided by mediainfo
  // TODO: access mediainfo data and find the first 'default' audio track, if available?
  def embAudioStreams    = (call{audioLanguages}?:[]) .collect{ new Stream( StreamType.AUDIO, it ) };
  // External audio streams
  def extAudioStreams    = model.findAll{it.ext=~/(?i)^mp3|m4a|dts|ac3|wav|ogg$/} .collect{
                             new Stream( StreamType.AUDIO, it.lang?:"und" )
                           };
  // Embedded text/subtitle streams, provided by mediainfo
  def embTextStreams     = (call{textLanguages }?:[]) .collect{ new Stream( StreamType.TEXT, it ) };
  // External text/subtitle streams
  // TODO: which ones are recognnized by FileBot ???
  def extTextStreams     = model.findAll{it.ext=~/(?i)^ass|psb|srt|ssa|ssf|sub$/} .collect{
                             new Stream( StreamType.TEXT, it.lang?:"und" )
                           };
  // External VobSub titles (.idx and .sub) supporting multiple languages
  def extVobSubStreams   = model.findAll{it.ext=='idx'} .collectMany{
                             it.file.findAll(/^id: ([a-zA-Z]+)/)
                           } .collect{
                             new Stream( StreamType.TEXT, it )
                           };
  // try to match patterns indicating hard subs
  def embHardSubs        = myGrepContext.findAll(RE_HARDSUBS) .collect{ new Stream( StreamType.HARD, it ) };
  // if no other information available, assume the default audio stream is in the  primary language
  if( ! embAudioStreams && primaryLanguage ) embAudioStreams += [ new Stream( StreamType.AUDIO, primaryLanguage) ];
  def myStreams          = embAudioStreams+extAudioStreams+embHardSubs+embTextStreams+extVobSubStreams+extTextStreams;
Just do myStreams.unique() before tagging your file.
 
			
					
				Re: External subtitle languages?
				Posted: 11 Nov 2017, 15:16
				by rednoah
				Yes, secondary files 
(i.e. any non-video files that match a video file by name) will automatically get matched to the the same match as the primary file. Unfortunately for your use case, that also means that all MediaInfo bindings will get redirected and retrieve values from the primary video file.
i.e.
Code: Select all
Movie (2000).mkv
Movie (2000).ac3
Movie (2000).dts
 
			
					
				Re: External subtitle languages?
				Posted: 11 Nov 2017, 16:17
				by thielj
				In general, that's what I want, as long as the mkv/mp4/avi/divx etc is the 'primary file', and external ac3/dts/mp3 tracks are always secondaries.
Are there any hooks to modify the matches before moving on to actions? Is it possible to cache some data between invocations of my script, for example in the 'model' object?
I also noticed that matching seems to happen between sibling folders, i.e. I have two movies in different versions (different codecs/resolutions/etc). This is often due to one being e.g. the German release and the other the original, differing in source material, run-time, etc:
Code: Select all
Movies/(R)/Redacted (2000) [720p DTS 5.1 EN de]/Redacted (2000).nfo
Movies/(R)/Redacted (2000) [720p DTS 5.1 EN de]/Redacted (2000).mkv
Movies/(R)/Redacted (2000) [720p DTS 5.1 EN de]/Redacted (2000).de.srt
Movies/(R)/Redacted (2000) [704x384 DE]/Redacted (2000).nfo
Movies/(R)/Redacted (2000) [704x384 DE]/Redacted (2000).CD1.avi
Movies/(R)/Redacted (2000) [704x384 DE]/Redacted (2000).CD2.avi
As a workaround, I can process these separately I guess. Haven't fully investigated this yet due to the xattr issues.
 
			
					
				Re: External subtitle languages?
				Posted: 11 Nov 2017, 17:48
				by rednoah
				1.
model can be particularly inefficient, because you'll access all files for each individual file each time, so it doesn't scale well. FileBot should be caching some stuff internally. There's a few crazy people that use internal APIs in custom formats. That's for another thread though.
2.
The --filter option should give you some control as to what gets matched. However, you don't get any hooks for checking the file/movie matches before processing. If you have the same movie file multiple times, then it will get matched to the same movie object multiple times. Your format will need to account for that if that's something that can happen in your case.