Page 1 of 1

AMC: Existing ./. downloaded subs

Posted: 20 Nov 2017, 10:59
by thielj
Hi

assuming I have

Code: Select all

Some Movie (2017).mkv
Some Movie (2017).srt
The srt file contains English subs. With Parameter subtitles = en FileBot fetches English subtitles:

Code: Select all

Some Movie (2017).mkv
Some Movie (2017).eng.srt
Some Movie (2017).srt
When entering the renaming phase, both subtitles will end up with the same name, with the second copy being discarded. However, as the subtitles seems to be processed in sorting order, it's the original / previously existing English subtitles that are being discarded.

Code: Select all

Some Movie (2017).mkv        --> Some Movie (2017).mkv
Some Movie (2017).eng.srt    --> Some Movie (2017).eng.srt
Some Movie (2017).srt        --> Some Movie (2017).eng.srt         CONFLICT!!
Suggested solution: Detect existing subtitle languages first (embedded, idx, srt, ...) and only fetch missing languages.

Re: AMC: Existing ./. downloaded subs

Posted: 20 Nov 2017, 12:08
by rednoah
It's up to your format to not yield conflicting file paths. Improved support for existing subtitles in the amc script is not planned. You'll have more control by using the suball script separately from the amc script to fetch subtitles as necessary.

:idea: If the subtitles you get from OpenSubtitles aren't good, then that's your cue to upload better ones, especially if you happen to already have them. ;)

Re: AMC: Existing ./. downloaded subs

Posted: 20 Nov 2017, 16:01
by thielj
I disagree:

1. "My format" isn't aware that FileBot has added an additional English subtitle. Except for implementing heuristics, there's no way to figure this out
2. Even the default or built-in {plex} formats will not resolve these conflicts
3. Keeping functional, existing files should have highest priority. At the minimum, newly downloaded subs should be processed at the very end of the file group
4. FileBot on the other hand is aware that I already have an English subtitle and should not have downloaded another
5. Consistency: If my original subtitle was named *.eng.srt instead of just *.srt, FileBot wouldn't have downloaded another English sub

Re: AMC: Existing ./. downloaded subs

Posted: 20 Nov 2017, 17:48
by rednoah
1.
Formats that don't use {pi} aren't aware of multi-part movies. Formats that use {subt} aren't aware of multiple subtitles per language. Unfortunately, there's no particular built-in support for multiple subtitles for the same language.


2.
The {plex} format doesn't support multiple subtitles per language. It also doesn't support multiple video files of the same movie with different resolutions.


3.
That's up to you. You can't use --def subtitles then. I recommend fetching subtitles independently of the amc script. That makes sense for many reasons. Firstly the subtitles will be well-named by the amc script so you won't get duplicates. Secondly, you'll be able to add subtitles later, as they're not always available immediately.


4.
Not at that time. FileBot is running language detection on the subtitle text when you use {subt} or {lang} on subtitle files that aren't tagged. I'd prefer not to add unnecessary complexity to the amc script for this.


5.
That's correct. Things work better if files are well-named. If the subtitle file was called subs/English.srt then it would be ignored completely. The amc script does keep things simple.

Re: AMC: Existing ./. downloaded subs

Posted: 20 Nov 2017, 17:49
by rednoah
(3) solves all your concerns as far as I can tell. Just remove --def subtitles and add a suball script call afterwards. You even get to filter out files that already contain embedded subtitles via the --def ignoreTextLanguage option.

Re: AMC: Existing ./. downloaded subs

Posted: 20 Nov 2017, 21:06
by thielj
(1) I haven't even talked about multi-part movies...
(1)/(2) ... and neither about multiple subtitles per language (although LANG.specialsomething.srt usually works with players I have been using).

I have an issue with FileBot getting the additional unnecessary subtitle in the first place! And then either losing the existing subtitle (for all languages in sort order < SRT) or losing the new subtitle (if you happen to be looking for Swedish or Turkish texts for example).

I would even suggest to unpack the downloaded subs into a temporary folder and not into the INPUT directory.

(4) How often does it happen that someone interested in processing subtitles does NOT need either {subt} or {lang} later in their format? I can't imagine any performance benefits from delay-loading this information, just added complexity and inconsistency.

(5) No need to discuss that things are better when files are well named (that's why I'm doing this here), but FileBot's behaviour is still inconsistent

Re: AMC: Existing ./. downloaded subs

Posted: 20 Nov 2017, 21:09
by thielj
(4) also, the unnecessary download will probably be 10x slower than a language detection of an untagged subtitle!

Re: AMC: Existing ./. downloaded subs

Posted: 20 Nov 2017, 21:36
by rednoah
Sorry, but that's just how it works. As far as subtitles are concerned, there are many limitations. The current solution is to always fetch them online, it's not ideal, but it works the same regardless of whether you already have subs or not or how they're named, or if language detection actually yielded the correct language, etc.

The --def subtitles=en amc script option just isn't the right solution for your particular requirements.

Please use the suball script instead. You can just && it to your amc script call. That'll give you lots of options.

Re: AMC: Existing ./. downloaded subs

Posted: 21 Nov 2017, 16:24
by thielj
The suball script doesn't give me the option to make the available languages part of the movie (folder) name, unless I re-run the amc script a second time (and creating a real mess in the filebot history).

How about a different approach, providing the movie script an API like this instead of using the subtitles= parameter:

Code: Select all

  ['en', 'fr', 'de']. each { lang ->
    FileBot.Whatever.getDownloadableSubs(lang, movieFile) 
    .findAll{ whateverCriteriaWeWant(it) } // pick only those we want to download
    .each{ it.registerTargetLocation( whateverLocationWeWantThem(it) ) }
  }

Re: AMC: Existing ./. downloaded subs

Posted: 21 Nov 2017, 16:38
by rednoah
That's essentially writing your own pre-processing script where you download subtitles yourself, before calling the amc script. That's always an option.

e.g. you could pre-process your files and make sure that each subtitle file is named and language-tagged correctly:

Code: Select all

filebot -rename -r /path/to/files --db xattr -non-strict --filter 'f.subtitle' --format '{fn}{subt}'
That way you'll add the language code to each file, which should prevent new subtitles from being downloaded when you run the amc script later on.

Re: AMC: Existing ./. downloaded subs

Posted: 21 Nov 2017, 17:35
by thielj
The advantage of the API suggestion is that I have access to ALL the relevant information I need to make decisions and generate non-conflicting filenames: all files in the batch, original movie language(s), available audio languages, hard coded subs if I have previously tagged my files, subtitles in external files, multiple sub title languages in idx files, languages available on OpenSubtitles, etc.

With a similar API, I would even be able to process orphans (and leverage FileBot to help me detect languages etc) and check if they match any pattern I have previously used to store certain files (everything in an ./Extras folder).

Re: AMC: Existing ./. downloaded subs

Posted: 21 Nov 2017, 17:44
by thielj
Another API advantage: if the script would register additional files, orphans, etc, they would also become part of the group/batch of files processed together and will all end up in history, with proper xattr, etc.

You could at some point even consider every group/batch as a single transaction that - like in SQL - will either fully complete or rollback. This would give users some peace of mind.