Feature Request: Improved subtitle matching

All your suggestions, requests and ideas for future development
Post Reply
xaeiou
Posts: 15
Joined: 22 Oct 2019, 06:24

Feature Request: Improved subtitle matching

Post by xaeiou »

Hi,

This is a feature request (or possibly a bug report) to improve matching logic when there are multiple instances of subtitles with the same language.

To distill the situation to a simple example consider this tree:

Code: Select all

$ find Seinfeld.S01-S09/                                        
Seinfeld.S01-S09/                                               
Seinfeld.S01-S09/Seinfeld.S01                                   
Seinfeld.S01-S09/Seinfeld.S01/Subs                              
Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01              
Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/5_Chinese.srt
Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/6_Chinese.srt
Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/3_English.srt
Seinfeld.S01-S09/Seinfeld.S01/Seinfeld.S01E01.mp4               
 
Using the latest release, filebot logic correctly matches by gleaning information from the filepath, but gets confused if a language appears more than once.

Code: Select all

$ filebot --action test -rename -non-strict -r Seinfeld.S01-S09 --format "{sxe} {episode.title}{subt}"                                      
Classify media files                                                                                                                                                        
* Consider specifying --db TheTVDB or --db TheMovieDB explicitly                                                                                                            
Rename episodes using [TheTVDB] with [Airdate]                                                                                                                              
Lookup via [Seinfeld]                                                                                                                                                       
Fetching episode data for [Seinfeld]                                                                                                                                        
[TEST] from [Seinfeld.S01-S09/Seinfeld.S01/Seinfeld.S01E01.mp4] to [Seinfeld.S01-S09/Seinfeld.S01/1x01 The Seinfeld Chronicles.mp4]                                         
[TEST] from [Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/3_English.srt] to [Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/1x01 The Seinfeld Chronicles.eng.srt] 
[TEST] from [Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/5_Chinese.srt] to [Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/1x01 The Seinfeld Chronicles.chi.srt] 
[TEST] from [Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/6_Chinese.srt] to [Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/6x01 The Chaperone.chi.srt]           
Processed 4 files                                                                                                                                                           
(I've removed absolute paths from the above to make it easier to read)

I've also tested this with multiple seasons and filebot correctly identifies the season, but always gets tripped if there is more than one sub file for a language.

Maybe this can be handled with a script but I'm not sure because the problem is within filebot's episode matching logic.

Anyway, if the episode match logic can be improved, it might be worthy of a special command line option or if possible to do it seamlessly ever better.

To handle this situation would require adding the subtitle index, but I notice a lot of people in the forums are doing this in their custom scripts anyway (I am), eg:

Code: Select all

{if (f.subtitle) '(' + { fn.matchAll(/\d+/).last() } + ')' }
{ if (f.subtitle) '(' + fn.match(/_(\d+)$/) + ')' }
So what I'm thinking is something like this (faked) result:

Code: Select all

$ filebot --action test -rename -non-strict -r Seinfeld.S01-S09 --format "{sxe} {episode.title}{subt}"  --sub_index 
Classify media files                                                                                                                                                        
* Consider specifying --db TheTVDB or --db TheMovieDB explicitly                                                                                                            
Rename episodes using [TheTVDB] with [Airdate]                                                                                                                              
Lookup via [Seinfeld]                                                                                                                                                       
Fetching episode data for [Seinfeld]                                                                                                                                        
[TEST] from [Seinfeld.S01-S09/Seinfeld.S01/Seinfeld.S01E01.mp4] to [Seinfeld.S01-S09/Seinfeld.S01/1x01 The Seinfeld Chronicles.mp4]                                                  
[TEST] from [Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/3_English.srt] to [Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/1x01 The Seinfeld Chronicles.eng(3).srt] 
[TEST] from [Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/5_Chinese.srt] to [Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/1x01 The Seinfeld Chronicles.chi(5).srt] 
[TEST] from [Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/6_Chinese.srt] to [Seinfeld.S01-S09/Seinfeld.S01/Subs/Seinfeld.S01E01/1x01 The Seinfeld Chronicles.chi(6).srt]           
Processed 4 files    
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Feature Request: Improved subtitle matching

Post by rednoah »

This particular issue might have been fixed already in the latest beta.


:arrow: Please try the latest revision and see if the issue can still be reproduced:
viewtopic.php?t=1609


:idea: The latest revision should fix matching, but custom formatting with subtitle index is a related but separate issue. The {plex} format would yield the same path for all subtitles of the same language, and thus only retain one of them in the target folder.
:idea: Please read the FAQ and How to Request Help.
xaeiou
Posts: 15
Joined: 22 Oct 2019, 06:24

Re: Feature Request: Improved subtitle matching - RESOLVED in BETA FileBot 4.9.5 (r9099)

Post by xaeiou »

rednoah wrote: 10 Feb 2022, 11:49 This particular issue might have been fixed already in the latest beta.
Yup it is fixed in BETA FileBot 4.9.5 (r9099). I tested it on the full 9 seasons of Seinfeld (over 5800 files with the full sub pack) and it worked perfectly. Thanks very much!

I've made a note to check the beta next time I find something :)
rednoah wrote: 10 Feb 2022, 11:49 :idea: The latest revision should fix matching, but custom formatting with subtitle index is a related but separate issue. The {plex} format would yield the same path for all subtitles of the same language, and thus only retain one of them in the target folder.
No worries, adding the subtitle index seems to work fine by adding the lines I mention previously to my "--format" file:

Code: Select all

{if (f.subtitle) '(' + { fn.matchAll(/\d+/).last() } + ')' }
{ if (f.subtitle) '(' + fn.match(/_(\d+)$/) + ')' }
Post Reply