Limit Suball Script to only missing subtitles

Running FileBot from the console, Groovy scripting, shell scripts, etc
Post Reply
kaneelias
Posts: 16
Joined: 06 Aug 2018, 10:14

Limit Suball Script to only missing subtitles

Post by kaneelias »

Hi all,

I have been using filebot for sometime and have only just started playing around with scripting.

I use the AMC script to pull video information and this pulls english subs if it can find it, awesome.

My issue is often when I run this script, no subs have been uploaded.

So then I run a sub all script periodically which reads

Code: Select all

filebot -script fn:suball w:\ -non-strict --def maxAgeDays=7 
And this works fine but it overwrites any .srt that has previously been downloaded.

Is there a command I can use to tell the script to skip a file if it already has a .srt file by the same name already?

Essentially the script will only look for files that are max 7 days and do not already have subs.

Cheers
User avatar
rednoah
The Source
Posts: 23513
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Limit Suball Script to only missing subtitles

Post by rednoah »

kaneelias wrote: 09 Aug 2018, 11:41 Is there a command I can use to tell the script to skip a file if it already has a .srt file by the same name already?
I'm fairly sure that that's default behaviour. What do the logs say?
:idea: Please read the FAQ and How to Request Help.
kaneelias
Posts: 16
Joined: 06 Aug 2018, 10:14

Re: Limit Suball Script to only missing subtitles

Post by kaneelias »

rednoah wrote: 09 Aug 2018, 12:10
kaneelias wrote: 09 Aug 2018, 11:41 Is there a command I can use to tell the script to skip a file if it already has a .srt file by the same name already?
I'm fairly sure that that's default behaviour. What do the logs say?
Thanks for the reply.

The script was saying it found a match and redownloading.

I have managed to skip the files by adding --action duplicate --conflict skip

to the string. So all good now.
starlo
Posts: 6
Joined: 11 May 2014, 11:11

Re: Limit Suball Script to only missing subtitles

Post by starlo »

I want to run this on a large TV and Movie collection which currently does not have ANY subtitles (except the files which already have them embedded) and doing so will exceed my daily download limit considerably (even as a VIP member) I must therefore download the subtitles over period of a few days.

however calling the script daily requires
:!: If you call this script repeatedly on the same folders or files then you MUST SET --def maxAgeDays to 30 days or less and call it no more than once per day.

The problem I have is all my shows have the file creation date set to the show air date so if I set --def maxAgeDays to 30 days or less then it wont download any subtitles for my older shows i.e Star Trek

If I have understood this thread correctly then this command might be the answer I'm looking for as the script would scan the /path/to/media folder and then only download an EXACT match for MISSING subtitles - skipping embedded and existing subtitles (both downloaded subtitles from previous runs of the script and any new media which might already have exisiting srt files)

Shell: Select all

filebot -script fn:suball /path/to/media --action duplicate --conflict skip

If that is the case, would you still need to add an --def maxAgeDays value? As it would only ever download subtitles that where missing - if the script can't find an exact match or the subtitle is embedded / existing then nothing would be downloaded?


My thinking is

The script COULD run daily (without setting --def maxAgeDays )and over a period of a few days download all the subtitles for all the shows/films - skipping subtitles i already downloaded the day before

Then once that has been done , the daily cron run of the script would simply find any subtitles for newly downloaded content which were not available on release (download) day and if no new subtitles where found or nothing was missing the script would just exit without downloading anything

So long post short - would setting the following 2 cron jobs to run once a day achieve my goal of

1st Downloading all missing subtitles for my TV and Movie collection over a period of a few days and then
2nd check once a day for any missing subtitles (ie if new content as added recently and no subtitles where available on release/download day)
3rd not get banned!

Shell: Select all

filebot -script fn:suball /mnt/user/TV --action duplicate --conflict skip

Shell: Select all

filebot -script fn:suball /mnt/user/Films --action duplicate --conflict skip
assuming for a moment the above cron jobs are safe to add I would also like exact match subtitles to be downloaded at rename if they are available and if they are not available - download nothing at all and wait until the daily cron finds them in x days time

i use utorrent to download but because it runs as nobody it sets some weird file permissions so I call filebot by a custom postprocess.sh script which basically run chmod on utorrent's completed folder and then calls amc with $1 $2 $3 etc

Shell: Select all

sudo -su root chmod -R 777 "/mnt/cache/completed"
sudo -su root chown -R nobody:users "/mnt/cache/completed"
/usr/bin/docker exec filebot /opt/filebot/filebot -script fn:amc --output "/mnt/cache/" --log-file "/mnt/appdata/logs/amc.log" --action move --conflict auto -non-strict --def movieDB=TheMovieDB seriesDB=TheTVDB movieFormat=@/mnt/appdata/config/Filebot/MovieFormat.groovy seriesFormat=@/mnt/appdata/config/Filebot/SeriesFormat.groovy animeFormat=@/mnt/appdata/config/Filebot/AnimeFormat.groovy musicFormat=@/mnt/appdata/config/Filebot/MusicFormat.groovy plex=192.168.0.250:################### minFileSize=0 deleteAfterExtract=y clean=y "ut_dir=$1" "ut_file=$2" "ut_kind=$3" "ut_title=$4" "ut_label=$5" "ut_state=$6" --apply date
now when I was searching the forums for help, I read that if you want to ignore embedded and existing subtitles in cli you must first disable amc --def subtitles=en and then call filebot -script fn:suball before you call the amc script.

Since I am using a custom script to call filebot this should be easy, my question is how to pass the input files in my custom script as some downloads will be files and other folders (ut_kind=single/multi)

Shell: Select all

/usr/bin/docker exec filebot /opt/filebot/filebot -script fn:suball "ut_dir=$1" "ut_file=$2" "ut_kind=$3" "ut_title=$4" "ut_label=$5" "ut_state=$6" --action duplicate --conflict skip --log-file "/mnt/appdata/logs/suball.log"
or

Shell: Select all

/usr/bin/docker exec filebot /opt/filebot/filebot -script fn:suball "$1" "$2" "$3" "$4" "$5" "$6" --action duplicate --conflict skip --log-file "/mnt/appdata/logs/suball.log"
starlo
Posts: 6
Joined: 11 May 2014, 11:11

Re: Limit Suball Script to only missing subtitles

Post by starlo »

Thinking a bit more on this,

does suball script count the number of successful downloads? if not, could it be updated to do so ie --def limit=x
once this limit is reached the suball script would then exit and not request any more subtitles

reason being

if suball is called daily with the above --action duplicate --conflict skip suggestion on a large tv directory with no existing files - say 10,000 episodes across multiple shows and seasons.

it would successfully download the users download limit - which in my case is 1,000 leaving 9,000 files left for the following days but it would still request the remaining 9,000 subtitles getting download request denied 9,000 times

day 2 it would ignore the 1,000 files downloaded the day before and request 9,000 - successfully download 1,000 and still request the remaining 8,000

day 3 it would ignore 2,000 files previously downloaded and request 8,000 - successfully download 1,000 and still request the remaining 7,000
etc etc

I'm guessing all the repeated failed requests each day would get you a ban?

so if the suball script could be stopped with a tag like -def limit=x after the first successful x number of downloads it would solve the problem


the reason i need to download so many subtitles is because my partner has moved in with me and needs subtitles and previously i did not download them but i can easily see how this would be of use to someone new to filebot and wants to download subtitles for their existing collection.

they could simply run the following daily on their TV or Movie folder - It would scan the files skipping any embedded or existing subtitles aborting the script once the user's download limit has been reached. As it skips existing subtitles, the list of files it would request subtitles for each day would be unique, exiting after their download limit has been reached so opensubtitles would never receive the same request twice despite the script running on the same directory daily without the age limit being set

Shell: Select all

filebot -script fn:suball /path/to/media --action duplicate --conflict skip --limit 1000
starlo
Posts: 6
Joined: 11 May 2014, 11:11

Re: Limit Suball Script to only missing subtitles

Post by starlo »

i realise my last two posts were rather long so to sum it all up

1st
Basically i just need a way to download 10,000+ subtitles for an existing collection over a period of days ignoring any embedded or existing subtitles and then check said collection once a day for any missing subtitles ie newly downloaded files which did not have subtitles available at the time they were released

i think this can be solved by adding these two crons (although i may need the -limit=1000 tag i suggested)

Shell: Select all

filebot -script fn:suball /mnt/user/TV --action duplicate --conflict skip

Shell: Select all

filebot -script fn:suball /mnt/user/Films --action duplicate --conflict skip
2nd
set utorrent to check for subtitles when download complets which i belive is as easy as adding either

Shell: Select all

/usr/bin/docker exec filebot /opt/filebot/filebot -script fn:suball "ut_dir=$1" "ut_file=$2" "ut_kind=$3" "ut_title=$4" "ut_label=$5" "ut_state=$6" --action duplicate --conflict skip --log-file "/mnt/appdata/logs/suball.log"
or

Shell: Select all

/usr/bin/docker exec filebot /opt/filebot/filebot -script fn:suball "$1" "$2" "$3" "$4" "$5" "$6" --action duplicate --conflict skip --log-file "/mnt/appdata/logs/suball.log"
to the 1st line of the custom utorrent postprocess script i use
User avatar
rednoah
The Source
Posts: 23513
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Limit Suball Script to only missing subtitles

Post by rednoah »

You can use the filebot -get-subtitles command to get subtitles for all your files:

Shell: Select all

filebot -get-subtitles -r "/path/to/files" -non-strict
:arrow: https://www.filebot.net/cli.html


:idea: filebot -get-subtitles is not well-tested with surpassing your limits, especially the new OpenSubtitles REST API. Presumably it'll just error out, or retry a few times and then error out, when the limit is reached. You'll want to do a few test runs and observe the log. Please share your findings.


:idea: You may want to combine this with a find -exec or bash script of some kind to limit the number of files that FileBot is called with each time, i.e. you'll want to write your own code that takes care of the done / todo accounting.


:idea: You can remove the -non-strict bit if you want to limit yourself to exact matches.


:idea: Excluding files that already have embedded subtitles could be done via the --file-filter option, or your custom bash script that selects files based on your custom code.


:!: The suball script is very counter productive in your specific use case, as it was designed for an automated setup where lots of safety checks are useful and desirable, before eventually selecting a few files and just making a simple -get-subtitles call.


:!: The --action duplicate --conflict skip options do nothing in filebot -get-subtitles or filebot -script fn:suball calls. I would refrain from using options that do nothing, as to avoid confusion.




EDIT:

Once you're at the point where you only need new subtitles, and if the --def subtitles=en option of the amc script is somehow not suitable for your use case, then can do something like this:

Shell: Select all

filebot -script fn:suball /path/to/media -non-strict --def minAgeDays=1 maxAgeDays=7
If you want to use this in your custom post-processing bash script, where $1 is the input folder, then you will want to write the command like this:

Shell: Select all

filebot -script fn:suball "$1" -non-strict --def minAgeDays=1 maxAgeDays=7
** You may want to make this call before processing your files with the amc script so that your files still have the original file dates and file names.
:idea: Please read the FAQ and How to Request Help.
starlo
Posts: 6
Joined: 11 May 2014, 11:11

Re: Limit Suball Script to only missing subtitles

Post by starlo »

thanks for the reply but im still struggling.

I don't want amc to get subtitles because it downloads srt files even if the video already has embedded subtitles but that part of the problem is solved i think with editing my custom post-processing bash script to have this as the first line before amc is called

Shell: Select all

filebot -script fn:suball "$1" --def minAgeDays=1 maxAgeDays=7
this will try and download subtitles and if an exact match is not found skip the file - amc would then kick in and rename, move and change the date of the file so I would also need to add the following to daily cron job so it keeps searching for an exact match for the next 7 days

Shell: Select all

filebot -script fn:suball /path/to/media -non-strict --def minAgeDays=1 maxAgeDays=7
perfect. thats me sorted moving forward - anything new will get subtitles and the script will skip any files with embedded subtitles or external subtitles

but after spending all day googling on how to use find -exec and --file-filter im even more confused than i was in the beginning.

somehow i need to use the "find" to find all video files that don't have a matching .eng.srt file ignoring any white space in the filenames then -exec get subtitles with --file-filter to exclude embedded subtitles and finally limit this to 1,000 results then exit

at least that's what i think i need to do reading your post but i have no idea how to achieve it

is there any chance you could help me out? my head is going to explode!

just for clarity

I am trying to create a one line filebot command i can run as a daily cron which will:

1st . Search my entire tv and movie folders for any files without subtitles (embedded or .eng.srt file)
2nd. Limit the result of this search to 1,000
3rd. Download exact match subtitles for the search results

i really thought suball would have been perfect in my case as it would skip existing subtitles and embedded subtitles as default so i could simple run it once a day at tv folder and it would just chip away at it, skipping the files already downloaded and just grabbing the missing subtitles.
User avatar
rednoah
The Source
Posts: 23513
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Limit Suball Script to only missing subtitles

Post by rednoah »

starlo wrote: 04 Nov 2024, 17:27 1st . Search my entire tv and movie folders for any files without subtitles (embedded or .eng.srt file)
2nd. Limit the result of this search to 1,000
3rd. Download exact match subtitles for the search results
What you are asking for is default behaviour:
  1. The embedded bit is default for suball script calls. The .eng.srt file bit is default for both -get-subtitles and suball script calls. It may not work well if files are not well-named. Please post logs (the bits that show the file names) so we can see what you can see.
  2. FileBot will likely just crash and abort when requests fail. Thus this behaviour is default. Please paste logs (the last few lines) so we can see what happens when the limit is reached.
  3. This is default behaviour, unless -non-strict is specified.

:arrow: Please paste console logs if you're having issues so that we can see what you can see. You can paste console logs on Pastebin and then paste a link here.
:idea: Please read the FAQ and How to Request Help.
Post Reply