[AMC] --filter "age < 7 || !model.any{ it.age < 7 }" is extremely slow for shows with a large number of episodes

Any questions? Need some help?
Post Reply
miggy
Posts: 8
Joined: 09 Jun 2016, 19:57

[AMC] --filter "age < 7 || !model.any{ it.age < 7 }" is extremely slow for shows with a large number of episodes

Post by miggy »

Hello,

When I run

Code: Select all

 filebot -script fn:amc --filter "age < 7 || !model.any{ it.age < 7 }" --output "D:\Finished DLs" --action move --conflict index -non-strict "D:\Finished DLs\TV" --log-file "D:\Finished DLs\amc_TV.log"  --def skipExtract=y clean=y 
against a show with a lot of episodes it takes longer than I have had the patience.

For example, against "the.daily.show.2021.09.28" I get:

Code: Select all

Fetching episode data for [The Daily Show]
Fetching episode data for [The Daly Show]
Fetching episode data for [The Nightly Show with Larry Wilmore]
Fetching episode data for [The D'Amelio Show]
Fetching episode data for [The Lucy Show]
Apply filter [age < 7 || !model.any{ it.age < 7 }] on [4513] options
Where it hangs until I quit. I've run it against shows with 843 options and it finishes within 30 seconds, so I expect it would finish eventually. Is this slowness expected behavior?
User avatar
rednoah
The Source
Posts: 20271
Joined: 16 Nov 2011, 08:59

Re: filter episodes by age < 7 takes too long

Post by rednoah »

:idea: age < 7 is a linear time algorithm (i.e. check each episode once) and thus pretty fast. (i.e. 4,513 * 1 = 4,513 operations)


:idea: model.any{ it.age < 7 } is quadratic time algorithm (i.e. check each episode for each episode) and while that works well for small numbers, it doesn't scale well to large numbers. (i.e. 4,513 * 4,513 = 20,367,169 operations)


TL;DR age < 7 || !model.any{ it.age < 7 } is extremely slow for TV shows with a large number of episodes.


:arrow: You may prefer to have separate amc script calls, one with --filter "age < 7" for newly released episodes, and another one without --filter for everything else.




EDIT:

FileBot r8833 now caches the {model} value in-memory for each unique match context. That will make {model} access constant time, and thus take care of most of the slowness:

Code: Select all

$ time filebot -list --q "The Daily Show" --log INFO --filter 'age < 5 || 5 <= model.age.min()'
The Daily Show with Trevor Noah - 27x01 - Neal Brennan
The Daily Show with Trevor Noah - 27x02 - Davido

real	0m13.518s
user	0m18.088s
sys	0m0.613s

Doing model.age.min() for each episode is still a bit slow though, so at this point we can only optimize our --filter code to run our hasRecentEpisode check only once and then remember the result, instead of recomputing it for each episode:

Code: Select all

time filebot -list --q "The Daily Show" --log INFO --filter '
@groovy.transform.Memoized
static def hasRecentEpisode(model) {
        5 <= model.age.min()
}
age < 5 || hasRecentEpisode(model)
'
The Daily Show with Trevor Noah - 27x01 - Neal Brennan
The Daily Show with Trevor Noah - 27x02 - Davido

real	0m3.295s
user	0m8.687s
sys	0m0.496s
:!: Note that @groovy.transform.Memoized only works as expected if the {model} value is cached, and thus this --filter code optimization also requires FileBot r8833 or higher.
:idea: Please read the FAQ and How to Request Help.
miggy
Posts: 8
Joined: 09 Jun 2016, 19:57

Re: [AMC] --filter "age < 7 || !model.any{ it.age < 7 }" is extremely slow for shows with a large number of episodes

Post by miggy »

Thanks for the quick response!

By bad luck, another show "What if...?" was having issues, and actually a show named "What if" is currently broadcasting so this method was not working well.

I moved to the exclude list model which is working great.

Thanks again!
Post Reply