Extremely slow matching performance after Filebot and Java upgrade

All your suggestions, requests and ideas for future development
Post Reply
wiifm
Posts: 4
Joined: 22 Mar 2020, 00:14

Extremely slow matching performance after Filebot and Java upgrade

Post by wiifm »

Today I updated filebot to 4.9.0 on the Synology 918+. I had to remove Java 8, and then install Java through the recommended installer (now version 14).

I primarily use filebot with the fn:amc script:

Code: Select all

filebot -script fn:amc \
  --log fine \
  --output "/volume1/TV" \
  --conflict skip \
  --action copy \
  -non-strict "/volume1/Downloads/Synology" \
  --log-file amc.log \
  --def excludeList=amc.txt music=n \
  seriesFormat="/volume1/{vf == /2160p/ ? 'TV4K' : 'TV'}/{n}/{s00e00} - {t}" \
  movieFormat="/volume1/{vf == /2160p/ ? 'Movies4K' : 'Movies'}/{n} [{y}] {vf} {af}" \
  subtitles="en"
Now to process new files it is taking a really long time. I used to be able to delete the amc.txt file and reprocess all downloads relatively quickly (say 30 minutes). Now it takes 1.5 minutes just to check a single file

Code: Select all

wiifm@tank:~$ time ./downloadsToTv.sh
Run script [fn:amc] at [Sun Mar 22 13:29:27 NZDT 2020]
Use excludes: /volume1/TV/amc.txt (1192)
Input: /volume1/Downloads/Synology/Star.Trek.Picard.S01E09.iNTERNAL.1080p.WEB.x264-BAMBOOZLE/star.trek.picard.s01e09.internal.1080p.web.x264-bamboozle.mkv
CmdlineException: OpenSubtitles: Please enter your login details by calling `filebot -script fn:configure`
Rename episodes using [TheTVDB] with [Airdate Order]
Auto-detected query: [Star Trek Picard]
Fetching episode data for [Star Trek: Picard]
Fetching episode data for [Star Trek]
Fetching episode data for [Star Trek: Voyager]
Fetching episode data for [Star Trek: Enterprise]
Fetching episode data for [Star Trek: Deep Space Nine]
Stripping invalid characters from new path: /volume1/TV/Star Trek: Picard/S01E09 - Et in Arcadia Ego (1)
Processed 0 files
CmdlineException: Failed to process [/volume1/Downloads/Synology/Star.Trek.Picard.S01E09.iNTERNAL.1080p.WEB.x264-BAMBOOZLE/star.trek.picard.s01e09.internal.1080p.web.x264-bamboozle.mkv] because [/volume1/TV/Star Trek Picard/S01E09 - Et in Arcadia Ego (1).mkv] is an exact copy and already exists [Last-Modified: Sat Mar 21 20:27:13 NZDT 2020]
Finished without processing any files

real	0m59.788s
user	1m31.224s
sys	0m5.929s
What I suspect is happening is that the 2 files are being compared with a hash or something, to ensure they are exactly the same. In my case, I only care the file name is the same (and thus it can be skipped). Can we speed up this comparison?

System information

Code: Select all

wiifm@tank:~$ filebot -script fn:sysinfo
FileBot 4.9.0 (r7234)
JNA Native: 6.1.0
MediaInfo: 19.09
7-Zip-JBinding: 9.20
Chromaprint: java.io.IOException: Cannot run program "fpcalc": error=2, No such file or directory
Extended Attributes: OK
Unicode Filesystem: OK
Script Bundle: 2020-03-16 (r625)
Groovy: 3.0.2
JRE: OpenJDK Runtime Environment 14
JVM: 64-bit OpenJDK 64-Bit Server VM
CPU/MEM: 4 Core / 4.2 GB Max Memory / 50 MB Used Memory
OS: Linux (amd64)
HW: Linux tank 4.4.59+ #24922 SMP PREEMPT Mon Aug 19 12:13:37 CST 2019 x86_64 GNU/Linux synology_apollolake_918+
STORAGE: ext4 [/] @ 1.3 GB | btrfs [/volume1] @ 16 TB | btrfs [/volume1/@docker] @ 16 TB | btrfs [/volume1/@docker/btrfs] @ 16 TB
DATA: /volume1/@appstore/filebot/data/wiifm
Package: SPK
License: FileBot License P10597872 (Valid-Until: 2020-12-11)
Done ヾ(@⌒ー⌒@)ノ
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Extremely slow matching performance after Filebot and Java upgrade

Post by rednoah »

This error indicates that something going awry in your automated setup, because you somehow end up processing the same file to the same location more than once:

Code: Select all

Failed to process [...] because [...] is an exact copy and already exists [Last-Modified: Sat Mar 21 20:27:13 NZDT 2020]
:idea: Note that FileBot 4.8.5 has the same check, so nothing new there. It prevents you from accidentally processing files in an infinite loop. In this exceptional case, it does indeed compare file contents, but this would never happen during normal usage, as you'd only ever process new files.


:arrow: Presumably, this particular error is caused by deleting amc.txt without also deleting all the files that are listed in amc.txt and so you accidentally end up processing the exact same file in exactly the same way which triggers the sanity check above.


:idea: If it was slow before deleting amc.txt then we're now looking at a different issue now that was caused by you trying to work around the first issue. If you can reproduce the original issue again, then the Developer Options may be able to help you narrow down where things stall.


:idea: A large amc.txt wouldn't slow down FileBot. The amc.txt exclude list size probably doesn't really matter unless we're talking about 10+ million lines.




EDIT:

On a more general note, the new -no-probe option will disable libmediainfo and ffprobe for most aspects of FileBot, which can significantly speed up processing of remote network shares, but it likely won't make much of a difference for local files, and definitely has no effect on the "exact copy and already exists" issue discussed above.
:idea: Please read the FAQ and How to Request Help.
wiifm
Posts: 4
Joined: 22 Mar 2020, 00:14

Re: Extremely slow matching performance after Filebot and Java upgrade

Post by wiifm »

Hey @Rednoah,

Yeah, I noticed that a particular match was wrong when I last ran the script, so I killed the execution of it entirely. This however had the unfortunate side effect of adding every file to the amc.txt even though it was not actually processed.

In the past, I would just remove the amc.txt file, and re-process every download. With the older version, this would take some time, but it would complete. Now with 4.9.0 it takes an absolute age, and effectively makes this not an option.
it does indeed compare file contents
Can this be disabled? And the comparison can be on a pure filename match (or filesize to the byte)
A large amc.txt wouldn't slow down FileBot.
Good to know this is not the issue.
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Extremely slow matching performance after Filebot and Java upgrade

Post by rednoah »

1.
wiifm wrote: 22 Mar 2020, 08:32 In the past, I would just remove the amc.txt file, and re-process every download. With the older version, this would take some time, but it would complete. Now with 4.9.0 it takes an absolute age, and effectively makes this not an option.
Yes, this is the idea. Processing all files for effectively no reason is not nice.

:idea: The correct way to go about it would have been to modify amc.txt and just remove the last few lines, so you can process these files again.


2.
There are many many ways to go about it after the fact.

I'll just list a few that come to mind:
* Generate a new exclude list based on the FileBot history, i.e. consisting of all files that have previously been processed, e.g. filebot -script fn:history | cut -f1
* Generate a new exclude list using the find command, i.e. consisting of all files in your input folder, and then manually remove entries that haven't actually been processed yet
* Use the duplicate script to delete all physical duplicates from either input or output folder
* Process files into a different output folder (this is not nice because you're just tricking FileBot into letting you process all your files again)
* Update the Last-Modified date on all your files in either input or output folder (this is not nice because you're just tricking FileBot into letting you process all your files again)
* Consider using a completely different approach, such as using --file-filter to select only files that haven't been processed yet, e.g. --file-filter "age < 1" will only processes recently modified files, neatly excluding all the files that have been there for a long time.
* ...
:idea: Please read the FAQ and How to Request Help.
Cloaky
Posts: 9
Joined: 23 Feb 2015, 13:58

Re: Extremely slow matching performance after Filebot and Java upgrade

Post by Cloaky »

Hey people,

I got the feeling that my Synology NAS is also running slower after the update. I had to follow a similar path and remove my old Java and install Java Installer after updating to File Bot 4.9.0.
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Extremely slow matching performance after Filebot and Java upgrade

Post by rednoah »

Cloaky wrote: 22 Mar 2020, 14:47 I got the feeling that my Synology NAS is also running slower after the update. I had to follow a similar path and remove my old Java and install Java Installer after updating to File Bot 4.9.0.
You can use the portable packages to run side-by-side comparisons. Newer versions may very well require more CPU and RAM when they run more complex code to yield better results, depending on the situation and device you're running on, it might be faster or slower.


:idea: Please read How to Request Help and always include filebot -script fn:sysinfo output. ;)


TL;DR I'm happy to help you look into why it's slow for you, but it's going to take time and effort on your part too. ;)
:idea: Please read the FAQ and How to Request Help.
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Extremely slow matching performance after Filebot and Java upgrade

Post by rednoah »

FileBot r7265 adds the -no-index option which greatly reduces CPU and RAM usage, at the expense of some higher level auto-detection features.
:idea: Please read the FAQ and How to Request Help.
Post Reply