rednoah wrote: ↑07 Jun 2020, 04:27
If you're looking into xattr, then you're already on the wrong path. You'll want to look for binary duplicates (same bytes) and not logical duplicates (same movie id as per xattr metadata) and remove them from the input folder beforehand.
Well, I
also want to deal with logical duplicates, but your post did make me check and it turns out that I only had 4.8.5, so I upgraded to 4.9.1 and tried again.
Code: Select all
/opt/filebot/filebot.sh -script fn:duplicates --mode binary /media/incoming /media/en/serials -rename --output /media/exactdupes --format "{n}{' ('+y+')'}/{episode.special ? 'Specials' : 'Season '+s.pad(2)}/{n} - {episode.special ? 'S00E'+special.pad(2) : s00e00} - {t.replaceAll(/[\`´‘’ʻ]/, /'/).replaceAll(/[!?.]+$/)}{' '+source}{' '+vf}{' '+vc}{' '+ac}{'.'+lang}"
/media/incoming had exactly one file, which I copied from my media library to make sure it was an exact duplicate.
I started FileBot and then walked away for a bit and came back, and it was still running. So I re-ran it with 'truss' (like 'strace') to see what it was doing. FileBot was going through each media file in my library, checking for an extended attribute named "CRC32" (won't work because I'm using ZFS), then opening and reading the file (to generate a checksum?), then trying and failing to set that same "CRC32" extended attribute.
Code: Select all
30658: extattr_get_file("/media/en/serials/TVSHOW (2016)/Season 2/TVSHOW (2016) - s02e01 - EPISODETITLE 1080p WEB-DL AVC.mkv",EXTATTR_NAMESPACE_USER,"CRC32",0x0,0) ERR#87 'Attribute not found'
30607: openat(AT_FDCWD,"/media/en/serials/TVSHOW (2016)/Season 2/TVSHOW (2016) - s02e01 - EPISODETITLE 1080p WEB-DL AVC.mkv",O_RDONLY,00) = 76 (0x4c)
30672: extattr_set_file("/media/en/serials/TVSHOW (2016)/Season 2/TVSHOW (2016) - s02e01 - EPISODETITLE 1080p WEB-DL AVC.mkv",EXTATTR_NAMESPACE_USER,"CRC32","165031EE",8) = 8 (0x8)
My library has 71 TB across 81,544 files. It would take days to checksum my library. Worse, without extended attributes, it would take days every time I ran this script!
I put "net.filebot.xattr.store=.xattr" back in my system.properties, and sure enough it started creating .xattr directories to store this CRC32 extended attribute:
Code: Select all
31046: openat(AT_FDCWD,"/media/en/serials/TVSHOW (2016)/Season 2/.xattr/TVSHOW (2016) - s02e01 - EPISODETITLE 1080p WEB-DL AVC.mkv/CRC32",O_RDONLY,00) ERR#2 'No such file or directory'
31046: stat("/media/en/serials/TVSHOW (2016)/Season 2/.xattr/TVSHOW (2016) - s02e01 - EPISODETITLE 1080p WEB-DL AVC.mkv",0x7fffdfff8e58) ERR#2 'No such file or directory'
31046: mkdir("/media/en/serials/TVSHOW (2016)/Season 2/.xattr/TVSHOW (2016) - s02e01 - EPISODETITLE 1080p WEB-DL AVC.mkv",0777) ERR#2 'No such file or directory'
31046: access("/media/en/serials/TVSHOW (2016)/Season 2/.xattr",F_OK) ERR#2 'No such file or directory'
31046: mkdir("/media/en/serials/TVSHOW (2016)/Season 2/.xattr",0777) = 0 (0x0)
31046: mkdir("/media/en/serials/TVSHOW (2016)/Season 2/.xattr/TVSHOW (2016) - s02e01 - EPISODETITLE 1080p WEB-DL AVC.mkv",0777) = 0 (0x0)
31046: openat(AT_FDCWD,"/media/en/serials/TVSHOW (2016)/Season 2/.xattr/TVSHOW (2016) - s02e01 - EPISODETITLE 1080p WEB-DL AVC.mkv/CRC32",O_WRONLY|O_CREAT|O_TRUNC,0666) = 76 (0x4c)
To me it looks like the only sane way to run this script would be to enable extended attributes, so it can store the "CRC32" attribute.
Anyways, I don't want to checksum my whole library, nor do I want to check for duplicates
within my library. I purposefully hard-link files when the file has multiple languages, which would look like duplicates.
I just want to check the destination filepath. If FileBot knows that '/media/incoming/TVSHOW - S31E22 - EPISODETITLE 1080p x264 AAC.mkv' is going to end up as '/media/en/serials/TVSHOW (1989)/Season 31/TVSHOW - S31E22 - EPISODETITLE 1080p x264 AAC.mkv' based on the --format parameter, then there's no need to check any other file.
Do FileBot scripts have access to the destination file path without actually performing the rename?