Bug in the cleaner Script

All your suggestions, requests and ideas for future development
Post Reply
Automator
Posts: 36
Joined: 30 Jul 2015, 10:17

Bug in the cleaner Script

Post by Automator »

Hi there,

I found out that I guess since the Patching for r5256 the cleaner script no longer searches thru the root folder you give it.

Code: Select all

[bliblablup@whatever]$ ls -l Spider.Man.Homecoming.2017.German.DL.1080p.BluRay.x264-ENCOUNTERS/
total 7064612
-rwxr-xr-x. 1 jd jd        179 Nov 10 23:15 #1 Uploaded Premium Account.URL
-rwxr-xr-x. 1 jd jd        229 Oct 13  2016 #2 Share-Online Premium Account.URL
-rwxr-xr-x. 1 jd jd        189 Oct 13  2016 #3 Oboom Premium Account.URL
-rwxr-xr-x. 1 jd jd        210 Jul 18  2017 #4 Rapidgator Premium Account.URL
-rwxr-xr-x. 1 jd jd        451 Oct 13  2016 BITTE LESEN WICHTIG.txt
-rwxr-xr-x. 1 jd jd 7234125641 Nov 20 20:57 encounters-spmaho_1080p.mkv
-rwxr-xr-x. 1 jd jd      14195 Nov 21 07:02 encounters-spmaho_1080p.nfo
drwxr-xr-x. 2 jd jd          0 Jan 24 19:44 Subs

Code: Select all

[bliblablup@whatever]$ filebot -script fn:cleaner Spider.Man.Homecoming.2017.German.DL.1080p.BluRay.x264-ENCOUNTERS/
Delete /mystrucure/Spider.Man.Homecoming.2017.German.DL.1080p.BluRay.x264-ENCOUNTERS/Subs/encounters-spmaho_1080p_subs.sfv
Delete /mystrucure/Spider.Man.Homecoming.2017.German.DL.1080p.BluRay.x264-ENCOUNTERS/Subs
Done ヾ(@⌒ー⌒@)ノ
script missed URL files totally...

Please fix...

Thanks
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Bug in the cleaner Script

Post by rednoah »

1.
Did you make up the logs?

This command could not and would not yield to output you posted, since you're not passing in any input folder as argument:

Code: Select all

filebot -script fn:cleaner
Did you mean this?

Code: Select all

filebot -script fn:cleaner /path/to/input/folder

2.
Looking at the code, the extension is checked in a case-sensitive manner, so the script matches "url" but not "URL", and nothing has ever changed in this regard.

:idea: Did you test with *.url files when checking older revisions? But with *.URL files when checking newer revisions? The revision of FileBot making a difference here, doesn't make sense to me.
:idea: Please read the FAQ and How to Request Help.
Automator
Posts: 36
Joined: 30 Jul 2015, 10:17

Re: Bug in the cleaner Script

Post by Automator »

1.
rednoah wrote: 24 Jan 2018, 20:42 1.
Did you make up the logs?

This command could not and would not yield to output you posted, since you're not passing in any input folder as argument:
Yes, and I helped the USA staging the moon landing... Image

Image

As you can see if you make the browser window large enough it is all on one line.
rednoah wrote: 24 Jan 2018, 20:42
2.
Looking at the code, the extension is checked in a case-sensitive manner, so the script matches "url" but not "URL", and nothing has ever changed in this regard.

Did you test with *.url files when checking older revisions? But with *.URL files when checking newer revisions? The revision of FileBot making a difference here, doesn't make sense to me.
Well as much as I like to say this, you are wrong again! If you would look at the two outputs. Or at least give it more than 1 second to think in... You could see the .txt file. BITTE LESEN WICHTIG its even written with capitalized letters. The ending is .txt written with small letters. As you can see in the output of your script, it does not consider that file. But to proof you even more wrong, once again... I quickly downloaded your script into my folder modified the url to URL and ran it again, with the shocking result. IT IS NOT WORKING!

Code: Select all

$ cat cleaner.groovy
#!/usr/bin/env filebot -script


def deleteRootFolder = any{ root.toBoolean() }{ false }

def ignore  = any{ ignore }{ /extrathumbs/ }
def exts    = any{ exts }{ /jpg|jpeg|png|gif|ico|nfo|info|xml|htm|html|log|m3u|cue|srt|sub|idx|smi|sup|md5|sfv|txt|rtf|URL|url|db|dna|log|tgmd|json|data|ignore|srv|srr|nzb|vbs|ini |vsmeta/ }
def terms   = any{ terms }{ /sample|trailer|extras|deleted.scenes|music.video|scrapbook|DS_Store/ }
def minsize = any{ minsize.toLong() }{ 20 * 1024 * 1024 }
def maxsize = any{ maxsize.toLong() }{ 100 * 1024 * 1024 }


def testRun = _args.action.equalsIgnoreCase('test')


/*
 * Delete orphaned "clutter" files like nfo, jpg, etc and sample files
 */
def isClutter = { f ->
        // whitelist
        if (f.path.findMatch(ignore))
                return false

        // file is either too small to have meaning, or to large to be considered clutter
        def fsize = f.length()

        // path contains blacklisted terms or extension is blacklisted
        if (f.extension ==~ exts && fsize < maxsize)
                return true

        if (f.path.findMatch(/\b(/ + terms + /)\b/) && fsize < maxsize)
                return true

        // NOTE: some smb filesystem implementations are buggy and known to incorrectly return filesize 0 for all files
        if (f.isVideo() && fsize < minsize && fsize > 0)
                return true

        return false
}


def clean = { f ->
        log.info "Delete $f"

        // do a dry run via --action test
        if (testRun) {
                return false
        }

        return f.isDirectory() ? f.deleteDir() : f.delete()
}


// memoize media folder status for performance
def hasMediaFiles = { dir -> dir.isDirectory() && dir.getFiles().find{ (it.isVideo() || it.isAudio()) && !isClutter(it) } }.memoize()

// delete clutter files in orphaned media folders
args.getFiles{ isClutter(it) && !hasMediaFiles(it.dir) }.each{ clean(it) }

// delete empty folders but exclude given args
args.getFolders().sort().reverse().each{
        if (it.isDirectory() && !it.hasFile{ it.isDirectory() || !isClutter(it) }) {
                if (deleteRootFolder || !args.contains(it))
                        clean(it)
        }
}
Result:

Code: Select all

filebot -script cleaner.groovy Spider.Man.Homecoming.2017.German.DL.1080p.BluRay.x264-ENCOUNTERS/ --action test
Delete /whatever/Spider.Man.Homecoming.2017.German.DL.1080p.BluRay.x264-ENCOUNTERS/Subs/encounters-spmaho_1080p_subs.sfv
Delete /whatever/Spider.Man.Homecoming.2017.German.DL.1080p.BluRay.x264-ENCOUNTERS/Subs
Done ヾ(@⌒ー⌒@)ノ
But hey! Why not just listen to the user, if they tell you, your software / script is broken? Instead of trying to make them look foolish?! Kinda spoiles the point of support!? And makes you, if you are proven wrong, even look worse!
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Bug in the cleaner Script

Post by rednoah »

1.
Mea culpa. I guess I shouldn't read the forums when I'm mobile. Ups. Sorry. :oops:

:idea: Recent changes could not possibly (read: extremely unlikely) have broken the cleaner script (which hasn't significantly changed in years) so strong proof was necessary to eliminate human error. I shall ask for clarification more politely next time things don't quite ad up.


2.
rednoah wrote: 21 Jul 2012, 09:50 Delete clutter files like artwork, samples, etc in folders that have been left over after moving video files.
Looking at the OP using a larger screen, your ls command reveals a large mkv file. By design, the cleaner script doesn't touch folders with large files in them, to avoid accidentally deleting something important.

:idea: Still pretty sure the FileBot revision has no effect on that part though, but you're welcome to proof me wrong yet again. :lol:
:idea: Please read the FAQ and How to Request Help.
Automator
Posts: 36
Joined: 30 Jul 2015, 10:17

Re: Bug in the cleaner Script

Post by Automator »

rednoah wrote: 24 Jan 2018, 22:21 1.
Mea culpa. I guess I shouldn't read the forums when I'm mobile. Ups. Sorry. :oops:

:idea: Recent changes could not possibly (read: extremely unlikely) have broken the cleaner script (which hasn't significantly changed in years) so strong proof was necessary to eliminate human error. I shall ask for clarification more politely next time things don't quite ad up.
The problem is, it's the second time. It already happen with the "UnsupportedOperationException with the Rename function" where the bug was reported half a year ago, before it was finally fixed, after me asking again and again... I understand that you can not help everyone and take everyone serious and since it is the internet you got tons and tons of morons who probably don't even have the basic knowledge about computers. But you could at least give someone the benefit of the doubt.

rednoah wrote: 24 Jan 2018, 22:21 Looking at the OP using a larger screen, your ls command reveals a large mkv file. By design, the cleaner script doesn't touch folders with large files in them, to avoid accidentally deleting something important.

Still pretty sure the FileBot revision has no effect on that part though, but you're welcome to proof me wrong yet again.
Yes, in this case its not about the rev or what ever changes. Its more that the intention of that script is kinda questionable since you got tons and tons of trash lying around your .mkv .avi .mp4 or what ever files and its called a cleaner script. Why should it not touch exactly that files which are in that folder where the video files are? You put enough safeties in place filter by extension, filter by size, filter by name. What line would I need to change in your (your code is not that easy to read for someone who never worked with java / groovy before) but I actually did good progress so far. I know you are a great coder (your code is very good written - product is also solid and I saw your video on Youtube JCConf 2016) and I know its your opinion that people should learn it them self and you only give hints never the solution ;-) but sometimes only sometimes it would make life a lot easier :-P and yes I learned a lot about java and groovy the last days thanks your most of the time to be honest useless answers :-D

Last but not least, going off topic once again.

rednoah wrote: 24 Jan 2018, 09:03 You can't use the lookup result if the lookup result hasn't been stored in xattr.

:!: Please don't abuse TheMovieDB by doing online lookup for duplicate detection repeatedly for all your files.
You can! I did :-) And again proven you wrong :-P and I repeat also again I don't abuse TheMovieDB for duplicate detection. I use it only to look up one new movie the generate folder name and then use that result to go thru all my folders for duplication detection without TheMovieDB only folder name to folder name matching.
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Bug in the cleaner Script

Post by rednoah »

1.
Well, it looked really wrong on my small phone screen... Never considered the line break, just stopped reading the log after the first line that seemed to be the entire command... :oops:


2.
The cleaner script is doing the more difficult thing, hence the convoluted hard-to-understand code. If you just want to delete files by extension, then there's probably really good and much easier to use tools for that (i.e. the find -delete command on Linux).

But since it's the FileBot forums, here goes...

Replace:

Code: Select all

def hasMediaFiles = { dir -> dir.isDirectory() && dir.getFiles().find{ (it.isVideo() || it.isAudio()) && !isClutter(it) } }.memoize()
With:

Code: Select all

def hasMediaFiles = { dir -> false }
Untested, but it should trick the rest of the code that no folder ever contains any media files, and thus delete extra clutter files (better test it carefully first, just to make sure it doesn't accidentally also delete media files :P).


3.
That's fine. Duplicate detection based on folder names is perfectly fine, and fairly instant. As long as the online lookup happens once, and not every day or week, then that's fine.


4.
I wonder if anybody ever implemented anything like the FileBot Format Engine for in-house products. JCConf is a blast from the past. Didn't think that'd ever come back to haunt me. How did you even find that again? :D
:idea: Please read the FAQ and How to Request Help.
Automator
Posts: 36
Joined: 30 Jul 2015, 10:17

Re: Bug in the cleaner Script

Post by Automator »

1. ;-)

2. Thank you very much. And I mean that. I had so much troubles reading your code there.. Since you && and ! functions and I don't have any good debugger to see what you are passing where so I put println's everywhere to kinda get a hang of what you are doing but I could not really figure it 100% out.

3. Works very fast and does what it should :-)

4. The internet never forgets! :-)
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Bug in the cleaner Script

Post by rednoah »

I have no illusions about this script being totally unreadable. Unfortunately, I couldn't think of a more readable solution that takes care of this oddly specific use case of maybe deleting files. :D


EDIT: Funny story...
Filipe Fortes wrote:Debugging is like being the detective in a crime movie where you are also the murderer.
The "UnsupportedOperationException" bug was a really weird one though, indirectly somehow caused by myself, through changes in completely unrelated code, and masquerading itself as another already solved issue. The the key crux being that it only happened on folders, which was why I could never reproduce it and why nobody else noticed. It's actually one of my favorite bugs. I wouldn't be able to guess how that particular commit actually fixes the problem without having debugged it for 3-4 hours beforehand. :lol:
:idea: Please read the FAQ and How to Request Help.
Post Reply