[Bug] Pattern matching worse since 4.7

All your suggestions, requests and ideas for future development
Hemloc
Posts: 25
Joined: 03 Dec 2016, 11:14

[Bug] Pattern matching worse since 4.7

Post by Hemloc »

Hi there,

Since upgrading to 4.7 I have found the pattern matching actually worse than before. So much so that I have gone back to 4.6.1 which does not have the same problems. There are several issues I have seen with the new pattern matching :

1. For some reason FileBot 4.7.5 cannot identify a series from a scene standard filename, e.g. Humans.S02E05.1080p.AMZN.WEBRip.DD2.0.x264-KiNGS_0.mkv. There is only one series on TheTVDB with this name, and yet FileBot fails to identify the name. It shouldn't since it confirms to scene standards. This works fine in 4.6.1.
2. FileBot should not use the directory for pattern matching if everything is already in the filename. An example is the name above, and yet FileBot always checks the directory name no matter what. It should only do this if it cannot find a series name from the filename.
3. There should be a way of switching off checking directory names completely. This is a new feature, and unwanted by several it seems, not just me. Using a blacklist is a very inefficient design when a simple directory name like "Humans - Season 2" confuse the program. As it is, I use temporary directory names that do not conform to any standard and asking you to add them to a blacklist seems silly. A simple switch would, I am sure, solve the problem for many users.

Regards
Andrew

PS : And now that I have downgraded, every time I open the program I am asked if I want to upgrade. Why is there no option to switch this off?

C:\Program Files\FileBot>filebot -script fn:sysinfo
FileBot 4.7.5 (r4600)
JNA Native: 4.0.1
MediaInfo: 0.7.88
7-Zip-JBinding: 9.20
Chromaprint: 1.1.0
Extended Attributes: OK
Script Bundle: 2016-11-26 (r465)
Groovy: 2.4.7
JRE: Java(TM) SE Runtime Environment 1.8.0_66
JVM: 32-bit Java HotSpot(TM) Client VM
CPU/MEM: 12 Core / 247 MB Max Memory / 11 MB Used Memory
OS: Windows 7 (x86)
Package: MSI
Data: C:\Users\Hemloc\AppData\Roaming\FileBot
uname: CYGWIN_NT-6.1 Hemloc-PC 1.7.28(0.271/5/3) 2014-02-09 21:06 x86_64 Cygwin
Done ?(?????)?
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [Bug] Pattern matching worse since 4.7

Post by rednoah »

1.
Scene "standards" include names such as TWD and GoT so they're pretty useless. Fortunately, there is people who add these mangled names as alias to the series record and that's why FileBot can deal with most of them.

I have no idea why this specific example doesn't work, will be doing some debugging. The full path of the file that doesn't work would be helpful.


EDIT:

Works for me. I need the full file path to run additional tests.

Code: Select all

$ filebot -rename Humans.S02E05.1080p.AMZN.WEBRip.DD2.0.x264-KiNGS_0.mkv --db TheTVDB -non-strict --action TEST
Rename episodes using [TheTVDB]
Auto-detected query: [Humans]
Fetching episode data for [Humans]
Fetching episode data for [Real Humans]
Fetching episode data for [Humans Mutants]
Fetching episode data for [Humans & Households]
[TEST] Rename [Humans.S02E05.1080p.AMZN.WEBRip.DD2.0.x264-KiNGS_0.mkv] to [Humans - 2x05 - Episode 5.mkv]
Processed 1 files

2.
FileBot is and has always been considering the entire file path (file name, folder name, folders folder name, etc) so nothing new here.

FileBot doesn't have settings, because it needs to work out of the box. FileBot will not search for folder names that are on the blacklist, so as long as you use folders such as "Complete", "Downloads", etc it'll just work.

If in some rare cases the folder name makes more sense to FileBot than the filename, then you can post issues here in the forums a things will be fixed, possibly by adding entries to the blacklist. Certainly inconvenient for the individual user, but important for making FileBot work out of the box for most people (especially the less capable ones).


PS: There is such an option: viewtopic.php?f=8&t=2112
:idea: Please read the FAQ and How to Request Help.
Hemloc
Posts: 25
Joined: 03 Dec 2016, 11:14

Re: [Bug] Pattern matching worse since 4.7

Post by Hemloc »

1. Actually, it is not just that specific example, I had the same problem with Criminal.Minds.S12E07.Mirror.Image.PROPER.1080p.AMZN.WEBRip.DD5.1.x264-VISUM_0.mkv. I didn't bother to try the various other ones I had as it seemed they would all have a similar problem. I didn't give the path because I tried this in "L:\Downloads" and that is a blacklisted directory name so I didn't think it would make a difference.

2. Ok... but then why when I upgraded from 4.6.1 to 4.7 for files I have been renaming in the same directory "L:\Downloads\Newsleecher" for over a year, does it suddenly start complaining that it cannot identify "Newsleecher". Something must have changed.

3. I still believe there must be some better way of handling "strange" directory names. However, it seems I have three main directory names causing FileBot problems :
3.1. "L:\Downloads\Newsleecher"
3.2. I use a sub-directory called "L:\Downloads\Newsleecher\t" and the "t" also causes problems. I use "t" instead of "temp" for many things like this.
3.3. Finally I name my series names like "Humans - Season 2", and potentially the " - Season 2" is causing problems.
So if you were to add "Newsleecher", "t" and " - Season \d+" to your blacklist, that would probably solve most of my directory problems :)

Aha! My apologies, I never actually checked to see if there was a way of switching off updates... which if you solve the above problems I potentially won't need!

Anyway, many thanks for the time and effort you put into both the program and help on these forums.
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [Bug] Pattern matching worse since 4.7

Post by rednoah »

I've added "Newsleecher" to the blacklist. 1-2 character terms or numbers have always been blacklisted. I don't think FileBot will consider "t" a valid query or show name (that's also why the TV Series "V" won't work well).

EDIT: "T" is considered a valid TV Show name similar to "V" because there is a show called "T" (actually "T." but . is ignored). Using "temp" or any other blacklisted term as folder name would be better.

FileBot understands patterns like SeriesName - Season [0-9] so that shouldn't be an issue.


Search requests for "T" are not ideal, it does seem to work:

Code: Select all

Input: Downloads/Newsleecher/t/Humans - Season 2/Humans.1x01.mp4
Group: [tvs:humans] => [Humans.1x01.mp4]
Rename episodes using [TheTVDB]
Auto-detected query: [Humans, T]
Fetching episode data for [Humans]
Fetching episode data for [Real Humans]
Fetching episode data for [Humans Mutants]
Fetching episode data for [Humans & Households]
Fetching episode data for [T.]
Fetching episode data for [T@gged]
Fetching episode data for [T. Bag And The Revenge Of The T. Set]
Fetching episode data for [T and T]
Fetching episode data for [T in the Park]
[COPY] Rename [Downloads/Newsleecher/t/Humans - Season 2/Humans.1x01.mp4] to [TV Shows/Humans/Season 01/Humans - S01E01 - Episode 1.mp4]
Processed 1 files

The latest revision r4643 should fix the "folder name lookup issues if folder name uses dots instead of spaces" regression bug.
:idea: Please read the FAQ and How to Request Help.
Hemloc
Posts: 25
Joined: 03 Dec 2016, 11:14

Re: [Bug] Pattern matching worse since 4.7

Post by Hemloc »

Thanks for the update. I noticed that my 3.3. problem is not actually related to the directory name, but the actual series name, see below.

The new version seems a handle the directory names better, so I tried a general rename of a lot of files. Now the following is reported :

1. The.Librarians.2014.S03E03.And.the.Reunion.of.Evil.1080p.WEB-DL.DD5.1.H264-Oosh_0.mkv
This name is not matched uniquely, however the list FileBot brings up shows "The Librarians (2014)" as the first entry, which matches exactly to the series name and year. It looks like FileBot does not match on year as well, but should as it would find a unique match then.
Same occurs with Travelers.2016.S01E08.Donner.1080p.WEB-DL.DD5.1.H.264-TRV_0.mkv.

2. NCIS.S14E07.1080p.Home.of.the.Brave.WEB-DL.DD5.1.H264-FENIKS_0.mkv
Humans.S02E06.1080p.AMZN.WEBRip.DD2.0.x264-KiNGS_0.mkv
Gotham.S03E10.Time.Bomb.1080p.WEBRip.DD5.1.x264-CasStudio_0.mkv
These names are not matched uniquely, potentially because there are many series with the series name in them, however, there is only one series actually just named "NCIS" or "Gotham", so FileBot should match it uniquely, and there is only one series that even starts with "Humans".
Note this works correctly in 4.6.1.

3. yonderland.s03e08.720p.hdtv.x264-tla_0.mkv
This is not matched uniquely although there is only one series with this name.

4. Sweet.Vicious.S01E03.Sucker.1080p.MTV.WEBRip.AAC2.0.x264-BTW.mkv
FileBot cannot find a match on this series, probably because of the backslash in the actual series name "Sweet/Vicious".

- When displaying that a series was failed to be identified, wouldn't it be better to display the filename that did not match instead of a list of all files, of which the one in question is not even listed (presumably in the "..." section)? See http://imgur.com/a/L6lDl as an example.

- And would still be nice having a switch to ignore directory matching, I often play in many different directories :)
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [Bug] Pattern matching worse since 4.7

Post by rednoah »

1.
The first option is the one that FileBot thinks is most likely, so you can just hit ENTER to click through the confirmation steps.

Just auto-selecting the first result without confirmation is a feature reserved for the command-line tools.

I could easily skip that step in the GUI if there's a name + year match, but name + year isn't actually very common with TV shows, so it'll only make a slight difference (confirming a preselected option) for a small number of shows.


2.
Same issue. It might seem obvious for you testing with 3 shows, but the same logic that would make life slightly easier for you would make life quite a bit harder for other people. FileBot has to deal with possible shows for all people that have all kinds of strangely named files.

e.g.
Doctor Who - 1x01.mkv by all likelihood is Doctor Who (2005) and not Doctor Who

:idea: You may get better results (or at least less manual confirmation steps) if you use Strict matching instead of Opportunistic matching.
:idea: Please read the FAQ and How to Request Help.
Hemloc
Posts: 25
Joined: 03 Dec 2016, 11:14

Re: [Bug] Pattern matching worse since 4.7

Post by Hemloc »

1. Every little bit helps in automation :)

2. Thanks, but that made no difference. I understand what you mean, but there should be a way of making a one to one match when the series name can be made uniquely. That is why I did not indicate "Once Upon a Time" which has the same year issue. However, as indicated, this worked fine in 4.6.1 but no longer in 4.7+.

3. Your explanation does not indicate why this fails as it should be a one to one match.

4. Similar to 3, there is only one series with the words "sweet" and "vicious" and FileBot does not even find it. Instead if matches only on "Vicious", for some reason ignoring "Sweet".
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [Bug] Pattern matching worse since 4.7

Post by rednoah »

Have you tried with Strict mode? That should be a solution for most of these issues.


Re: Sweet & Vicious

Without file paths I can run some tests on I have no idea of even guessing why it may or may not work.
:idea: Please read the FAQ and How to Request Help.
Hemloc
Posts: 25
Joined: 03 Dec 2016, 11:14

Re: [Bug] Pattern matching worse since 4.7

Post by Hemloc »

Yes, I have tried Strict mode, and unfortunately, no it makes no difference. Here are images of the attempts :
NCIS - http://imgur.com/a/X1zLW,
Humans - http://imgur.com/a/1Dehy
Yonderland - http://imgur.com/a/xdTT7
Sweet/Vicious - http://imgur.com/a/4DMuj
All files are in "L:\Downloads\Newsleecher".
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [Bug] Pattern matching worse since 4.7

Post by rednoah »

1.
The first 3 do seem odd. Maybe some fine-tuning for these kind of cases would be good.


2.
TheTVDB has issued with shows that contain / because the query FileBot searches with won't contain slashes so TheTVDB won't return results.

If the show was more popular, then FileBot would index it as well it'd just work. Giving the show a good rating on TheTVDB will help the show get included in the FileBot index sooner.
:idea: Please read the FAQ and How to Request Help.
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [Bug] Pattern matching worse since 4.7

Post by rednoah »

In the selection dialog, there is a toggle button you can set to Image and Image. ALL allows you to remember and auto-select the same result for the same query (e.g. "NCIS" ➔ TheTVDB::72108) so it won't ask the same question again.

It works the same in strict and non-strict mode, however it's not stored forever so you may need to make the same selection every few months (and you can clear your selections by clearing the cache).
:idea: Please read the FAQ and How to Request Help.
Hemloc
Posts: 25
Joined: 03 Dec 2016, 11:14

Re: [Bug] Pattern matching worse since 4.7

Post by Hemloc »

Hi rednoah,

Me again, wondering what happened this time. I am trying out 4.7.7 and (potentially using a very obscure series again, it's Swedish), wondering what happened this time?
http://imgur.com/a/QFIWt
I am not sure how it converted "Svartsjon" to "Sweet Vicious" since they are not really that close.

Also, you will notice that Sweet.Vicious.Season.1.Special.Too.Legit.1080p.MTV.WEBRip.AAC2.0.x264-BTW_0.mkv has not been highlighted. However, in 4.6.1 (http://imgur.com/a/AWkVP) you will see that it actually does locate and rename this correctly.

At least both versions completely misnamed Svartsjon (I am not really sure why since there is exactly one series Svartsjön on TheTVDB) so no difference there :)

Andrew
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [Bug] Pattern matching worse since 4.7

Post by rednoah »

1.
"Sweet Vicious" is one of the queries that is being considered. If "Svartsjon" yields no results then "Sweet Vicious" seems to be the next best match. That's what Match Mode: Opportunistic does.

2.
Please post filenames as text.
:idea: Please read the FAQ and How to Request Help.
Hemloc
Posts: 25
Joined: 03 Dec 2016, 11:14

Re: [Bug] Pattern matching worse since 4.7

Post by Hemloc »

1. My preferences are still set to "Strict".

2. Um, I'm not sure which filenames you would like as text. Presumably Svartsjon so examples would be :
Svartsjon.S01E01.720p.WEB-DL.AAC2.0.H.264.srt
Svartsjon.S01E01.720p.WEB-DL.AAC2.0.H.264_0 (1).mkv
Svartsjon.S01E02.SWEDiSH.720p.WEB-DL.AAC2.0.H.264-NORDiCUS.srt
Svartsjon.S01E02.SWEDiSH.720p.WEB-DL.AAC2.0.H.264-NORDiCUS_0 (1).mkv
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [Bug] Pattern matching worse since 4.7

Post by rednoah »

TheTVDB doesn't yield any results for "Svartsjon" so it doesn't work:

Code: Select all

$ filebot -list --q "Svartsjon"
TheTVDB: no results
However, searching in Swedish seems to magically solve the issue:

Code: Select all

$ filebot -list --q "Svartsjon" --lang Swedish
Svartsjön - 1x01 - Avsnitt 1
Svartsjön - 1x02 - Avsnitt 2
Svartsjön - 1x03 - Avsnitt 3
Svartsjön - 1x04 - Avsnitt 4
Svartsjön - 1x05 - Avsnitt 5
Svartsjön - 1x06 - Avsnitt 6
Svartsjön - 1x07 - Avsnitt 7
Svartsjön - 1x08 - Avsnitt 8
You'll probably need to add "Svartsjon" as English alias for "Svartsjön" on TheTVDB so it'll work with all language preferences.

When testing in Strict mode, I cannot reproduce any issues with "Sweet Vicious" episode data spilling over to fill in the missing information for "Svartsjon" files.
:idea: Please read the FAQ and How to Request Help.
Hemloc
Posts: 25
Joined: 03 Dec 2016, 11:14

Re: [Bug] Pattern matching worse since 4.7

Post by Hemloc »

It being Swedish and missing an English alias makes sense that it cannot find it.

Anything I can do to help you find the issue?

Just tried "Svartsjon (Swe, Fre, Eng) S01E08.srt" and it found "Sweet Vicious S01E08 - Back to Black", so the Strict setting seems to be ignored. Are you doing this via CLI or via GUI?
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [Bug] Pattern matching worse since 4.7

Post by rednoah »

Have these files been processed with FileBot before? If yes, then they're tagged.

@see viewtopic.php?f=4&t=5#p5394
:idea: Please read the FAQ and How to Request Help.
Hemloc
Posts: 25
Joined: 03 Dec 2016, 11:14

Re: [Bug] Pattern matching worse since 4.7

Post by Hemloc »

Tried "filebot -script fn:xattr /path/to/files" and all it returned was "Done ?(?????)?"... so apparently not. Even wiped out the "AppData\Roaming\FileBot" directory and it still matched to "Sweet Vicious".
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [Bug] Pattern matching worse since 4.7

Post by rednoah »

Fails, as expected:

Code: Select all

$ filebot -rename "Svartsjon (Swe, Fre, Eng) S01E08.srt"
Rename episodes using [TheTVDB]
Auto-detected query: [svartsjon]
Failed to fetch episode data: [svartsjon]
Failed to match files to episode data
Succeeds, as expected:

Code: Select all

$ filebot -rename "Svartsjon (Swe, Fre, Eng) S01E08.srt" --lang Swedish
Rename episodes using [TheTVDB]
Auto-detected query: [svartsjon]
Fetching episode data for [Svartsjön]
[MOVE] Rename [Svartsjon (Swe, Fre, Eng) S01E08.srt] to [Svartsjön - 1x08 - Avsnitt 8.srt]
Processed 1 files
I've played with the GUI and CLI and everything works as far as I can tell.
:idea: Please read the FAQ and How to Request Help.
Hemloc
Posts: 25
Joined: 03 Dec 2016, 11:14

Re: [Bug] Pattern matching worse since 4.7

Post by Hemloc »

Hmm, ok, I removed "C:\Program Files\FileBot" (please note the installer will not install to this location even though I don't want it in (x86) and tried to install into the directory indicated), removed "AppData\Roaming\FileBot", re-installed the program to the default location, wiped out my preferences using "filebot -script fn:preferences --action clear", reset preferences to Strict and tried the filename again in the GUI... no difference, still matches to "Sweet Vicious".
However, trying your command line does return the same results as you.
So it seems there is a problem with the GUI not using the Strict preference properly, but CLI does.

I tried using TheMovieDB (which returned Black Lake) and TVmaze (which returned Svartsjön), both much better results, but unfortunately they both return way too many series results for other titles.
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [Bug] Pattern matching worse since 4.7

Post by rednoah »

1.
Program Files => 64-bit Program Files
Program Files (x86) => 32-bit Program Files

Please use 64-bit FileBot on 64-bit Windows and not 32-bit FileBot.


2.
I've also tested the GUI with the paths you provided. In strict mode, I cannot reproduce the behaviour you're getting.
Image
:idea: Please read the FAQ and How to Request Help.
Hemloc
Posts: 25
Joined: 03 Dec 2016, 11:14

Re: [Bug] Pattern matching worse since 4.7

Post by Hemloc »

1. Irrelevant.
1.1. You can stick any program you like in any directory you like, windoze simply does this for convenience to split the programs... which ultimately is pointless since a user will almost never install two different bit versions of a program. I never install a program into (x86) unless I have no choice (which is rare), and never have any issues (it is also much more convenient when you are trying to find program directories as you only have to look in one location instead of two).
1.2. Also, I cannot use 64-bit FileBot as I have 32-bit java required for other programs, and 64-bit FileBot will not work with 32-bit java. Also, this makes absolutely no difference to the general running of a program.

2. Isn't there any logging that can be checked to see what the GUI is performing in the background to see why it is not working?
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [Bug] Pattern matching worse since 4.7

Post by rednoah »

Yes, but I'm not sure if it'll be of any use in this case:
viewtopic.php?f=3&t=3913
:idea: Please read the FAQ and How to Request Help.
Hemloc
Posts: 25
Joined: 03 Dec 2016, 11:14

Re: [Bug] Pattern matching worse since 4.7

Post by Hemloc »

Where do I find the log files?
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [Bug] Pattern matching worse since 4.7

Post by rednoah »

If you want a log file then you need to set the --log-file option. Otherwise it's all just console output.
:idea: Please read the FAQ and How to Request Help.
Post Reply