Strict mode processing files with bad matches
Re: Strict mode processing files with bad matches
can you make it so that movie matching is in the log like the TV funnel ?
(that way I/we can better understand why and how to fix it)
the year 2 x weight, is that like 1 word = 1 x weight ?
btw: "The Bodyguard 2003"
#1 The Body 2003
#2 The Bodyguard 2004
(that way I/we can better understand why and how to fix it)
the year 2 x weight, is that like 1 word = 1 x weight ?
btw: "The Bodyguard 2003"
#1 The Body 2003
#2 The Bodyguard 2004
Re: Strict mode processing files with bad matches
You can find the relevant code here:
https://github.com/filebot/filebot/blob ... .java#L696
The overall metric consists of 4 metrics that are based on the name, and 1 metric with 2x weight based on the year. The currently implementation does not take year +- 1 into account when ranking results.
For debugging purposes, writing a script that will give you the values for each individual metric would make sense.
https://github.com/filebot/filebot/blob ... .java#L696
The overall metric consists of 4 metrics that are based on the name, and 1 metric with 2x weight based on the year. The currently implementation does not take year +- 1 into account when ranking results.
For debugging purposes, writing a script that will give you the values for each individual metric would make sense.
Please read the FAQ and How to Request Help.
Re: Strict mode processing files with bad matches
I don't get much out of just looking at part of the code...
"For debugging purposes, writing a script that will give you the values for each individual metric would make sense."
meaning you will write it or I ?
because I have NO idea how
"For debugging purposes, writing a script that will give you the values for each individual metric would make sense."
meaning you will write it or I ?
because I have NO idea how
Re: Strict mode processing files with bad matches
Here's some inspiration for a script like that:
Code: Select all
def file = 'Movie/Avatar/Avatar.2009.mkv' as File
def movies = MediaDetection.detectMovie(file, TheMovieDB, Locale.ENGLISH, false)
movies.each{ option ->
println "$option <=> $file"
println MediaDetection.movieMatchMetric.getSimilarity(file, option)
MediaDetection.movieMatchMetric.metrics.each { metric ->
println "* ${metric.class}: ${metric.getSimilarity(file, option)}"
}
}
Please read the FAQ and How to Request Help.
Re: Strict mode processing files with bad matches
* class net.filebot.similarity.NameSimilarityMetric: 0.5074627 = ???
* class net.filebot.media.MediaDetection$1: 0.0 = ???
* class net.filebot.media.MediaDetection$2: 1.3333334 = Year ?
* class net.filebot.similarity.SequenceMatchSimilarity: 0.6 = ???
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0 = ???
why so different result if folder in "def file" ?
Memories of 'The Bodyguard' (2005) <=> The Bodyguard (2005)\The Bodyguard.2005.mkv
0.48815918
* class net.filebot.similarity.NameSimilarityMetric: 0.5074627
* class net.filebot.media.MediaDetection$1: 0.0
* class net.filebot.media.MediaDetection$2: 1.3333334
* class net.filebot.similarity.SequenceMatchSimilarity: 0.6
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0
Memories of 'The Bodyguard' (2005) <=> The Bodyguard.2005.mkv
0.6969697
* class net.filebot.similarity.NameSimilarityMetric: 0.6666667
* class net.filebot.media.MediaDetection$1: 0.0
* class net.filebot.media.MediaDetection$2: 2.0
* class net.filebot.similarity.SequenceMatchSimilarity: 0.8181818
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0
btw: I just learned that on TMDB, if you search with the year e.g. "query=the+bodyguard&year=2005" then any "release information" year with be used e.g. "query=the+bodyguard&year=2008" = the bodyguard 1992 because of BR "Physical" year 2008 same with 1999 for GR
* class net.filebot.media.MediaDetection$1: 0.0 = ???
* class net.filebot.media.MediaDetection$2: 1.3333334 = Year ?
* class net.filebot.similarity.SequenceMatchSimilarity: 0.6 = ???
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0 = ???
why so different result if folder in "def file" ?
Memories of 'The Bodyguard' (2005) <=> The Bodyguard (2005)\The Bodyguard.2005.mkv
0.48815918
* class net.filebot.similarity.NameSimilarityMetric: 0.5074627
* class net.filebot.media.MediaDetection$1: 0.0
* class net.filebot.media.MediaDetection$2: 1.3333334
* class net.filebot.similarity.SequenceMatchSimilarity: 0.6
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0
Memories of 'The Bodyguard' (2005) <=> The Bodyguard.2005.mkv
0.6969697
* class net.filebot.similarity.NameSimilarityMetric: 0.6666667
* class net.filebot.media.MediaDetection$1: 0.0
* class net.filebot.media.MediaDetection$2: 2.0
* class net.filebot.similarity.SequenceMatchSimilarity: 0.8181818
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0
btw: I just learned that on TMDB, if you search with the year e.g. "query=the+bodyguard&year=2005" then any "release information" year with be used e.g. "query=the+bodyguard&year=2008" = the bodyguard 1992 because of BR "Physical" year 2008 same with 1999 for GR
Re: Strict mode processing files with bad matches
1.
Multiple matches in file and folder name increase the total score.
2.
Mind sharing the links for any TheMovieDB query documentation you might have found?
Multiple matches in file and folder name increase the total score.
2.
Mind sharing the links for any TheMovieDB query documentation you might have found?
Please read the FAQ and How to Request Help.
Re: Strict mode processing files with bad matches
1. looks to me like the other way around ?:
with folder 0.48815918
without 0.6969697
2. I did not read it, I just discovered because off the "query=the+bodyguard&year=2008" = the bodyguard 1992
what do the lines mean ?
* class net.filebot.similarity.NameSimilarityMetric: 0.5074627 = ???
* class net.filebot.media.MediaDetection$1: 0.0 = ???
* class net.filebot.media.MediaDetection$2: 1.3333334 = Year ?
* class net.filebot.similarity.SequenceMatchSimilarity: 0.6 = ???
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0 = ???
with folder 0.48815918
without 0.6969697
2. I did not read it, I just discovered because off the "query=the+bodyguard&year=2008" = the bodyguard 1992
what do the lines mean ?
* class net.filebot.similarity.NameSimilarityMetric: 0.5074627 = ???
* class net.filebot.media.MediaDetection$1: 0.0 = ???
* class net.filebot.media.MediaDetection$2: 1.3333334 = Year ?
* class net.filebot.similarity.SequenceMatchSimilarity: 0.6 = ???
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0 = ???
Re: Strict mode processing files with bad matches
1.
I see. It probably works different because you have no folder structure at all, which reduces the number of components being compared, and thus average values. This won't happen in real usage scenarios because there's always gonna be a parent folder.
I'd test with something like this:
def file = '''X:/path/to/a/real/file.mkv''' as File
2.
I guess TheMovieDB will give you a result even if just the query is a perfect match. Unfortunately, there's nothing I can do about strange behaviour in the query engine. That's one of the reasons why FileBot has it's own search index.
Does TheMovieDB give results for that query? Because detectMovie(...) will give you results where TheMovieDB search results are combined with local index lookup (which can definitely give you results based on the name even if the year is wrong).
3.
Hard to explain without understanding the code...
NameSimilarityMetric is fussy string similarity (e.g. Hallo <=> Hello is similar):
StringEqualsMetric will give you extra points when the filename is exactly the same as the movie name:
NumericSimilarityMetric will give you extra points if numeric patterns are similar (e.g. "Best Movie of 1968" is similar to "2001 Space Odyssey 1968")
SequenceMatchSimilarity is about a matching sequence of words anywhere in the word sequence (e.g. Bodyguard is similar to The Bodyguard):
Second SequenceMatchSimilarity is about a matching sequence of words at the beginning of the sequence (e.g. The Bodyguard is similar to The Bodyguard: Part Two but not Bodyguard):
I see. It probably works different because you have no folder structure at all, which reduces the number of components being compared, and thus average values. This won't happen in real usage scenarios because there's always gonna be a parent folder.
I'd test with something like this:
def file = '''X:/path/to/a/real/file.mkv''' as File
2.
I guess TheMovieDB will give you a result even if just the query is a perfect match. Unfortunately, there's nothing I can do about strange behaviour in the query engine. That's one of the reasons why FileBot has it's own search index.
Does TheMovieDB give results for that query? Because detectMovie(...) will give you results where TheMovieDB search results are combined with local index lookup (which can definitely give you results based on the name even if the year is wrong).
3.
Hard to explain without understanding the code...
NameSimilarityMetric is fussy string similarity (e.g. Hallo <=> Hello is similar):
Code: Select all
* class net.filebot.similarity.NameSimilarityMetric: 0.5074627 = ???
Code: Select all
* class net.filebot.media.MediaDetection$1: 0.0 = ???
Code: Select all
* class net.filebot.media.MediaDetection$2: 1.3333334 = Year ?
Code: Select all
* class net.filebot.similarity.SequenceMatchSimilarity: 0.6 = ???
Code: Select all
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0 = ???
Please read the FAQ and How to Request Help.
Re: Strict mode processing files with bad matches
I found a new one that need a bit of work
viewtopic.php?f=4&t=5416
https://www.themoviedb.org/search/movie ... &year=1999
viewtopic.php?f=4&t=5416
https://www.themoviedb.org/search/movie ... &year=1999
Query Movie => [psycho 1999]
Rank [Psycho 1999] => [The Psychotic Odyssey of Richard Chase (1999), The Masked Strangler (1999), Kiss [1999] Psycho Circus in Buenos Aires (1999), Psycho (1998), Psycho (1960), Psycho Sisters (1998), Psycho Beach Party (2000), American Psycho (2000), The Maddening (1996)]
The Psychotic Odyssey of Richard Chase (1999) <=> D:\movies\Psycho.1999.BRRip.XviD.MP3-RARBG.avi
0.4566999
* class net.filebot.similarity.NameSimilarityMetric: 0.1904762
* class net.filebot.media.MediaDetection$1: 0.0
* class net.filebot.media.MediaDetection$2: 2.0
* class net.filebot.similarity.SequenceMatchSimilarity: 0.093023255
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0
The Masked Strangler (1999) <=> D:\movies\Psycho.1999.BRRip.XviD.MP3-RARBG.avi
0.4501818
* class net.filebot.similarity.NameSimilarityMetric: 0.09090909
* class net.filebot.media.MediaDetection$1: 0.0
* class net.filebot.media.MediaDetection$2: 2.0
* class net.filebot.similarity.SequenceMatchSimilarity: 0.16
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0
Kiss [1999] Psycho Circus in Buenos Aires (1999) <=> D:\movies\Psycho.1999.BRRip.XviD.MP3-RARBG.avi
0.3457041
* class net.filebot.similarity.NameSimilarityMetric: 0.25882354
* class net.filebot.media.MediaDetection$1: 0.0
* class net.filebot.media.MediaDetection$2: 1.3333334
* class net.filebot.similarity.SequenceMatchSimilarity: 0.13636364
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0
Psycho (1998) <=> D:\movies\Psycho.1999.BRRip.XviD.MP3-RARBG.avi
0.17062938
* class net.filebot.similarity.NameSimilarityMetric: 0.30769232
* class net.filebot.media.MediaDetection$1: 0.0
* class net.filebot.media.MediaDetection$2: 0.0
* class net.filebot.similarity.SequenceMatchSimilarity: 0.54545456
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0
Psycho (1960) <=> D:\movies\Psycho.1999.BRRip.XviD.MP3-RARBG.avi
0.16293707
* class net.filebot.similarity.NameSimilarityMetric: 0.26923078
* class net.filebot.media.MediaDetection$1: 0.0
* class net.filebot.media.MediaDetection$2: 0.0
* class net.filebot.similarity.SequenceMatchSimilarity: 0.54545456
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0
Psycho Sisters (1998) <=> D:\movies\Psycho.1999.BRRip.XviD.MP3-RARBG.avi
0.10982456
* class net.filebot.similarity.NameSimilarityMetric: 0.23333333
* class net.filebot.media.MediaDetection$1: 0.0
* class net.filebot.media.MediaDetection$2: 0.0
* class net.filebot.similarity.SequenceMatchSimilarity: 0.31578946
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0
Psycho Beach Party (2000) <=> D:\movies\Psycho.1999.BRRip.XviD.MP3-RARBG.avi
0.08342391
* class net.filebot.similarity.NameSimilarityMetric: 0.15625
* class net.filebot.media.MediaDetection$1: 0.0
* class net.filebot.media.MediaDetection$2: 0.0
* class net.filebot.similarity.SequenceMatchSimilarity: 0.26086956
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0
American Psycho (2000) <=> D:\movies\Psycho.1999.BRRip.XviD.MP3-RARBG.avi
0.09934427
* class net.filebot.similarity.NameSimilarityMetric: 0.19672132
* class net.filebot.media.MediaDetection$1: 0.0
* class net.filebot.media.MediaDetection$2: 0.0
* class net.filebot.similarity.SequenceMatchSimilarity: 0.3
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0
The Maddening (1996) <=> D:\movies\Psycho.1999.BRRip.XviD.MP3-RARBG.avi
0.013559322
* class net.filebot.similarity.NameSimilarityMetric: 0.06779661
* class net.filebot.media.MediaDetection$1: 0.0
* class net.filebot.media.MediaDetection$2: 0.0
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0
* class net.filebot.similarity.SequenceMatchSimilarity: 0.0
Result: [The Psychotic Odyssey of Richard Chase (1999), The Masked Strangler (1999), Kiss [1999] Psycho Circus in Buenos Aires (1999), Psycho (1998), Psycho (1960), Psycho Sisters (1998), Psycho Beach Party (2000), American Psycho (2000), The Maddening (1996)]
http://www.imdb.com/title/tt0155975/rel ... =tt_ov_infUSA 4 December 1998
Singapore 31 December 1998
Germany 7 January 1999
Re: Strict mode processing files with bad matches
Future revisions will take year-off-by-one into consideration when ranking options and picking the best one.
Please read the FAQ and How to Request Help.