Page 1 of 1

AMC: Subs for multi-part movies

Posted: 20 Nov 2017, 11:03
by thielj
This is more an observation for now: with a subtitles=xx parameter, FileBot is fetching subtitles for multi-part movies. These would need to be split and adjusted to actually work.

I've seen net/filebot/subtitle contains classes for transcoding subtitles. Do these work or is there any documentation?

Re: AMC: Subs for multi-part movies

Posted: 20 Nov 2017, 12:00
by rednoah
--def subtitles=en will only fetch subtitles by file hash, not by movie name, so the subtitles should match exactly that file.

@see viewtopic.php?f=3&t=2615

Re: AMC: Subs for multi-part movies

Posted: 20 Nov 2017, 14:55
by thielj
Thanks, I will try to narrow down what's happening here. Maybe the uploaded subs have already been wrong.

Re: AMC: Subs for multi-part movies

Posted: 21 Nov 2017, 14:21
by thielj
Hi, below is an example of FileBot fetching the same single CD subtitle for both CD parts:

Code: Select all

Get [French] subtitles for 2 files
Looking up subtitles by hash via OpenSubtitles
Fetching [French] subtitles [Blow[2001]DVDrip[ENG]-MissRipZ.FR.srt] from [OpenSubtitles]
Export [Blow[2001]DVDrip[ENG]-MissRipZ.FR.srt] as [SubRip / UTF-8]
Writing [Blow[2001]DVDrip[ENG]-MissRipZ.FR.srt] to [Blow (2001).CD1.fra.srt]
Fetching [French] subtitles [Blow[2001]DVDrip[ENG]-MissRipZ.FR.srt] from [OpenSubtitles]
Export [Blow[2001]DVDrip[ENG]-MissRipZ.FR.srt] as [SubRip / UTF-8]
Writing [Blow[2001]DVDrip[ENG]-MissRipZ.FR.srt] to [Blow (2001).CD2.fra.srt]
The downloaded subtitles are identical for both CD parts and seem to be either https://www.opensubtitles.org/en/subtit ... 83/blow-fr
or https://www.opensubtitles.org/en/subtit ... 00/blow-fr

From what I understand, the hash used to download is over the first and last 64K of the movie, so a hash collision seems rather unlikely. This would leave us with a wrong upload on OpenSubtitles, API problems or a bug in FileBot.

Is there any easy way to debug this / calculate the moviehashes / etc?

Re: AMC: Subs for multi-part movies

Posted: 21 Nov 2017, 14:29
by rednoah
1.
AFAIK, BSPlayer uploads subtitles automatically without asking the user if they're good or bad, so bad upload is likely the case here.


2.
The osdb.explain script might shed some light on this issue:

Code: Select all

filebot -script fn:osdb.explain /path/to/movie --def fetch=y

Re: AMC: Subs for multi-part movies

Posted: 21 Nov 2017, 18:52
by thielj
The result is the same for both parts: and most interesting, it's a tag match (due to MovieHash=0 maybe??) of a single-part subtitle. I would expect these to be eliminated:
  • In strict mode, due to not being a hash match
  • In non-strict mode, due to not matching the part / total parts count and being significantly longer than the matched movie part (i.e. SubLastTS > (part playing time + 1minute)

Code: Select all

File: /volume1/_INPUT_/Blow (2001) [640x272 EN]/Blow (2001).CD1.avi
Hash/Tag Lookup (hash: 72f6d85ab3c052aa, size: 728033280, lang: fr_FR, tag: Blow (2001).CD1)
[...]
File: /volume1/_INPUT_/Blow (2001) [640x272 EN]/Blow (2001).CD2.avi
Hash/Tag Lookup (hash: f62de139e78c9c3d, size: 729657344, lang: fr_FR, tag: Blow (2001).CD2)
[...]
Best Hash Match: [IDSubtitle:3300000, IDSubtitleFile:1951888932, IDSubMovieFile:0, IDMovie:377,
  MovieHash:0, MovieByteSize:0, MovieName:Blow, MovieYear:2001, MovieTimeMS:0, MovieFPS:25.000,
  SubFileName:Blow[2001]DVDrip[ENG]-MissRipZ.FR.srt, SubLastTS:01:58:08, SubFormat:srt,
  SubLanguageID:fre, ISO639:fr, SubActualCD:1, SubSumCD:1, MatchedBy:tag,
]

Re: AMC: Subs for multi-part movies

Posted: 21 Nov 2017, 20:06
by rednoah
Interesting. Something may have changed over time here. "tag" used to refer to an exact match of the video/subtitle file names (excluding subtitle extension).

e.g.

Code: Select all

Blow (2001).CD2.avi
Blow (2001).CD2.eng.srt               <-- TAG MATCH
Blow[2001]DVDrip[ENG]-MissRipZ.FR.srt <-- NOT SUPPOSED TO BE A TAG MATCH
@see http://trac.opensubtitles.org/projects/ ... hSubtitles


If this is not a bug, and API behaviour has indeed changed, than "tag lookup" will have to be removed. There doesn't seem to be a replacement for "exact filename lookup" as far as I can see.

Re: AMC: Subs for multi-part movies

Posted: 21 Nov 2017, 20:53
by thielj
"For perfect matches use moviehash/moviebytesize searching, for movie matches use tag/imdbid searching, if you can not use any of them, use fulltext search (least accurate)"

The way I understand is that only the hash/size provides a perfect match; tag/imdbid is the preferred way to search (as you get at least the right movie) and fulltext search the least accurate option.

The second best match on this query was even worse - I can't see **any** tag match here. Not even the movie is matching!

Code: Select all

Result 2: [IDSubtitle:3254295, IDSubtitleFile:1951831232, IDSubMovieFile:0, IDMovie:93990, IDMovieImdb:1055795, SubFileName:terminator.the.sarah.connor.chronicles.s01e07..proper.hdtv.xvid-notv.FR.srt, SubLastTS:00:40:53, SubFormat:srt, SubEncoding:CP1252, SubHash:ca2f0d7d0e0e403922547501d42fa2cd, SubSize:44567, MovieHash:0, MovieByteSize:0, MovieName:"Terminator: The Sarah Connor Chronicles" The Demon Hand, MovieNameEng:, MovieYear:2008, MovieReleaseName:S01E07 Hdtv NoTV Proper, MovieTimeMS:0, MovieFPS:23.980, MovieImdbRating:8.0, MovieKind:episode, SeriesSeason:1, SeriesEpisode:7, SeriesIMDBParent:851851, SubLanguageID:fre, ISO639:fr, LanguageName:French, UserID:120910, UserRank:trusted, UserNickName:ninjaw, SubAddDate:2008-02-28 18:48:13, SubAuthorComment:, SubFeatured:0, SubComments:0, SubDownloadsCnt:695, SubHearingImpaired:0, SubRating:5.0, SubHD:1, SubBad:0, SubActualCD:1, SubSumCD:1, MatchedBy:tag, QueryNumber:0, ...]

Re: AMC: Subs for multi-part movies

Posted: 21 Nov 2017, 21:10
by rednoah
Alright then, tag lookup seems completely broken then, so it's been removed with the latest revision. strict match is now moviehash/filesize only.

Re: AMC: Subs for multi-part movies

Posted: 21 Nov 2017, 21:14
by thielj
What's the non-strict policy?

Re: AMC: Subs for multi-part movies

Posted: 21 Nov 2017, 23:00
by rednoah
Lookup by Name kinda like a human would do it. Identify movie, search by name/id, check list of subtitles and pick the one that matches your files best.

@see viewtopic.php?f=3&t=2615

:idea: Note that the amc script will always force strict mode for subtitles regardless of the -non-strict option set by you. If -non-strict subtitle lookup is desired, then the suball script is recommended for that.

Re: AMC: Subs for multi-part movies

Posted: 22 Nov 2017, 03:41
by thielj
To confirm: FB no longer fetches the wrong subtitles with the latest jar.

Re: AMC: Subs for multi-part movies

Posted: 22 Nov 2017, 12:43
by rednoah
Thanks for checking and confirming. :)