Thanks for your detailed response. I appreciate the careful thought you've put into this.
I understand your concerns, but I'd like to directly address each of them:
Performance Concerns:
I completely agree that fetching detailed metadata (including runtime) from TMDb for every single search result would be inefficient. However, this overhead can be minimized effectively by only fetching detailed data when multiple ambiguous matches occur (same or similar titles without provided years). This targeted approach should alleviate performance impacts significantly.
Accuracy Concerns:
I acknowledge that runtime discrepancies between editions (theatrical, extended, director's cuts) do exist, but these extreme runtime variations (±20 minutes or more) are relatively uncommon in practice. A reasonable threshold (±5 minutes, for example) could handle the vast majority of cases effectively without causing unintended mismatches.
Slippery Slope Concerns:
I'm not suggesting that runtime disambiguation replace or override existing year-based matching logic. Rather, I'm advocating that runtime be leveraged as an additional, secondary disambiguation factor specifically in cases where the filename lacks a provided year and multiple movies share identical or closely similar titles. This supplementary use of runtime data should improve accuracy significantly without compromising existing functionality.
My ultimate goal here is straightforward: ensuring that Filebot returns consistently formatted filenames that include the correct release year. This is critical because media software like Plex relies heavily on correctly formatted filenames—including release years—for accurate metadata retrieval.
Thanks again for your openness to consider this enhancement. Here is some sample pseudocode that may help to clarify:
Code: Select all
function find_movie_year(title, local_runtime, filename_has_year, filename):
if filename_has_year:
return apply_existing_logic(title)
candidates = tmdb_search(title)
if candidates.length == 1:
return candidates[0].year
# Fetch detailed runtimes for ambiguous cases only
for candidate in candidates:
detailed_info = fetch_tmdb_details(candidate.id)
candidate.runtime_difference = abs(detailed_info.runtime - local_runtime)
# Immediately prioritize exact runtime match (no commercials)
if candidate.runtime_difference == 0:
return candidate.year
# Check if filename explicitly mentions "[Commercials]"
if "[Commercials]" in filename:
adjusted_runtime = local_runtime - 15 # conservative commercial estimate
else:
adjusted_runtime = local_runtime
# Expanded tolerance accounts for commercials, cuts, etc.
tolerance = 15
filtered_candidates = filter candidates where abs(candidate.runtime - adjusted_runtime) <= tolerance
if filtered_candidates.length == 1:
return filtered_candidates[0].year
else:
return apply_existing_disambiguation_logic(candidates)