Page 1 of 1

[NEW FEATURE] Series Alias / Override Series Detection

Posted: 03 Oct 2012, 09:18
by rednoah

There's a common problems that keeps resurfacing in various forms, TheTVDB search not giving you any results if though the detected name makes perfect sense. In the worst case this might list the wrong series as "best match" and rename with wrong episode data. In they GUI it's easy to correct these things manually but when running fully-automated this is annoying.

I'm thinking it might be time to add a shared index of synonyms to that kind of logic into FileBot to work around external API deficiencies.

Rizolli and Isles => Rizolli & Isles
HIMYM => How I Met your Mother
Boss => Boss (2011) and not BOSS (2008)

That's my idea. I'm open to suggestions. Especially people that can help with providing these mappings or if you have a better idea that is less manual. ;)

As always, FileBot could use more people that help out and promote it. Writing reviews, showing your support financially, etc :P Running the forums is a shit-load of work as well, wouldn't mind making anybody with a bit of Groovy skills a mod right now.


Re: [NEW FEATURE] Series Alias / Override Series Detection

Posted: 04 Oct 2012, 08:43
by part timer
I haven't waded into the waters of renaming my TV episodes yet. So far I've only done my movies, and even then only finished a few days ago, finally.

Thank you for this program btw, even though I'm really only able to scratch the surface of it myself.

I think one idea that might work, but would need community involvement, and I'm sure you've thought of it already, is the same as the GROUPS posts. Start up a post for TV Shows with the filename (or at least the series portion of it, maybe the whole filename is best?) and what the results of the query SHOULD be rather than what they are now, either by a code or the name. As it's added to, let it get smarter and help the community. I think a big key for me is how quickly those lists actually get updated, always seems to be less than 24hrs to me, which makes you want to contribute. It feels like someone cares and it's worth taking the time to do it.

I wonder how hard it would be to have people send in rename histories and parse TV out of them. Probably too much to sift through. Hopefully those in the know already for renaming their shows already know which ones cause the problems.

I guess there isn't really any new ideas in this post, I just thought I would throw in an opinion.

Thanks again for all your efforts!

Re: [NEW FEATURE] Series Alias / Override Series Detection

Posted: 04 Oct 2012, 09:29
by rednoah
Yep, I've thought about this many times and it's in the back of my head, but I'll never do it 'cause I'll be the one that has to maintain it. And with an open database that everybody can contribute, how can I make sure nobody messes that up with bad data?

Not to mention that running a database like that would not just talk lots of time (that I don't get paid for in any way) but also money to keep things running.

Also while it's nice to know that some people feel it's worth the effort, 99.9% do not. OpenSubtitles has the MovieHash system running for years, and FileBot uses that to grab the correct subtitle or identify you movie file. Do you remember when FileBot movie-mode was completely useless? That was when it was only using OSDB MovieHash for identification. It probably works alright for the popular stuff, but in normal usage it'll work in less then 10% of the stuff you wanna rename.

In the end OpenSubtitles had by far the best way of doing things, yet they're basically bankrupt while Addict7ed & Co make lots of money by making people visit the websites. OpenSubtitles can't do that with the MovieHash system as it defeats it's very purpose of complete automation.

By now my logic works for almost everything that makes sense to normal people.

This project is kinda what you're suggesting, hashrenamer it's called I believe, but to build a good database they'd have to get loads of users happy to contribute, but since 99.9% of users will throw it away once it makes them enter data it'll never get there. OpenSubtitles never got there.

But hey, maybe I'm wrong. ;) That's just my person experience with user contribution, no offence to the 0.1% of FileBot users that contributed in one way or another. You know who you are, also you're awesome!

Re: [NEW FEATURE] Series Alias / Override Series Detection

Posted: 04 Oct 2012, 09:43
by part timer
Ok, what about if you could let people have a file, preferably something with fields like a spreadsheet for filename title, dropdown box with the tv search engines you use, result name you'd like to get, db id (I'm just assuming they have this kind of feature like for imdb and tmdb).

Let it be local so everyone can adjust as they want, but have some kind of check that if multiple people have the same before and after names and id that it then gets added to some kind of global cache list.

I don't know if this is possible, but I think it sounds kind of neat.

Re: [NEW FEATURE] Series Alias / Override Series Detection

Posted: 04 Oct 2012, 11:05
by rednoah
Nope, not gonna add any magic-config to the GUI, you select your db and you can force series/movie name via CTRL-clicking the db. That handles everything very efficiently.

For more automation there is Groovy, what you're suggesting is 1 line of scripting: ... =280#p1135


:idea: Here's an idea! Do this yourself! :idea:
Paste a script like that on pastebin, write a little tutorial and then keep updating the mappings. ;)

e.g. when called like this FileBot will grab the new script/mappings every day:

Code: Select all

filebot -script ...

Re: [NEW FEATURE] Series Alias / Override Series Detection

Posted: 09 Oct 2012, 15:08
by rednoah
:: UPDATE ::

+ added new matching logic to balance file-lastModifiedDate against episode-airdate in order to fix issues related to having two TV shows with the same name from different years.

+ basic support for hard-coding filename->series lookup; designed primarily as a workaround for database search limitations and issues

@see series-mappings

This file is updated every 24h so if you find TheTVDB/TVRage rejecting the queries FileBot comes up with, I can add that anytime. Same as the release-groups list this list will be a community project.

Re: [NEW FEATURE] Series Alias / Override Series Detection

Posted: 11 Oct 2012, 18:50
by antekgla
first, congratulations for your program, it is A-MA-ZING.
I am a programmer myself and I code a much,much simple utility named SRTFilter for process the content of Subtitles files and additionaly rename Video files as well , so I know the huge effort what take to write a program like yours. Kudos again.

Anyway... on topic now :D a more automated way to respond to fail searches in TheTVDB is to use the Google's magic and look in the IMDB result, always came up with the right answer. I searched the 3 lines of the series-mappings.txt and always the result of the IMDB TV Series is correct.

Code: Select all

tv series Franklin and Bash
tv series HIMYM	
tv series Rizolli and Isles
also I do other searches like:

Code: Select all

tv series colbert
tv series daily show
tv series jimmy fallon
and google (always the IMDB result) gives the correct show name.

Re: [NEW FEATURE] Series Alias / Override Series Detection

Posted: 12 Oct 2012, 00:36
by rednoah
Scraping google violates their terms of service, also your arse is wide open if they decided to f*ck you. :P

+ Search will get a lot smarter cause google is very very smart

- Violates Googles terms of service
- Need to work around Googles logic of blocking scrapers (They only reason you succeed at out-smarting google is google letting you 'cause they don't care)
- Scrapers are a hell to maintain => i don't do shit-work, especially not for free
- Fragile, breaks every time google decides to change their html or completely disable serving of static result pages

TheRenamer is still working so Google is actually playing nice. I checked the requests that one sends with Fiddler... It's not pretty. Just saying that scrapers are a hell to maintain in the long-term.

Also it makes things slow if I have to simulate human sluggishness to make google not block me. Obviously I'd have to limit google searches to a minimum, logic when to check google and when not... really not as straight forward as one would think at first.

Well, writing this probably cost me more time then I will ever spend on maintaining the shared series-mappings, that's much more simple and robust.

PS: There is no Java API for scraping Google. So I definitly can't be bothered. :P

Re: [NEW FEATURE] Series Alias / Override Series Detection

Posted: 12 Oct 2012, 01:57
by antekgla
I think what Google Terms of Service has changed.
They have a Custom Search but requires a Google API KEY.
Free API KEYS are restricted to 100 querys a day so... the only way is what every user get our own Free Api Key.
But seems too complicated for a final user.
I agree, your approach is better, but requires user contribution...

Re: [NEW FEATURE] Series Alias / Override Series Detection

Posted: 12 Oct 2012, 04:20
by rednoah
Didn't know they had a free REST API as well, but like you said, making people sign up individually wouldn't work.

How it works now is just a very minimal fix to work around stuff that's not working with TheTVDB search and there really is only a handful of problematic cases. Also it primarily helps with unattended mode, since with the GUI it'd just make you enter a valid query.

User contribution never works, except for a few individuals. Point is, whenever someone runs into one of these lookup issues I can just update that file and it'll work for everybody the next day.

Re: [NEW FEATURE] Series Alias / Override Series Detection

Posted: 14 Oct 2012, 12:48
by rednoah
:: UPDATE ::

With r1239 TheTVDB search will not just query the API but also run it's own fuzzy search using a locally cached index of TheTVDB. This should fix any and all TheTVDB series lookup problems but also make search noticably slower (if your computer is old and slow that is).

PS: This change makes some of the stuff I posted above kinda obsolete.