Page 1 of 1

Use Internal Cache

Posted: 24 Oct 2018, 03:06
by kim
Can you help me ?

I'm trying to use the Internal Cache...
this gets and writes the info to a file (Cache), but how to do the rest that e.g.
request()
does...
Read/Write data file
Check if file has the data and how old it is
more ?

I need to use
text()
or Filebot will not write the data

Code: Select all

{
import net.filebot.Cache; 
import net.sf.ehcache.Element;
import net.filebot.CacheType; 

import org.jsoup.Jsoup;

def IMDbInfo = Jsoup.connect('http://www.imdb.com/title/'+'tt0499549').userAgent('Mozilla/5.0').get().select("div.imdbRating")
def cache = Cache.getCache('IMDbInfo', CacheType.Weekly);
def key = 'IMDbInfo';
def Element putData = new Element(key, IMDbInfo.text());
cache.put(key, putData);
def Element getData = cache.get(key);
getData.getValue()
}
btw: Ehcache v3 is ready, if this is better for Filebot.

Re: Use Internal Cache

Posted: 24 Oct 2018, 07:28
by rednoah
Here's an example how I'd do it:

Code: Select all

def id = 499549
def rating = Cache.getCache("IMDbInfo.Rating", CacheType.Weekly).computeIfAbsent(id) {
	println "Fetch IMDbInfo.Rating: ${id.pad(7)}"
	return org.jsoup.Jsoup.connect("http://www.imdb.com/title/tt${id.pad(7)}")
	            .userAgent('Mozilla/5.0')
	            .get()
	            .select("div.imdbRating")
	            .text()
}
That's the easiest way. All the read/write/check/date/time is abstracted away internally, so if you want fine-grained control, then things get a lot more complicated.


:idea: ehcache v2 works best for what FileBot needs. I tried v3 some time ago, and had to revert everything because disk persistent / restart (i.e. what FileBot needs) didn't work at all (or wasn't even supported) because most people use ehcache for servers and processes that don't typically launch and quit often like Desktop applications.

Re: Use Internal Cache

Posted: 24 Oct 2018, 23:25
by kim
thx, you saved me a lot of time ;)

Re: Use Internal Cache

Posted: 25 Oct 2018, 03:45
by kim
This is what I have done so far, pls give notes on problems or if something can be done better...

This gets info from IMDB for TV and Movies using TheMovieDB (if imdbid is found):

Code: Select all

{def season = (any{special ? 0 : s} {s}); 
def episode = (any{special ? special : e} {e}); 
def eAddOns = ['append_to_response':'external_ids']; 
def idTest = any{imdbid}{null}

def episodeInfo = (idTest == null) ? net.filebot.WebServices.TheMovieDB.request("tv/${id}/season/${season}/episode/${episode}", eAddOns, Locale.US) : null; 
def epImdbId = any{episodeInfo.external_ids.imdb_id}{null}; 

def IMDbId = any{imdbid}{epImdbId}

def data = (IMDbId != null) ? net.filebot.Cache.getCache("IMDbData_JSON", net.filebot.CacheType.Weekly).computeIfAbsent(id) {
	return org.jsoup.Jsoup.connect("http://www.imdb.com/title/${IMDbId}").userAgent('Mozilla/5.0').get().select("script[type=\"application/ld+json\"]").html().replaceAll(/\s\s+/)
} : null;

def JSON = (IMDbId != null) ? new groovy.json.JsonSlurper().parseText( data ) : null

def iTitle = any{JSON.name}{null}
def iYear = any{new Date().parse("yyyy-MM-dd", JSON.datePublished).format('yyyy')}{null}
def iPremiered = any{JSON.datePublished}{null}
def iRuntime = any{JSON.duration.replaceAll(/PT/).lower()}{null}
def iRating = any{JSON.aggregateRating.ratingValue as double}{null}
def iVotes = any{JSON.aggregateRating.ratingCount as int}{null}
def iGenre = any{JSON.genre}{null}
def iCertification = any{JSON.contentRating}{null}
def iDirector = any{JSON.director.name}{null}
def iWriters = any{JSON.creator.name.minus(null).unique()}{null}
def iActor = any{JSON.actor.name}{null}
def iKeywords = any{JSON.keywords}{null}
def iPlot = any{JSON.description}{null}
def iPoster = any{new URL(JSON.image)}{null}
//iYear and iPremiered may be wrong
allOf{iTitle}{iYear}{iPremiered}{iRuntime}{iRating}{iVotes}{iGenre}{iCertification}{iDirector}{iWriters}{IMDbId}{iPoster}

}
btw: I found 1 movie with the last release date and not the first = wrong year/date
//iYear and iPremiered may be wrong

Re: Use Internal Cache

Posted: 25 Oct 2018, 12:28
by rednoah
Looks good to me. Didn't know the have a kinda-JSON-API baked into the website now. That's new! :lol:

You could consider using less variables and using more of a tree-structure approach, e.g.

Code: Select all

allOf
    {JSON.name}
    {new Date().parse("yyyy-MM-dd", JSON.datePublished).format('yyyy')}
    {JSON.datePublished}
Your code might be easier for others to pick and choose from because variables make it easier to understand and reuse. Either way is fine.

Re: Use Internal Cache

Posted: 25 Oct 2018, 16:46
by kim
It's was news to me also, I discovered it when looking at the html code

It's called "JSON-LD", here is some info:
https://en.wikipedia.org/wiki/JSON-LD
https://schema.org/

Re: Use Internal Cache

Posted: 27 Oct 2018, 00:38
by kim
For some reason this no longer work in new Filebot

Code: Select all

{new Date().parse("yyyy-MM-dd", JSON.datePublished).format('yyyy')}
if it's a internal change or a Java change... I don't know

Re: Use Internal Cache

Posted: 27 Oct 2018, 10:43
by rednoah
I donno. Date utility functions seem to be in flux with recent Groovy releases.

Might be easier to just stick to something dead simple like this:

Code: Select all

{
	def date = '2018-01-01'
	def year = date.split('-').first()
}