Custom Page Scraper (e.g. Rename audiobooks via Audible)

How-to guides, frequently asked questions, not-so-obvious features, etc
Post Reply
User avatar
rednoah
The Source
Posts: 23386
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Custom Page Scraper (e.g. Rename audiobooks via Audible)

Post by rednoah »

Plain File Mode allows you to rename files based on information collected from arbitrary websites (e.g. HTML webpage, JSON API, XML API, etc) via your own custom code. This is an advanced feature and will require some coding skills to develop and modify to your needs. The examples below will get you started though, and may work for your use case with no or minimal customization.


e.g. Audible

Send HTTP requests to audible.com/search using the file name as search query and then scrape information from the first result in the HTML response.

ScreenshotScreenshot

Groovy: Select all

{
	def query = fn.space(' ')
	def url = 'https://www.audible.com/search'.toURL(keywords: query, ipRedirectOverride: true, overrideBaseCountry: true)
	def ul = html(url).select('li.authorLabel').first().parent()

	def title = ul.select('h3 a').text()
	def author = ul.select('li.authorLabel a').text()
	def narrator = ul.select('li.narratorLabel a').text()
	
	def series = ul.select('li.seriesLabel a').text()
	def book = any{ ul.select('li.seriesLabel').text().tokenize().last() }{ null }

	if (series) {
		"~/Audible/$author/$series/$book - $title (narrated by $narrator)"	
	} else {
		"~/Audible/$author/$title (narrated by $narrator)"
	}
}

:!: The example code above assumes that the files at hand are already well-named. The filename is used verbatim as search query to audible.com/search and so you may need to modify the code to fit your needs if your files are named differently and thus require some additional logic to extract the appropriate search query for each file:

Code: Select all

The Hobbit by J. R. R. Tolkien.mp3
The Hobbit.mp3




e.g. Custom Format Server

Send HTTP requests to a custom format server and use the JSON response as target file path:

Format: Select all

{ json('http://localhost:8080/'.toURL(f:f)) }
:idea: Please read the FAQ and How to Request Help.
Post Reply