Custom Scraper (e.g. Rename audiobooks via Audible, Rename Game ROMs via ChatGPT, etc)

Post by **rednoah** » 24 Feb 2024, 05:36

Plain File Mode allows you to rename files based on information collected from arbitrary websites (e.g. HTML webpage, JSON API, XML API, etc) via your own custom code. This is an advanced feature and will require some coding skills to develop and modify to your needs. The examples below will get you started though, and may work for your use case with no or minimal customization.

Examples

e.g. Rename Game ROMs via ChatGPT

Prompt ChatGPT to propose a new file path for the file at hand.
NOTE: You will need your own OPENAI_API_KEY / DEEPSEEK_API_KEY and pay for OpenAI API usage / DeepSeek API usage.

e.g. use OpenAI

Format: Select all

{
	OpenAI(
		system: 'You are a helpful video game historian.',
		user: """
			Rename the following Game ROM file:
			"$fn"
			
			Use the format:
			<Console Name>/<Release Year> - <Game Title>
			
			Provide:
			<New File Path>
			
			Limit your response to a single valid file path without extension.
			""",
		url: 'https://api.openai.com/v1', model: 'gpt-4o-mini',
		key: '<INSERT YOUR API KEY HERE>'
	)
}

Code: Select all

N64/1997 - GoldenEye

e.g. use DeepSeek

Format: Select all

{
	OpenAI(
		system: 'You are a helpful video game historian.',
		user: """
			Rename the following Game ROM file:
			"$fn"
			
			Use the format:
			<Console Name>/<Release Year> - <Game Title>
			
			Provide:
			<New File Path>
			
			Limit your response to a single valid file path without extension.
			""",
		url: 'https://api.deepseek.com', model: 'deepseek-chat',
		key: '<INSERT YOUR API KEY HERE>'
	)
}

Code: Select all

Nintendo 64/1997 - GoldenEye 007

e.g. use Google Gemini

Format: Select all

{
	OpenAI(
		system: 'You are a helpful video game historian.',
		user: """
			Rename the following Game ROM file:
			"$fn"
			
			Use the format:
			<Console Name>/<Release Year> - <Game Title>
			
			Provide:
			<New File Path>
			
			Limit your response to a single valid file path without extension.
			""",
		url: 'https://generativelanguage.googleapis.com/v1beta/openai', model: 'gemini-2.0-flash',
		key: '<INSERT YOUR API KEY HERE>'
	)
}

Code: Select all

Nintendo 64/1997 - Goldeneye

e.g. Rename audiobooks via Audible

Send HTTP requests to audible.com/search using the file name as search query and then scrape information from the first result in the HTML response.

Groovy: Select all

{ drive }/Audible/
{
	def query = fn.space(' ').removeAll(/^\d+ - | - \d+$| \d+$|-Part\d+$/)
	def url = 'https://www.audible.com/search'.toURL(keywords: query, ipRedirectOverride: true, overrideBaseCountry: true)
	def ul = html(url).select('li.authorLabel').first().parent()

	def title = ul.select('h3 a').text()
	def author = ul.select('li.authorLabel a').text()
	def narrator = ul.select('li.narratorLabel a').text()
	
	def series = ul.select('li.seriesLabel a').text()
	def book = any{ ul.select('li.seriesLabel').text().tokenize().last() }{ null }

	if (series) {
		"$author/$series/$book - $title (narrated by $narrator)"	
	} else {
		"$author/$title (narrated by $narrator)"
	}
}

The example code above assumes that the files at hand are already well-named. The filename is used verbatim as search query to audible.com/search and so you may need to modify the code to fit your needs if your files are named differently and thus require some additional logic to extract the appropriate search query for each file:

Code: Select all

The Hobbit.mp3
The Awkward Thoughts of W Kamau Bell-Part01.mp3
Steve Berry - The Emperors Tomb 01.mp3
001 - Prodigal_Son.mp3

e.g. Custom Format Server

Send HTTP requests to a custom format server and use the JSON response as target file path:

Format: Select all

{ json('http://localhost:8080/'.toURL(f:f)) }