Page 1 of 2

Invaild XML Error

Posted: 15 Nov 2019, 02:42
by ImpactPlayers
Hi,

Can someone please tell me what this error means and how to please fix it? Thanks in advance, Sharon

https://snipboard.io/AnzOgc.jpg

Re: Invaild XML Error

Posted: 15 Nov 2019, 04:51
by rednoah
Looks like FileBot is requesting a resource that is expected to be XML, but unexpectedly receives a HTML error page. They logs will tell you more.


:idea: Please read How to Request Help.

Re: Invaild XML Error

Posted: 15 Nov 2019, 06:37
by GhostOfSparta
I think thetvdb's API just changed where it's using cloudfront (AWS CDN) to enforce redirects from http -> https and filebot's not properly following the redirect on some (possibly older?) versions.

Code: Select all

Rename episodes using [TheTVDB]
Auto-detected query: [young sheldon]
Fetch failed: http://thetvdb.com/api/694FAD89942D3827/mirrors.xml
net.filebot.InvalidResponseException: Invalid XML: SAXParseException: The element type "hr" must be terminated by the matching end-tag "</hr>".
<html>
<head><title>301 Moved Permanently</title></head>
<body bgcolor="white">
<center><h1>301 Moved Permanently</h1></center>
<hr><center>CloudFront</center>
</body>
</html>
when I curl that address I get this

Code: Select all

filebot# curl -vvv http://thetvdb.com/api/694FAD89942D3827/mirrors.xml

*   Trying 13.224.29.21...
* TCP_NODELAY set
* Connected to thetvdb.com (13.224.29.21) port 80 (#0)
> GET /api/694FAD89942D3827/mirrors.xml HTTP/1.1
> Host: thetvdb.com
> User-Agent: curl/7.52.1
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Server: CloudFront
< Date: Fri, 15 Nov 2019 06:29:36 GMT
< Content-Type: text/html
< Content-Length: 183
< Connection: keep-alive
< Location: https://thetvdb.com/api/694FAD89942D3827/mirrors.xml
< X-Cache: Redirect from cloudfront
< Via: 1.1 0cf6c59c77f0fff670ae085179adc459.cloudfront.net (CloudFront)
< X-Amz-Cf-Pop: SEA19-C1
< X-Amz-Cf-Id: HZR-R_MzSSCICCYUZM-9gI21KCZOL0--Z02roPV-BHqDQElnHVOsSQ==
<
<html>
<head><title>301 Moved Permanently</title></head>
<body bgcolor="white">
<center><h1>301 Moved Permanently</h1></center>
<hr><center>CloudFront</center>
</body>
</html>
* Curl_http_done: called premature == 0
* Connection #0 to host thetvdb.com left intact
and when I add -L to follow the redirect I see this

Code: Select all

curl -L -vvv http://thetvdb.com/api/694FAD89942D3827/mirrors.xml

*   Trying 99.86.129.127...
* TCP_NODELAY set
* Connected to thetvdb.com (99.86.129.127) port 80 (#0)
> GET /api/694FAD89942D3827/mirrors.xml HTTP/1.1
> Host: thetvdb.com
> User-Agent: curl/7.52.1
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Server: CloudFront
< Date: Fri, 15 Nov 2019 06:33:27 GMT
< Content-Type: text/html
< Content-Length: 183
< Connection: keep-alive
< Location: https://thetvdb.com/api/694FAD89942D3827/mirrors.xml
< X-Cache: Redirect from cloudfront
< Via: 1.1 320ce1e804be3f51c59bf2aa0a75b4b6.cloudfront.net (CloudFront)
< X-Amz-Cf-Pop: HIO51-C1
< X-Amz-Cf-Id: qrj3BvTdO6oR-K8csnDDpxWQCzaM86XcPc5OT6GVrkF0fSC4y8Xyqg==
<
* Ignoring the response-body
* Curl_http_done: called premature == 0
* Connection #0 to host thetvdb.com left intact
* Issue another request to this URL: 'https://thetvdb.com/api/694FAD89942D3827/mirrors.xml'
*   Trying 99.86.129.127...
* TCP_NODELAY set
* Connected to thetvdb.com (99.86.129.127) port 443 (#1)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=*.thetvdb.com
*  start date: May 23 00:00:00 2019 GMT
*  expire date: Jun 23 12:00:00 2020 GMT
*  subjectAltName: host "thetvdb.com" matched cert's "thetvdb.com"
*  issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x56395d098e80)
> GET /api/694FAD89942D3827/mirrors.xml HTTP/1.1
> Host: thetvdb.com
> User-Agent: curl/7.52.1
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 403
< content-type: application/json
< content-length: 42
< date: Fri, 15 Nov 2019 06:33:27 GMT
< x-amzn-requestid: 07e753b1-be7d-4f5e-974b-4002b91f0a2a
< x-amzn-errortype: MissingAuthenticationTokenException
< x-amz-apigw-id: DL_8sEc4PHcF1Mg=
< via: 1.1 0005a84c2971ff4f5bbb79e7ebc622a9.cloudfront.net (CloudFront), 1.1 686ace1321107362da87839adf526fc6.cloudfront.net (CloudFront)
< x-amz-cf-pop: HIO50-C1
< x-cache: Error from cloudfront
< x-amz-cf-pop: HIO51-C1
< x-amz-cf-id: NRoZii6VwRScGQ19lgZ9GAOtFaSJd3gd62T25dMWaaU_PR5dB1VMPw==
<
* Curl_http_done: called premature == 0
* Connection #1 to host thetvdb.com left intact
{"message":"Missing Authentication Token"}
So it seems like thetvdb mostly broke backwards compatibility recently. I'm guessing newer versions of filebot don't have this problem because support for the new JSON API has been added.

Re: Invaild XML Error

Posted: 15 Nov 2019, 11:18
by rednoah
1.
There are massive issues across the board if you're using TheTVDB, and I suppose both XML v1 API and JSON v2 API are broken to some degree for the time being:
https://forums.thetvdb.com/viewtopic.php?f=17&t=60029


2.
FileBot is switched to the JSON v2 API around ~2 years ago. But if you're using an older release, then that's likely gonna be using the XML v1 API.

:idea: As for redirects, IDK how that's handled, and if it's handled by FileBot or the JDK, or if you can perhaps add some Java system property to enable it. If redirect-follow was never necessary in the past, then it was likely never tested, and so very possible that it just doesn't work.

:arrow: filebot -script fn:sysinfo will tell us what version / revision of FileBot you're using.

Re: Invaild XML Error

Posted: 15 Nov 2019, 17:22
by kim
Sound like you are using an old version of Filebot ?
that had a related problem like
viewtopic.php?f=6&t=5329&hilit=https+thetvdb#p30200
viewtopic.php?f=6&t=5329&p=30634&hilit= ... vdb#p30315 that was fixed long ago.

Maybe you can "fix" it like I did then, but it's better to use themoviedb or upgrade (buy) new Filebot

Re: Invaild XML Error

Posted: 15 Nov 2019, 23:00
by ImpactPlayers
Thanks for all the comments. I tried the filebot -script fn:sysinfo and got the following results.

Code: Select all

Microsoft Windows [Version 10.0.18363.476]
(c) 2019 Microsoft Corporation. All rights reserved.

C:\WINDOWS\system32>filebot -script fn:sysinfo
FileBot 4.7.9 (r4984)
JNA Native: 5.1.0
MediaInfo: 0.7.93
7-Zip-JBinding: 9.20
Chromaprint: 1.4.2
Extended Attributes: OK
Unicode Filesystem: OK
Script Bundle: 2018-03-16 (r516)
Groovy: 2.4.10
JRE: Java(TM) SE Runtime Environment 1.8.0_231
JVM: 64-bit Java HotSpot(TM) 64-Bit Server VM
CPU/MEM: 4 Core / 3 GB Max Memory / 24 MB Used Memory
OS: Windows 10 (amd64)
Package: MSI

------------------- UPDATE AVAILABLE: FileBot 4.8.5 (r6224) --------------------

Done ?(?????)?

C:\WINDOWS\system32>

Re: Invaild XML Error

Posted: 16 Nov 2019, 06:02
by rednoah
Looks like you'll have to wait for the new-old TheTVDB v1 API to restore backwards-compatibility with existing software.

viewtopic.php?t=11254

Re: Invaild XML Error

Posted: 26 Nov 2019, 16:37
by tarad10
The issue is here: https://thetvdb.com/api/694FAD89942D3827/mirrors.xml. The mirrorpath node is returning the non-secure TVDB URL. Older versions of FileBot build the cache off of that. The easiest workaround mentioned is to use the movie database for television. If that is not granular enough, you can either buy a license for the new version or modify an old instance of the code. Version 4.6.2 is still available via SourceForge. You can either modify the WebRequest class to handle redirects or you can replace "http" with "https" when the TheTVDBClient class gets the mirror and populates the cache.

Re: Invaild XML Error

Posted: 27 Nov 2019, 01:19
by 2devnull
tarad10 wrote: 26 Nov 2019, 16:37 The issue is here: https://thetvdb.com/api/694FAD89942D3827/mirrors.xml. The mirrorpath node is returning the non-secure TVDB URL. Older versions of FileBot build the cache off of that. The easiest workaround mentioned is to use the movie database for television. If that is not granular enough, you can either buy a license for the new version or modify an old instance of the code. Version 4.6.2 is still available via SourceForge. You can either modify the WebRequest class to handle redirects or you can replace "http" with "https" when the TheTVDBClient class gets the mirror and populates the cache.
Where can one locate the 4.7.9 source to fix it?

Re: Invaild XML Error

Posted: 28 Nov 2019, 16:09
by 2devnull
Nevermind, was able to create a proxy to edit the mirrors.xml file before filebot sees it.

Re: Invaild XML Error

Posted: 29 Nov 2019, 02:50
by kim
What kind of proxy ?
what you you mean by "edit the mirrors.xml file" ?

btw: https://forums.thetvdb.com/viewtopic.ph ... 9&p=163719

So it's fixed ?
Here are the current known API-related issues with released fixes:
mirrors.xml doesn't exist — Released 11/17

Re: Invaild XML Error

Posted: 30 Nov 2019, 11:37
by uder
tarad10 wrote: 26 Nov 2019, 16:37 The easiest workaround mentioned is to use the movie database for television.
Anyone know how I can do this in my uTorrent script (automated media centre)?
Can't seem to get this working.

Code: Select all

filebot -script fn:amc --output "P:/Media" --log-file C:/Users/xxx/amc_log.txt --action copy --conflict auto -non-strict --db TheMovieDB::TV --def music=n --def plex=localhost --def clean=y subtitles=en artwork=n reportError=y --def "ut_label=%L" "ut_state=%S" "ut_title=%N" "ut_kind=%K" "ut_file=%F" "ut_dir=%D"

Re: Invaild XML Error

Posted: 30 Nov 2019, 17:43
by rednoah
What does the log say?


:idea: Please read How to Request Help.

Re: Invaild XML Error

Posted: 30 Nov 2019, 22:27
by uder
Still uses thetvdb and fails:

Code: Select all

Failure (°_°)
Run script [fn:amc] at [Sat Nov 30 23:05:04 AEDT 2019]
Parameter: seriesDB = TheMovieDB::TV
...
Rename episodes using [TheTVDB]
...
Fetch failed: http://thetvdb.com/api/694FAD89942D3827/mirrors.xml

Re: Invaild XML Error

Posted: 01 Dec 2019, 02:04
by kim
this is why:
rednoah wrote: 16 Nov 2019, 06:19 Yes, newer revisions of the amc script support a series of --def *DB options:

Code: Select all

--def seriesDB="TheMovieDB::TV"
:!: You may need to upgrade to the latest beta though, for this option to work.

:!: The --db option has no effect on the amc script, since the amc script forces different db values internally depending on what movie/series/anime/music/ etc auto-detection says.
also if you use this it will fail again because TheTVDB is hardcoded:

Code: Select all

def artwork=y
https://github.com/filebot/scripts/blob ... roovy#L387

Re: Invaild XML Error

Posted: 01 Dec 2019, 02:43
by Chrrs
tarad10 wrote: 26 Nov 2019, 16:37 The issue is here: https://thetvdb.com/api/694FAD89942D3827/mirrors.xml. The mirrorpath node is returning the non-secure TVDB URL. Older versions of FileBot build the cache off of that. The easiest workaround mentioned is to use the movie database for television. If that is not granular enough, you can either buy a license for the new version or modify an old instance of the code. Version 4.6.2 is still available via SourceForge. You can either modify the WebRequest class to handle redirects or you can replace "http" with "https" when the TheTVDBClient class gets the mirror and populates the cache.
I modified the Java bytecodes for the getMirror() function to just bypass the cache and always return https://thetvdb.com. I'm not a Java programmer and I'm not sure if this will be viable long term, but my deluge+filebottool Unraid docker is now finally able to process my tv shows.

A proxy solution is probably the best solution though. I just had issues getting nginx to work correctly and going through the decompiled Java source and playing around with bytecode editing sounded more fun.

If anyone wants to reproduce what I did:

You will need a Java bytecode editor such as https://github.com/Col-E/Recaf. Download Recaf and then run it. Go to File -> Load and then point it to your FileBot.jar. From the left side menu go to net/filebot/web and open TheTVDBClientV1 class.

Click the Method's tab. Find the getMirror method and open it. Click the Edit Instructions button.

Find the instruction that goes: "IF_ACMPNE LABEL C" and double click on it. Change the opcode to "IF_ACMPEQ" instead.

Find the instruction that goes LDC "http://thetvdb.com" a couple lines down and double click on it. Change its value to "https://thetvdb.com" without the quotes.

The code should now look like this: https://imgur.com/a/bWBZzEd

Close all the windows you just opened to go back to the main Recaf screen. Go to File -> Export and save this as FileBot.jar.

You can now replace your existing FileBot.jar with your newly edited copy. I also cleared my FileBot's cache by running: filebot -clear-cache.

Re: Invaild XML Error

Posted: 01 Dec 2019, 08:03
by uder
Worked, thank you!

Re: Invaild XML Error

Posted: 01 Dec 2019, 15:25
by Dannzi
Chrrs wrote: 01 Dec 2019, 02:43
tarad10 wrote: 26 Nov 2019, 16:37 The issue is here: https://thetvdb.com/api/694FAD89942D3827/mirrors.xml. The mirrorpath node is returning the non-secure TVDB URL. Older versions of FileBot build the cache off of that. The easiest workaround mentioned is to use the movie database for television. If that is not granular enough, you can either buy a license for the new version or modify an old instance of the code. Version 4.6.2 is still available via SourceForge. You can either modify the WebRequest class to handle redirects or you can replace "http" with "https" when the TheTVDBClient class gets the mirror and populates the cache.
I modified the Java bytecodes for the getMirror() function to just bypass the cache and always return https://thetvdb.com. I'm not a Java programmer and I'm not sure if this will be viable long term, but my deluge+filebottool Unraid docker is now finally able to process my tv shows.

A proxy solution is probably the best solution though. I just had issues getting nginx to work correctly and going through the decompiled Java source and playing around with bytecode editing sounded more fun.

If anyone wants to reproduce what I did:

You will need a Java bytecode editor such as https://github.com/Col-E/Recaf. Download Recaf and then run it. Go to File -> Load and then point it to your FileBot.jar. From the left side menu go to net/filebot/web and open TheTVDBClientV1 class.

Click the Method's tab. Find the getMirror method and open it. Click the Edit Instructions button.

Find the instruction that goes: "IF_ACMPNE LABEL C" and double click on it. Change the opcode to "IF_ACMPEQ" instead.

Find the instruction that goes LDC "http://thetvdb.com" a couple lines down and double click on it. Change its value to "https://thetvdb.com" without the quotes.

The code should now look like this: https://imgur.com/a/bWBZzEd

Close all the windows you just opened to go back to the main Recaf screen. Go to File -> Export and save this as FileBot.jar.

You can now replace your existing FileBot.jar with your newly edited copy. I also cleared my FileBot's cache by running: filebot -clear-cache.
any chance for you to upload that filebot.jar file anywhere, downloaded Recaf and after loading the file into it, it just freezes, can't edit anything :l

Re: Invaild XML Error

Posted: 01 Dec 2019, 16:16
by klarth
Chrrs wrote: 01 Dec 2019, 02:43
Thank you very much. This worked.

Re: Invaild XML Error

Posted: 01 Dec 2019, 19:30
by kalgon
I modified net.filebot.web.WebRequest.fetch(URL, long, Object, Map<>, Consumer<>) to check if the connection response's code is 301. If it is, then the method calls itself with the new URL I got from the Location header. I tried setting connection.setInstanceFollowRedirects(true) but it did not work.

Code: Select all

public static ByteBuffer fetch(URL url, long ifModifiedSince, Object etag, Map<String, String> requestParameters, Consumer<Map<String, List<String>>> responseParameters) throws IOException {
    ...
    if (requestParameters != null) {
        requestParameters.forEach(connection::addRequestProperty);
    }

    // NEW CODE
    if (connection instanceof HttpURLConnection) {
        HttpURLConnection httpConnection = (HttpURLConnection) connection;
        if (httpConnection.getResponseCode() == 301) {
            return fetch(new URL(connection.getHeaderField("Location")), ifModifiedSince, etag, requestParameters, responseParameters);
        }
    }
    // END NEW CODE

    int contentLength = connection.getContentLength();
    ...
}   

Re: Invaild XML Error

Posted: 01 Dec 2019, 22:30
by fireheart2008
Chrrs wrote: 01 Dec 2019, 02:43
I modified the Java bytecodes for ...
thanks man! it works for the renaming. unfortunately fetching tv artwork doesn't work

Re: Invaild XML Error

Posted: 02 Dec 2019, 01:00
by Chrrs
fireheart2008 wrote: 01 Dec 2019, 22:30
Chrrs wrote: 01 Dec 2019, 02:43
I modified the Java bytecodes for ...
thanks man! it works for the renaming. unfortunately fetching tv artwork doesn't work
Are you able to setup a proxy such as nginx? This might be the best solution rather than my bytecode change. Someone on reddit was able to get it working and posted their nginx config: https://www.reddit.com/r/filebot/commen ... dium=web2x

But what error are you getting? I tried it on my end using my modified FileBot.jar and it ran with no issues:

Code: Select all

# filebot -script fn:artwork.tvdb "/TV/The Simpsons"   
/TV/The Simpsons => Search by The Simpsons
Auto-Select [The Simpsons] from [The Simpsons, The Ashlee Simpson Show, The People vs O.J. Simpson, The Real O.J. Simpson Trial, Jessica Simpson's The Price of Beauty, The Galton and Simpson Playhouse, The Galton and Simpson Playhouse (1969), Todd Sampson's Life on the Line, Russell Simmons Presents The Ruckus, Russell Simmons Presents Stand-Up at the El Rey, Is O.J. Innocent? The Missing Evidence, American Crime Story]
/TV/The Simpsons => The Simpsons
Generate Series NFO: The Simpsons [71663]
Banner not found: /TV/The Simpsons/poster.jpg / poster:680x1000
Banner not found: /TV/The Simpsons/poster.jpg / poster:null
Banner not found: /TV/The Simpsons/banner.jpg / series:graphical
Banner not found: /TV/The Simpsons/banner.jpg / series:null
Banner already exists: /TV/The Simpsons/fanart.jpg
Fetching /TV/The Simpsons/clearart.png => [hdclearart, en, 6.0, https://assets.fanart.tv/fanart/tv/71663/hdclearart/the-simpsons-516eb1f107609.png]
Fetching /TV/The Simpsons/logo.png => [hdtvlogo, en, 6.0, https://assets.fanart.tv/fanart/tv/71663/hdtvlogo/the-simpsons-505768514da23.png]
Fetching /TV/The Simpsons/landscape.jpg => [tvthumb, en, 4.0, https://assets.fanart.tv/fanart/tv/71663/tvthumb/the-simpsons-58a9d6f1bb4eb.jpg]
Done ヾ(@⌒ー⌒@)ノ

Re: Invaild XML Error

Posted: 02 Dec 2019, 01:17
by Chrrs
Dannzi wrote: 01 Dec 2019, 15:25 any chance for you to upload that filebot.jar file anywhere, downloaded Recaf and after loading the file into it, it just freezes, can't edit anything :l
Not sure of the legality of me redistributing a modified copy of FileBot. Have you tried another bytecode editor such as https://set.ee/jbe/?

Re: Invaild XML Error

Posted: 02 Dec 2019, 08:17
by Dannzi
Chrrs wrote: 02 Dec 2019, 01:17
Dannzi wrote: 01 Dec 2019, 15:25 any chance for you to upload that filebot.jar file anywhere, downloaded Recaf and after loading the file into it, it just freezes, can't edit anything :l
Not sure of the legality of me redistributing a modified copy of FileBot. Have you tried another bytecode editor such as https://set.ee/jbe/?
Yeah, tried that one, managed to open the file, but when navigating to the correct place it froze again :/

Re: Invaild XML Error

Posted: 02 Dec 2019, 19:55
by preech
Chrrs wrote: 01 Dec 2019, 02:43...
Thank you, kind internet friend!

The jar editor also freeze for me under windows, but managed to get it working under linux.
( The java 8 install was a real pain tho )