Invaild XML Error

Any questions? Need some help?
ImpactPlayers
Posts: 2
Joined: 15 Nov 2019, 02:28

Invaild XML Error

Post by ImpactPlayers »

Hi,

Can someone please tell me what this error means and how to please fix it? Thanks in advance, Sharon

https://snipboard.io/AnzOgc.jpg
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Invaild XML Error

Post by rednoah »

Looks like FileBot is requesting a resource that is expected to be XML, but unexpectedly receives a HTML error page. They logs will tell you more.


:idea: Please read How to Request Help.
:idea: Please read the FAQ and How to Request Help.
GhostOfSparta
Posts: 3
Joined: 07 Aug 2018, 05:29

Re: Invaild XML Error

Post by GhostOfSparta »

I think thetvdb's API just changed where it's using cloudfront (AWS CDN) to enforce redirects from http -> https and filebot's not properly following the redirect on some (possibly older?) versions.

Code: Select all

Rename episodes using [TheTVDB]
Auto-detected query: [young sheldon]
Fetch failed: http://thetvdb.com/api/694FAD89942D3827/mirrors.xml
net.filebot.InvalidResponseException: Invalid XML: SAXParseException: The element type "hr" must be terminated by the matching end-tag "</hr>".
<html>
<head><title>301 Moved Permanently</title></head>
<body bgcolor="white">
<center><h1>301 Moved Permanently</h1></center>
<hr><center>CloudFront</center>
</body>
</html>
when I curl that address I get this

Code: Select all

filebot# curl -vvv http://thetvdb.com/api/694FAD89942D3827/mirrors.xml

*   Trying 13.224.29.21...
* TCP_NODELAY set
* Connected to thetvdb.com (13.224.29.21) port 80 (#0)
> GET /api/694FAD89942D3827/mirrors.xml HTTP/1.1
> Host: thetvdb.com
> User-Agent: curl/7.52.1
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Server: CloudFront
< Date: Fri, 15 Nov 2019 06:29:36 GMT
< Content-Type: text/html
< Content-Length: 183
< Connection: keep-alive
< Location: https://thetvdb.com/api/694FAD89942D3827/mirrors.xml
< X-Cache: Redirect from cloudfront
< Via: 1.1 0cf6c59c77f0fff670ae085179adc459.cloudfront.net (CloudFront)
< X-Amz-Cf-Pop: SEA19-C1
< X-Amz-Cf-Id: HZR-R_MzSSCICCYUZM-9gI21KCZOL0--Z02roPV-BHqDQElnHVOsSQ==
<
<html>
<head><title>301 Moved Permanently</title></head>
<body bgcolor="white">
<center><h1>301 Moved Permanently</h1></center>
<hr><center>CloudFront</center>
</body>
</html>
* Curl_http_done: called premature == 0
* Connection #0 to host thetvdb.com left intact
and when I add -L to follow the redirect I see this

Code: Select all

curl -L -vvv http://thetvdb.com/api/694FAD89942D3827/mirrors.xml

*   Trying 99.86.129.127...
* TCP_NODELAY set
* Connected to thetvdb.com (99.86.129.127) port 80 (#0)
> GET /api/694FAD89942D3827/mirrors.xml HTTP/1.1
> Host: thetvdb.com
> User-Agent: curl/7.52.1
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Server: CloudFront
< Date: Fri, 15 Nov 2019 06:33:27 GMT
< Content-Type: text/html
< Content-Length: 183
< Connection: keep-alive
< Location: https://thetvdb.com/api/694FAD89942D3827/mirrors.xml
< X-Cache: Redirect from cloudfront
< Via: 1.1 320ce1e804be3f51c59bf2aa0a75b4b6.cloudfront.net (CloudFront)
< X-Amz-Cf-Pop: HIO51-C1
< X-Amz-Cf-Id: qrj3BvTdO6oR-K8csnDDpxWQCzaM86XcPc5OT6GVrkF0fSC4y8Xyqg==
<
* Ignoring the response-body
* Curl_http_done: called premature == 0
* Connection #0 to host thetvdb.com left intact
* Issue another request to this URL: 'https://thetvdb.com/api/694FAD89942D3827/mirrors.xml'
*   Trying 99.86.129.127...
* TCP_NODELAY set
* Connected to thetvdb.com (99.86.129.127) port 443 (#1)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=*.thetvdb.com
*  start date: May 23 00:00:00 2019 GMT
*  expire date: Jun 23 12:00:00 2020 GMT
*  subjectAltName: host "thetvdb.com" matched cert's "thetvdb.com"
*  issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x56395d098e80)
> GET /api/694FAD89942D3827/mirrors.xml HTTP/1.1
> Host: thetvdb.com
> User-Agent: curl/7.52.1
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 403
< content-type: application/json
< content-length: 42
< date: Fri, 15 Nov 2019 06:33:27 GMT
< x-amzn-requestid: 07e753b1-be7d-4f5e-974b-4002b91f0a2a
< x-amzn-errortype: MissingAuthenticationTokenException
< x-amz-apigw-id: DL_8sEc4PHcF1Mg=
< via: 1.1 0005a84c2971ff4f5bbb79e7ebc622a9.cloudfront.net (CloudFront), 1.1 686ace1321107362da87839adf526fc6.cloudfront.net (CloudFront)
< x-amz-cf-pop: HIO50-C1
< x-cache: Error from cloudfront
< x-amz-cf-pop: HIO51-C1
< x-amz-cf-id: NRoZii6VwRScGQ19lgZ9GAOtFaSJd3gd62T25dMWaaU_PR5dB1VMPw==
<
* Curl_http_done: called premature == 0
* Connection #1 to host thetvdb.com left intact
{"message":"Missing Authentication Token"}
So it seems like thetvdb mostly broke backwards compatibility recently. I'm guessing newer versions of filebot don't have this problem because support for the new JSON API has been added.
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Invaild XML Error

Post by rednoah »

1.
There are massive issues across the board if you're using TheTVDB, and I suppose both XML v1 API and JSON v2 API are broken to some degree for the time being:
https://forums.thetvdb.com/viewtopic.php?f=17&t=60029


2.
FileBot is switched to the JSON v2 API around ~2 years ago. But if you're using an older release, then that's likely gonna be using the XML v1 API.

:idea: As for redirects, IDK how that's handled, and if it's handled by FileBot or the JDK, or if you can perhaps add some Java system property to enable it. If redirect-follow was never necessary in the past, then it was likely never tested, and so very possible that it just doesn't work.

:arrow: filebot -script fn:sysinfo will tell us what version / revision of FileBot you're using.
:idea: Please read the FAQ and How to Request Help.
kim
Power User
Posts: 1251
Joined: 15 May 2014, 16:17

Re: Invaild XML Error

Post by kim »

Sound like you are using an old version of Filebot ?
that had a related problem like
viewtopic.php?f=6&t=5329&hilit=https+thetvdb#p30200
viewtopic.php?f=6&t=5329&p=30634&hilit= ... vdb#p30315 that was fixed long ago.

Maybe you can "fix" it like I did then, but it's better to use themoviedb or upgrade (buy) new Filebot
ImpactPlayers
Posts: 2
Joined: 15 Nov 2019, 02:28

Re: Invaild XML Error

Post by ImpactPlayers »

Thanks for all the comments. I tried the filebot -script fn:sysinfo and got the following results.

Code: Select all

Microsoft Windows [Version 10.0.18363.476]
(c) 2019 Microsoft Corporation. All rights reserved.

C:\WINDOWS\system32>filebot -script fn:sysinfo
FileBot 4.7.9 (r4984)
JNA Native: 5.1.0
MediaInfo: 0.7.93
7-Zip-JBinding: 9.20
Chromaprint: 1.4.2
Extended Attributes: OK
Unicode Filesystem: OK
Script Bundle: 2018-03-16 (r516)
Groovy: 2.4.10
JRE: Java(TM) SE Runtime Environment 1.8.0_231
JVM: 64-bit Java HotSpot(TM) 64-Bit Server VM
CPU/MEM: 4 Core / 3 GB Max Memory / 24 MB Used Memory
OS: Windows 10 (amd64)
Package: MSI

------------------- UPDATE AVAILABLE: FileBot 4.8.5 (r6224) --------------------

Done ?(?????)?

C:\WINDOWS\system32>
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Invaild XML Error

Post by rednoah »

Looks like you'll have to wait for the new-old TheTVDB v1 API to restore backwards-compatibility with existing software.

viewtopic.php?t=11254
:idea: Please read the FAQ and How to Request Help.
tarad10
Posts: 1
Joined: 26 Nov 2019, 15:54

Re: Invaild XML Error

Post by tarad10 »

The issue is here: https://thetvdb.com/api/694FAD89942D3827/mirrors.xml. The mirrorpath node is returning the non-secure TVDB URL. Older versions of FileBot build the cache off of that. The easiest workaround mentioned is to use the movie database for television. If that is not granular enough, you can either buy a license for the new version or modify an old instance of the code. Version 4.6.2 is still available via SourceForge. You can either modify the WebRequest class to handle redirects or you can replace "http" with "https" when the TheTVDBClient class gets the mirror and populates the cache.
2devnull
Posts: 26
Joined: 09 Feb 2016, 02:07

Re: Invaild XML Error

Post by 2devnull »

tarad10 wrote: 26 Nov 2019, 16:37 The issue is here: https://thetvdb.com/api/694FAD89942D3827/mirrors.xml. The mirrorpath node is returning the non-secure TVDB URL. Older versions of FileBot build the cache off of that. The easiest workaround mentioned is to use the movie database for television. If that is not granular enough, you can either buy a license for the new version or modify an old instance of the code. Version 4.6.2 is still available via SourceForge. You can either modify the WebRequest class to handle redirects or you can replace "http" with "https" when the TheTVDBClient class gets the mirror and populates the cache.
Where can one locate the 4.7.9 source to fix it?
2devnull
Posts: 26
Joined: 09 Feb 2016, 02:07

Re: Invaild XML Error

Post by 2devnull »

Nevermind, was able to create a proxy to edit the mirrors.xml file before filebot sees it.
kim
Power User
Posts: 1251
Joined: 15 May 2014, 16:17

Re: Invaild XML Error

Post by kim »

What kind of proxy ?
what you you mean by "edit the mirrors.xml file" ?

btw: https://forums.thetvdb.com/viewtopic.ph ... 9&p=163719

So it's fixed ?
Here are the current known API-related issues with released fixes:
mirrors.xml doesn't exist — Released 11/17
uder
Posts: 3
Joined: 30 Nov 2019, 11:35

Re: Invaild XML Error

Post by uder »

tarad10 wrote: 26 Nov 2019, 16:37 The easiest workaround mentioned is to use the movie database for television.
Anyone know how I can do this in my uTorrent script (automated media centre)?
Can't seem to get this working.

Code: Select all

filebot -script fn:amc --output "P:/Media" --log-file C:/Users/xxx/amc_log.txt --action copy --conflict auto -non-strict --db TheMovieDB::TV --def music=n --def plex=localhost --def clean=y subtitles=en artwork=n reportError=y --def "ut_label=%L" "ut_state=%S" "ut_title=%N" "ut_kind=%K" "ut_file=%F" "ut_dir=%D"
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Invaild XML Error

Post by rednoah »

What does the log say?


:idea: Please read How to Request Help.
:idea: Please read the FAQ and How to Request Help.
uder
Posts: 3
Joined: 30 Nov 2019, 11:35

Re: Invaild XML Error

Post by uder »

Still uses thetvdb and fails:

Code: Select all

Failure (°_°)
Run script [fn:amc] at [Sat Nov 30 23:05:04 AEDT 2019]
Parameter: seriesDB = TheMovieDB::TV
...
Rename episodes using [TheTVDB]
...
Fetch failed: http://thetvdb.com/api/694FAD89942D3827/mirrors.xml
kim
Power User
Posts: 1251
Joined: 15 May 2014, 16:17

Re: Invaild XML Error

Post by kim »

this is why:
rednoah wrote: 16 Nov 2019, 06:19 Yes, newer revisions of the amc script support a series of --def *DB options:

Code: Select all

--def seriesDB="TheMovieDB::TV"
:!: You may need to upgrade to the latest beta though, for this option to work.

:!: The --db option has no effect on the amc script, since the amc script forces different db values internally depending on what movie/series/anime/music/ etc auto-detection says.
also if you use this it will fail again because TheTVDB is hardcoded:

Code: Select all

def artwork=y
https://github.com/filebot/scripts/blob ... roovy#L387
Chrrs
Posts: 4
Joined: 17 Apr 2017, 14:48

Re: Invaild XML Error

Post by Chrrs »

tarad10 wrote: 26 Nov 2019, 16:37 The issue is here: https://thetvdb.com/api/694FAD89942D3827/mirrors.xml. The mirrorpath node is returning the non-secure TVDB URL. Older versions of FileBot build the cache off of that. The easiest workaround mentioned is to use the movie database for television. If that is not granular enough, you can either buy a license for the new version or modify an old instance of the code. Version 4.6.2 is still available via SourceForge. You can either modify the WebRequest class to handle redirects or you can replace "http" with "https" when the TheTVDBClient class gets the mirror and populates the cache.
I modified the Java bytecodes for the getMirror() function to just bypass the cache and always return https://thetvdb.com. I'm not a Java programmer and I'm not sure if this will be viable long term, but my deluge+filebottool Unraid docker is now finally able to process my tv shows.

A proxy solution is probably the best solution though. I just had issues getting nginx to work correctly and going through the decompiled Java source and playing around with bytecode editing sounded more fun.

If anyone wants to reproduce what I did:

You will need a Java bytecode editor such as https://github.com/Col-E/Recaf. Download Recaf and then run it. Go to File -> Load and then point it to your FileBot.jar. From the left side menu go to net/filebot/web and open TheTVDBClientV1 class.

Click the Method's tab. Find the getMirror method and open it. Click the Edit Instructions button.

Find the instruction that goes: "IF_ACMPNE LABEL C" and double click on it. Change the opcode to "IF_ACMPEQ" instead.

Find the instruction that goes LDC "http://thetvdb.com" a couple lines down and double click on it. Change its value to "https://thetvdb.com" without the quotes.

The code should now look like this: https://imgur.com/a/bWBZzEd

Close all the windows you just opened to go back to the main Recaf screen. Go to File -> Export and save this as FileBot.jar.

You can now replace your existing FileBot.jar with your newly edited copy. I also cleared my FileBot's cache by running: filebot -clear-cache.
uder
Posts: 3
Joined: 30 Nov 2019, 11:35

Re: Invaild XML Error

Post by uder »

Worked, thank you!
Dannzi
Posts: 8
Joined: 05 Aug 2015, 16:01

Re: Invaild XML Error

Post by Dannzi »

Chrrs wrote: 01 Dec 2019, 02:43
tarad10 wrote: 26 Nov 2019, 16:37 The issue is here: https://thetvdb.com/api/694FAD89942D3827/mirrors.xml. The mirrorpath node is returning the non-secure TVDB URL. Older versions of FileBot build the cache off of that. The easiest workaround mentioned is to use the movie database for television. If that is not granular enough, you can either buy a license for the new version or modify an old instance of the code. Version 4.6.2 is still available via SourceForge. You can either modify the WebRequest class to handle redirects or you can replace "http" with "https" when the TheTVDBClient class gets the mirror and populates the cache.
I modified the Java bytecodes for the getMirror() function to just bypass the cache and always return https://thetvdb.com. I'm not a Java programmer and I'm not sure if this will be viable long term, but my deluge+filebottool Unraid docker is now finally able to process my tv shows.

A proxy solution is probably the best solution though. I just had issues getting nginx to work correctly and going through the decompiled Java source and playing around with bytecode editing sounded more fun.

If anyone wants to reproduce what I did:

You will need a Java bytecode editor such as https://github.com/Col-E/Recaf. Download Recaf and then run it. Go to File -> Load and then point it to your FileBot.jar. From the left side menu go to net/filebot/web and open TheTVDBClientV1 class.

Click the Method's tab. Find the getMirror method and open it. Click the Edit Instructions button.

Find the instruction that goes: "IF_ACMPNE LABEL C" and double click on it. Change the opcode to "IF_ACMPEQ" instead.

Find the instruction that goes LDC "http://thetvdb.com" a couple lines down and double click on it. Change its value to "https://thetvdb.com" without the quotes.

The code should now look like this: https://imgur.com/a/bWBZzEd

Close all the windows you just opened to go back to the main Recaf screen. Go to File -> Export and save this as FileBot.jar.

You can now replace your existing FileBot.jar with your newly edited copy. I also cleared my FileBot's cache by running: filebot -clear-cache.
any chance for you to upload that filebot.jar file anywhere, downloaded Recaf and after loading the file into it, it just freezes, can't edit anything :l
klarth
Posts: 2
Joined: 30 Nov 2019, 20:15

Re: Invaild XML Error

Post by klarth »

Chrrs wrote: 01 Dec 2019, 02:43
Thank you very much. This worked.
kalgon
Posts: 5
Joined: 26 Jul 2014, 10:58

Re: Invaild XML Error

Post by kalgon »

I modified net.filebot.web.WebRequest.fetch(URL, long, Object, Map<>, Consumer<>) to check if the connection response's code is 301. If it is, then the method calls itself with the new URL I got from the Location header. I tried setting connection.setInstanceFollowRedirects(true) but it did not work.

Code: Select all

public static ByteBuffer fetch(URL url, long ifModifiedSince, Object etag, Map<String, String> requestParameters, Consumer<Map<String, List<String>>> responseParameters) throws IOException {
    ...
    if (requestParameters != null) {
        requestParameters.forEach(connection::addRequestProperty);
    }

    // NEW CODE
    if (connection instanceof HttpURLConnection) {
        HttpURLConnection httpConnection = (HttpURLConnection) connection;
        if (httpConnection.getResponseCode() == 301) {
            return fetch(new URL(connection.getHeaderField("Location")), ifModifiedSince, etag, requestParameters, responseParameters);
        }
    }
    // END NEW CODE

    int contentLength = connection.getContentLength();
    ...
}   
fireheart2008
Posts: 37
Joined: 29 Jul 2014, 05:39

Re: Invaild XML Error

Post by fireheart2008 »

Chrrs wrote: 01 Dec 2019, 02:43
I modified the Java bytecodes for ...
thanks man! it works for the renaming. unfortunately fetching tv artwork doesn't work
Chrrs
Posts: 4
Joined: 17 Apr 2017, 14:48

Re: Invaild XML Error

Post by Chrrs »

fireheart2008 wrote: 01 Dec 2019, 22:30
Chrrs wrote: 01 Dec 2019, 02:43
I modified the Java bytecodes for ...
thanks man! it works for the renaming. unfortunately fetching tv artwork doesn't work
Are you able to setup a proxy such as nginx? This might be the best solution rather than my bytecode change. Someone on reddit was able to get it working and posted their nginx config: https://www.reddit.com/r/filebot/commen ... dium=web2x

But what error are you getting? I tried it on my end using my modified FileBot.jar and it ran with no issues:

Code: Select all

# filebot -script fn:artwork.tvdb "/TV/The Simpsons"   
/TV/The Simpsons => Search by The Simpsons
Auto-Select [The Simpsons] from [The Simpsons, The Ashlee Simpson Show, The People vs O.J. Simpson, The Real O.J. Simpson Trial, Jessica Simpson's The Price of Beauty, The Galton and Simpson Playhouse, The Galton and Simpson Playhouse (1969), Todd Sampson's Life on the Line, Russell Simmons Presents The Ruckus, Russell Simmons Presents Stand-Up at the El Rey, Is O.J. Innocent? The Missing Evidence, American Crime Story]
/TV/The Simpsons => The Simpsons
Generate Series NFO: The Simpsons [71663]
Banner not found: /TV/The Simpsons/poster.jpg / poster:680x1000
Banner not found: /TV/The Simpsons/poster.jpg / poster:null
Banner not found: /TV/The Simpsons/banner.jpg / series:graphical
Banner not found: /TV/The Simpsons/banner.jpg / series:null
Banner already exists: /TV/The Simpsons/fanart.jpg
Fetching /TV/The Simpsons/clearart.png => [hdclearart, en, 6.0, https://assets.fanart.tv/fanart/tv/71663/hdclearart/the-simpsons-516eb1f107609.png]
Fetching /TV/The Simpsons/logo.png => [hdtvlogo, en, 6.0, https://assets.fanart.tv/fanart/tv/71663/hdtvlogo/the-simpsons-505768514da23.png]
Fetching /TV/The Simpsons/landscape.jpg => [tvthumb, en, 4.0, https://assets.fanart.tv/fanart/tv/71663/tvthumb/the-simpsons-58a9d6f1bb4eb.jpg]
Done ヾ(@⌒ー⌒@)ノ
Chrrs
Posts: 4
Joined: 17 Apr 2017, 14:48

Re: Invaild XML Error

Post by Chrrs »

Dannzi wrote: 01 Dec 2019, 15:25 any chance for you to upload that filebot.jar file anywhere, downloaded Recaf and after loading the file into it, it just freezes, can't edit anything :l
Not sure of the legality of me redistributing a modified copy of FileBot. Have you tried another bytecode editor such as https://set.ee/jbe/?
Dannzi
Posts: 8
Joined: 05 Aug 2015, 16:01

Re: Invaild XML Error

Post by Dannzi »

Chrrs wrote: 02 Dec 2019, 01:17
Dannzi wrote: 01 Dec 2019, 15:25 any chance for you to upload that filebot.jar file anywhere, downloaded Recaf and after loading the file into it, it just freezes, can't edit anything :l
Not sure of the legality of me redistributing a modified copy of FileBot. Have you tried another bytecode editor such as https://set.ee/jbe/?
Yeah, tried that one, managed to open the file, but when navigating to the correct place it froze again :/
preech
Posts: 1
Joined: 02 Dec 2019, 19:54

Re: Invaild XML Error

Post by preech »

Chrrs wrote: 01 Dec 2019, 02:43...
Thank you, kind internet friend!

The jar editor also freeze for me under windows, but managed to get it working under linux.
( The java 8 install was a real pain tho )
Post Reply