Frequent error: java.nio.charset.UnsupportedCharsetException: ISO-8859-8

All your suggestions, requests and ideas for future development
Post Reply
bpleat
Posts: 5
Joined: 22 Jun 2021, 21:05

Frequent error: java.nio.charset.UnsupportedCharsetException: ISO-8859-8

Post by bpleat »

In a (DOS shell) script (Win 10 64-bit) I'm running, to pull down subtitles in a few languages for some movies, I frequently get the following error:
java.nio.charset.UnsupportedCharsetException: ISO-8859-8

The filebot command line is...
"C:\Program Files\Filebot\filebot" -get-subtitles -r "%_P%" -non-strict --lang %%L --output srt --log warning -no-probe -no-xattr

(%_P% is the path we're recursing, %%L is the language)
Happens only for "heb", not for "eng", "spa", "ita", or other (European) languages I've tried

The filebot -script fn:sysinfo output is:
FileBot 4.9.3 (r8340)
JNA Native: 6.1.0
MediaInfo: 20.09
7-Zip-JBinding: 16.02
Tools: fpcalc/1.5.0
Extended Attributes: OK
Unicode Filesystem: OK
Script Bundle: 2021-08-02 (r761)
Groovy: 3.0.7
JRE: OpenJDK Runtime Environment 15.0.2
JVM: 64-bit OpenJDK 64-Bit Server VM
CPU/MEM: 16 Core / 17 GB Max Memory / 32 MB Used Memory
OS: Windows 10 (amd64)
HW: CYGWIN_NT-10.0-WOW P5540 3.0.7(0.338/5/3) 2019-04-30 18:04 i686 Cygwin
STORAGE: NTFS [OS] @ 113 GB | (and some other drives)
DATA: (%UserProfile%)\AppData\Roaming\FileBot
Package: MSI
My "daily limit" (of 1,000 - VIP on OpenSubtitles.org) isn't close to being reached.

Please advise how I can further diagnose.

(Is Cygwin conflicting? I see it mentioned in the "HW" line)

Thank you.
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Frequent error: java.nio.charset.UnsupportedCharsetException: ISO-8859-8

Post by rednoah »

UnsupportedCharsetException: ISO-8859-8 just means that the subtitle file is likely encoded with ISO-8859-8 and that FileBot cannot decode the text. ISO-8859-8 is a non-unicode character encoding for Hebrew, so you won't run into this issue with non-Hebrew (or UTF-8 encoded Hebrew) subtitle files.


:arrow: If you remove the --output srt option, then FileBot will save the binary data as is, and will not try to read / decode / transcode the subtitle format or character encoding, and thus bypass the issue.



EDIT:

:?: Have you tried FileBot 4.9.4 yet? The latest release includes additional charsets including EUC-KR and possibly ISO-8859-8 among others.
:idea: Please read the FAQ and How to Request Help.
bpleat
Posts: 5
Joined: 22 Jun 2021, 21:05

Re: Frequent error: java.nio.charset.UnsupportedCharsetException: ISO-8859-8

Post by bpleat »

Thank you for replying.
4.9.4 fixed most of the occurrences, and the "--output srt" fixed even more. I'll dig a bit more on any remaining.
Thank you again.
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Frequent error: java.nio.charset.UnsupportedCharsetException: ISO-8859-8

Post by rednoah »

bpleat wrote: 16 Sep 2021, 04:00 4.9.4 fixed most of the occurrences
What specific charsets are still missing?
:idea: Please read the FAQ and How to Request Help.
bpleat
Posts: 5
Joined: 22 Jun 2021, 21:05

Re: Frequent error: java.nio.charset.UnsupportedCharsetException: ISO-8859-8

Post by bpleat »

I meant I saw other errors, but these may not be related. This exact issue can be considered resolved for me.
I have to re-run my script with logging, and I'll let you know if I see anything else.
Thanks again!
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Frequent error: java.nio.charset.UnsupportedCharsetException: ISO-8859-8

Post by rednoah »

Ah. No worries. In any case, if you see UnsupportedCharsetException: <charset> again, please copy & paste here.
:idea: Please read the FAQ and How to Request Help.
Post Reply