Page 1 of 1

[SOLVED] Getting JVM SIGSEGV errors on 4.8.2 version inside a VM

Posted: 30 Jul 2018, 18:09
by hansooloo
Started getting the following errors doing a TEST run on a directory with about 360 movies (mix of BDMV directories in MKV files).

Why am I seeing a crash?

Error

Code: Select all

jd@plex:/mnt/JDownloader$ filebot --log all --action test -non-strict -script fn:amc --def ut_dir="/mnt/Movies" ut_kind=multi  movieFormat="@/usr/local/JDownloader/FormatMovie.txt" clean=y minFileSize=0 --output /mnt/JDownloader/_02.Format_
Run script [fn:amc] at [Mon Jul 30 12:13:11 EDT 2018]
Parameter: ut_dir = /mnt/Movies
Parameter: ut_kind = multi
Parameter: movieFormat = Movies/{f.directory ? n.sortName('$2, $1')+' ('+y+')' : n.sortName('$2, $1')+' ('+y+')'/n.sortName('$2, $1')+' ('+y+')'}
Parameter: clean = y
Parameter: minFileSize = 0
Ignore hidden: /mnt/Movies/.metadata_never_index
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f25830f146b, pid=17995, tid=0x00007f24dfefe700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x71546b]  JNIHandleBlock::allocate_block(Thread*)+0x7b
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /mnt/JDownloader/hs_err_pid17995.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
Aborted (core dumped)

Modified my startup to look like this, thinking heap size could be an issue:

Code: Select all

plex@plex:~$ cat /usr/local/bin/filebot
#!/bin/bash
export JAVA_OPTS="-Xmx4G"
/usr/local/FileBot/filebot.sh "$@"
where `/usr/local/FileBot/filebot.sh` has been updated from https://get.filebot.net/filebot/HEAD/CHANGES.tar.xz as of today, 2018-07-30.

Sysinfo

Code: Select all

jd@plex:/mnt/JDownloader$ filebot -script fn:sysinfo
FileBot 4.8.2 (r5768)
JNA Native: 5.2.2
MediaInfo: 18.05
7-Zip-JBinding: 9.20
Chromaprint: java.io.IOException: Cannot run program "fpcalc": error=2, No such file or directory
Extended Attributes: OK
Unicode Filesystem: OK
Script Bundle: 2018-07-15 (r530)
Groovy: 2.5.0
JRE: OpenJDK Runtime Environment 1.8.0_171
JVM: 64-bit OpenJDK 64-Bit Server VM
CPU/MEM: 16 Core / 3 GB Max Memory / 37 MB Used Memory
OS: Linux (amd64)
HW: Linux plex 4.4.0-131-generic #157-Ubuntu SMP Thu Jul 12 15:51:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
DATA: /usr/local/FileBot/4.8.2/data/jd
Package: TAR
License: FileBot License PX4121863 (Valid-Until: 2068-07-24)
Done ヾ(@⌒ー⌒@)ノ
JRE error log: https://gist.github.com/HanSooloo/7efc8 ... 45c2cd407f

EDIT:
Updated title to reflect this is happens inside a virtual machine.

EDIT 2:
Turns out it was the Processor BIOS configuration on my Cisco UCS C240 M4, running ESXi 6.7. Further details in the post at the end.

Re: Getting JVM SIGSEGV errors

Posted: 30 Jul 2018, 18:23
by rednoah
1.
No idea. Looks like your OpenJDK is crashing somewhere internally:

Code: Select all

[libjvm.so+0x71546b]  JNIHandleBlock::allocate_block(Thread*)+0x7b
Since allocate probably has something to do with memory, maybe setting an unusually high memory limit has something to do with it? What happens if you remove your custom -Xmx options?

Code: Select all

# export JAVA_OPTS="-Xmx4G"

2.
What kind of Linux are you running? I recommend using the latest Oracle Java 10. That'll probably fix the issue.

:arrow: https://github.com/rednoah/java-install ... n-on-linux


3.
Note that you're running the amc script with 1.8.0_181:

Code: Select all

JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
But then sysinfo says you're running with 1.8.0_171:

Code: Select all

JRE: OpenJDK Runtime Environment 1.8.0_171
:idea: Probably not an issue here. But general advice, make sure you're using the same JDK for all your tests, because it'll be confusing if one crashes during production, and the other doesn't during testing.

Re: Getting JVM SIGSEGV errors

Posted: 30 Jul 2018, 18:39
by hansooloo
I first started off on Ubuntu 16, OpenJDK 8 with no JAVA_OPTS.

Then, OpenJDK8 with JAVA_OPTS.

Then, Oralce JRE 8 with JAVA OPTS.

Then, Oracle JRE 8 with no JAVA_OPTS.

All had the same issue.

I will update to JRE 10 to see if it fixes it.

Re: Getting JVM SIGSEGV errors

Posted: 30 Jul 2018, 21:11
by hansooloo
With JRE 10 and without JAVA_OPTS, following failure:

Code: Select all

.... FileBot starts, goes through close to a hundred items ....
....
[TEST] from [/mnt/Movies/Warcraft 720p (2016)/Warcraft (2016).mkv] to [/mnt/JDownloader/_02.Format_/Movies/Warcraft (2016)/Warcraft (2016).mkv]
Processed 2 files
Rename movies using [TheMovieDB]
Auto-detect movie from context: [/mnt/Movies/Watchmen (2009)/Watchmen (2009).m2ts]
vtable stub
java.lang.IncompatibleClassChangeError: vtable stub
	at net.filebot.util.JsonUtilities.asMapArray(JsonUtilities.java:53)
	at net.filebot.util.JsonUtilities.getMapArray(JsonUtilities.java:69)
	at net.filebot.util.JsonUtilities.streamJsonObjects(JsonUtilities.java:73)
	at net.filebot.web.TMDbClient.getMovieInfo(TMDbClient.java:198)
	at net.filebot.web.TMDbClient.getMovieInfo(TMDbClient.java:164)
	at net.filebot.web.TMDbClient.getMovieDescriptor(TMDbClient.java:148)
	at net.filebot.media.MediaDetection.getLocalizedMovie(MediaDetection.java:725)
	at net.filebot.cli.CmdlineOperations.renameMovie(CmdlineOperations.java:450)
	at net.filebot.cli.CmdlineOperations.rename(CmdlineOperations.java:91)
	at net.filebot.cli.ScriptShellBaseClass.rename(ScriptShellBaseClass.java:361)
	at jdk.internal.reflect.GeneratedMethodAccessor58.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at Script1$_run_closure54.doCall(Script1.groovy:452)
	at jdk.internal.reflect.GeneratedMethodAccessor57.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at Script1.run(Script1.groovy:414)
	at net.filebot.cli.ScriptShell.evaluate(ScriptShell.java:64)
	at net.filebot.cli.ScriptShell.runScript(ScriptShell.java:74)
	at net.filebot.cli.ArgumentProcessor.runScript(ArgumentProcessor.java:154)
	at net.filebot.cli.ArgumentProcessor.run(ArgumentProcessor.java:36)
	at net.filebot.Main.main(Main.java:131)

Failure (°_°)
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f56380029e3, pid=20964, tid=21076
#
# JRE version: Java(TM) SE Runtime Environment (10.0.2+13) (build 10.0.2+13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (10.0.2+13, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x94d9e3]  JNIHandleBlock::release_block(JNIHandleBlock*, Thread*)+0x2f3
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P" (or dumping to /usr/local/JDownloader/core.20964)
#
# An error report file with more information is saved as:
# /usr/local/JDownloader/hs_err_pid20964.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
Aborted (core dumped)
With JRE 10 and with JAVA_OPTS:

Code: Select all

jd@plex:~$ filebot --log all --action test -non-strict -script fn:amc --def ut_dir="/mnt/Movies" ut_kind=multi  movieFormat="@/usr/local/JDownloader/FormatMovie.txt" clean=y minFileSize=0 --output /mnt/JDownloader/_02.Format_
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.codehaus.groovy.vmplugin.v7.Java7$1 (file:/usr/local/FileBot/4.8.2/jar/groovy.jar) to constructor java.lang.invoke.MethodHandles$Lookup(java.lang.Class,int)
WARNING: Please consider reporting this to the maintainers of org.codehaus.groovy.vmplugin.v7.Java7$1
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Run script [fn:amc] at [Mon Jul 30 17:05:22 EDT 2018]
Parameter: ut_dir = /mnt/Movies
Parameter: ut_kind = multi
Parameter: movieFormat = Movies/{f.directory ? n.sortName('$2, $1')+' ('+y+')' : n.sortName('$2, $1')+' ('+y+')'/n.sortName('$2, $1')+' ('+y+')'}
Parameter: clean = y
Parameter: minFileSize = 0
Ignore hidden: /mnt/Movies/.metadata_never_index
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f9a25aa86b5, pid=21302, tid=21335
#
# JRE version: Java(TM) SE Runtime Environment (10.0.2+13) (build 10.0.2+13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (10.0.2+13, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xcc76b5]  SafepointSynchronize::begin()+0x16b5
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P" (or dumping to /usr/local/JDownloader/core.21302)
#
# An error report file with more information is saved as:
# /usr/local/JDownloader/hs_err_pid21302.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
Aborted (core dumped)

Re: Getting JVM SIGSEGV errors

Posted: 30 Jul 2018, 21:19
by hansooloo
With repeated runs using JRE 10 and JAVA_OPTS, the error points are different

Code: Select all

# Problematic frame:
# V  [libjvm.so+0xcc76b5]  SafepointSynchronize::begin()+0x16b5

Code: Select all

# Problematic frame:
# V  [libjvm.so+0xb98206]  MethodData::bci_to_data(int)+0x36

Re: Getting JVM SIGSEGV errors

Posted: 30 Jul 2018, 23:39
by rednoah
Strange. Something must be very unique about the system you’re running? What kind of Linux is it? What kind of hardware?

It’s so strange, it’s probably worth running me test once just to check if the RAM is working correctly.

Google also said that native libraries might indirectly have an effect, so deleting all the so files from FileBot might make a difference.

Re: Getting JVM SIGSEGV errors

Posted: 31 Jul 2018, 00:27
by hansooloo
Ubuntu 16 VM running on ESXi 6.7, running on Cisco UCS C240-M4 hardware.

Re: Getting JVM SIGSEGV errors

Posted: 31 Jul 2018, 05:54
by rednoah
Since I can barely find any references to other running into the same problem, this might not be something we can solve, and maybe very specific to your HW / SW setup.

As it seems to be crashing during GC phase, seemingly due to some sort of memory error, using a different GC is worth a try:
http://www.baeldung.com/jvm-garbage-collectors

Try this:

Code: Select all

export JAVA_OPTS="-XX:+UseSerialGC" && filebot -script fn:amc ...

Re: Getting JVM SIGSEGV errors

Posted: 31 Jul 2018, 23:31
by hansooloo
Mystery continues ... getting some hints from the "Memory leak?" post, decided to use (older) version 4.7.9 (viewtopic.php?f=6&t=6061)

Guess what?

4.7.9: no problems

4.8.2: crashes all around

Wonder if something with the "memory leak" fixes is having an adverse affect somewhere else.

Happy to help capture logs, core dumps, etc ... though not good with Java, so wouldn't be able to help with the code.

EDIT:
None of the GC options helped when on 4.8.2.

EDIT 2:
Was going to try the "alternative archive extractor" options, but seems like that option is gone in 4.8.2. How can I explicitly ask FileBot to use my native 7z bin?

Re: Getting JVM SIGSEGV errors

Posted: 01 Aug 2018, 06:00
by rednoah
1.
It crashes in native code unrelated to archive extraction. The "memory leak" thread refers to the ApacheVFS extractor which is only used on armv7l / aarch64 devices where we don't have lib7-Zip-JBinding.so. The latter is used on your machine because you're running normal amd64.


2.
It's interesting that and older version seems to not crash the JVM. Unfortunately, that doesn't help us much, since it could be any random unrelated change that somehow indirectly crashes the JVM. The fact that it only crashes on your device is worrying.

:idea: There should be JVM crash log files. Can you post them on pastebin so I can make Java bug reports?


3.
rednoah wrote: 30 Jul 2018, 23:39 Google also said that native libraries might indirectly have an effect, so deleting all the so files from FileBot might make a difference.
Have you tried this yet? Just delete the lib folder (with all the *.so files) entirely and see if that makes a difference.


4.
AFAIK, options have not changed:

Code: Select all

export FILEBOT_OPTS="-Dnet.filebot.media.parser=ffprobe -Dnet.filebot.Archive.extractor=SevenZipExecutable"
filebot -script fn:sysinfo
:idea: If (3.) makes a difference, then you can use these options to restore the functionality that you'll lose by deleting the native libs.

Re: Getting JVM SIGSEGV errors

Posted: 01 Aug 2018, 14:38
by hansooloo
Thanks for the explanation.

4.8.2 WITH 7z executable method worked fine.

4.8.2 with native libraries resulted in the crash, with crash log posted to https://pastebin.com/pKCdeTY3.

Both of them were run with JRE 8.

Code: Select all

java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)

Re: Getting JVM SIGSEGV errors

Posted: 01 Aug 2018, 16:57
by rednoah
That is strange indeed. I wonder if it's one specific native library that causes the crash. The one for 7z hasn't changed for years.

Re: Getting JVM SIGSEGV errors

Posted: 01 Aug 2018, 17:38
by hansooloo
rednoah wrote: 01 Aug 2018, 16:57 That is strange indeed. I wonder if it's one specific native library that causes the crash. The one for 7z hasn't changed for years.
I know that some of my directories have .ISO (pure disk image) and .ZIP (inside a BDMV structure) files. Wonder if that is causing the problem.

Again, happy to help get logs, etc. :-)

Re: Getting JVM SIGSEGV errors

Posted: 01 Aug 2018, 17:57
by rednoah
Maybe. But probably not in a way that make logical sense. Only a JVM engineer could maybe figure it out and actually find out why. :lol:

JVM crashing is like Windows crashing with a BSOD. It really really shouldn't happen, and it normally doesn't, otherwise lots of things in the world would just stop working. :lol:

Re: Getting JVM SIGSEGV errors

Posted: 06 Aug 2018, 01:06
by hansooloo
As I suspected, 4.8.2. works without an issue on a "real hardware", i.e., Lenovo X1 Carbon laptop with Open JRE 8 and no special settings.

Still accessing the media files over NFS v4, as was the case with the VM.

The only thing that is different between the 2 scenarios is that the the VM itself was in a data store backed by the same NFS v4 server as where the media files sit.

Having said all that, I still cannot figure out how 4.8.2 experiences a JVM SIGSEGV because of this difference? I am pretty good with computers/programming languages/etc, and this is one where it looks really weird.

Re: [SOLVED] Getting JVM SIGSEGV errors on 4.8.2 version inside a VM

Posted: 07 Aug 2018, 02:50
by hansooloo
After many hours of Google-fu and confirming that a "real hardware" did not exhibit any issues, thought of investigating the virtualization stack as a potential source of the problem.

What is the 1st layer in a virtualization stack? The Hardware !

Followed Cisco's own recommendations for virtualized platforms, specifically about the Processor Configuration and QPI Confgiruation BIOS settings as explained in: https://www.cisco.com/c/en/us/solutions ... 37931.html, section "BIOS Settings for Various Workload Types", "Virtualization"

Following are the values that fixed my issues after following the guide above:

Processor Configuration
Intel(R) Pass Through DMA ==> Enabled
Direct Cache Access Support ==> Enabled
Energy Performance ==> Performance
Processor CMCI ==> Disabled

QPI Configuration
QPI Snoop Mode ==> Early Snoop

Re: [SOLVED] Getting JVM SIGSEGV errors on 4.8.2 version inside a VM

Posted: 07 Aug 2018, 06:51
by rednoah
Wow, so what were the original values of these configuration items? Any idea which one might have fixed the issue? Cause I have no idea what any of these do. :lol:

:idea: It's really interesting that they have a column with recommend settings for Java! :lol:

Re: [SOLVED] Getting JVM SIGSEGV errors on 4.8.2 version inside a VM

Posted: 07 Aug 2018, 11:39
by hansooloo
Original values were the factory defaults. Mostly a toggle between enable/disable.

Not an expert by any means, but my bet is on the DMA, Cache Access and QPI Snoop ones.
Those are the ones that remotely sound like they could alter how memory is written/read, which could in turn relate to completely random SIGSEGV faults on any number of libraries (I have even seen the C library itself!!!).