[SOLVED] Getting JVM SIGSEGV errors on 4.8.2 version inside a VM

All your suggestions, requests and ideas for future development
Post Reply
hansooloo
Posts: 30
Joined: 06 Feb 2016, 14:58

[SOLVED] Getting JVM SIGSEGV errors on 4.8.2 version inside a VM

Post by hansooloo »

Started getting the following errors doing a TEST run on a directory with about 360 movies (mix of BDMV directories in MKV files).

Why am I seeing a crash?

Error

Code: Select all

jd@plex:/mnt/JDownloader$ filebot --log all --action test -non-strict -script fn:amc --def ut_dir="/mnt/Movies" ut_kind=multi  movieFormat="@/usr/local/JDownloader/FormatMovie.txt" clean=y minFileSize=0 --output /mnt/JDownloader/_02.Format_
Run script [fn:amc] at [Mon Jul 30 12:13:11 EDT 2018]
Parameter: ut_dir = /mnt/Movies
Parameter: ut_kind = multi
Parameter: movieFormat = Movies/{f.directory ? n.sortName('$2, $1')+' ('+y+')' : n.sortName('$2, $1')+' ('+y+')'/n.sortName('$2, $1')+' ('+y+')'}
Parameter: clean = y
Parameter: minFileSize = 0
Ignore hidden: /mnt/Movies/.metadata_never_index
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f25830f146b, pid=17995, tid=0x00007f24dfefe700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x71546b]  JNIHandleBlock::allocate_block(Thread*)+0x7b
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /mnt/JDownloader/hs_err_pid17995.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
Aborted (core dumped)

Modified my startup to look like this, thinking heap size could be an issue:

Code: Select all

plex@plex:~$ cat /usr/local/bin/filebot
#!/bin/bash
export JAVA_OPTS="-Xmx4G"
/usr/local/FileBot/filebot.sh "$@"
where `/usr/local/FileBot/filebot.sh` has been updated from https://get.filebot.net/filebot/HEAD/CHANGES.tar.xz as of today, 2018-07-30.

Sysinfo

Code: Select all

jd@plex:/mnt/JDownloader$ filebot -script fn:sysinfo
FileBot 4.8.2 (r5768)
JNA Native: 5.2.2
MediaInfo: 18.05
7-Zip-JBinding: 9.20
Chromaprint: java.io.IOException: Cannot run program "fpcalc": error=2, No such file or directory
Extended Attributes: OK
Unicode Filesystem: OK
Script Bundle: 2018-07-15 (r530)
Groovy: 2.5.0
JRE: OpenJDK Runtime Environment 1.8.0_171
JVM: 64-bit OpenJDK 64-Bit Server VM
CPU/MEM: 16 Core / 3 GB Max Memory / 37 MB Used Memory
OS: Linux (amd64)
HW: Linux plex 4.4.0-131-generic #157-Ubuntu SMP Thu Jul 12 15:51:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
DATA: /usr/local/FileBot/4.8.2/data/jd
Package: TAR
License: FileBot License PX4121863 (Valid-Until: 2068-07-24)
Done ヾ(@⌒ー⌒@)ノ
JRE error log: https://gist.github.com/HanSooloo/7efc8 ... 45c2cd407f

EDIT:
Updated title to reflect this is happens inside a virtual machine.

EDIT 2:
Turns out it was the Processor BIOS configuration on my Cisco UCS C240 M4, running ESXi 6.7. Further details in the post at the end.
Last edited by hansooloo on 07 Aug 2018, 02:52, edited 3 times in total.
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Getting JVM SIGSEGV errors

Post by rednoah »

1.
No idea. Looks like your OpenJDK is crashing somewhere internally:

Code: Select all

[libjvm.so+0x71546b]  JNIHandleBlock::allocate_block(Thread*)+0x7b
Since allocate probably has something to do with memory, maybe setting an unusually high memory limit has something to do with it? What happens if you remove your custom -Xmx options?

Code: Select all

# export JAVA_OPTS="-Xmx4G"

2.
What kind of Linux are you running? I recommend using the latest Oracle Java 10. That'll probably fix the issue.

:arrow: https://github.com/rednoah/java-install ... n-on-linux


3.
Note that you're running the amc script with 1.8.0_181:

Code: Select all

JRE version: Java(TM) SE Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
But then sysinfo says you're running with 1.8.0_171:

Code: Select all

JRE: OpenJDK Runtime Environment 1.8.0_171
:idea: Probably not an issue here. But general advice, make sure you're using the same JDK for all your tests, because it'll be confusing if one crashes during production, and the other doesn't during testing.
:idea: Please read the FAQ and How to Request Help.
hansooloo
Posts: 30
Joined: 06 Feb 2016, 14:58

Re: Getting JVM SIGSEGV errors

Post by hansooloo »

I first started off on Ubuntu 16, OpenJDK 8 with no JAVA_OPTS.

Then, OpenJDK8 with JAVA_OPTS.

Then, Oralce JRE 8 with JAVA OPTS.

Then, Oracle JRE 8 with no JAVA_OPTS.

All had the same issue.

I will update to JRE 10 to see if it fixes it.
hansooloo
Posts: 30
Joined: 06 Feb 2016, 14:58

Re: Getting JVM SIGSEGV errors

Post by hansooloo »

With JRE 10 and without JAVA_OPTS, following failure:

Code: Select all

.... FileBot starts, goes through close to a hundred items ....
....
[TEST] from [/mnt/Movies/Warcraft 720p (2016)/Warcraft (2016).mkv] to [/mnt/JDownloader/_02.Format_/Movies/Warcraft (2016)/Warcraft (2016).mkv]
Processed 2 files
Rename movies using [TheMovieDB]
Auto-detect movie from context: [/mnt/Movies/Watchmen (2009)/Watchmen (2009).m2ts]
vtable stub
java.lang.IncompatibleClassChangeError: vtable stub
	at net.filebot.util.JsonUtilities.asMapArray(JsonUtilities.java:53)
	at net.filebot.util.JsonUtilities.getMapArray(JsonUtilities.java:69)
	at net.filebot.util.JsonUtilities.streamJsonObjects(JsonUtilities.java:73)
	at net.filebot.web.TMDbClient.getMovieInfo(TMDbClient.java:198)
	at net.filebot.web.TMDbClient.getMovieInfo(TMDbClient.java:164)
	at net.filebot.web.TMDbClient.getMovieDescriptor(TMDbClient.java:148)
	at net.filebot.media.MediaDetection.getLocalizedMovie(MediaDetection.java:725)
	at net.filebot.cli.CmdlineOperations.renameMovie(CmdlineOperations.java:450)
	at net.filebot.cli.CmdlineOperations.rename(CmdlineOperations.java:91)
	at net.filebot.cli.ScriptShellBaseClass.rename(ScriptShellBaseClass.java:361)
	at jdk.internal.reflect.GeneratedMethodAccessor58.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at Script1$_run_closure54.doCall(Script1.groovy:452)
	at jdk.internal.reflect.GeneratedMethodAccessor57.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at Script1.run(Script1.groovy:414)
	at net.filebot.cli.ScriptShell.evaluate(ScriptShell.java:64)
	at net.filebot.cli.ScriptShell.runScript(ScriptShell.java:74)
	at net.filebot.cli.ArgumentProcessor.runScript(ArgumentProcessor.java:154)
	at net.filebot.cli.ArgumentProcessor.run(ArgumentProcessor.java:36)
	at net.filebot.Main.main(Main.java:131)

Failure (°_°)
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f56380029e3, pid=20964, tid=21076
#
# JRE version: Java(TM) SE Runtime Environment (10.0.2+13) (build 10.0.2+13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (10.0.2+13, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x94d9e3]  JNIHandleBlock::release_block(JNIHandleBlock*, Thread*)+0x2f3
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P" (or dumping to /usr/local/JDownloader/core.20964)
#
# An error report file with more information is saved as:
# /usr/local/JDownloader/hs_err_pid20964.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
Aborted (core dumped)
With JRE 10 and with JAVA_OPTS:

Code: Select all

jd@plex:~$ filebot --log all --action test -non-strict -script fn:amc --def ut_dir="/mnt/Movies" ut_kind=multi  movieFormat="@/usr/local/JDownloader/FormatMovie.txt" clean=y minFileSize=0 --output /mnt/JDownloader/_02.Format_
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.codehaus.groovy.vmplugin.v7.Java7$1 (file:/usr/local/FileBot/4.8.2/jar/groovy.jar) to constructor java.lang.invoke.MethodHandles$Lookup(java.lang.Class,int)
WARNING: Please consider reporting this to the maintainers of org.codehaus.groovy.vmplugin.v7.Java7$1
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Run script [fn:amc] at [Mon Jul 30 17:05:22 EDT 2018]
Parameter: ut_dir = /mnt/Movies
Parameter: ut_kind = multi
Parameter: movieFormat = Movies/{f.directory ? n.sortName('$2, $1')+' ('+y+')' : n.sortName('$2, $1')+' ('+y+')'/n.sortName('$2, $1')+' ('+y+')'}
Parameter: clean = y
Parameter: minFileSize = 0
Ignore hidden: /mnt/Movies/.metadata_never_index
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f9a25aa86b5, pid=21302, tid=21335
#
# JRE version: Java(TM) SE Runtime Environment (10.0.2+13) (build 10.0.2+13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (10.0.2+13, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xcc76b5]  SafepointSynchronize::begin()+0x16b5
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P" (or dumping to /usr/local/JDownloader/core.21302)
#
# An error report file with more information is saved as:
# /usr/local/JDownloader/hs_err_pid21302.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
Aborted (core dumped)
hansooloo
Posts: 30
Joined: 06 Feb 2016, 14:58

Re: Getting JVM SIGSEGV errors

Post by hansooloo »

With repeated runs using JRE 10 and JAVA_OPTS, the error points are different

Code: Select all

# Problematic frame:
# V  [libjvm.so+0xcc76b5]  SafepointSynchronize::begin()+0x16b5

Code: Select all

# Problematic frame:
# V  [libjvm.so+0xb98206]  MethodData::bci_to_data(int)+0x36
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Getting JVM SIGSEGV errors

Post by rednoah »

Strange. Something must be very unique about the system you’re running? What kind of Linux is it? What kind of hardware?

It’s so strange, it’s probably worth running me test once just to check if the RAM is working correctly.

Google also said that native libraries might indirectly have an effect, so deleting all the so files from FileBot might make a difference.
:idea: Please read the FAQ and How to Request Help.
hansooloo
Posts: 30
Joined: 06 Feb 2016, 14:58

Re: Getting JVM SIGSEGV errors

Post by hansooloo »

Ubuntu 16 VM running on ESXi 6.7, running on Cisco UCS C240-M4 hardware.
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Getting JVM SIGSEGV errors

Post by rednoah »

Since I can barely find any references to other running into the same problem, this might not be something we can solve, and maybe very specific to your HW / SW setup.

As it seems to be crashing during GC phase, seemingly due to some sort of memory error, using a different GC is worth a try:
http://www.baeldung.com/jvm-garbage-collectors

Try this:

Code: Select all

export JAVA_OPTS="-XX:+UseSerialGC" && filebot -script fn:amc ...
:idea: Please read the FAQ and How to Request Help.
hansooloo
Posts: 30
Joined: 06 Feb 2016, 14:58

Re: Getting JVM SIGSEGV errors

Post by hansooloo »

Mystery continues ... getting some hints from the "Memory leak?" post, decided to use (older) version 4.7.9 (viewtopic.php?f=6&t=6061)

Guess what?

4.7.9: no problems

4.8.2: crashes all around

Wonder if something with the "memory leak" fixes is having an adverse affect somewhere else.

Happy to help capture logs, core dumps, etc ... though not good with Java, so wouldn't be able to help with the code.

EDIT:
None of the GC options helped when on 4.8.2.

EDIT 2:
Was going to try the "alternative archive extractor" options, but seems like that option is gone in 4.8.2. How can I explicitly ask FileBot to use my native 7z bin?
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Getting JVM SIGSEGV errors

Post by rednoah »

1.
It crashes in native code unrelated to archive extraction. The "memory leak" thread refers to the ApacheVFS extractor which is only used on armv7l / aarch64 devices where we don't have lib7-Zip-JBinding.so. The latter is used on your machine because you're running normal amd64.


2.
It's interesting that and older version seems to not crash the JVM. Unfortunately, that doesn't help us much, since it could be any random unrelated change that somehow indirectly crashes the JVM. The fact that it only crashes on your device is worrying.

:idea: There should be JVM crash log files. Can you post them on pastebin so I can make Java bug reports?


3.
rednoah wrote: 30 Jul 2018, 23:39 Google also said that native libraries might indirectly have an effect, so deleting all the so files from FileBot might make a difference.
Have you tried this yet? Just delete the lib folder (with all the *.so files) entirely and see if that makes a difference.


4.
AFAIK, options have not changed:

Code: Select all

export FILEBOT_OPTS="-Dnet.filebot.media.parser=ffprobe -Dnet.filebot.Archive.extractor=SevenZipExecutable"
filebot -script fn:sysinfo
:idea: If (3.) makes a difference, then you can use these options to restore the functionality that you'll lose by deleting the native libs.
:idea: Please read the FAQ and How to Request Help.
hansooloo
Posts: 30
Joined: 06 Feb 2016, 14:58

Re: Getting JVM SIGSEGV errors

Post by hansooloo »

Thanks for the explanation.

4.8.2 WITH 7z executable method worked fine.

4.8.2 with native libraries resulted in the crash, with crash log posted to https://pastebin.com/pKCdeTY3.

Both of them were run with JRE 8.

Code: Select all

java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Getting JVM SIGSEGV errors

Post by rednoah »

That is strange indeed. I wonder if it's one specific native library that causes the crash. The one for 7z hasn't changed for years.
:idea: Please read the FAQ and How to Request Help.
hansooloo
Posts: 30
Joined: 06 Feb 2016, 14:58

Re: Getting JVM SIGSEGV errors

Post by hansooloo »

rednoah wrote: 01 Aug 2018, 16:57 That is strange indeed. I wonder if it's one specific native library that causes the crash. The one for 7z hasn't changed for years.
I know that some of my directories have .ISO (pure disk image) and .ZIP (inside a BDMV structure) files. Wonder if that is causing the problem.

Again, happy to help get logs, etc. :-)
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Getting JVM SIGSEGV errors

Post by rednoah »

Maybe. But probably not in a way that make logical sense. Only a JVM engineer could maybe figure it out and actually find out why. :lol:

JVM crashing is like Windows crashing with a BSOD. It really really shouldn't happen, and it normally doesn't, otherwise lots of things in the world would just stop working. :lol:
:idea: Please read the FAQ and How to Request Help.
hansooloo
Posts: 30
Joined: 06 Feb 2016, 14:58

Re: Getting JVM SIGSEGV errors

Post by hansooloo »

As I suspected, 4.8.2. works without an issue on a "real hardware", i.e., Lenovo X1 Carbon laptop with Open JRE 8 and no special settings.

Still accessing the media files over NFS v4, as was the case with the VM.

The only thing that is different between the 2 scenarios is that the the VM itself was in a data store backed by the same NFS v4 server as where the media files sit.

Having said all that, I still cannot figure out how 4.8.2 experiences a JVM SIGSEGV because of this difference? I am pretty good with computers/programming languages/etc, and this is one where it looks really weird.
hansooloo
Posts: 30
Joined: 06 Feb 2016, 14:58

Re: [SOLVED] Getting JVM SIGSEGV errors on 4.8.2 version inside a VM

Post by hansooloo »

After many hours of Google-fu and confirming that a "real hardware" did not exhibit any issues, thought of investigating the virtualization stack as a potential source of the problem.

What is the 1st layer in a virtualization stack? The Hardware !

Followed Cisco's own recommendations for virtualized platforms, specifically about the Processor Configuration and QPI Confgiruation BIOS settings as explained in: https://www.cisco.com/c/en/us/solutions ... 37931.html, section "BIOS Settings for Various Workload Types", "Virtualization"

Following are the values that fixed my issues after following the guide above:

Processor Configuration
Intel(R) Pass Through DMA ==> Enabled
Direct Cache Access Support ==> Enabled
Energy Performance ==> Performance
Processor CMCI ==> Disabled

QPI Configuration
QPI Snoop Mode ==> Early Snoop
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: [SOLVED] Getting JVM SIGSEGV errors on 4.8.2 version inside a VM

Post by rednoah »

Wow, so what were the original values of these configuration items? Any idea which one might have fixed the issue? Cause I have no idea what any of these do. :lol:

:idea: It's really interesting that they have a column with recommend settings for Java! :lol:
:idea: Please read the FAQ and How to Request Help.
hansooloo
Posts: 30
Joined: 06 Feb 2016, 14:58

Re: [SOLVED] Getting JVM SIGSEGV errors on 4.8.2 version inside a VM

Post by hansooloo »

Original values were the factory defaults. Mostly a toggle between enable/disable.

Not an expert by any means, but my bet is on the DMA, Cache Access and QPI Snoop ones.
Those are the ones that remotely sound like they could alter how memory is written/read, which could in turn relate to completely random SIGSEGV faults on any number of libraries (I have even seen the C library itself!!!).
Post Reply