issue with special caracters
issue with special caracters
Hello, i have detected a problem with Mac OS edition (dunno if occurs on other platforms)
problem reproducible if you have a mac, a NAS, and mount that nas as NFS share.
every file named with accentuated letters as é à ç ô aren't processed through mediainfo commands so parameters like vf and sdhd doesn't match
problem reproducible if you have a mac, a NAS, and mount that nas as NFS share.
every file named with accentuated letters as é à ç ô aren't processed through mediainfo commands so parameters like vf and sdhd doesn't match
Re: issue with special caracters
That seems like a tricky one. Possibly a filename encoding mismatch between Java Unicode strings and whatever native NS classes are used internally by mediainfo when talking to OSX. Possibly something that has to be fixed in the mediainfo code.
In anycase, I don't have a Mac so I can't do much about native issues right now.
In anycase, I don't have a Mac so I can't do much about native issues right now.
Re: issue with special caracters
Perhaps, or not.. 'cause by it's GUI mediainfo processes the files without a glitch. Using UTF-8 or another caracter encoding ?
if you use UTF-8 it would be useful to enforce NFC form in the parameters, can read http://twiki.org/cgi-bin/view/Codev/UnicodeMac if you need to rest a little and can't fall asleep
can do tests for you if you wanna
gimme the builds, and i try to process a bunch o'files
if you use UTF-8 it would be useful to enforce NFC form in the parameters, can read http://twiki.org/cgi-bin/view/Codev/UnicodeMac if you need to rest a little and can't fall asleep

can do tests for you if you wanna

Re: issue with special caracters
Yep, I'd guess it's an NFC issue cause the UTF-8 strings is different bytewise depending how accents are encoded.
I'm trying this:
Please grab the latest jar from HEAD and give it a try.
I'm trying this:
Code: Select all
public synchronized boolean open(File file) {
String path = file.getAbsolutePath();
if (Platform.isMac()) {
path = Normalizer.normalize(path, Form.NFC);
System.out.println("Normalizer.normalize(path, Form.NFC) => " + path);
}
return file.isFile() && MediaInfoLibrary.INSTANCE.Open(handle, new WString(path)) > 0;
}
Re: issue with special caracters
i got that output from cli:
so no output cause there's no file named with that question mark instead of accents 
Code: Select all
filebot -mediainfo "/Volumes/Qmultimedia/video/films/#/5 Ans de Réflexion (2012)/5 Ans de Réflexion (2012)/5 Ans de Réflexion (2012) .tt1195478(6.2).mkv"
Normalizer.normalize(path, Form.NFC) => /Volumes/Qmultimedia/video/films/#/5 Ans de R?flexion (2012)/5 Ans de R?flexion (2012)/5 Ans de R?flexion (2012) .tt1195478(6.2).mkv
Normalizer.normalize(path, Form.NFC) => /Volumes/Qmultimedia/video/films/#/5 Ans de R?flexion (2012)/5 Ans de R?flexion (2012)/5 Ans de R?flexion (2012) .tt1195478(6.2).mkv
Normalizer.normalize(path, Form.NFC) => /Volumes/Qmultimedia/video/films/#/5 Ans de R?flexion (2012)/5 Ans de R?flexion (2012)/5 Ans de R?flexion (2012) .tt1195478(6.2).mkv
Normalizer.normalize(path, Form.NFC) => /Volumes/Qmultimedia/video/films/#/5 Ans de R?flexion (2012)/5 Ans de R?flexion (2012)/5 Ans de R?flexion (2012) .tt1195478(6.2).mkv
5 Ans de R?flexion (2012) .tt1195478(6.2) [ ]

Re: issue with special caracters
The ? means the console can't display that unicode character. It's not actually the ? character.
Added two more test jars for NFKC and NFD. Maybe one of those works.
Added two more test jars for NFKC and NFD. Maybe one of those works.
Re: issue with special caracters
java default caracterset is Macroman, is it any help ?
Re: issue with special caracters
Nevermind console charsets. Try the other test jars with different Unicode NFs. Hopefully one of the four will just work.
Re: issue with special caracters
Updates? Is any of the test jars working?
Re: issue with special caracters
- by GUI not even launch if embedded in the app package (for any test version).
- by direct launch of the JAR package: displaying upper folder (/Volumes/Qmultimedia/video/films/#), but not [e with eacute] folder (5 Ans de Réflexion (2012)) seen under 'load' box. When trying to scan # folder, obtaining [java.lang.NullPointerException] displayed whenever you try to load (for any test version).
- by CLI
version test1:four times andso test 1 fail.
version test2:four times and
so test 2 fails too but displays ok in normalizer even if mediainfo process fails the same way as usual.
version test3:four times andso test 3 fails again same way...
and version test4 also:four times and
seems some had found solutions, but i'm far from understanding the whole bunch
speaking of javac settings, and also normalizer...
http://shlrm.org/blog/2012/10/04/osx-java-utf-8-oh-my/
http://stackoverflow.com/questions/3610 ... ion-issues
http://hints.macworld.com/article.php?s ... 8053951714
http://lists.apple.com/archives/java-de ... 00058.html
Hope it'll helps
- by direct launch of the JAR package: displaying upper folder (/Volumes/Qmultimedia/video/films/#), but not [e with eacute] folder (5 Ans de Réflexion (2012)) seen under 'load' box. When trying to scan # folder, obtaining [java.lang.NullPointerException] displayed whenever you try to load (for any test version).
- by CLI
version test1:
Code: Select all
Normalizer.normalize(path, Form.NFC) => /Volumes/Qmultimedia/video/films/#/5 Ans de R?flexion (2012)/5 Ans de R?flexion (2012)/5 Ans de R?flexion (2012) .tt1195478(6.2).mkv
Code: Select all
5 Ans de R?flexion (2012) .tt1195478(6.2) [ ]
version test2:
Code: Select all
Normalizer.normalize(path, Form.NFKC) => /Volumes/Qmultimedia/video/films/#/5 Ans de Réflexion (2012)/5 Ans de Réflexion (2012)/5 Ans de Réflexion (2012) .tt1195478(6.2).mkv
Code: Select all
5 Ans de R?flexion (2012) .tt1195478(6.2) [ ]
version test3:
Code: Select all
Normalizer.normalize(path, Form.NFD) => /Volumes/Qmultimedia/video/films/#/5 Ans de Réflexion (2012)/5 Ans de Réflexion (2012)/5 Ans de Réflexion (2012) .tt1195478(6.2).mkv
Code: Select all
5 Ans de R?flexion (2012) .tt1195478(6.2) [ ]
and version test4 also:
Code: Select all
Normalizer.normalize(path, Form.NFKD) => /Volumes/Qmultimedia/video/films/#/5 Ans de Réflexion (2012)/5 Ans de Réflexion (2012)/5 Ans de Réflexion (2012) .tt1195478(6.2).mkv
Code: Select all
5 Ans de R?flexion (2012) .tt1195478(6.2) [ ]

http://shlrm.org/blog/2012/10/04/osx-java-utf-8-oh-my/
http://stackoverflow.com/questions/3610 ... ion-issues
http://hints.macworld.com/article.php?s ... 8053951714
http://lists.apple.com/archives/java-de ... 00058.html
Hope it'll helps

Re: issue with special caracters
btw are you using Apples JDK 6 or Oracles JDK 7? If you only tried Apple JDK 6 please try again with Oracle JDK 7.
This is a hard one...
This is a hard one...
Re: issue with special caracters
Apple Jdk is unable to handle filebot (of-course i tried
) so i have freshly uninstalled Oracle JRE 7 to install JDK7 before testing 


Re: issue with special caracters
perhaps the problem comes not from filebot (in the nfd/nfkc/nfkd mode) but from the mediainfo library ?
Re: issue with special caracters
treid with dylib found on sourceforge revidion 0.7.63
taking good way ?
Code: Select all
Normalizer.normalize(path, Form.NFKD) => /Volumes/Qmultimedia/video/films/#/5 Ans de Réflexion (2012)/5 Ans de Réflexion (2012)/5 Ans de Réflexion (2012) .tt1195478(6.2).mkv
5 Ans de Réflexion (2012) .tt1195478(6.2) [ ]
Re: issue with special caracters
Still not working? I the latest release I do normalize with NFD, that should be the right one but it still didn't work I guess. Other than that I have no clue. I guess updating to the latest libmediainfo also doesn't help? Not much I can do on my said at this point.
Re: issue with special caracters
i guess that it is mediainfo cli that is in cause for the last part of the bug. i'll try to fill a message to them 

Re: issue with special caracters
made tests, got full good output from mediainfo when submitting the file directly in command line...
just submitted the path between double quotes
tested with 3.61 CLI same as 3.60, and same with GUI.
Is there a method to see verbose mode ? it's obviously a parsing or kinda issue, nope ?
just submitted the path between double quotes
tested with 3.61 CLI same as 3.60, and same with GUI.
Is there a method to see verbose mode ? it's obviously a parsing or kinda issue, nope ?
Re: issue with special caracters
Nope, it's a native interface issue. FileBot is not calling the mediainfo cli tool. It's directly hooking into the libmediainfo C interface. I guess somewhere the conversion between Java String and C char_w** gets messed up, or just unicode normalization form. Suffice to say it'd be very hard for me to debug libmediainfo even if I had a Mac to play with.
Re: issue with special caracters
did you triend this ?
Code: Select all
options = "#{options} -Dfile.encoding=UTF-8" if java.lang.System.getProperty('file.encoding') == 'MacRoman'
Re: issue with special caracters
tried directly by command line, lost shot.
Re: issue with special caracters
A wise man said "when you don't know where you are going, look behind from where you come from"...
tried again the -script fn:sysinfo and here the results...
tried again the -script fn:sysinfo and here the results...
Code: Select all
Macgregor:~ greg$ java -Dfile.encoding=UTF8 -jar /Applications/FileBot.app/Contents/Resources/Java/FileBot.jar -script fn:sysinfo
FileBot 3.61 (r1646)
JNA Native: 3.5.0
MediaInfo: java.lang.UnsatisfiedLinkError: Unable to load library 'mediainfo': dlopen(libmediainfo.dylib, 9): image not found
7-Zip-JBinding: net.sf.sevenzipjbinding.SevenZipNativeInitializationException: Failed to load 7z-JBinding: no 7-Zip-JBinding in java.library.path
Extended Attributes: DISABLED
Java(TM) SE Runtime Environment 1.7.0_25
64-bit Java HotSpot(TM) 64-Bit Server VM
Mac OS X (x86_64)
Done ヾ(@⌒ー⌒@)ノ
Code: Select all
Macgregor:~ greg$ FileBot -script fn:sysinfo
FileBot 3.61 (r1646)
JNA Native: 3.5.0
MediaInfo: MediaInfoLib - v0.7.60
7-Zip-JBinding: OK
Extended Attributes: java.lang.NullPointerException
Java(TM) SE Runtime Environment 1.7.0_25 (headless)
64-bit Java HotSpot(TM) 64-Bit Server VM
Mac OS X (x86_64)
Done ヾ(@⌒ー⌒@)ノ
Macgregor:~ greg$
Re: issue with special caracters
opened new terminal instance, had results... will reboot whole hardware !
Code: Select all
Macgregor:~ greg$ java -jar /Applications/FileBot.app/Contents/Resources/Java/FileBot.jar -script fn:sysenv
# Java System Properties #
sun.boot.library.path: /Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre/lib
ehcache.disk.store.dir: /Users/greg/.filebot/cache/0
gopherProxySet: false
java.version: 1.7.0_25
java.vm.name: Java HotSpot(TM) 64-Bit Server VM
java.awt.graphicsenv: sun.awt.CGraphicsEnvironment
java.specification.vendor: Oracle Corporation
os.version: 10.8.4
ftp.nonProxyHosts: local|*.local|169.254/16|*.169.254/16
sun.os.patch.level: unknown
os.name: Mac OS X
java.specification.name: Java Platform API Specification
user.name: greg
sun.java.launcher: SUN_STANDARD
socksNonProxyHosts: local|*.local|169.254/16|*.169.254/16
user.dir: /Users/greg
java.ext.dirs: /Users/greg/Library/Java/Extensions:/Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre/lib/ext:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java
sun.cpu.endian: little
user.home: /Users/greg
java.vm.specification.version: 1.7
grape.root: /Users/greg/.filebot/grape
java.endorsed.dirs: /Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre/lib/endorsed
file.separator: /
sun.arch.data.model: 64
sun.cpu.isalist:
file.encoding: UTF-8
java.home: /Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre
java.vendor.url: http://java.oracle.com/
sun.management.compiler: HotSpot 64-Bit Tiered Compilers
java.class.path: /Applications/FileBot.app/Contents/Resources/Java/FileBot.jar
user.language: fr
java.runtime.name: Java(TM) SE Runtime Environment
java.vm.specification.vendor: Oracle Corporation
java.class.version: 51.0
http.agent: FileBot 3.61
file.encoding.pkg: sun.io
java.vm.info: mixed mode
swing.crossplatformlaf: javax.swing.plaf.nimbus.NimbusLookAndFeel
java.vendor: Oracle Corporation
sun.jnu.encoding: UTF-8
awt.toolkit: sun.lwawt.macosx.LWCToolkit
sun.font.fontmanager: sun.font.CFontManager
http.nonProxyHosts: local|*.local|169.254/16|*.169.254/16
user.country: FR
os.arch: x86_64
sun.boot.class.path: /Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre/lib/resources.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre/lib/rt.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre/lib/sunrsasign.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre/lib/jsse.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre/lib/jce.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre/lib/charsets.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre/lib/jfr.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre/lib/JObjC.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/jre/classes
sun.io.unicode.encoding: UnicodeBig
line.separator:
java.vm.version: 23.25-b01
java.io.tmpdir: /var/folders/7q/qx89427162sdgjb88dz944q40000gn/T/
sun.java.command: /Applications/FileBot.app/Contents/Resources/Java/FileBot.jar -script fn:sysenv
java.awt.printerjob: sun.lwawt.macosx.CPrinterJob
java.vendor.url.bug: http://bugreport.sun.com/bugreport/
java.vm.specification.name: Java Virtual Machine Specification
java.library.path: /Users/greg/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
java.runtime.version: 1.7.0_25-b15
java.specification.version: 1.7
path.separator: :
user.timezone:
java.vm.vendor: Oracle Corporation
# Environment Variables #
_: /usr/bin/java
HOME: /Users/greg
SHELL: /bin/bash
__CF_USER_TEXT_ENCODING: 0x1F5:0:91
JAVA_ARCH: x86_64
Apple_PubSub_Socket_Render: /tmp/launch-lQLTHL/Render
SHLVL: 1
SECURITYSESSIONID: 186a4
LANG: fr_FR.UTF-8
LOGNAME: greg
SSH_AUTH_SOCK: /tmp/launch-S14RE4/Listeners
com.apple.java.jvmTask: CommandLine
PWD: /Users/greg
TERM: xterm-256color
TERM_SESSION_ID: D3D92F31-B5AB-4E09-9109-FACCC652E772
COMMAND_MODE: unix2003
TERM_PROGRAM: Apple_Terminal
PATH: /Library/Frameworks/Python.framework/Versions/2.7/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/local/git/bin:/Developer/usr/bin
TERM_PROGRAM_VERSION: 309
TMPDIR: /var/folders/7q/qx89427162sdgjb88dz944q40000gn/T/
USER: greg
Apple_Ubiquity_Message: /tmp/launch-U4Cblw/Apple_Ubiquity_Message
Done ヾ(@⌒ー⌒@)ノ
Re: issue with special caracters
Can't think of anything here. Looks good to me. I'm not sure if much can be done with Java system properties.
Internal things seem to be all set to UTF-8...
Internal things seem to be all set to UTF-8...
Code: Select all
sun.jnu.encoding: UTF-8
Re: issue with special caracters
i'll be glad to digg more by myself, if you could gimme some clues to follow.
starting with how to follow operations and results of instructions submitted to filebot step by step ?
IMO unicode normalization doesn't get thru UTF-16LE or UTF-8 accented characters, so a 'ê' is translated 'e+^' and transmitted just 'e' and mediainfo didn't find the misnamed file...
starting with how to follow operations and results of instructions submitted to filebot step by step ?
IMO unicode normalization doesn't get thru UTF-16LE or UTF-8 accented characters, so a 'ê' is translated 'e+^' and transmitted just 'e' and mediainfo didn't find the misnamed file...
Re: issue with special caracters
Unicode NF has nothing to do with UTF-16LE/BE or UTF-8. Can be encoded however you want. The problem is that when you type ê you'd type ^+e (decomposed) and that'd result in the character ê (composed). It looks the same but it's two different bytesequences.
The first thing I'd debug is add loging to libmediainfo and dump filenames how they're native and how filebot is passing them in via JNA. Somehow there must be a difference.
1. The issue might either be filebot passing in the wrong value. But I'm passing in NFD now so I think that's ok.
2. JNA messes up the filename when converting Java String to native wide-char String
3. libmediainfo somehow messes up the correct filename it gets from JNA
I think without the help from zenitram the author of mediainfo I can't figure this out or fix it.
The first thing I'd debug is add loging to libmediainfo and dump filenames how they're native and how filebot is passing them in via JNA. Somehow there must be a difference.
1. The issue might either be filebot passing in the wrong value. But I'm passing in NFD now so I think that's ok.
2. JNA messes up the filename when converting Java String to native wide-char String
3. libmediainfo somehow messes up the correct filename it gets from JNA
I think without the help from zenitram the author of mediainfo I can't figure this out or fix it.