Page 1 of 1

Encoding issue prevents renaming

Posted: 15 Sep 2014, 23:00
by pngl
I'm triggering amc.groovy through a bash script ("script.sh") at the end of Transmission downloads.

Unfortunately, I get the following error (I've edited the actual paths):

Code: Select all

[HARDLINK] Rename [/download/path] to [/destination/accent_é_issue/path]
java.nio.file.InvalidPathException: Malformed input or input contains unmappable chacraters: /destination/accent_é_issue/path
Transmission sets two environment variables to communicate with the scripts it runs: TR_TORRENT_DIR and TR_TORRENT_NAME. If I set those variables manually before executing script.sh by hand, the renaming works.

If I add the parameter -Dfile.encoding=UTF-8 to filebot.sh, I get the following error :

Code: Select all

[HARDLINK] Rename [/download/path] to [/destination/accent_é_issue/path]
[HARDLINK] Failed to rename [/download/path]
java.nio.file.NoSuchFileException: /destination/accent_é_issue/path -> /download/path
Again, invoking script.sh by hand with TR_TORRENT_DIR and TR_TORRENT_NAME manully set works.

I've noticed two things when -Dfile.encoding=UTF-8 is set:
1) amc.groovy creates a destination folder that looks like this : /destination/accent_?_issue/path. See how the "é" became a "?". Then it triggers the above error.
2) If the proper destination/accent_é_issue/path folder exists before invoking script.sh, the script works even when called from Transmission.

So my guess is that filebot creates a destination folder with some wrong settings, replacing accentuated characters and such with "?". Then it tries to copy the file to the right destination folder, which fails... unless I had manually created that destination folder beforehand.

For reference (and in case this is what's causing the error), here is script.sh. The extra code is for dealing with logging and passing files as arguments during testing.

Code: Select all

# Log every execution (bad, rewrites every logfile)
exec > >(tee /path/to/script.log);
exec 2>&1;

DEFAULT_SCAN_DIR="/download/path";

if [[ -z $TR_TORRENT_DIR || -z $TR_TORRENT_NAME ]]; then
  if [[ -z $1 ]]; then
    FILE="$DEFAULT_SCAN_DIR";
    SUBS="";
    echo "script.sh: Scanning $DEFAULT_SCAN_DIR (default)";
    echo "script.sh: Not downloading any subtitles, or you will get banned if the folder is full.";
  else
    FILE="$1";
    SUBS="fr";
    echo "script.sh: Scanning $FILE (given as argument)";
  fi
else
  FILE="$TR_TORRENT_DIR/$TR_TORRENT_NAME";
  SUBS="fr";
  echo "script.sh: Scanning $FILE (given by Transmission as env. variables)";
fi

filebot -script /path/to/amc.groovy\
  --output "/destination/base/path"\
  --log-file amc.log\
  --action hardlink\
  --conflict skip\
  -non-strict\
  --def clean=y\
  --def music=n\
  subtitles="$SUBS"\
  --def plex="<plex IP>"\
  artwork=n\
  "ut_kind=single"\
   "$FILE"

Re: Encoding issue prevents renaming

Posted: 16 Sep 2014, 00:07
by rednoah
FAQ wrote:Q: I'm running FileBot on a Linux machine and non-ASCII characters get all messed up. Why do unicode characters not work?
A: On some machines the locale is not set up. You'll need to tell Java what charset filenames are encoded with by setting the environment variable LANG. Also if you get an InvalidPathException about unmappable characters then it could very well be because LANG is not set up correctly.

Code: Select all

export LANG=en_US.utf8

Re: Encoding issue prevents renaming

Posted: 16 Sep 2014, 10:09
by pngl
Forgot to mention I already checked that. LANG is en_US.UTF-8 (checked by adding echo $LANG at the beginning of filebot.sh).

Re: Encoding issue prevents renaming

Posted: 16 Sep 2014, 14:34
by rednoah
You could use the ascii() function in your format expression to normalize the non-ascii characters.

So you're on Embedded Linux with Embedded Java 8? If you figure out how to set everything up to handle unicode paths properly let me know. ;)

Re: Encoding issue prevents renaming

Posted: 18 Sep 2014, 18:29
by pngl
My bad, I had checked the encoding in filebot.sh but not in script.sh. Adding export LANG="en_US.UTF-8" at the beginning of script.sh fixed the issue, and -Dfile.encoding=UTF-8 is no longer necessary.

The problem was caused by Transmission manually resetting the environment before calling the torrent-done script ([email protected], for reference).