Encoding issue prevents renaming

All your suggestions, requests and ideas for future development
Post Reply
pngl
Posts: 4
Joined: 04 Sep 2014, 13:50

Encoding issue prevents renaming

Post by pngl »

I'm triggering amc.groovy through a bash script ("script.sh") at the end of Transmission downloads.

Unfortunately, I get the following error (I've edited the actual paths):

Code: Select all

[HARDLINK] Rename [/download/path] to [/destination/accent_é_issue/path]
java.nio.file.InvalidPathException: Malformed input or input contains unmappable chacraters: /destination/accent_é_issue/path
Transmission sets two environment variables to communicate with the scripts it runs: TR_TORRENT_DIR and TR_TORRENT_NAME. If I set those variables manually before executing script.sh by hand, the renaming works.

If I add the parameter -Dfile.encoding=UTF-8 to filebot.sh, I get the following error :

Code: Select all

[HARDLINK] Rename [/download/path] to [/destination/accent_é_issue/path]
[HARDLINK] Failed to rename [/download/path]
java.nio.file.NoSuchFileException: /destination/accent_é_issue/path -> /download/path
Again, invoking script.sh by hand with TR_TORRENT_DIR and TR_TORRENT_NAME manully set works.

I've noticed two things when -Dfile.encoding=UTF-8 is set:
1) amc.groovy creates a destination folder that looks like this : /destination/accent_?_issue/path. See how the "é" became a "?". Then it triggers the above error.
2) If the proper destination/accent_é_issue/path folder exists before invoking script.sh, the script works even when called from Transmission.

So my guess is that filebot creates a destination folder with some wrong settings, replacing accentuated characters and such with "?". Then it tries to copy the file to the right destination folder, which fails... unless I had manually created that destination folder beforehand.

For reference (and in case this is what's causing the error), here is script.sh. The extra code is for dealing with logging and passing files as arguments during testing.

Code: Select all

# Log every execution (bad, rewrites every logfile)
exec > >(tee /path/to/script.log);
exec 2>&1;

DEFAULT_SCAN_DIR="/download/path";

if [[ -z $TR_TORRENT_DIR || -z $TR_TORRENT_NAME ]]; then
  if [[ -z $1 ]]; then
    FILE="$DEFAULT_SCAN_DIR";
    SUBS="";
    echo "script.sh: Scanning $DEFAULT_SCAN_DIR (default)";
    echo "script.sh: Not downloading any subtitles, or you will get banned if the folder is full.";
  else
    FILE="$1";
    SUBS="fr";
    echo "script.sh: Scanning $FILE (given as argument)";
  fi
else
  FILE="$TR_TORRENT_DIR/$TR_TORRENT_NAME";
  SUBS="fr";
  echo "script.sh: Scanning $FILE (given by Transmission as env. variables)";
fi

filebot -script /path/to/amc.groovy\
  --output "/destination/base/path"\
  --log-file amc.log\
  --action hardlink\
  --conflict skip\
  -non-strict\
  --def clean=y\
  --def music=n\
  subtitles="$SUBS"\
  --def plex="<plex IP>"\
  artwork=n\
  "ut_kind=single"\
   "$FILE"
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Encoding issue prevents renaming

Post by rednoah »

FAQ wrote:Q: I'm running FileBot on a Linux machine and non-ASCII characters get all messed up. Why do unicode characters not work?
A: On some machines the locale is not set up. You'll need to tell Java what charset filenames are encoded with by setting the environment variable LANG. Also if you get an InvalidPathException about unmappable characters then it could very well be because LANG is not set up correctly.

Code: Select all

export LANG=en_US.utf8
:idea: Please read the FAQ and How to Request Help.
pngl
Posts: 4
Joined: 04 Sep 2014, 13:50

Re: Encoding issue prevents renaming

Post by pngl »

Forgot to mention I already checked that. LANG is en_US.UTF-8 (checked by adding echo $LANG at the beginning of filebot.sh).
User avatar
rednoah
The Source
Posts: 22998
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Encoding issue prevents renaming

Post by rednoah »

You could use the ascii() function in your format expression to normalize the non-ascii characters.

So you're on Embedded Linux with Embedded Java 8? If you figure out how to set everything up to handle unicode paths properly let me know. ;)
:idea: Please read the FAQ and How to Request Help.
pngl
Posts: 4
Joined: 04 Sep 2014, 13:50

Re: Encoding issue prevents renaming

Post by pngl »

My bad, I had checked the encoding in filebot.sh but not in script.sh. Adding export LANG="en_US.UTF-8" at the beginning of script.sh fixed the issue, and -Dfile.encoding=UTF-8 is no longer necessary.

The problem was caused by Transmission manually resetting the environment before calling the torrent-done script ([email protected], for reference).
Post Reply