I read through this excellent summary of formatting rules pinned here:
There are rules in that thread which replace characters invalid in the Windows filenames with more regular characters (eg ":" to "-"). However, unless I missed it, there are none that take the approach I prefer: replacing them with similar looking unicode "equivalents". Apologies if I did miss such a rule - please point it out if you've already found one.
This approach works well when your files are on a Linux filesystem (where mostly anything goes except "/"), but you're sharing them via samba to Windows machines.
Here are a couple of references on options, including a program that will help if the files already exist:
In the latter case, here are the recommended replacements - this if from the rust code, but I think it is pretty clear what's going on to readers here - two of them are escaped with a backslash as noted:
Code: Select all
const MS_RESERVED_STRINGS: [(&str, &str); 9] = [ ("<", "﹤"), (">", "﹥"), (":", "ː"), ("\"", "“"), -- " escaped ("/", "⁄"), ("\\", "∖"), -- \ escaped ("|", "⼁"), ("?", "﹖"), ("*", "﹡"), ];
It's fairly straightforward to build a format expression from this, but I have a couple of questions:
1. Does anyone have any better replacement suggestions from the unicode character set? "Better" might include being clearer or prettier on Windows, tho of course this is a bit subjective. For example, I prefer replacing "?" with "？" rather than "﹖" suggested above. I'm still experimenting...
2. A question for rednoah...if it doesn't already exist, would it be worthwhile to have a built-in function for this? It's a fairly common NFS->Samba mapping issue, and there are different systems around that solve this with a single formatting call (which you can usually configure).
Maybe there is something in groovy for this already, but I couldn't find any suitable function among those listed here: https://www.filebot.net/naming.html
This post (including the subject) edited for clarity when I woke up this morning