I read through this excellent summary of formatting rules pinned here:
viewtopic.php?f=5&t=2
There are rules in that thread which replace characters invalid in the Windows filenames with more regular characters (eg ":" to "-"). However, unless I missed it, there are none that take the approach I prefer: replacing them with similar looking unicode "equivalents". Apologies if I did miss such a rule - please point it out if you've already found one.
This approach works well when your files are on a Linux filesystem (where mostly anything goes except "/"), but you're sharing them via samba to Windows machines.
Here are a couple of references on options, including a program that will help if the files already exist:
https://stackoverflow.com/a/61448658
https://github.com/DDR0/fuseblk-filename-fixer
In the latter case, here are the recommended replacements - this if from the rust code, but I think it is pretty clear what's going on to readers here - two of them are escaped with a backslash as noted:
Code: Select all
const MS_RESERVED_STRINGS: [(&str, &str); 9] = [
("<", "﹤"),
(">", "﹥"),
(":", "ː"),
("\"", "“"), -- " escaped
("/", "⁄"),
("\\", "∖"), -- \ escaped
("|", "⼁"),
("?", "﹖"),
("*", "﹡"),
];
It's fairly straightforward to build a format expression from this, but I have a couple of questions:
1. Does anyone have any better replacement suggestions from the unicode character set? "Better" might include being clearer or prettier on Windows, tho of course this is a bit subjective. For example, I prefer replacing "?" with "?" rather than "﹖" suggested above. I'm still experimenting...
2. A question for rednoah...if it doesn't already exist, would it be worthwhile to have a built-in function for this? It's a fairly common NFS->Samba mapping issue, and there are different systems around that solve this with a single formatting call (which you can usually configure).
Maybe there is something in groovy for this already, but I couldn't find any suitable function among those listed here: https://www.filebot.net/naming.html
This post (including the subject) edited for clarity when I woke up this morning
