Using Filebot to add custom xattr

Running FileBot from the console, Groovy scripting, shell scripts, etc
Post Reply
User avatar
ctark
Power User
Posts: 2
Joined: 12 Mar 2017, 17:08

Using Filebot to add custom xattr

Post by ctark »

Hey!

First off, thank you for this great utility, I am loving the AMC script, and have used it as a base for a custom version that suits my needs.

What I am trying to do: Save the label and tracker that is passed from the client to the AMC script as a new xattr for use later.

A bit of background: I recently lost all my hardlinks when I migrated from separate volumes to one large, spanned volume. My library contains a mixture of TV, Movies, eBooks, Music, Comics, ect. Each is sorted into it's own folder based on the tracker it came from, or the label it was assigned.
When I went to run my manual filebot run script, it obviously doesn't have these being passed to it, so it had to rely on existing xattr and online lookups for the content.
This worked decent for the movies and TV shows, but obviously failed / mismatched for the other categories. So I thought, if I could save these as xattr, and then modify the script to look these up, I won't have to worry about this in the future.

Now I will be the first to say I'm not fluent in groovy, and some of your backend code goes way over my head, but for the most part I have been figuring out how to do things on my own, and it's been working (maybe not as elegant as if you had written it, but it's functional).

After a good 5 hours of playing around, I think I'm close, but I'm just not quite sure what I'm doing... Here is the code I'm using:

Code: Select all

// group episodes/movies and rename according to Plex standards
def groups = input.groupBy{ f ->
	// print xattr metadata
	if (f.metadata) {
		log.finest "xattr: [$f.name] => [$f.metadata]"
	} 
	tryLogCatch {
		if (fromUT) {		
			Resource<MetaAttributes> xattr = Resource.lazy(xattr(f));  // Attempt 1
			// def xattr = new MetaAttributeView(f);                                // Attempt 2
			
			String UT_LABEL_KEY = "net.filebot.ut.label";
			String UT_TRACKER_KEY = "net.filebot.ut.tracker";
			
			if (tracker != null && tracker.length() > 0) {
				xattr.get().setExtraMeta(UT_TRACKER_KEY, tracker);
			}
			if (label != null && label.length() > 0) {
				xattr.get().setExtraMeta(UT_LABEL_KEY, label);
			}
		}
	}
Using Attempt 1:
I have gotten it narrowed down to:

Code: Select all

No signature of method: net.filebot.media.XattrMetaInfo.call() is applicable for argument types: (java.io.File) values: [X:\Folder\File-Name.ext]
By the looks of that error, it's looking for a file object and I'm passing it a string path, could this be the issue?

Using Attempt 2:
I got the following error:

Code: Select all

Could not find which method get() to invoke from this list:
  public abstract java.lang.Object java.util.Map#get(java.lang.Object)
  public java.lang.Object java.util.Map#get(java.lang.Object, java.lang.Object)
Any help you can provide would be greatly appreciated!

Once again, thanks for the amazing program!
-Ctark
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Using Filebot to add custom xattr

Post by rednoah »

While modifying the amc is certainly possible, wouldn't it be easier to use --def exec for that? I'd recommend writing a PowerShell script for writing xattr (i.e. Alternative Data Streams) the way you like.

e.g.

Code: Select all

Set-Content -Path /path/to/file -Stream label -Value XYZ
i.e.

Code: Select all

--def exec="Set-Content -Path \"{f}\" -Stream label -Value %LABEL%"
@see http://www.powertheshell.com/ntfsstreams/



After that, you can read it with FileBot, e.g. to group files by label:

Code: Select all

filebot -script "g:println args.groupBy{ it.xattr.label }" -r /files
More Code Samples:

Code: Select all

def f = "/path/to/file.mp4" as File

// set xattr
f.xattr['net.filebot.ut.label'] = 'Label 1'

// read xattr
println f.xattr['net.filebot.ut.label']
:idea: Please read the FAQ and How to Request Help.
User avatar
ctark
Power User
Posts: 2
Joined: 12 Mar 2017, 17:08

Re: Using Filebot to add custom xattr

Post by ctark »

Wow, it was this easy? I feel silly for wasting so much time!

Code: Select all

f.xattr['net.filebot.ut.label'] = 'Label 1'
Was all I needed in the end.

Thanks for the link to the ntfs streams, I didn't know you could do that in powershell, cool feature.

Now I just need to work the new xattr into the manually ran file sorting script, and we will be set!

Thanks for the quick response!
Do you mind, when I finish everything if I put it on the forums for others to enjoy? (I ask because it's still about 70% your AMC script)
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Using Filebot to add custom xattr

Post by rednoah »

You can, but the amc script does change over time, so maybe a patch or tutorial on how to make certain changes might be better.

If it's really just about setting xattr, and not using them in the amc script, the --def exec would be best, and that's always great for sharing cause people can just add to their own command line calls.
:idea: Please read the FAQ and How to Request Help.
ayk
Posts: 5
Joined: 31 May 2018, 22:48

Re: Using Filebot to add custom xattr

Post by ayk »

Hi,

There appears to be a small but very crippling bug relating to this "xattr" feature...
The last character of an extended attribute's value is simply absent (truncated), when used in any format.

Here are

Code: Select all

$ touch sample.txt
$ xattr -w net.filebot.ut.greeting "Hello"  sample.txt
$ xattr -p net.filebot.ut.greeting  sample.txt      
# prints 'Hello'

$ filebot -mediainfo  sample.txt --db xattr -non-strict  --format "{f.xattr['net.filebot.ut.greeting']}  
# prints 'Hell'

In my case I am using a different namespace prefix (not "net.filebot.ut."), but that shouldn't matter and indeed it doesn't.

And here is the version info :

Code: Select all

$ filebot -version
FileBot 4.7.9 (r4984) / Java(TM) SE Runtime Environment 1.8.0_66 / Mac OS X 10.13.4 (x86_64)
Maybe I am doing something stupid, but it would not seem so.

Here's some background :

I came accross FileBot while specifically looking for xattr-capable renaming tool to be used in the context of going paperless.

In this case, the files involved consist mainly of PDF documents, some downloaded while others come from a "scan+OCR" workflow.

Granted, this is not the typical use case for FileBot, but looking at the docs, I figured it was more than powerful enough for the task at hand, and then some.

Besides, I thought I could also use it later for organizing my media files, when I finally get around to that..

So I went ahead and purchased it on the Mac App Store and also installed the command line program via Homebrew.

BINGO!

Or at least that's how I felt before hitting this bug, which basically makes it useless for my uusage scenario.. Bummer...

BTW, there may be an opportunity for FileBot for this scenario. Since it's already got a lot of of the features needed in this context.

Anyhow, thank you for having created this otherwise great tool and please keep it going.
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Using Filebot to add custom xattr

Post by rednoah »

Not sure. Probably an encoding issue more than anything. FileBot probably assumes that xattr data is UTF-8 encoded text without BOM and with \0 termination.

What text encoding settings is the Mac xattr tool using when you convert text to bytes to write it as xattr data? (I honestly don't know, but maybe there's options for setting it)

man xattr wrote:The metadata is often a null-terminated UTF-8 string, but can also be arbitrary binary data.
man xattr says this, but your problem description sounds like the xattr tool is not null-terminating strings.


You can try to read / write xattr with FileBot and then figure out what's different to the built-in macOS tools:

Code: Select all

#!/usr/bin/env filebot -script

args[0].xattr.'net.filebot.xyz' = "Hello World"
Then chmod +x and ./fb-setxattr.groovy /path/to/file.

Afterwards, you can compare the exact binary value for each xattr with this command:

Code: Select all

xattr -l /path/to/file



EDIT:

Indeed. xattr -w does not null-terminate string values:

Code: Select all

$ xattr -w net.filebot.test "Hello World" sample.txt

Code: Select all

$ xattr -lx sample.txt
net.filebot.test:
00000000  48 65 6C 6C 6F 20 57 6F 72 6C 64                 |Hello World|
0000000b
Unfortunately, this is something that can't be fixed easily within filebot, since an underlying libraries is handling that. A possible work around would be to use FileBot to read / write xattr, and not use the built-in xattr tool, or to use a different xattr tool that writes 0-terminated String values.


e.g.

Code: Select all

$ filebot *.txt -script "g:args.each{ it.xattr[key] = value }" --def key="net.filebot.test" --def value="Hello World"

Code: Select all

xattr -lx *.txt
net.filebot.test:
00000000  48 65 6C 6C 6F 20 57 6F 72 6C 64 00              |Hello World.|
0000000c
If you use xattr -wx then you can write xattr in HEX with 0-termination, and thus write a value that can be read correctly by FileBot:

Code: Select all

xattr -wx net.filebot.net "48 65 6C 6C 6F 20 57 6F 72 6C 64 00" sample.txt
:idea: Please read the FAQ and How to Request Help.
ayk
Posts: 5
Joined: 31 May 2018, 22:48

Re: Using Filebot to add custom xattr

Post by ayk »

Thanks a lot for this detailed response, rednoah.

I guess I will have to see what convention is actually going to be expected by other software that will make up my toolchain.

If null-terminated strings do not cause trouble in those tools, then we now have two alternate methods for setting them (via filebot or "xattr -wx"), thanks to your investigation.


-------
As a side note, here is my opinion (probably not worth more 2 cents) with regards to the correct software behaviour on this topic.

I do not know which underlying library filebot is using for this purpose, but IMHO they've got this thing wrong.

On the other hand, they may not be alone, as can be seen in the man pages for xattr (but apparently not its codebase).

The reasoning is simple :

- Basically, extended attributes are essentially blob-like slots which may contain arbitrary binary data (including nulls and what-not), in the general case.

- Because of this, their underlying storage (within the file-system objects) would normally rely on storing "size & data", without involving null-termination.

(otherwise, the general binary xattr storage and its APIs would have to resort to some escaping or encoding techniques, like base64, which does NOT seem to be the case).

Granted, since the 70's, null-terminated strings are indeed a very popular convention for strings, but mainly applies when storing them in memory (RAM). Elsewhere (at rest on disk, or in transit thru the network), there are usually other serialization conventions.

In any case, some folks might have been tempted to cater to the C-string convention in this case as well, resulting in more confusion then anything else, IMHO.

-----

Anyhow, thanks a lot!
Ayk
ayk
Posts: 5
Joined: 31 May 2018, 22:48

Re: Using Filebot to add custom xattr

Post by ayk »

I have done a bit more digging on this topic, including skimming through some of the FileBot source code kindly made available on github.

Sure enough, filebot simply calls a platform specific JNA (com.sun.jna.platform.mac.XAttrUtil) for accessing xattrs without doing anything funky on its own.
Here's the URL for a Javadoc of this lib : https://java-native-access.github.io/jn ... rUtil.html,

Although I could not find the source code on that site, I am now pretty much convinced that the blame goes to this JNA library (com.sun.jna.platform.mac.XAttrUtil) and not to the native implementation underneath.

So, as @rednoah previously hinted, there is really not much that can be done on the FileBot end, other than perhaps ditching "*.mac.XAttrUtil" altogether and using the lower level "*.mac.XAttr" calls that appear to directly map to the underlying macOS system calls.

----

Meanwhile, here's something that could help clarify the question on null-termination, which can be found on the Linux man page for setxattr(2) system call : ( http://man7.org/linux/man-pages/man2/setxattr.2.html )
setxattr() sets the value of the extended attribute identified by
name and associated with the given path in the filesystem. The size
argument specifies the size (in bytes) of value; a zero-length value
is permitted.
...
An extended attribute name is a null-terminated string. The name
includes a namespace prefix; there may be several, disjoint
namespaces associated with an individual inode. The value of an
extended attribute is a chunk of arbitrary textual or binary data of
specified length
.
OK, that's Linux... But its should not differ for the macOS, either.
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Using Filebot to add custom xattr

Post by rednoah »

At the very least, xattr -p and xattr -w should work in combination with FileBot.
man xattr wrote:In the second form, using the -p option, the value associated with the given attribute name is displayed. Attribute values are usually displayed as strings. However, if nils are detected in the data, the value is displayed in a hexadecimal representation.
Because of 0-termination, xattr -p doesn't display String values written by FileBot nicely. That would be very desirable though. I'm fairly sure macOS changed how things work under the hood (maybe with APFS?) sometime in the last few years, because this is the kind of thing I would have noticed back in 2013 when xattr support was implemented.

I'm the original contributor of the com.sun.jna.platform.mac.XAttr* code, so it'd be quite easy for me to make a few changes, but backwards compatibility has to be extremely well-tested before contributing back. Not sure who else uses this code by now.


EDIT:

I have a fix. But I won't be to enable it for now, since older versions won't be forwards-compatible with the changes. Fortunately, backwards compatibility is not an issue.
:idea: Please read the FAQ and How to Request Help.
ayk
Posts: 5
Joined: 31 May 2018, 22:48

Re: Using Filebot to add custom xattr

Post by ayk »

This is great news!

Not enabling the fix by default is quite understadable at least at this stage.
But it will be possible to enable it via some option/switch, right?

(BTW, was the culprit indeed within the JNA code?)

----

With regards to your comments about the HEX ouput coming out of xattr -p :

Yes, that's exactly the problem. The HEX i/o just makes it a pain for any practical command line use.
It would sure be possible to wrap xattr somehow, but then we've got yet another piece of glue software to maintain and share ...

Meanwhile, looking for xattr alternatives, I came across pxattr, which is a cross-platform C library and a cli program that goes with it
(While it dates back a while, it sure still runs. )

Guess what? pxattr also behaves quite similar to xattr when setting attribute values, i.e. it will not append a null to the value when storing it. BTW, In this regard, I think both xattr and pxattr are behaving correctly.

When it comes to output, pxattr seems to take a different aproach: Instead of displaying HEX representation, it makes use of backslash escapes. Again, both can be considered "coherent and logically correct" behaviour with different aesthetic results.

In terms of practicality, it really depends on the characteristics of the dataset at hand:
  • pxattr's approach shines when you have got only or mostly textual values with occasional nulls
  • xattr's HEX output looks much cleaner when there are lots of truely binary values.
Neither of them give the option to choose another output style, which would have clearly been preferable...

BTW, do you know of any other xattr alternatives that work on macOS?

Anyhow, thanks a lot again. I am looking forward to trying out your fix.

Cheers,
Ayhan
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Using Filebot to add custom xattr

Post by rednoah »

1.
Unfortunately, adding a switch will not be possible. macOS code signing makes it impossible to modify application files / configuration files, and I don't plan to make a GUI setting that probably would be used effectively by only one single person.

Since the UI and CLI version need to be updated simultaneously, there definitely won't be a fix until I can update the CLI version, which I can't right now because it's based on the free version (and can't be based on the paid version because of how brew cask works).


2.
The JNA code does indeed explicitly add \0 and assume \0 termination in all values.


3.
For the time being, I recommend using FileBot to read and write FileBot related xattr (you can set your own via custom scripts).
:idea: Please read the FAQ and How to Request Help.
ayk
Posts: 5
Joined: 31 May 2018, 22:48

Re: Using Filebot to add custom xattr

Post by ayk »

Oh well...

In my case, its the CLI that would get used the most, anyway. This is because of the semi-automated paperless workflow I had mentioned earlier.

But I would be quite reluctant at this time to go with the workround you mention (read & wrirte with FileBot) and ending up with \0 terminated attribute values all over the place; in a way that would make those incompatible with the rest of the toolchain.

So I guess I would have to wait for the CLI update, ...

Could you please post an announcement on this forum thread when that happens?
User avatar
rednoah
The Source
Posts: 22923
Joined: 16 Nov 2011, 08:59
Location: Taipei
Contact:

Re: Using Filebot to add custom xattr

Post by rednoah »

ayk wrote: 04 Jun 2018, 04:16 Oh well...

In my case, its the CLI that would get used the most, anyway. This is because of the semi-automated paperless workflow I had mentioned earlier.

But I would be quite reluctant at this time to go with the workround you mention (read & wrirte with FileBot) and ending up with \0 terminated attribute values all over the place; in a way that would make those incompatible with the rest of the toolchain.

So I guess I would have to wait for the CLI update, ...

Could you please post an announcement on this forum thread when that happens?
FileBot 4.8.2 is available as alpha / beta and should include the proposed xattr fixes:
https://get.filebot.net/filebot/FileBot_4.8.2/
:idea: Please read the FAQ and How to Request Help.
Post Reply