For the last 6 months i’ve been archiving all my paper work (OCR’ing and than trashing it) to a personal documents repository.
There are some document managers out there but every single one felt like overkill to me, so i just stick to a pretty simple directory structure which is enough for me.
Although i need those documents across devices, i didn’t want to use a cloud service to sync them. git does a pretty good job here.
With Mac OS X Mavericks comes a great new feature: Tagging. Certainly we’ve all used tags somewhere on the internet and i really like this kind of taxonomies. It’s way better than a fixed folder structure.
So, i can now tag all my documents without the need for an external program.
But what about sync? Those tags are stored in the extended file attributes of the Mac OS X filesystem (along with other stuff, for example if the file has been download from the web or email). git does not include those extended attributes in a repository so they will be lost.
xattr to the rescue. xattr can dump all extended attributes for all files in a directory and also can write them back.
I use the following pre-commit hook to dump all extended attributes of my archive to a file named .metadata
#!/bin/sh
xattr -lrx . > .metadata
git add -f .metadata |
#!/bin/sh
xattr -lrx . > .metadata
git add -f .metadata
This can be a problem if only tags are modified as nothing will be committed. This can be handled by an empty commit:
git commit --allow-empty -m "New Tags |
git commit --allow-empty -m "New Tags
To restore them i use the following post-merge hook which is also executed after a pull (i’m pretty much doing only pulls on this repository anyway).
#!/usr/bin/env ruby
# Be careful, this can be something you don't want:
# strip all existing extended attributes
system("xattr -cr .")
pattern_header = /([^\0]+): (.+):/
pattern_data = /\d{8} (.+) +\|.+\|/
data, current_file, current_attribute = '', nil, nil
File.readlines('.metadata').each do |line|
# collect hex data
if(m = pattern_data.match(line) and current_file)
m = pattern_data.match line
data += m[1].to_s.strip if m and m[1]
# starting hex data for a new file
elsif(m = pattern_header.match(line))
# we have some data for the current file
if current_file and data != ''
system("xattr -wx #{current_attribute} #{data.gsub(/ /, '')} \"#{current_file}\"")
end
data, current_file, current_attribute = '', m[1], m[2]
elsif current_file
m = pattern_data.match line
data += m[1].to_s.strip if m and m[1]
end
end |
#!/usr/bin/env ruby
# Be careful, this can be something you don't want:
# strip all existing extended attributes
system("xattr -cr .")
pattern_header = /([^\0]+): (.+):/
pattern_data = /\d{8} (.+) +\|.+\|/
data, current_file, current_attribute = '', nil, nil
File.readlines('.metadata').each do |line|
# collect hex data
if(m = pattern_data.match(line) and current_file)
m = pattern_data.match line
data += m[1].to_s.strip if m and m[1]
# starting hex data for a new file
elsif(m = pattern_header.match(line))
# we have some data for the current file
if current_file and data != ''
system("xattr -wx #{current_attribute} #{data.gsub(/ /, '')} \"#{current_file}\"")
end
data, current_file, current_attribute = '', m[1], m[2]
elsif current_file
m = pattern_data.match line
data += m[1].to_s.strip if m and m[1]
end
end
This hook is pretty simple and one can surely think of better ways for storing (and / or parsing) the data and add some error handling, but this works quite well for my purpose.
This hook also stores every extended attribute. If you’re only interested in meta tags, than only sync the “com.apple.metadata:_kMDItemUserTags” attribute.
Filed in Apple, English posts
|