I do a lot of photography, which includes a lot of cosplay conventions that end up being multi-day all-day continuous shoots. That results in a large amount of data. To store it, I move the final archives off on to large external drives.
Suffice it to say, I've been doing this before there were affordable cloud storage services and when internet connections weren't quite as powerful (often requiring a modem). The point being, I've got a lot of old drives.
For much of my workflow, I use Apple machines. And while this has the advantage of having the next, newest, bright and shiny, it also has the disadvantage of Apple removing ports from its systems as they try to deprecate older technologies.
The combination of these two things, lots of old drives and not having ports to connect them to modern machines, has left me in the position of trying to transfer the data from an external Maxor drive using Firewire 400 to a modern Synology DS1817+ via dual-bonded 10GB ethernet.
Having the foresight to retain each older Macintosh computer that does have the missing ports, I was able to connect the drive via a series of dongles. The drive via Firewire 400 to Firewire 800 to Thunderbolt into an old MacBook Pro, which then has a second dongle from Thunderbolt to Ethernet to the Synology.
It would be great if Apple, when deprecating ports, would also sell an uber desktop dock device with all the deprecated ports on it, connecting with whatever latest and greatest high-speed technology they are offering at the time. I'm always looking for recommendations, and while the Dell USB-C Mobile Adapter (DA300) and CalDigit Thunderbolt Station 3 Plus do great for monitors, audio, media cards, and USB, both are lacking on the historic ports, which would be useful for connecting those fantastic older iSight Apple cameras.
I happened to notice as I was transferring data from the drive to the Synology that I was getting errors. Not hardware errors. Not file system corruption errors. But rather, file copy errors. Mind you, both the source disk and the Synology pass all health checks.
So I decided to copy what I could, logging the results, and deal then with the aftermath. Here's how I did that:
$ cp -rv /Volume/with/Firewire400 /Volume/of/mounted/Synology 2>&1 | tee copy-files.log $ grep "^cp" copyfile.log # Look for error messages from the copy command
I got two kinds of errors.
One, "Permission denied" errors. If I had used
sudo it would have worked, but this was primarily
.Spotlight-V100 directories that I didn't care about.
Two, "File name too long". Now these I did care about. There were only about 150, which given the size of the drive wasn't that bad, but then again these were archived images I didn't want to walk away from.
Admittedly, I use very descriptive names, both for directories and for filenames. But I was fairly certain I hadn't come close.
While additional fact-checking is needed here, and specifics vary from file system to file system, a casual rule of thumb is that modern Unix-based systems have a maximum filename length of 255 characters and allows a maximum path of 4096 characters. I was no where close.
And while more fact-checking is needed here, Synology's encrypted volumes are a little wonky. They use up part of the filename length for encryption material (eCryptfs adds a prefix to the encrypted filename). Empirical evidence suggests file names themselves can then be only ~140 bytes. Matters may get worse if Unicode gets involved where a character isn't equal to just one byte. (source)
Admittedly, this can turn into a real problem for folks trying to do a full system backup, as some of the application and operating system paths do get pretty deep.
In checking my log for errors, I discovered something strange: there were longer filenames and paths that both individually, and combined, exceeded the filenames that were failing. That meant it was something about the filename or paths themselves.
So, I copied the directory from the external hard drive to the Mac. No problem.
When I tried copying the folder, now on the Mac, to the Samba mounted Synology, I got this slightly more descriptive error: "You can't copy some of these items because their names are too long or contain certain characters that are invalid on the destination volume." Interesting.
A quick survey of the files showed that they all looked normal. I did a quick check to see if there was Unicode (nope), control characters (nope), bad pathing symbols (nope), etc. Everything was normal.
And then I noticed the path:
Back in the old MS-DOS days, the console device was named
CON:. This was a "special" filename
that MS-DOS reserved for itself.
My directory was named
Con because it was from a convention, which was the locale of the
shoots that happened at the main event.
To test my theory, I opened Finder and went to the mounted Synology drive, created a new folder, and tried to rename it to Con. It instantly failed: 'The name "Con" can't be used. Try using a name with fewer characters, or with no special punctuation marks.' Ah ha!
I renamed the folder from Con to Convention, and the copy to the Synology worked just fine. Problem solved, but not the mystery.
Concerned this was a bug worth reporting to Synology, I tried creating a Con directory with my other computer connected to the same Synology drive and folder, and ...it worked. Ok, so what's different?
My first thought was operating systems (e.g. El Capitan vs Mojave), but it's the mounted file system that matters.
Very recently, I came to the party late in learning that Apple had deprecated their AFP protocol in favor of SMB 3, when I was investigating stalled file copies to Synology.
On the system I was doing the file transfer, I was using SMB3 for speed, reliability, and all the good things that come with it.
On the other system where I was observing the transfer, I was using AFP, primarily from a default I hadn't gotten around to changing.
So, I unmounted the drives on the AFP machine, remounted them as SMB, and then tried to
Con directory. Surprise, the error about not being able to use that
filename here appeared as well.
Historic "special" file names from MS-DOS, that are valid on a Mac, and Synology, can't pass through the SMB protocol (even the modern SMB3) without some sort of rejection or mutation, but can through AFP (though its deprecated).
This is dangerous. Other names do strange and terrible things. Note, these don't even have the colon after them.
- CON is rejected.
- COM1 to COM4 each mutate into a different filename.
- NUL mutates into a different filename.
- AUX mutates into a different filename.
- LPT1 to LPT3 each mutate into a different filename.
- PRN mutates into a different filename.
The mutated names follow patterns like CDH4BA~N. I didn't put the real mutations in the post for fear that they actually might leak my encryption key somehow.
While Win10 still prohibits those as file names, this is a Linux and Mac solution. And the 21st century. If AFP is going away, then how would one use those names? Or more importantly, why should they matter now. Can't the operating system and not the protocol do the complaining?
So what happens if you have one system mounting a shared drive with AFP and another mounting the same shared drive with SMB, and the person with the AFP drive makes a directory with one of the above names? Glad you asked. On the SMB system you see mutated names. This means the directory structure doesn't appear consistent between networking applications, like rsync.
What happens if the SMB system tries to create a directory with one of the names above? Well, to them it mutates, but to the AFP person, it surprisingly looks like the right name.
This is when things get weird. Let's say the SMB machine creates a file called AUX, but they now see a folder AZY9U2~9. The AFP machine sees AUX.
If the AFP machine also creates a folder called AZY9U2~9, they'll see that and the original AUX. But the SMB machine now sees AZY9U2~9 and AZY9U2~9 twice in the same directory listing, which is bound to cause problems, if not for the end user.
Update - Name Mangling
This adventure has led into a deeper dive of Samba than I've ever wanted. Turns out this "feature" is called Name Mangling.
smb.conf file has this enabled by default, which I bregrudigngly admit makes sense for backwards compatability to
older systems. According to this ServerFault entry
one can disable it with the setting:
[data] mangled names = no
This feels like one of those cases where you get troubled by peripheral symptoms, but the error messages are inadequate and without knowing what's going on or why, it's difficult to even know what to Google for.
Turns out someone else had this problem over on Ask Different.
Admittedly, I got to this path because a copy simply failed without explanation. If I encountered the mangled names issue, that would have been a much faster path to discovery the issue was in Samba and specifically it's name mangling.
It seems there are now three options, none good:
- Use AFP, although AFP is going away and has been having reliability issues for long lasting file transfer sessions.
- Use SMB for speed and reliability, and accept that it's going to do strange things to my filenames transparently and possibly abort by backups.
- Turn off name mangling, but the author of the ServerFault entrysays this will create problems when you connect via SMB to the server.
I'm guessing that all devices, everywhere, have to elect not to use name mangling, and it's hard to trust Apple won't put things back after each OS update or install.