[PATCH] Add additional error handling to safe

Discussion:

[PATCH] Add additional error handling to safe_rename().

Kevin J. McCarthy

2018-08-27 01:48:46 UTC

Here's a first try at a patch to check for the historically documented
"link() returning -1 but succeeding" case for NFS.

--
Kevin J. McCarthy
GPG Fingerprint: 8975 A9B3 3AA3 7910 385C 5308 ADEF 7684 8031 6BDA

Derek Martin

2018-08-28 20:27:51 UTC

Permalink

Post by Kevin J. McCarthy
Here's a first try at a patch to check for the historically documented
"link() returning -1 but succeeding" case for NFS.

This seems fine. Though, it seems it leaves in the stat comparison if
link() succeeds, which is superfluous (could cause needless
performance hit on slow networked filesystems), and violates KISS and
"Don't write code you don't need" principles. I personally would have
avoided goto by eliminating that superfluous check, and putting the
rename() bits in an else clause to the if (lstat() && lstat() &&
compare_stat()) conditional.

It obviously also does not attempt to address the follow symlink
behavior I mentioned, which is also fine. It may well be that
safe_rename() is never used in contexts where that's likely to matter,
so it's possible that it would be reasonable to ignore that issue.
--
Derek D. Martin http://www.pizzashack.org/ GPG Key ID: 0xDFBEAD02
-=-=-=-=-
This message is posted from an invalid address. Replying to it will result in
undeliverable mail due to spam prevention. Sorry for the inconvenience.

Post by Kevin J. McCarthy
From f0faf6cf205e0b96c2fb93ea12393458843aa485 Mon Sep 17 00:00:00 2001
Date: Sun, 26 Aug 2018 18:43:20 -0700
Subject: [PATCH] Add additional error handling to safe_rename().
It is apparently possible for link() to return an error but the link
to still be created. Add a double check for that case. If the files
match, unlink the src and return success.
---
lib.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
diff --git a/lib.c b/lib.c
index 9a5d5325..0f65a641 100644
--- a/lib.c
+++ b/lib.c
@@ -446,12 +446,31 @@ int safe_symlink(const char *oldpath, const char *newpath)
int safe_rename (const char *src, const char *target)
{
struct stat ssb, tsb;
+ int link_errno;
if (!src || !target)
return -1;
if (link (src, target) != 0)
{
+ link_errno = errno;
+
+ /*
+ * It is historically documented that link can return -1 if NFS
+ * dies after creating the link. In that case, we are supposed
+ * to use stat to check if the link was created.
+ */
+ if ((lstat (src, &ssb) == 0) &&
+ (lstat (target, &tsb) == 0) &&
+ (compare_stat (&ssb, &tsb) == 0))
+ {
+ dprint (1, (debugfile,
+ "safe_rename: link (%s, %s) reported failure: %s (%d) but actually succeded\n",
+ src, target, strerror (errno), errno));
+ goto success;
+ }
+
+ errno = link_errno;
/*
* Coda does not allow cross-directory links, but tells
@@ -533,11 +552,11 @@ int safe_rename (const char *src, const char *target)
}
#endif
/*
* Unlink the original link. Should we really ignore the return
* value here? XXX
*/
-
if (unlink (src) == -1)
{
dprint (1, (debugfile, "safe_rename: unlink (%s) failed: %s (%d)\n",
--
2.18.0

Kevin J. McCarthy

2018-08-29 01:08:46 UTC

Permalink

Post by Derek Martin

Post by Kevin J. McCarthy
Here's a first try at a patch to check for the historically documented
"link() returning -1 but succeeding" case for NFS.

This seems fine.

Thank you for taking a look, Derek.

Post by Derek Martin
Though, it seems it leaves in the stat comparison if
link() succeeds, which is superfluous (could cause needless
performance hit on slow networked filesystems), and violates KISS and
"Don't write code you don't need" principles

My strategy with these two patches was to change the code as little as
possible. I agree the stats if link() succeeds are now 95% useless, but
my thinking was they were there before and provided some kind of check
that the/a target was actually there after the link. I could be
persuaded to pull them inside the #if 0, but I figure they are doing as
much harm as they have been for the past 20 years... :-/

Post by Derek Martin
It obviously also does not attempt to address the follow symlink
behavior I mentioned, which is also fine. It may well be that
safe_rename() is never used in contexts where that's likely to matter,
so it's possible that it would be reasonable to ignore that issue.

Yes, my thinking was like that. Technically I think you have a valid
point, but the code apparently wasn't used in any context where the
source was a symlink. It would have failed the compare_stat() in that
case and gone into an infinite loop too, if it were.

So, I again decided to minimize changes, keeping the use of lstat()
consistent with previous behavior.

--
Kevin J. McCarthy
GPG Fingerprint: 8975 A9B3 3AA3 7910 385C 5308 ADEF 7684 8031 6BDA

Derek Martin

2018-08-29 17:16:14 UTC

Permalink

Post by Kevin J. McCarthy

I agree with your assessment, except for one detail: The check is
next to useless because we already know the link succeeded, and after
the fact the check has a race condition: Even if the check succeeds,
you can't know that the file will still exist when Mutt next tries to
do something to it. So it really doesn't do anything useful.

In retrospect it seems clear to me that whoever wrote that code just
got it wrong. The harm is minimal, except that as I pointed out, it
can cause a performance hit. On a networked file system, stat() is a
relatively expensive operation (in the sense that the data transfer is
predominated by overhead--round trip time and network headers). If
the network is slow, then the more of these your application does, the
worse it will perform. So for example, this could have a significant
performance impact on maildir over NFS, if, say, you tag a bunch of
messages in order to mark them as not new.

On a fast, unloaded network, the user is unlikely to notice at all.
If the network is just slow, the user may or may not notice, depending
on how many messages... But if the network is congested, you're only
adding to the congestion by doing unnecessary operations. It would be
interesting to do a large test, comparing the time with and without
those stats, ideally on a slow network...

So, I think the post-success stat() calls should should come out, and
I wouldn't bother using #if to remove it. I think it's clearly just a
bad idea to do that check there, and we should remove any trace of it
so that no one is tempted to put it back in at some point in the
future. But if you feel there's a reason to keep it I won't continue
to argue the point.

Post by Kevin J. McCarthy

Sure. It's easy enough to address if a user hits it--probably worth
documenting (in the code) though, if nothing else.

--
Derek D. Martin http://www.pizzashack.org/ GPG Key ID: 0xDFBEAD02
-=-=-=-=-
This message is posted from an invalid address. Replying to it will result in
undeliverable mail due to spam prevention. Sorry for the inconvenience.

Kevin J. McCarthy

2018-08-29 20:26:22 UTC

Permalink

Post by Derek Martin
So, I think the post-success stat() calls should should come out, and
I wouldn't bother using #if to remove it. I think it's clearly just a
bad idea to do that check there, and we should remove any trace of it
so that no one is tempted to put it back in at some point in the
future. But if you feel there's a reason to keep it I won't continue
to argue the point.

I've pulled it out, but for now I'll leave it in the #if0 with the
compare_stat() call. I'll add a todo to completely yank it in the next
development cycle though.

Post by Derek Martin
Sure. It's easy enough to address if a user hits it--probably worth
documenting (in the code) though, if nothing else.

I've added a comment about that as a breadcrumb if needed in the future.

Thanks again.

--
Kevin J. McCarthy
GPG Fingerprint: 8975 A9B3 3AA3 7910 385C 5308 ADEF 7684 8031 6BDA