GNU bug report logs -
#62404
--reflink=auto not falling back appropriately to standard copy in all cases
Previous Next
Reported by: Pádraig Brady <P <at> draigBrady.com>
Date: Thu, 23 Mar 2023 13:04:01 UTC
Severity: normal
Done: Pádraig Brady <P <at> draigBrady.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 62404 in the body.
You can then email your comments to 62404 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#62404
; Package
coreutils
.
(Thu, 23 Mar 2023 13:04:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Pádraig Brady <P <at> draigBrady.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Thu, 23 Mar 2023 13:04:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Details at
https://github.com/termux/termux-packages/issues/15706#issuecomment-1481144831
But in summary, the new --reflink errno checking in coreutils 9.2
is not working appropriately on android 4.9 kernels at least.
cheers,
Pádraig
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#62404
; Package
coreutils
.
(Thu, 23 Mar 2023 13:40:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 62404 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 23/03/2023 13:03, Pádraig Brady wrote:
> Details at
> https://github.com/termux/termux-packages/issues/15706#issuecomment-1481144831
>
> But in summary, the new --reflink errno checking in coreutils 9.2
> is not working appropriately on android 4.9 kernels at least.
Proposed patch attached.
It may be better to key on kernel version rather than __ANDROID__
(along the lines of https://github.com/coreutils/coreutils/commit/ba5e6885d),
but that needs more investigation.
cheers,
Pádraig
[copy-clone-android.patch (text/x-patch, attachment)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#62404
; Package
coreutils
.
(Thu, 23 Mar 2023 14:56:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 62404 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 23/03/2023 13:39, Pádraig Brady wrote:
> On 23/03/2023 13:03, Pádraig Brady wrote:
>> Details at
>> https://github.com/termux/termux-packages/issues/15706#issuecomment-1481144831
>>
>> But in summary, the new --reflink errno checking in coreutils 9.2
>> is not working appropriately on android 4.9 kernels at least.
>
> Proposed patch attached.
> It may be better to key on kernel version rather than __ANDROID__
> (along the lines of https://github.com/coreutils/coreutils/commit/ba5e6885d),
> but that needs more investigation.
Actually it seems the errno checking may be too brittle in general,
https://www.reddit.com/r/Proxmox/comments/11zieha/cpmvinstall_are_broken_in_unprivileged_containers/
Perhaps we should use the above __ANDROID__ logic in all cases,
so that we fallback unless there is a specific reason not to.
The attached does that.
cheers,
Pádraig
[copy-clone-fallback.patch (text/x-patch, attachment)]
Changed bug title to '--reflink=auto not falling back appropriately to standard copy in all cases' from '--reflink=auto not falling back appropriately on older kernels'
Request was from
Pádraig Brady <P <at> draigBrady.com>
to
control <at> debbugs.gnu.org
.
(Thu, 23 Mar 2023 17:31:01 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#62404
; Package
coreutils
.
(Thu, 23 Mar 2023 23:06:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 62404 <at> debbugs.gnu.org (full text, mbox):
Thanks for looking into this.
On 3/23/23 07:55, Pádraig Brady wrote:
> Perhaps we should use the above __ANDROID__ logic in all cases,
> so that we fallback unless there is a specific reason not to.
Yes, that sounds better.
> +/* Whether the errno indicates operation is a transient failure.
> + I.e., a failure that would indicate the operation _is_ supported,
> + but has failed in a terminal way. */
I got confused by the terminology here, as "transient failure" has the
opposite connotation from terminal failure. I suggest that we remove the
function (since it's used only once and its name is confusing) and
replace its calling code as suggested below.
> - if (errno == EPERM && *total_n_read == 0)
> + if (is_CLONENOTSUP (errno, *total_n_read))
The *total_n_read should be *total_n_read != 0 partly for style and
partly to be portable to ancient compilers that don't properly support
bool. More important, I suggest leaving is_CLONENOTSUP's API as-is, and
changing the 'if' to:
if (*total_n_read == 0 && is_CLONENOTSUP (errno))
as I don't see why we should treat EPERM differently from ENOSYS etc.
> /* Whether the errno from FICLONE, or copy_file_range
> indicates operation is not supported for this file or file system. */
>
> static bool
> -is_CLONENOTSUP (int err)
> +is_CLONENOTSUP (int err, bool partial)
Since the code no longer calls is_CLONENOTSUP when FICLONE fails, the
comment should be changed to not mention FICLONE.
> + /* If the clone operation is not creating the destination (i.e., FICLONE)
> + then we could possibly be more restrictive with errors deemed non terminal.
> + However doing that like in the following
> + would be more coupled to disparate errno handling on various systems.
> + if (0 <= dest_desc)
> + transient_failure = ! is_CLONENOTSUP (errno, false); */
> + bool transient_failure = is_transient_failure (errno);
I had trouble with those two "transient_failure"s lining up, plus the
terminology thing mentioned above. I suggest changing this to:
/* When FICLONE fails, report failure only with errno values known
to mean trouble when FICLONE is supported and called properly.
Do do not report failure merely because !is_CLONENOTSUP (errno),
as too many systems yield oddball errno values here. */
bool report_failure = (errno == EIO || errno == ENOMEM
|| errno == ENOSPC || errno == EDQUOT);
and then replace all occurrences of "transient_failure" with
"report_failure".
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#62404
; Package
coreutils
.
(Thu, 23 Mar 2023 23:15:02 GMT)
Full text and
rfc822 format available.
Message #19 received at 62404 <at> debbugs.gnu.org (full text, mbox):
On 23/03/2023 22:57, Paul Eggert wrote:
> Thanks for looking into this.
>
> On 3/23/23 07:55, Pádraig Brady wrote:
>
>> Perhaps we should use the above __ANDROID__ logic in all cases,
>> so that we fallback unless there is a specific reason not to.
>
> Yes, that sounds better.
>
>> +/* Whether the errno indicates operation is a transient failure.
>> + I.e., a failure that would indicate the operation _is_ supported,
>> + but has failed in a terminal way. */
>
> I got confused by the terminology here, as "transient failure" has the
> opposite connotation from terminal failure. I suggest that we remove the
> function (since it's used only once and its name is confusing) and
> replace its calling code as suggested below.
>
>
>> - if (errno == EPERM && *total_n_read == 0)
>> + if (is_CLONENOTSUP (errno, *total_n_read))
>
> The *total_n_read should be *total_n_read != 0 partly for style and
> partly to be portable to ancient compilers that don't properly support
> bool. More important, I suggest leaving is_CLONENOTSUP's API as-is, and
> changing the 'if' to:
>
> if (*total_n_read == 0 && is_CLONENOTSUP (errno))
>
> as I don't see why we should treat EPERM differently from ENOSYS etc.
>
>
>> /* Whether the errno from FICLONE, or copy_file_range
>> indicates operation is not supported for this file or file system. */
>>
>> static bool
>> -is_CLONENOTSUP (int err)
>> +is_CLONENOTSUP (int err, bool partial)
>
> Since the code no longer calls is_CLONENOTSUP when FICLONE fails, the
> comment should be changed to not mention FICLONE.
>
>
>> + /* If the clone operation is not creating the destination (i.e., FICLONE)
>> + then we could possibly be more restrictive with errors deemed non terminal.
>> + However doing that like in the following
>> + would be more coupled to disparate errno handling on various systems.
>> + if (0 <= dest_desc)
>> + transient_failure = ! is_CLONENOTSUP (errno, false); */
>> + bool transient_failure = is_transient_failure (errno);
>
> I had trouble with those two "transient_failure"s lining up, plus the
> terminology thing mentioned above. I suggest changing this to:
>
> /* When FICLONE fails, report failure only with errno values known
> to mean trouble when FICLONE is supported and called properly.
> Do do not report failure merely because !is_CLONENOTSUP (errno),
> as too many systems yield oddball errno values here. */
> bool report_failure = (errno == EIO || errno == ENOMEM
> || errno == ENOSPC || errno == EDQUOT);
>
> and then replace all occurrences of "transient_failure" with
> "report_failure".
Thanks for the review.
I'll adjust as suggested.
The is_CLONENOTSUP() change _was_ a little contrived
since it wasn't actually needed in the new implementation.
It would only be needed if we ever wanted to change back
to the more aggressive errno checks after FICLONE,
in which case the is_CLONENOTSUP() would then be correct.
I'll leave it as is and comment for now.
thanks,
Pádraig
Reply sent
to
Pádraig Brady <P <at> draigBrady.com>
:
You have taken responsibility.
(Fri, 24 Mar 2023 13:22:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Pádraig Brady <P <at> draigBrady.com>
:
bug acknowledged by developer.
(Fri, 24 Mar 2023 13:22:02 GMT)
Full text and
rfc822 format available.
Message #24 received at 62404-done <at> debbugs.gnu.org (full text, mbox):
The fix for this is now pushed at:
https://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=093a8b4bf
marking this as done.
thanks,
Pádraig
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#62404
; Package
coreutils
.
(Sat, 25 Mar 2023 20:35:01 GMT)
Full text and
rfc822 format available.
Message #27 received at 62404 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Thanks for installing that. I found the comments still a bit confusing
so I pushed the attached; hope it's OK.
It is annoying that in the common case where A is a regular file and B
does not exist but will be created on the same file system, the syscalls
start out:
openat(AT_FDCWD, "B", O_RDONLY|O_PATH|O_DIRECTORY) = -1 ENOENT
newfstatat(AT_FDCWD, "A", ...}, 0) = 0
openat(AT_FDCWD, "A", O_RDONLY) = 3
newfstatat(3, "", ..., AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "B", O_WRONLY|O_CREAT|O_EXCL, 0775) = 4
ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EOPNOTSUPP
newfstatat(4, "", ..., AT_EMPTY_PATH) = 0
They should be just:
openat(AT_FDCWD, "A", O_RDONLY) = 3
newfstatat(3, "", ..., AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "B", O_WRONLY|O_CREAT|O_EXCL, 0775) = 4
ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EOPNOTSUPP
newfstatat(4, "", ..., AT_EMPTY_PATH) = 0
as there should be no need to stat the source twice, or to try to open
the destination twice. But this is a performance improvement for another
day.
[0001-cp-clarify-commentary.patch (text/x-patch, attachment)]
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sun, 23 Apr 2023 11:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 1 year and 338 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.