GNU bug report logs - #21665
Use of mmap for large files

Previous Next

Package: diffutils;

Reported by: Maurice van der Pot <griffon26 <at> kfk4ever.com>

Date: Sun, 11 Oct 2015 17:09:01 UTC

Severity: normal

To reply to this bug, email your comments to 21665 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-diffutils <at> gnu.org:
bug#21665; Package diffutils. (Sun, 11 Oct 2015 17:09:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Maurice van der Pot <griffon26 <at> kfk4ever.com>:
New bug report received and forwarded. Copy sent to bug-diffutils <at> gnu.org. (Sun, 11 Oct 2015 17:09:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Maurice van der Pot <griffon26 <at> kfk4ever.com>
To: bug-diffutils <at> gnu.org
Subject: Use of mmap for large files
Date: Sun, 11 Oct 2015 15:42:03 +0200
[Message part 1 (text/plain, inline)]
I am working on an application suitable for visually merging large files.
This application delegates determination of differences to GNU diff.

Unfortunately I have found that diff reads the entire input files into
memory, leading to "/usr/bin/diff: memory exhausted" messages on the
types of files I'd like to support.

Would you be open to patches that enable diffing large files by using
mmap?

Kind regards,
Maurice.

-- 
Maurice van der Pot

Kdiff3 developer   griffon26 <at> kfk4ever.com   http://kdiff3.sourceforge.net
Tdiff3 developer                            https://github.com/Griffon26/tdiff3
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-diffutils <at> gnu.org:
bug#21665; Package diffutils. (Sun, 11 Oct 2015 21:16:02 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: bug-diffutils <at> gnu.org
Subject: Re: [bug-diffutils] bug#21665: Use of mmap for large files
Date: Sun, 11 Oct 2015 14:14:53 -0700
Maurice van der Pot wrote:
> Would you be open to patches that enable diffing large files by using
> mmap?

I doubt whether that would help that much, as it still needs to construct 
information about each line, and that information consumes memory too.  Doing 
this in secondary storage would be a bear.  In practice when I've run into this 
problem, I've either gotten a bigger machine or made my input lines shorter. 
Preferably the former.




Information forwarded to bug-diffutils <at> gnu.org:
bug#21665; Package diffutils. (Mon, 02 May 2016 01:28:01 GMT) Full text and rfc822 format available.

Message #11 received at 21665 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Maurice van der Pot <griffon26 <at> kfk4ever.com>
Cc: 21665 <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#21665: Use of mmap for large files
Date: Sun, 1 May 2016 18:27:03 -0700
tags 21665 notabug
close 21665
done

On Sun, Oct 11, 2015 at 6:42 AM, Maurice van der Pot
<griffon26 <at> kfk4ever.com> wrote:
> I am working on an application suitable for visually merging large files.
> This application delegates determination of differences to GNU diff.
>
> Unfortunately I have found that diff reads the entire input files into
> memory, leading to "/usr/bin/diff: memory exhausted" messages on the
> types of files I'd like to support.
>
> Would you be open to patches that enable diffing large files by using
> mmap?

As Paul responded in http://bugs.gnu.org/21665#8, using mmap seems
unlikely to help much, but if you write the patch and demonstrate that
it does make a difference, we'll be very interested, and I will
happily reopen the issue.

For now, I'm marking this as notabug and closing it.




This bug report was last modified 7 years and 362 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.