Re: data loss when doing ls-remote and piped to command
From: Rolf Eike Beer
Date: Fri Sep 17 2021 - 02:59:16 EST
Am Donnerstag, 16. September 2021, 22:42:22 CEST schrieb Junio C Hamano:
> Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:
> > On Thu, Sep 16, 2021 at 5:17 AM Rolf Eike Beer <eb@xxxxxxxxx> wrote:
> >> Am Donnerstag, 16. September 2021, 12:12:48 CEST schrieb Tobias Ulmer:
> >> > > The redirection seems to be an important part of it. I now did:
> >> > >
> >> > > git ... 2>&1 | sha256sum
> >> >
> >> > I've tried to reproduce this since yesterday, but couldn't until now:
> >> >
> >> > 2>&1 made all the difference, took less than a minute.
> >
> > So if that redirection is what matters, and what causes problems, I
> > can almost guarantee that the reason is very simple:
> > ...
> > Anyway. That was a long email just to tell people it's almost
> > certainly user error, not the kernel.
>
> Yes, 2>&1 will mix messages from the standard error stream at random
> places in the output, which explains the checksum quite well.
If there would be any errors. The point is: if I run the command with ">/dev/
null" just to the terminals a hundred times there is never any output on
stderr at all. If I pipe stderr into a file it's empty after all of this (yes,
I did append, not overwrite).
That the particular construct in this case is sort of nonsense is granted, I
just hit it because some tool here used some very similar construct and
suddenly started failing. "less" isn't the original reproducer, it was just
something I started testing with to be able to easily visually inspect the
output.
What you need is a _fast_ git server. kernel.org or github.com seem to be too
slow for this if you don't sit somewhere in their datacenter. Use something in
your local network, a Xeon E5 with lot's of RAM and connected with 1GBit/s
Ethernet in my case.
And the reader must be "somewhat" slow. Using sha256sum works reliably for me.
Using "wc -l" does not, also md5sum and sha1sum are too fast as it seems.
When I run the whole thing with strace I can't see the effect, which isn't
really surprising. But there is a difference between the cases where I run
with redirection "2>&1":
ioctl(2, TCGETS, 0x7ffd6f119b10) = -1 ENOTTY (Inappropriate ioctl for
device)
and without:
ioctl(2, TCGETS, {B38400 opost isig icanon echo ...}) = 0
AFAICT this is the only place where fd 2 is used at all during the whole time.
Regards,
Eike
--
Rolf Eike Beer, emlix GmbH, http://www.emlix.com
Fon +49 551 30664-0, Fax +49 551 30664-11
Gothaer Platz 3, 37083 Göttingen, Germany
Sitz der Gesellschaft: Göttingen, Amtsgericht Göttingen HR B 3160
Geschäftsführung: Heike Jordan, Dr. Uwe Kracke – Ust-IdNr.: DE 205 198 055
emlix - smart embedded open source
Attachment:
signature.asc
Description: This is a digitally signed message part.