Steve Litt via plug on 2 Apr 2024 19:24:17 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] XZ scanner


Rich Freeman via plug said on Tue, 2 Apr 2024 18:43:44 -0400

>On Tue, Apr 2, 2024 at 4:47 PM Steve Litt via plug
><plug@lists.phillylinux.org> wrote:
>>
>> "Reinventing the wheel" is a cute and persuasive phrase for
>> trivializing developers who code their own rather than gleaning other
>> peoples' code (OPC) far and wide, but for the past several years the
>> OPC caused complexification with its attendant voluminous attack
>> surface has been on full display.  
>
>This is why everybody and their uncle was writing their own bubble
>sorts until standard libraries started including a way to sort
>collections.  You're just trading one set of problems for another.

1) Sorting isn't a good example because a quicksort doesn't, or
   shouldn't, contain anybody else's code.

2) Nobody I knew wrote bubble sorts for production code. Insert sorts,
   maybe, merge sorts when it couldn't be done in RAM, sure. And when I
   started in this business in 1984, Knuth's "Data Structures and
   Algorithms" was already available at a reasonable price, and it
   explained why quicksorts are so fast and gave the (Pascal) code to
   implement it.

3) As far as sorting collections, I've never had to do so myself [4],
   but it seems to me this would be as simple as going through the
   collection, building an index file of the data to be sorted on, and
   either access the collection through the index file, or physically
   sort the collection using the index file. I'm sure there are better
   ways, but this way is probably good enough, at least as a first
   stab at the problem. But as I said...

[4] A quicksort is just the kind of Other People's Code (OPC) you
    *would* use, always assuming it doesn't incorporate all sorts of
    other features, code and libraries.


>
>> >Even on something like the kernel or
>> >a browser I bet you could slowly work your contributors in such that
>> >they become the majority of eyeballs in a single subsystem and
>> >become trusted to get code far enough along the QA process that it
>> >doesn't get as much close attention.  
>>
>> Yes. This is what happens when software gets big, ugly, entangled,
>> and poorly designed.  
>
>Uh, how would you fix Linux or any of the modern browsers so that they
>aren't "poorly designed?"

How would your $1M/year gang of professionals fix these things? The
Linux kernel is what it is: We just need to trust the crew for that,
and use distros that don't link unneeded libraries to it.

As far as browsers, it's too bad that the HTML language wasn't made an
XML dialect right from the start, because doing so would have cleared
up a whole lot of these problems. By the time XHTML came along, there
was too much existing HTML nobody wanted to rewrite to adopt it.

But you asked me a question. The only way to fix it would be to build,
from scratch, hopefully in C or some other fast, low resource, simple
language, a browser conforming to the HTML5 specs. Web authors too lazy
to rewrite their HTML to HTML5 specs, sucks to be them. Costs a lot
more than $1M/year if you want to get it done before HTML5 is
deprecated, and no way volunteers would do this.

>
>Complex software isn't inherently bad.  

Yes it is.

>It is just beyond the total
>comprehension of a single developer.

Which is an extremely bad thing. Nobody can even draw a box and line
block diagram for it because it's not understood. Murder to debug,
plenty of nooks and crannys to hide bugs, both accidental and
deliberate.

>
>It really doesn't matter if you split it up into 100 simpler parts,
>you still have the same problem that those parts need to trust each
>other to work.  

Dividing it into simple parts is exactly what's called for, if and only
if those parts are connected via a thin, simple, visible interface so
each part can be tested alone, and any part can be replaced by a dummy
that simply articulates its inputs and asks for the desired outputs.


>After all, this issue occurred in a library that is
>fairly simple already, and if you just re-implemented it dozens of
>times that is just dozens of more places where somebody could have
>implanted the same bug and nobody would have noticed, since it would
>have been just as obscure as a fragment of a larger program.

You're right. It's as simple as it can be, given the problem domain.
It's a shame that commercial Linuces like Redhat didn't give it a
once-over before using it, and each time it was changed.


>
>> So let's not make it easy for them. Before incorporating a library,
>> everyone should ask:
>>
>> * Are the library's features worth the complexification and magnified
>>   attack surface?
>> * How easy would it be to achieve the desired outcome, perhaps in a
>>   different form, with a reasonable number of lines of first person
>>   code?  
>
>Uh, just how easy do you think it is to implement your own lzma
>decompressor, and what is the likely result if you get something
>subtly wrong?

I'm not good enough to do that. As I mentioned, this is not a good case
for not using other peoples' code.

What might be interesting, and something I could do if I wanted, is to
create a program that scans source code for things like memcpy, strcpy,
and other stuff that's just asking for trouble, and reporting on it. In
this case, for instance, the safe_fprint function had been replaced by
a less secure variant. If that variant could have been scanned for,
this would have been nipped in the bud. See
https://arstechnica.com/security/2024/04/what-we-know-about-the-xz-utils-backdoor-that-almost-infected-the-world/
for details. And at the very least, devs fix all warnings before
releasing software.

Speaking of
https://arstechnica.com/security/2024/04/what-we-know-about-the-xz-utils-backdoor-that-almost-infected-the-world/
, according to this web page the way the xz exploit got to ssh was that
"Debian and many other Linux distributions add a patch to link sshd to
systemd, a program that loads a variety of services during the system
bootup. Systemd, in turn, links to liblzma, and this allows xz Utils to
exert control over sshd." Complexification increases attack surface.

SteveT

Steve Litt 

Autumn 2023 featured book: Rapid Learning for the 21st Century
http://www.troubleshooters.com/rl21
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug