Skip to main content


Live Toot of reading P2739r0 since I'm finally sitting down again. Also I have hot chocolate and it's cold, so this is nice reading material. Albeit... .... wait is this document only 2 pages? I put off reading it this long for that??
This entry was edited (1 year ago)
in reply to Björkus "No time_t to Die" Dorkus

"A call to action:
Think seriously about “safety”;
then do something sensible about it"

The paper is 2 pages long. What could they possible figure out how to do in 2 pages? This is about to be the most underwhelming thing I've ever read or it's going to blow my fucking mind.

Let's find out which, yeah?

in reply to Björkus "No time_t to Die" Dorkus

Anyways, the reason this paper was written was because The Gubberment™ noticed that C and C++ mysteriously keep having lots of vulnerabilities and issues, particularly leading to critical vulnerabilities that blow everything up. The NSA itself chimed in GRANTED, that's probably because C and C++ are used quite a bit in lots of places, but amongst most software they've got some earth-shattering CVEs under their belt. (Not that CVEs matter: we all remember "Trojan Source", right, that resulted in VSCode shipping bullshit that highlights every "potential harmful character" in your editor even if that's exactly what you fucking intended to type with no smart analysis for anything?)

This paper notices that, and quotes the juiciest tidbit present:

the overarching software community across the private sector, academia, and the U.S.
Government have begun initiatives to drive the culture of software development towards utilizing memory safe languages. [3] [4]
[5]...
NSA advises organizations to consider making a strategic shift from programming languages that provide little or no inherent memory protection, such as C/C++, to a memory safe language when possible. Some examples of memory safe languages are C#, Go, Java, Ruby™, and Swift®.
in reply to Björkus "No time_t to Die" Dorkus

The paper immediately goes on to state:

That specifically and explicitly excludes C and C++ as unsafe. As is far too common, it lumps C and C++ into the single category C/C++, ignoring 30+ years of progress. Unfortunately, much C++ use is also stuck in the distant past, ignoring improvements, including ways of dramatically improving safety.


Annnd this is where the paper semi-contradicts itself. At first, C++ has engaged in "30+ years of improvements", which means it has exceeded the boundaries of what is bad for C by a huge 30 year margin. And believe me, with all the ways I call C a crummy programming language (sometimes literally), I'm no stranger to how low the bar is.

But in that same paragraph, Professor Stroustrup then states that much C++ use is "also stuck in the distant past". Which... validates lumping the two together. "People aren't using new, modern C++" and "lots of C and C++ are basically identical in style and functionality" is one in the same, and so Stroustrup immediately gives up the game here by just straight up admitting that people aren't adopting Modern C++ fast enough for even C++'s own tastes, completely ignoring C. This makes the NSA's assessment of the ecosystem Spot On. We can argue about why we got here, but ultimately the conclusion the NSA is drawing is correct.

If the C++-in-use hasn't advanced much, it doesn't really matter if there is a theoretical C++ that's better. It's not being written, so in practice it doesn't matter. ¯\_(ツ)_/¯.

in reply to Björkus "No time_t to Die" Dorkus

There is, of course, more:

Now, I can’t say that I am surprised. After all, I have worked for decades to make it possible to write better, safer, and more efficient C++.


I mean, you made the language, so like yeah. If you didn't believe in it there'd literally be no reason for anyone else to, so that's a bit of a given.

In particular, the work on the C++ Core Guidelines specifically aims at delivering statically guaranteed type-safe and resource-safe C++ for people who need that without disrupting code bases that can manage without such strong guarantees or introducing additional tool chains. For example, the Microsoft Visual Studio analyzer and its memory-safety profile deliver much of the CG support today and any good static analyzer (e.g., Clang tidy, that has some CG support) could be made to completely deliver those guarantees at a fraction of the cost of a change to a variety of novel “safe” language


So first off: citation needed. One of the most advanced C++ shops on the planet -- Google -- already wrote about how introducing a crapton of new Rust code into one of their most critical systems -- Android mobile devices -- has had a net return on vulnerabilities of "0 per the ${lots of code} we've written". See:

[0] security.googleblog.com/2021/0…
[1] security.googleblog.com/2022/1…

This is the Multibillion, Multinational corporation that sunk an enormous amount of resources into C++, basically heralding some of the most advanced bug detection techniques at compile-time and run-time, and brave enough to actually ship UBSan/ASan in production on one of their most popular platforms (Android). They have the most advanced threat mitigation and threat detection teams on the planet (asides from the ones funded straight up by governments like Israel and the United States).

It's going to take a lot more than Stroustrup's word that the Core Guidelines -- which many vendors have stated are already too noisy and many parts of it unenforceable because they're too broad and written more like advice to a programmer/guide than a hard-and-fast checkable rule they can mechanically compute for -- will work. I've also been hearing how the Core Guidelines and these static checkers will eliminate all of C++'s (and, to other extents, C's) problems for a decade now. It's the promise I've been made over and over again, that if we just make this change will have the Safe Enough Language We've Always Dreamed Of.

I personally am beginning to suspect there's a little snake in this oil, but that's a personal feeling not substantiated by much.

in reply to Björkus "No time_t to Die" Dorkus

The paper's next move is to link 3 proposals from 2022, 2021, and 2015 about how to use C++ for type and resource safety, and then the C++ Core Guidelines themselves.

Speed-summary.

2022


The 2022 paper lays out a bunch of alternatives and then goes all-in on the Core Guidelines as THE solution. "Just develop the checkers more, slap then on top of existing code and keep using the compatibility story to push what C++ is in more places". It's built around this idea.

It mentions other ways of maintaining type safety, but it boils down to "use the RAII we have and combine it with lots of core guidelines to get what we want". I mean, RAII is wonderful, and that's fair, but you know. Again.

Existence proof of Google's work makes it a little less palatable to think this'll do the trick completely.

2021


This one very explicitly tries to lay out some rules for protecting programmers. But, a lot of the advice is simply just straight up unactionable for the way C++ is written today, meaning it's likely geared towards people doing things today to write new code without lots of refactoring. For example, Rule #2 from this paper is

Every object is accessed according to the type with which it was defined.

The language guarantees this except for preventable pointer misuses (see §4, §5), explicit casts, and type punning using unions. The CG have specific rules to enforce these language rules.

Static analysis can prevent unsafe casting and unsafe uses of unions. Type-safe alternatives to unions, such as std::variant, are available. Casting is essential only for converting untyped data (bytes) into
typed objects.


This is a pretty sensible rule until you start doing things in more complicated code bases or interfacing with C, where the name of the game is committing crimes with pointers -- void* or otherwise -- to get things done. Type-punning is common, and std::variant only showed up in C++17 (much like std::string_view also showing up 10 years late by being introduced in C++17, also with a hamstringed std::optional type -- see here for my usual rant on that).

It also flies in the face of the whole "we take advantage of code that exists", because doing that means you need to interface with C++03, C++11, and C89 to C17 code. All of that is raw pointers and crimes, especially since until C++11 there was no move semantics, so if the whole point -- as these papers and the overarching paper points out -- is to keep compat alive and use old code, then you're going to have to keep working with a grafted up monster.

2015


This is actually the best paper because it uses lots of examples to explain some serious problems with C and C++ code. It walks people through introducing RAII types for safety to prevent use-after-free, double-free, and other large classes of problems. It still shills for the Core Guidelines eventually, but at least it's trying to help. But, again, it offers it in the context if migrating old code to new styles and idioms; but, at the same time, the tension of "you can keep using your old code" keeps rising up. So either we're all using old code or we're rewriting it to use the new stuff, which means that for all the compatibility we have we're still rewriting large chunks of code to defend against bugs instead of just writing the code we want to write to make progress.

This is a consistent theme for this paper and many C++ papers; "compatibility is important", is what they keep saying, while deeply implying that old code needs new tools to be rewritten in and transition paths must happen. At some point, I need to stop fucking with Old and Busted But Occasionally Reliable and start working on the New Hotness, so we end up with a Lovecraftian monstrosity as a matter-of-fact in most long-lived C++ codebases.

Damned if you do, damned if you don't I guess?

#2
This entry was edited (1 year ago)
in reply to Björkus "No time_t to Die" Dorkus

(Sorry for keeping you in the tag on the last one, Mara, I am a certified fuckin' dumbass. ANYWHO.)

Jumping out of those links, we get back to the paper cited at the top of this thread. After the paper citations, it is said:

There is not just one definition of “safety”, and we can achieve a variety of kinds of safety through a combination of programming styles, support libraries, and enforcement through static analysis.


Well, that's what we've been doing, but I have to admit. A lot of testing and fuzzing and etc. just straight up doesn't happen. OpenSSL shipped a punycode decoder, and you know they didn't test it because a single example straight out of the punycode RFC bricked the code; C and C++ developers routinely and messily don't deliver on either using static analysis or testing their code. The paper says that we can achieve that just by trying to get people to Test More and Test Harder, and I'm inclined to believe Professor Stroustrup here.

But then, you have to get them to test and fuzz and check, and we haven't been able to get them to do that for the greater part of the past 10 years, so like.

Can we start talking about that point?

Either because the code complexity is too high, the time it takes to write and deploy is too high (shoutout to Charity Majors / mipsytipsy for constantly writing about how shitty orgs have to be -- and in general, how shitty the entire software development ecosystem is -- that everyone fears Deploying On A Friday because their build, test, and run cycles are so fuckin' slow they never notice a problem until the weekend is well underway or on Monday).

So, if you're going to spend ONE MILLION DOLLARSYEEEARS building your C++ code, you might as well have checks built into this. This is what the sister paper to the above paper, P2759r0, admits:

We now support the idea that the changes for safety need to be not just in tooling, but visible in the language/compiler, and library.


This concession is solely because of languages like Rust, where they proved that you could stick a borrow checker in the compiler and get people to write "Systems Level", "Close to the Metal" (or w/e the fuck people call it nowadays) without a runtime, a garbage collector, and loss of feet and legs from foot-shotguns and thigh cannons.

By building checks into the compiler, you spend one exceedingly painful amount time rather than dispersing it over 6+ different tools that have to re-parse the code and build (sometimes, almost identical) structures to provide diagnostics and check things. But, C++ as a language also makes that exceedingly difficult to do (thanks, runtime-based, non-destructive Move and other purely C++ constructs that have nothing to do with C).

So, another place where Professor Stroustrup is wishing for something fundamentally unattainable, not necessarily because it's impossible but because we've been trying for a decade or 2 now and it's not quite panning out for a lot of us.

in reply to Björkus "No time_t to Die" Dorkus

This is a tremendous thread; thank you for writing it.

Your punycode/openssl example reminded me of the little things, like how freakin' hard it is to write tests in C/C++ for little "purely algorithmic" stuff like that. Pick a testing framework. Make it so the code to test doesn't need the rest of your code because then it doesn't link. Add the test runner to your CI. That's pretty far from "#[test]" and "cargo test".

Hubert Figuière reshared this.

in reply to Federico Mena Quintero

And while reading the thread I was wondering about the cost of porting to Rust, vs. the cost of updating to modern C++ practices. Happy to read the last posts where you addressed this.

(I think we should make a lot more noise on porting things gradually. It is possible, it works, it WORKS FREAKING' GREAT. I think too many people want full rewrites.)

in reply to Federico Mena Quintero

@federicomena That indeed reminds me of the time I ported a C++ class to Rust just so I could test it with Cargo.