daniel:// stenberg://

daniel:// stenberg://

1 year ago • •

daniel:// stenberg://
1 year ago • •

The #curl graph we always get to debate over. Number of *C mistakes* vs *non-C mistakes* among the existing 145 reported vulnerabilities. Updated with the latest 4 reports, and the LOC graph added as a comparison.

#curl

in reply to daniel:// stenberg://

Lambda

in reply to daniel:// stenberg:// • 1 year ago • •

how do you distinguish "C mistakes" from "non-C mistakes"? Just memory safety, or also considering better abstractions that would've been used in anything other than a line-by-line translation?

in reply to Lambda

daniel:// stenberg://

in reply to Lambda • 1 year ago • •

I qualify "C mistakes" to be one of: OVERFLOW, OVERREAD, DOUBLE_FREE, USE_AFTER_FREE, NULL_MISTAKE or UNINIT.

They have all been manually assessed by me, so there's a of course a risk of mistakes in there.

My thinking has been to identify problems that *likely* would not have happened if we had not used C.

This entry was edited (1 year ago)

in reply to daniel:// stenberg://

Troed Sångberg

in reply to daniel:// stenberg:// • 1 year ago • •

Why did I not know of this graph before. It's so going into basic cybersecurity-for-developers material.

in reply to daniel:// stenberg://

Rob Napier

in reply to daniel:// stenberg:// • 1 year ago • •

The sharp uptick in 2014 is interesting. Was there a particular effort that year that exposed so many (valgrind?), or just the randomness of data? I didn't see you discuss the history in your blog post.

(These reports are a real asset. Thank you for so much transparency and helpful data.)

in reply to Rob Napier

daniel:// stenberg://

in reply to Rob Napier • 1 year ago • •

@cocoaphony the graph shows when the flaws were introduced, not found. I really don't know how it happened so much in that particular period. We have adjusted and improved internals since then, which possibly have helped.

@Rob Napier

Unknown parent

daniel:// stenberg://

Unknown parent • 1 year ago • •

@ironiridis I have the numbers separated. This graph shows product code only, no test code.

@Chris Harrington ☕✊

Unknown parent

daniel:// stenberg://

Unknown parent • 1 year ago • •

@ironiridis the number of test cases over time follows the LOC pretty closely. I have a graph for that as well 😄

@Chris Harrington ☕✊

in reply to daniel:// stenberg://

brk, a.k.a. @evanrichter

in reply to daniel:// stenberg:// • 1 year ago • •

should be "Non-C mistakes Given a C codebase"

in reply to brk, a.k.a. @evanrichter

daniel:// stenberg://

in reply to brk, a.k.a. @evanrichter • 1 year ago • •

@brk it would be tricky to compare C vs non-C mistakes in something that is *not* C code...

@brk, a.k.a. @evanrichter

in reply to daniel:// stenberg://

Alexander Shendi

in reply to daniel:// stenberg:// • 1 year ago • •

It would be interesting to know what distinguishes 'C' from 'non C' mistakes.

My summary of the graph:
* 'C mistakes' account for roughly 2/3 of all nistakes.
* Both 'C' and 'non C' mistakes strongly correlated with LOC.

in reply to Alexander Shendi

daniel:// stenberg://

in reply to Alexander Shendi • 1 year ago • •

@alexshendi the C mistakes are at ~41% of the total

@Alexander Shendi

in reply to daniel:// stenberg://

Alexander Shendi

in reply to daniel:// stenberg:// • 1 year ago • •

True, my mistake. Now, what counts as 'C' mistake? Links or references to papers ok.

TIA.

in reply to Alexander Shendi

daniel:// stenberg://

in reply to Alexander Shendi • 1 year ago • •

@alexshendi Two years ago we were still at ~50% C mistakes and then I blogged this: daniel.haxx.se/blog/2021/03/09…

half of curl’s vulnerabilities are C mistakes | daniel.haxx.se

^{daniel.haxx.se}

@Alexander Shendi

in reply to daniel:// stenberg://

Alexander Shendi

in reply to daniel:// stenberg:// • 1 year ago • •

Thank you, very interesting.
The chart at:
daniel.haxx.se/blog/wp-content…

was what I wanted to know. I hope I haven't offended you, I just was looking for an opportunity to learn something.

in reply to daniel:// stenberg://

Arne Babenhauserheide

in reply to daniel:// stenberg:// • 1 year ago • •

In this graph it looks much more like the C vulnerabilities are stagnating than it sounds like in the article 2 years ago.

Thank you for sharing!

in reply to Arne Babenhauserheide

daniel:// stenberg://

in reply to Arne Babenhauserheide • 1 year ago • •

@ArneBab clearly the c mistake share has decreased significantly the last two years. I suppose we will learn if this was just a fluke or something real as we go forward!

@Arne Babenhauserheide

in reply to daniel:// stenberg://

Arne Babenhauserheide

in reply to daniel:// stenberg:// • 1 year ago • •

even the mistake per lines of code — I’m looking forward very much to more blog posts the next years (data collection takes time …).

⇧