Fuzzing (fuzz testing) 101: Lessons from cyber security expert Dr. David Brumley
Summary
TLDRDr. David Brumley, a professor at Carnegie Mellon University and CEO of For All Secure, explains fuzzing, a technique used to improve software security and development. Fuzzing involves feeding random inputs to a program to uncover bugs and vulnerabilities, much like monkeys randomly typing on keyboards. The process has evolved from black box fuzzers to the current third generation, which uses instrumentation-guided fuzzing to intelligently explore and identify security issues. Brumley highlights the benefits of fuzzing for both security and reliability in software development.
Takeaways
- 📚 Fuzzing is a decades-old process that is not widely known outside of cybersecurity circles but is crucial for improving security processes and software development.
- 🏆 Dr. David Brumley, a professor at Carnegie Mellon University and CEO of For All Secure, is a pioneer in fuzzing technology, having built the winning entry for the DARPA Cyber Grand Challenge.
- 🔍 Fuzzing involves providing random input to applications to discover vulnerabilities, much like monkeys typing on a keyboard, but with the aim of uncovering security issues rather than creating literary works.
- 🤖 The analogy of a program being a maze with a robot navigating it helps explain how fuzzing works, with inputs as directions that the robot follows through the program's logic.
- 🐛 Fuzzing automates the process of input generation and execution to find bugs, which is more efficient than manual unit testing by developers.
- 🚀 Fuzzing has evolved to its third generation, moving from random input generation to more sophisticated techniques that learn from execution paths to find new vulnerabilities.
- 🔬 Static analysis and fuzzing are contrasted, with static analysis examining code for patterns without execution, while fuzzing actively runs the program with generated inputs to find issues.
- 🛡️ Google's use of fuzzing has led to the discovery of 25,000 bugs in Google Chrome and open-source libraries over three years, demonstrating the power of automated testing for security.
- 🛠️ Beyond security, fuzzing benefits developers by providing test cases that can improve the reliability of software and speed up the development lifecycle.
- 🔑 There are different types of fuzzers, including black box, grammar-based, and instrumentation-guided fuzzers, each with its approach to generating inputs and finding bugs.
- 🌐 Companies can start using fuzzing by adopting these techniques, which are favored by major development shops like Google and Microsoft for their effectiveness in finding vulnerabilities.
Q & A
What is fuzzing and how long has it been around?
-Fuzzing is a technique used to discover security issues in software by providing random inputs to the program to see if it can cause a crash or uncover vulnerabilities. It has been around for about 25 years, originally coined by Professor Bart Miller.
What did Professor Bart Miller and his graduate students discover when they gave random inputs to Unix, Microsoft, and Apple applications?
-They discovered that about a third of these applications would crash when given random inputs, revealing serious security issues.
Can you explain the analogy used by Dr. David Brumley to describe how fuzzing works?
-Dr. Brumley used the analogy of a program being like a maze and the input being directions for a robot navigating through it. Fuzzing automates the process of giving the robot different paths to explore to find bugs or crashes in the program.
What is the difference between fuzzing and static analysis in terms of software testing?
-Static analysis involves examining the program's code without running it, looking for patterns that might indicate problems. Fuzzing, on the other hand, involves actually running the program with various inputs to see if it behaves unexpectedly or crashes.
How does fuzzing benefit the software development process beyond security?
-Fuzzing can help improve the reliability of software by executing various paths and uncovering potential bugs. It can also speed up the software development life cycle by generating test cases automatically, reducing the need for manual testing.
What are the three generations of fuzzing techniques mentioned in the script?
-The first generation is black box fuzzing, which generates random inputs. The second generation is grammar-based fuzzing, which uses templates to generate structured inputs. The third generation is instrumentation-guided fuzzing, which learns from the program's execution to generate new inputs.
How does instrumentation-guided fuzzing differ from the earlier generations of fuzzing?
-Instrumentation-guided fuzzing generates inputs and observes the program's execution path, learning from it to inform the generation of the next input. This approach combines the benefits of structured exploration with the ability to discover new paths, unlike earlier methods.
What is the significance of the DARPA Cyber Grand Challenge mentioned in the script?
-The DARPA Cyber Grand Challenge is a competition that Dr. David Brumley's team won using their fuzzing technology. This achievement highlights the effectiveness and advancement of fuzzing techniques in the field of cybersecurity.
How has Google utilized fuzzing in their projects?
-Google has a project where they use fuzzing to automatically find bugs in Google Chrome and many open-source libraries they use. Over the last three years, they have found 25,000 bugs with zero false positives using this method.
What advice does Dr. David Brumley give for companies looking to start using fuzzing?
-Dr. Brumley suggests that companies should consider using the third generation of fuzzing techniques, specifically instrumentation-guided fuzzing, as it offers a balance between structured exploration and the ability to uncover new vulnerabilities.
Outlines
🔍 Introduction to Fuzzing
This paragraph introduces the concept of fuzzing, a method used to discover security vulnerabilities in software through the input of random data. Dr. David Brumley, a professor at Carnegie Mellon University and CEO of For All Secure, is highlighted for his contribution to fuzzing technology, which won the DARPA Cyber Grand Challenge. The speaker invites Dr. Brumley to explain fuzzing, its origins, and its applications in improving security processes and software development cycles. The analogy of a maze with a robot navigating through it based on input directions is used to illustrate how fuzzing works, emphasizing the automated generation of inputs to uncover bugs that traditional unit testing might miss.
🛠️ Benefits and Evolution of Fuzzing Techniques
The second paragraph delves into the benefits of fuzzing for developers beyond security, such as improving the reliability of software through extensive testing without the need for developers to manually write test cases. It outlines the evolution of fuzzing techniques from the first generation of black box fuzzers that randomly generated inputs, to the second generation of grammar-based fuzzers that used templates for input generation, and finally to the third generation of instrumentation-guided fuzzing. This last generation learns from the execution paths taken, optimizing the discovery of new vulnerabilities. The speaker also mentions the adoption of these techniques by major companies like Google and Microsoft and invites viewers to explore more about fuzzing on TechRepublic.
Mindmap
Keywords
💡Fuzzing
💡Cyber Security
💡DARPA Cyber Grand Challenge
💡Dr. David Brumley
💡Software Development Cycle
💡Static Analysis
💡Unit Test
💡Black Box Fuzzers
💡Grammar-Based Fuzzers
💡Instrumentation Guided Fuzzing
💡AI Fuzzing
Highlights
Fuzzing is a decades-old process that is not well known outside of cybersecurity circles.
Dr. David Brumley, a professor at Carnegie Mellon University and CEO of For All Secure, built fuzzing technology that won the DARPA Cyber Grand Challenge.
Fuzzing involves giving random input to applications to discover security issues.
An analogy of a program as a maze with inputs as directions for a robot to navigate through it.
Unit tests check one path through the program, while fuzzing explores multiple paths to find bugs.
Fuzzing automates the process of input generation and program execution to identify crashes.
Fuzzing has evolved to the third generation of techniques, moving beyond random input generation.
Instrumentation guided fuzzing learns from execution paths to generate new inputs.
Fuzzing can be completely automated, running thousands of iterations in a second.
Static analysis looks for patterns in code without running the program, unlike fuzzing.
Google has found 25,000 bugs automatically with fuzzing in Google Chrome and open source libraries over three years.
Fuzzing improves software reliability and can speed up the software development life cycle.
Developers can use fuzzing to generate test cases and perform regression tests.
Black box fuzzers are the first generation, generating random inputs and checking for crashes.
Grammar-based fuzzers use templates to generate structured inputs for more directed exploration.
Instrumentation guided fuzzing combines the benefits of structured and random input generation.
Modern development shops like Google and Microsoft use instrumentation guided fuzzing for its effectiveness.
Transcripts
the process of fuzzing
is decades old but isn't well known
outside of
cyber security circles that needs to
change
luckily i'm here with someone that can
help us do that
dr david brumley david is a professor
at carnegie mellon university and ceo of
for
all secure and he's also someone that
built the fuzzing technology
that won the darpa cyber grand challenge
and he's going to explain what fuzzing
is
and explain how companies can use it to
help improve both their security
processes
and software development cycles so david
thanks for joining me and let's jump
right to it
what is fuzzing well
as you said buzzing was uh named about
25 years ago the story is professor bart
miller
and his graduate students were looking
at the reliability of
unix microsoft and apple applications
and they noticed something kind of funny
when they gave these applications a
random input
they could cause about a third of them
to crash pretty big number right
it was really like the proverbial like
monkeys typing on a keyboard right
but instead of creating shakespeare they
found serious security issues
that's worse right it's much worse so
let me explain how fuzzing works and i'm
going to use
an analogy here so think of a program
like
amaze right and so we know when a
programmer is developing code
they have different computations
depending upon what the user gives them
so here the program is amazing and then
we have let's just pretend a little
robot up here and an input to the
program is going to be directions for
our robot
through the maze so for example we can
give the robot the directions
i'm going to write it up here down
left down right and he's going to take
two rights just meaning he's going to go
to the right twice
and then he's going to go down a bunch
of times
so you can think about giving our little
robot this input
and robot is going to take that as
directions and he's going to
take this path through the program he's
going to go down left down
first right second right then a bunch of
downs
and when you look at this we had a
little bug here they can verify that
this is actually
okay there's no actual bug here and this
is what's happening when a developer
writes a unit test
so what they're doing is they're coming
up with an input and they're making sure
that it gets the right output
now the problem is if you think about
this maze we've only checked one path
through this maze and there's other
potential lurking buzzes
bugs out there so what fuzzing does is
it really automates this idea of
coming up with an input and running the
program and seeing
if we find a bug so for example
if we think about just switching these
directions a little bit we have down
left down but instead of
taking two rights we only take one right
and then go down
and some more directions
the robot may take this particular path
through the program down right instead
of going 2
it's only going to go down 1. say it
comes over here and we find that the
program
crashes now what bart originally found
of course was
providing random input so it wasn't just
structured like this random inputs could
actually cause
applications to crash pretty often now
we're on our third generation of fuzzing
techniques it's no longer
monkeys typing on a keyboard there's a
lot more tech behind it
where the idea though is still the same
we're going to automatically generate an
input we're going to see if the program
crashes or not
and here's the cool thing it can be
completely automated
by making a computer do this as opposed
to the developer writing the unit test
you can go through thousands of these
iterations in a single second
let me contrast this with static
analysis because i know a lot of people
think about static analysis and fuzzing
wonder what the difference is between
them so when you think about static
analysis
the static analysis is doing is it's
looking at the program it never actually
runs it
and it's saying well there may be a
problem here maybe a problem here
maybe it knows already this is okay
maybe there's a problem
it thinks here and so on and so forth
but it's never actually
proved there's a problem so it's looking
for patterns
in the code looking just for patterns
and so if you actually look at this maze
right
you can say well static analysis flagged
this but there's no way our little robot
can get over there it's blocked
and when you think about static analysis
it can potentially find more bugs
but you have to staff someone manually
reviewing it what fuzzing is doing is
incrementally exploring the program to
come up with these
to find lots and lots of problems for
example
google has a project where they're
tracking google chrome and many of the
open source libraries google finds
uh uses and they found 25 000 bugs
completely automatically
with zero false positives over the last
three years
i also want to throw security aside and
say how can this benefit the developer
because security is not always a cost it
can actually benefit
we all know that the better we test a
program the more reliable it's going to
be in the field
and we also know developers don't
particularly like writing test cases
and so by using fuzzing to come up with
different inputs that execute
all these paths they're really just test
cases and you can do that to do
regression tests over time
so one of the benefits beyond security
of fuzzing
is you can use it to speed up your
software development life cycle to
produce
more trustworthy and better quality code
and so how can companies get started
using
fuzzing as a technique and what are some
of the actual
fuzzers that are out there let's talk
about that so
i started off by saying this was
invented or coined 25 years ago by
professor bart miller and we're really
on our third generation
so the original set of fathers were what
we call black box buzzers
and they would generate an input maybe
at random or
with some algorithm and they just run
the program and see if it crashed just
over and over and over and over and over
again
now the problem with that is if you're
just generating a random input it may
not take
the robot anywhere for example you don't
want to generate an input that has the
robot going down
and back up and back down and so on and
so forth
so that was the first generation these
techniques actually still work today
randomly generating but not as well
the second generation are what we call
protocol or grammar based
uh grammar-based fuzzers and what they
do is
you have someone manually generate a
template for how to create those inputs
so in our example here someone may write
a template that says always you know go
down
and then go either down or right go
either
left or right next go after that maybe
down again or up again
and so on and so forth and if you think
about what this is doing
is it's constraining the set of things
you're going to explore
so for example if you write this
protocol or grammar out
it may end up inadvertently only
checking part of the program because
you haven't actually said it's possible
to go over this far
so that's the second generation great
products out there today
the third generation is what we call
instrumentation guided buzzing
and what instrumentation guided fuzzing
does is it generates an input
and it watches as the robot's executing
the path
and it learns from that to come up with
the next input
and so sometimes this is branded as ai
fuzzing i don't think of it as ai but it
is learning the more it executes it's
learning about which paths it's
already looked at and what are the new
places out there so it's a little bit of
the best of both worlds right you have a
constrained process
but you're not missing half of the
potential vulnerabilities
i think so and i think if you go look at
modern development shops the people like
google and microsoft who put
tons of money into this they've settled
on instrumentation guided fuzzing for a
reason
well david thank you for explaining uh
what fuzzing is
and if you'd like to learn more about
fuzzing and fuzzers
check out techrepublic now if you've got
a topic you'd like to see david and i
cover
in a future video or a question leave a
comment
and if you like this video be sure to
click the like button or
subscribe
Ver Más Videos Relacionados
Fuzzing XSS Sanitizers for Fun and Profit | Tom Anthony
FUZZING FOR BEGINNERS (KUGG teaches STÖK American fuzzy lop)
How Tide transitioned to developer-first security with Semgrep
Fuzzing for beginners! FFuF - Hacker Tools
What Should You Do After Recon?!
Epic Wordlists for Bug Bounty content discovery and API bugs!
5.0 / 5 (0 votes)