Fuzzing (fuzz testing) 101: Lessons from cyber security expert Dr. David Brumley

TechRepublic
19 Oct 202008:18

Summary

TLDRDr. David Brumley, a professor at Carnegie Mellon University and CEO of For All Secure, explains fuzzing, a technique used to improve software security and development. Fuzzing involves feeding random inputs to a program to uncover bugs and vulnerabilities, much like monkeys randomly typing on keyboards. The process has evolved from black box fuzzers to the current third generation, which uses instrumentation-guided fuzzing to intelligently explore and identify security issues. Brumley highlights the benefits of fuzzing for both security and reliability in software development.

Takeaways

  • 📚 Fuzzing is a decades-old process that is not widely known outside of cybersecurity circles but is crucial for improving security processes and software development.
  • 🏆 Dr. David Brumley, a professor at Carnegie Mellon University and CEO of For All Secure, is a pioneer in fuzzing technology, having built the winning entry for the DARPA Cyber Grand Challenge.
  • 🔍 Fuzzing involves providing random input to applications to discover vulnerabilities, much like monkeys typing on a keyboard, but with the aim of uncovering security issues rather than creating literary works.
  • 🤖 The analogy of a program being a maze with a robot navigating it helps explain how fuzzing works, with inputs as directions that the robot follows through the program's logic.
  • 🐛 Fuzzing automates the process of input generation and execution to find bugs, which is more efficient than manual unit testing by developers.
  • 🚀 Fuzzing has evolved to its third generation, moving from random input generation to more sophisticated techniques that learn from execution paths to find new vulnerabilities.
  • 🔬 Static analysis and fuzzing are contrasted, with static analysis examining code for patterns without execution, while fuzzing actively runs the program with generated inputs to find issues.
  • 🛡️ Google's use of fuzzing has led to the discovery of 25,000 bugs in Google Chrome and open-source libraries over three years, demonstrating the power of automated testing for security.
  • 🛠️ Beyond security, fuzzing benefits developers by providing test cases that can improve the reliability of software and speed up the development lifecycle.
  • 🔑 There are different types of fuzzers, including black box, grammar-based, and instrumentation-guided fuzzers, each with its approach to generating inputs and finding bugs.
  • 🌐 Companies can start using fuzzing by adopting these techniques, which are favored by major development shops like Google and Microsoft for their effectiveness in finding vulnerabilities.

Q & A

  • What is fuzzing and how long has it been around?

    -Fuzzing is a technique used to discover security issues in software by providing random inputs to the program to see if it can cause a crash or uncover vulnerabilities. It has been around for about 25 years, originally coined by Professor Bart Miller.

  • What did Professor Bart Miller and his graduate students discover when they gave random inputs to Unix, Microsoft, and Apple applications?

    -They discovered that about a third of these applications would crash when given random inputs, revealing serious security issues.

  • Can you explain the analogy used by Dr. David Brumley to describe how fuzzing works?

    -Dr. Brumley used the analogy of a program being like a maze and the input being directions for a robot navigating through it. Fuzzing automates the process of giving the robot different paths to explore to find bugs or crashes in the program.

  • What is the difference between fuzzing and static analysis in terms of software testing?

    -Static analysis involves examining the program's code without running it, looking for patterns that might indicate problems. Fuzzing, on the other hand, involves actually running the program with various inputs to see if it behaves unexpectedly or crashes.

  • How does fuzzing benefit the software development process beyond security?

    -Fuzzing can help improve the reliability of software by executing various paths and uncovering potential bugs. It can also speed up the software development life cycle by generating test cases automatically, reducing the need for manual testing.

  • What are the three generations of fuzzing techniques mentioned in the script?

    -The first generation is black box fuzzing, which generates random inputs. The second generation is grammar-based fuzzing, which uses templates to generate structured inputs. The third generation is instrumentation-guided fuzzing, which learns from the program's execution to generate new inputs.

  • How does instrumentation-guided fuzzing differ from the earlier generations of fuzzing?

    -Instrumentation-guided fuzzing generates inputs and observes the program's execution path, learning from it to inform the generation of the next input. This approach combines the benefits of structured exploration with the ability to discover new paths, unlike earlier methods.

  • What is the significance of the DARPA Cyber Grand Challenge mentioned in the script?

    -The DARPA Cyber Grand Challenge is a competition that Dr. David Brumley's team won using their fuzzing technology. This achievement highlights the effectiveness and advancement of fuzzing techniques in the field of cybersecurity.

  • How has Google utilized fuzzing in their projects?

    -Google has a project where they use fuzzing to automatically find bugs in Google Chrome and many open-source libraries they use. Over the last three years, they have found 25,000 bugs with zero false positives using this method.

  • What advice does Dr. David Brumley give for companies looking to start using fuzzing?

    -Dr. Brumley suggests that companies should consider using the third generation of fuzzing techniques, specifically instrumentation-guided fuzzing, as it offers a balance between structured exploration and the ability to uncover new vulnerabilities.

Outlines

00:00

🔍 Introduction to Fuzzing

This paragraph introduces the concept of fuzzing, a method used to discover security vulnerabilities in software through the input of random data. Dr. David Brumley, a professor at Carnegie Mellon University and CEO of For All Secure, is highlighted for his contribution to fuzzing technology, which won the DARPA Cyber Grand Challenge. The speaker invites Dr. Brumley to explain fuzzing, its origins, and its applications in improving security processes and software development cycles. The analogy of a maze with a robot navigating through it based on input directions is used to illustrate how fuzzing works, emphasizing the automated generation of inputs to uncover bugs that traditional unit testing might miss.

05:00

🛠️ Benefits and Evolution of Fuzzing Techniques

The second paragraph delves into the benefits of fuzzing for developers beyond security, such as improving the reliability of software through extensive testing without the need for developers to manually write test cases. It outlines the evolution of fuzzing techniques from the first generation of black box fuzzers that randomly generated inputs, to the second generation of grammar-based fuzzers that used templates for input generation, and finally to the third generation of instrumentation-guided fuzzing. This last generation learns from the execution paths taken, optimizing the discovery of new vulnerabilities. The speaker also mentions the adoption of these techniques by major companies like Google and Microsoft and invites viewers to explore more about fuzzing on TechRepublic.

Mindmap

Keywords

💡Fuzzing

Fuzzing is a software testing technique that involves providing random, invalid, or unexpected data as input to a program to observe how it handles the input. It is used to discover coding errors, security loopholes, or unexpected behavior within the program. In the video, fuzzing is presented as a crucial tool for improving security processes and software development cycles, with the analogy of a robot navigating a maze to illustrate how it explores different paths within a program to uncover bugs.

💡Cyber Security

Cyber security refers to the practice of protecting systems, networks, and programs from digital attacks. It is a broad field that includes various strategies and technologies to safeguard sensitive information from theft, damage, or unauthorized access. In the context of the video, fuzzing is highlighted as an important technique within cyber security to enhance the security of software applications by identifying vulnerabilities.

💡DARPA Cyber Grand Challenge

The DARPA Cyber Grand Challenge was an event organized by the Defense Advanced Research Projects Agency (DARPA) to promote the development of automated cyber defense systems. The competition involved creating systems that could independently detect and patch security vulnerabilities. Dr. David Brumley, mentioned in the video, built fuzzing technology that won this challenge, showcasing the effectiveness of fuzzing in the field of cyber security.

💡Dr. David Brumley

Dr. David Brumley is a professor at Carnegie Mellon University and the CEO of For All Secure. He is a key figure in the field of cyber security, particularly in the development of fuzzing technology. In the video, he is introduced as an expert who can explain the concept of fuzzing and its applications in improving security and software development.

💡Software Development Cycle

The software development cycle refers to the process of creating, testing, and maintaining software applications. It typically includes stages such as planning, design, coding, testing, and deployment. The video discusses how fuzzing can be integrated into this cycle to improve the reliability and quality of software by identifying bugs and vulnerabilities early on.

💡Static Analysis

Static analysis is a method of examining software code without executing the program. It involves looking for patterns and potential issues within the code to identify bugs or security flaws. In the video, static analysis is contrasted with fuzzing, where fuzzing involves running the program with various inputs to find problems that static analysis might only suspect but not confirm.

💡Unit Test

A unit test is a piece of code that tests a small, isolated part of a larger software application to determine if it behaves as expected. In the video, the concept of unit testing is used to compare with fuzzing, where fuzzing automates the process of generating inputs and checking outputs, potentially identifying more bugs than manual unit testing.

💡Black Box Fuzzers

Black box fuzzers are the first generation of fuzzing tools that operate without any knowledge of the program's internal workings. They generate random inputs and observe the program's reactions to determine if it behaves correctly or crashes. The video mentions this as the initial approach to fuzzing, which has evolved into more sophisticated techniques.

💡Grammar-Based Fuzzers

Grammar-based fuzzers represent the second generation of fuzzing tools. They use a predefined template or grammar that dictates how inputs should be structured. This method is more focused than black box fuzzing, but it may still miss certain paths within the program if the grammar does not cover all possible input scenarios, as explained in the video.

💡Instrumentation Guided Fuzzing

Instrumentation guided fuzzing is the third generation of fuzzing techniques. It involves monitoring the execution of the program with generated inputs and using that information to inform the creation of subsequent inputs. This method is described in the video as a more intelligent approach to fuzzing, combining the benefits of structured input with the ability to explore a wider range of the program's functionality.

💡AI Fuzzing

AI fuzzing, while not explicitly defined in the video, is implied as a more advanced form of instrumentation guided fuzzing. It suggests the use of artificial intelligence to learn from the program's execution and improve the fuzzing process. The video hints at this by describing how modern fuzzing tools learn from their exploration to find new vulnerabilities.

Highlights

Fuzzing is a decades-old process that is not well known outside of cybersecurity circles.

Dr. David Brumley, a professor at Carnegie Mellon University and CEO of For All Secure, built fuzzing technology that won the DARPA Cyber Grand Challenge.

Fuzzing involves giving random input to applications to discover security issues.

An analogy of a program as a maze with inputs as directions for a robot to navigate through it.

Unit tests check one path through the program, while fuzzing explores multiple paths to find bugs.

Fuzzing automates the process of input generation and program execution to identify crashes.

Fuzzing has evolved to the third generation of techniques, moving beyond random input generation.

Instrumentation guided fuzzing learns from execution paths to generate new inputs.

Fuzzing can be completely automated, running thousands of iterations in a second.

Static analysis looks for patterns in code without running the program, unlike fuzzing.

Google has found 25,000 bugs automatically with fuzzing in Google Chrome and open source libraries over three years.

Fuzzing improves software reliability and can speed up the software development life cycle.

Developers can use fuzzing to generate test cases and perform regression tests.

Black box fuzzers are the first generation, generating random inputs and checking for crashes.

Grammar-based fuzzers use templates to generate structured inputs for more directed exploration.

Instrumentation guided fuzzing combines the benefits of structured and random input generation.

Modern development shops like Google and Microsoft use instrumentation guided fuzzing for its effectiveness.

Transcripts

play00:05

the process of fuzzing

play00:06

is decades old but isn't well known

play00:08

outside of

play00:09

cyber security circles that needs to

play00:11

change

play00:12

luckily i'm here with someone that can

play00:14

help us do that

play00:16

dr david brumley david is a professor

play00:20

at carnegie mellon university and ceo of

play00:23

for

play00:23

all secure and he's also someone that

play00:26

built the fuzzing technology

play00:28

that won the darpa cyber grand challenge

play00:31

and he's going to explain what fuzzing

play00:33

is

play00:34

and explain how companies can use it to

play00:36

help improve both their security

play00:38

processes

play00:39

and software development cycles so david

play00:42

thanks for joining me and let's jump

play00:44

right to it

play00:45

what is fuzzing well

play00:48

as you said buzzing was uh named about

play00:50

25 years ago the story is professor bart

play00:53

miller

play00:53

and his graduate students were looking

play00:55

at the reliability of

play00:56

unix microsoft and apple applications

play01:00

and they noticed something kind of funny

play01:02

when they gave these applications a

play01:03

random input

play01:04

they could cause about a third of them

play01:06

to crash pretty big number right

play01:09

it was really like the proverbial like

play01:10

monkeys typing on a keyboard right

play01:13

but instead of creating shakespeare they

play01:15

found serious security issues

play01:18

that's worse right it's much worse so

play01:21

let me explain how fuzzing works and i'm

play01:22

going to use

play01:23

an analogy here so think of a program

play01:26

like

play01:27

amaze right and so we know when a

play01:28

programmer is developing code

play01:30

they have different computations

play01:32

depending upon what the user gives them

play01:34

so here the program is amazing and then

play01:38

we have let's just pretend a little

play01:42

robot up here and an input to the

play01:45

program is going to be directions for

play01:47

our robot

play01:47

through the maze so for example we can

play01:50

give the robot the directions

play01:51

i'm going to write it up here down

play01:54

left down right and he's going to take

play01:58

two rights just meaning he's going to go

play01:59

to the right twice

play02:01

and then he's going to go down a bunch

play02:03

of times

play02:05

so you can think about giving our little

play02:07

robot this input

play02:09

and robot is going to take that as

play02:11

directions and he's going to

play02:13

take this path through the program he's

play02:14

going to go down left down

play02:17

first right second right then a bunch of

play02:19

downs

play02:20

and when you look at this we had a

play02:22

little bug here they can verify that

play02:23

this is actually

play02:24

okay there's no actual bug here and this

play02:26

is what's happening when a developer

play02:28

writes a unit test

play02:30

so what they're doing is they're coming

play02:31

up with an input and they're making sure

play02:33

that it gets the right output

play02:34

now the problem is if you think about

play02:36

this maze we've only checked one path

play02:38

through this maze and there's other

play02:39

potential lurking buzzes

play02:41

bugs out there so what fuzzing does is

play02:43

it really automates this idea of

play02:45

coming up with an input and running the

play02:48

program and seeing

play02:49

if we find a bug so for example

play02:54

if we think about just switching these

play02:55

directions a little bit we have down

play02:57

left down but instead of

play03:01

taking two rights we only take one right

play03:03

and then go down

play03:04

and some more directions

play03:08

the robot may take this particular path

play03:10

through the program down right instead

play03:11

of going 2

play03:12

it's only going to go down 1. say it

play03:15

comes over here and we find that the

play03:16

program

play03:17

crashes now what bart originally found

play03:20

of course was

play03:21

providing random input so it wasn't just

play03:23

structured like this random inputs could

play03:25

actually cause

play03:25

applications to crash pretty often now

play03:28

we're on our third generation of fuzzing

play03:30

techniques it's no longer

play03:31

monkeys typing on a keyboard there's a

play03:34

lot more tech behind it

play03:36

where the idea though is still the same

play03:38

we're going to automatically generate an

play03:40

input we're going to see if the program

play03:41

crashes or not

play03:42

and here's the cool thing it can be

play03:44

completely automated

play03:46

by making a computer do this as opposed

play03:48

to the developer writing the unit test

play03:49

you can go through thousands of these

play03:51

iterations in a single second

play03:55

let me contrast this with static

play03:56

analysis because i know a lot of people

play03:58

think about static analysis and fuzzing

play04:00

wonder what the difference is between

play04:01

them so when you think about static

play04:03

analysis

play04:04

the static analysis is doing is it's

play04:05

looking at the program it never actually

play04:07

runs it

play04:07

and it's saying well there may be a

play04:09

problem here maybe a problem here

play04:11

maybe it knows already this is okay

play04:13

maybe there's a problem

play04:15

it thinks here and so on and so forth

play04:17

but it's never actually

play04:18

proved there's a problem so it's looking

play04:20

for patterns

play04:21

in the code looking just for patterns

play04:23

and so if you actually look at this maze

play04:25

right

play04:25

you can say well static analysis flagged

play04:27

this but there's no way our little robot

play04:29

can get over there it's blocked

play04:31

and when you think about static analysis

play04:33

it can potentially find more bugs

play04:35

but you have to staff someone manually

play04:37

reviewing it what fuzzing is doing is

play04:39

incrementally exploring the program to

play04:41

come up with these

play04:42

to find lots and lots of problems for

play04:45

example

play04:46

google has a project where they're

play04:47

tracking google chrome and many of the

play04:49

open source libraries google finds

play04:51

uh uses and they found 25 000 bugs

play04:54

completely automatically

play04:56

with zero false positives over the last

play04:58

three years

play05:00

i also want to throw security aside and

play05:02

say how can this benefit the developer

play05:04

because security is not always a cost it

play05:06

can actually benefit

play05:08

we all know that the better we test a

play05:10

program the more reliable it's going to

play05:11

be in the field

play05:13

and we also know developers don't

play05:14

particularly like writing test cases

play05:17

and so by using fuzzing to come up with

play05:19

different inputs that execute

play05:20

all these paths they're really just test

play05:21

cases and you can do that to do

play05:23

regression tests over time

play05:25

so one of the benefits beyond security

play05:27

of fuzzing

play05:28

is you can use it to speed up your

play05:29

software development life cycle to

play05:31

produce

play05:32

more trustworthy and better quality code

play05:34

and so how can companies get started

play05:36

using

play05:37

fuzzing as a technique and what are some

play05:40

of the actual

play05:41

fuzzers that are out there let's talk

play05:43

about that so

play05:44

i started off by saying this was

play05:46

invented or coined 25 years ago by

play05:48

professor bart miller and we're really

play05:50

on our third generation

play05:51

so the original set of fathers were what

play05:53

we call black box buzzers

play05:56

and they would generate an input maybe

play05:58

at random or

play05:59

with some algorithm and they just run

play06:00

the program and see if it crashed just

play06:02

over and over and over and over and over

play06:04

again

play06:05

now the problem with that is if you're

play06:06

just generating a random input it may

play06:08

not take

play06:09

the robot anywhere for example you don't

play06:10

want to generate an input that has the

play06:11

robot going down

play06:12

and back up and back down and so on and

play06:14

so forth

play06:16

so that was the first generation these

play06:18

techniques actually still work today

play06:20

randomly generating but not as well

play06:22

the second generation are what we call

play06:24

protocol or grammar based

play06:26

uh grammar-based fuzzers and what they

play06:28

do is

play06:29

you have someone manually generate a

play06:31

template for how to create those inputs

play06:33

so in our example here someone may write

play06:35

a template that says always you know go

play06:37

down

play06:38

and then go either down or right go

play06:41

either

play06:42

left or right next go after that maybe

play06:45

down again or up again

play06:46

and so on and so forth and if you think

play06:48

about what this is doing

play06:49

is it's constraining the set of things

play06:52

you're going to explore

play06:54

so for example if you write this

play06:55

protocol or grammar out

play06:57

it may end up inadvertently only

play06:59

checking part of the program because

play07:01

you haven't actually said it's possible

play07:03

to go over this far

play07:05

so that's the second generation great

play07:07

products out there today

play07:08

the third generation is what we call

play07:10

instrumentation guided buzzing

play07:12

and what instrumentation guided fuzzing

play07:14

does is it generates an input

play07:16

and it watches as the robot's executing

play07:19

the path

play07:19

and it learns from that to come up with

play07:21

the next input

play07:23

and so sometimes this is branded as ai

play07:25

fuzzing i don't think of it as ai but it

play07:27

is learning the more it executes it's

play07:29

learning about which paths it's

play07:30

already looked at and what are the new

play07:32

places out there so it's a little bit of

play07:34

the best of both worlds right you have a

play07:36

constrained process

play07:37

but you're not missing half of the

play07:40

potential vulnerabilities

play07:42

i think so and i think if you go look at

play07:44

modern development shops the people like

play07:46

google and microsoft who put

play07:47

tons of money into this they've settled

play07:49

on instrumentation guided fuzzing for a

play07:51

reason

play07:53

well david thank you for explaining uh

play07:55

what fuzzing is

play07:57

and if you'd like to learn more about

play07:59

fuzzing and fuzzers

play08:01

check out techrepublic now if you've got

play08:03

a topic you'd like to see david and i

play08:04

cover

play08:05

in a future video or a question leave a

play08:08

comment

play08:08

and if you like this video be sure to

play08:10

click the like button or

play08:16

subscribe

Rate This

5.0 / 5 (0 votes)

関連タグ
FuzzingCybersecuritySoftware TestingDr. David BrumleyCarnegie MellonFor All SecureDARPA ChallengeSecurity ProcessesDevelopment CyclesGoogle Project
英語で要約が必要ですか?