Part 8/8: ML Based Web App Firewall : Testing the IPS in Real Time

Debasish Mandal
20 Oct 202008:15

Summary

TLDRIn this informative video, Devashesh demonstrates how to deploy and test a machine learning model for a Web Application Intrusion Prevention System (IPS) using the Pikered library. The model is integrated with a proxy server to intercept HTTP requests in real-time, analyzing them to determine if they are malicious. Viewers witness a live test using Firefox, where the model successfully detects SQL injection attacks, highlighting the real-time capabilities of the IPS. Devashesh acknowledges the need for further refinement in feature extraction and promises ongoing improvements to enhance the model's precision and accuracy.

Takeaways

  • 😀 The video is a tutorial by Devashesh on deploying a machine learning model using the Piker server library.
  • 🛡️ The model being discussed is an Intrusion Prevention System (IPS) designed to detect malicious HTTP requests in real-time.
  • 💡 The process involves creating a proxy server that integrates with the machine learning model to intercept and analyze HTTP requests.
  • 🔍 The model extracts features from the HTTP requests to determine if they are 'good' or 'bad' in nature.
  • 📈 The video demonstrates using a Jupyter notebook to set up the environment and apply a K-means clustering model with two clusters.
  • 📚 It references a previous dataset saved in 'data.csv' for training the model.
  • 🌐 The testing is done using a Firefox web browser configured to send all requests through the proxy server.
  • 🔬 The model is tested against a dummy website, 'demo.testfire.net', which is a known vulnerable web application.
  • 🚀 The video shows real-time feature extraction and model execution on HTTP requests sent by the browser.
  • 🛑 The model successfully identifies some SQL injection payloads as malicious, printing 'intrusion detected'.
  • 🔄 The presenter acknowledges the need for further work on feature extraction to improve the IPS's overall quality and accuracy.

Q & A

  • What is the main topic of the video?

    -The main topic of the video is deploying and testing a machine learning model for a web application intrusion prevention system (IPS) in real-time using a proxy server.

  • What library was mentioned for deploying the model in the last video?

    -The library mentioned for deploying the model in the last video is 'pikered'.

  • What is the purpose of the proxy server in this context?

    -The purpose of the proxy server in this context is to intercept HTTP requests and integrate with the machine learning model to determine whether the requests are good or bad in nature.

  • What tool is the presenter using to demonstrate the real-time feature extraction from HTTP requests?

    -The presenter is using a Jupyter notebook to demonstrate the real-time feature extraction from HTTP requests.

  • What is the method used for training the model in the script?

    -The method used for training the model is K-means clustering, with the number of clusters set to 2.

  • What is the data source for training the model mentioned in the script?

    -The data source for training the model is a dataset saved in 'data.csv'.

  • How is the Firefox web browser configured in the demonstration?

    -The Firefox web browser is configured to send all requests through the proxy server created in the Jupyter notebook.

  • What website is used for testing the IPS in the video?

    -The website used for testing the IPS is 'demo.testfire.net', a known vulnerable web application.

  • What type of payloads are used to test the IPS for detecting bad requests?

    -SQL injection payloads taken from the internet are used to test the IPS for detecting bad requests.

  • What is the presenter's plan for improving the IPS after the demonstration?

    -The presenter plans to continue working on the feature extraction from the training data and tuning the clustering model to make it more precise and accurate.

  • How does the presenter conclude the video?

    -The presenter concludes the video by asking viewers to stay subscribed for updates on the IPS development and improvement.

Outlines

00:00

🛡️ Deploying and Testing an IPS with Pikered Library

In this segment, Devashesh introduces the process of deploying a machine learning model using the Pikered library, which he demonstrated in a previous video. The focus now shifts to testing the Intrusion Prevention System (IPS) in real-time. A proxy server is set up to integrate with the machine learning model, which will intercept HTTP requests to determine if they are benign or malicious. The video demonstrates the use of a Jupyter notebook to extract features from HTTP requests in real-time and apply a K-means clustering model with two clusters to classify the requests. The setup includes configuring a Firefox web browser to send all requests through the proxy server for testing against a dummy website known for vulnerabilities.

05:02

🔍 Real-time Feature Extraction and IPS Testing

This paragraph delves into the real-time feature extraction from HTTP requests sent by a browser, as demonstrated in the video. Initially, only benign requests are sent to gather data points. Subsequently, the presenter tests the IPS by sending SQL injection payloads, which are malicious requests, to observe the system's response. The IPS successfully identifies some of the payloads as intrusions, printing 'intrusion detected', while others are missed. The presenter acknowledges the need for further improvement in feature extraction and tuning the clustering model for better precision and accuracy. The video concludes with a call to action for viewers to subscribe for more content on the channel and an overview of the ongoing development and improvement of the IPS.

Mindmap

Keywords

💡Deploy

To 'deploy' in the context of software or models refers to the process of making them available for use in a live environment. In the video, the presenter discusses deploying a machine learning model using the 'pikered' library, which is a simple process that requires just a few lines of code. This is a key step in transitioning a model from development to practical application.

💡Pikerved

Pikerved is a library mentioned in the video that facilitates the deployment of machine learning models. It is implied to be a user-friendly tool that simplifies the process of making a model operational in a production environment. The script does not provide further details about pikerved, but its mention suggests its importance in model deployment.

💡Proxy Server

A 'proxy server' is an intermediary server that sits between a client and a destination server, handling requests and forwarding responses. In the video, the presenter creates a proxy server to integrate with a machine learning model, which intercepts HTTP requests to check their nature—whether they are 'good' or 'bad'. This is central to the video's demonstration of a real-time intrusion detection system.

💡HTTP Request

An 'HTTP request' is a message sent from a client to a server to request access to a resource. In the context of the video, the presenter discusses extracting features from HTTP requests in real-time to feed into a machine learning model for analysis. The script mentions that the proxy server checks the nature of these requests, which is integral to the functioning of the intrusion prevention system (IPS).

💡Feature Extraction

Feature extraction is the process of pulling out the characteristics or features from raw data that are relevant for a particular task, such as machine learning. In the video, the presenter mentions a function responsible for extracting features from HTTP requests in real-time, which is a crucial step for the machine learning model to analyze and classify the requests.

💡Machine Learning Model

A 'machine learning model' is a system that uses algorithms to learn from data and make predictions or decisions without being explicitly programmed. The video script describes using a model to analyze HTTP requests and determine if they are potentially harmful. The model's accuracy and efficiency are central to the effectiveness of the IPS being demonstrated.

💡K-Means Clustering

K-Means clustering is a type of unsupervised machine learning algorithm used for grouping similar data points into clusters. The video script mentions applying a K-Means clustering model to classify the nature of HTTP requests. The number of clusters is set to 2, presumably to differentiate between 'good' and 'bad' requests.

💡Intrusion Prevention System (IPS)

An 'Intrusion Prevention System' is a cybersecurity mechanism aimed at detecting and preventing malicious activities. In the video, the presenter demonstrates an IPS that uses a machine learning model to analyze HTTP requests in real-time and identify potential threats, such as SQL injection payloads.

💡SQL Injection

SQL injection is a type of cyber attack where an attacker inserts malicious SQL code into a web application's input fields to manipulate the database. The video script includes examples where the presenter tests the IPS by sending SQL injection payloads to see if the system can detect them as 'bad' requests.

💡Firefox Web Browser

The 'Firefox web browser' is a popular internet browser used for accessing the web. In the video, the presenter configures Firefox to send all requests through the proxy server created in the notebook. This setup is used to test the IPS by observing how the machine learning model processes and evaluates the requests.

💡Real-Time

The term 'real-time' refers to the processing or analysis of data as it is being collected or received, without significant delay. The video script emphasizes the real-time capabilities of the IPS, showcasing how the machine learning model can instantly analyze incoming HTTP requests and detect intrusions.

Highlights

Introduction to deploying machine learning models using the Piker library, emphasizing its simplicity with just a few lines of code.

Explanation of testing the IPS in real-time with a proxy server integrated with a machine learning model.

Description of the proxy server's role in intercepting HTTP requests to check their nature.

Use of Jupiter notebook for step-by-step explanation of the process.

Feature extraction from HTTP requests in real-time for analysis by the IPS.

Training the model using a dataset from a previous video, setting up the environment for the model.

Application of a K-means clustering model with a specified number of clusters.

Creation of a simple proxy server in Python for testing the IPS.

Testing the IPS against a dummy website known for vulnerabilities.

Configuration of Firefox to send requests through the proxy server for testing.

Real-time demonstration of the IPS detecting bad HTTP requests using a machine learning model.

Use of SQL injection payloads to test the IPS's ability to detect malicious requests.

Demonstration of the IPS successfully identifying a bad request with an intrusion detected message.

Testing additional SQL injection payloads to evaluate the IPS's detection capabilities.

Acknowledgment of the need for further work on feature extraction to improve the IPS's quality.

Commitment to continue working on and improving the IPS, with updates to be shared.

Encouragement for viewers to stay subscribed for more content on the channel.

Transcripts

play00:00

[Music]

play00:05

hello everyone my name is devashesh and

play00:06

i welcome you all to this video

play00:08

so in the last video we have seen how

play00:10

can we deploy our model using pikered

play00:12

library

play00:13

and we have seen it how easy it is to do

play00:15

that

play00:17

it literally took two to three lines to

play00:19

deploy our created model using bikerate

play00:22

now it is time to test our ips

play00:24

in the real time so this is exactly what

play00:27

we are going to do here

play00:29

you must be pretty familiar with this

play00:30

particular this particular diagram

play00:33

uh we'll be creating a proxy server and

play00:36

the proxy server will be integrated with

play00:38

the machine learning model that we have

play00:40

created in the last few videos

play00:42

and whenever we are actually sending any

play00:44

request to our server

play00:45

our proxy is going to intercept the http

play00:48

request

play00:49

and it is going to check whether this

play00:52

uh the request http request is good in

play00:55

nature or bad in nature

play00:57

and so let's do it

play01:01

as usual i'm going to use the jupiter

play01:04

notebook

play01:05

and i have already actually created this

play01:07

notebook so i'm going to you know

play01:08

go ahead and explain to you line by line

play01:11

or step by step

play01:12

so basically since we have to extract

play01:14

the features from the http request

play01:17

in real time uh this code actually you

play01:19

know responsible for doing that

play01:21

so we are going to extract some of the

play01:23

features from the http request that

play01:26

our ips receives in real time so this

play01:28

function is responsible for that

play01:31

so we'll just go ahead and do that and

play01:34

after that

play01:35

we are what we are doing this this is

play01:37

where we are actually our

play01:38

we are actually training our model

play01:42

so basically we are reading the old data

play01:44

set that we have seen in our previous

play01:46

video

play01:47

uh which is saved in you know all

play01:48

data.csv

play01:50

and we are reading that data we are

play01:52

actually you know setting up the

play01:53

environment for the model and we are

play01:55

actually

play01:55

applying k means clustering model and

play01:58

this is actually the k

play01:59

means object and the number of cluster

play02:02

we want to create

play02:03

is what we have seen before is 2 so

play02:06

let's

play02:06

execute it

play02:16

now it is created now what we have to do

play02:18

we have to actually you know

play02:20

create a simple proxy server uh using

play02:23

python

play02:24

that we have already created and this is

play02:26

the class responsible for you know

play02:28

creating this proxy server uh so let me

play02:31

quickly explain

play02:32

to you guys you know how we are going to

play02:33

test it so ah this is the this is

play02:36

firefox web browser and

play02:38

we are going to test our ips against a

play02:42

dummy website

play02:47

demo.testfire.net as you can see it is a

play02:50

known vulnerable web application

play02:52

we can actually search several stuffs in

play02:56

here

play02:56

and so now we have to configure our

play03:00

firefox to you know

play03:02

send all the request through the proxy

play03:04

server

play03:05

that we have we are going to you know

play03:08

create

play03:09

in this notebook so let's go ahead and

play03:12

do that

play03:17

manual proxy one two seven zero zero one

play03:20

uh and eight zero zero okay

play03:24

so now let's open our jupiter notebook

play03:30

and

play03:33

let's execute it

play03:42

it is actually now

play03:48

it is not showing this interrupt

play03:59

yeah as you can see it is it is

play04:02

listening to

play04:03

the local interface 8080 so now whatever

play04:06

request we are going to send

play04:08

it has to uh go through this proxy

play04:11

server and it the machine learning model

play04:13

that we have developed is going to apply

play04:14

on the

play04:15

all the request that we have uh you know

play04:18

we are going to send

play04:19

uh so let's you know split the screen in

play04:22

two parts so that we can see that in

play04:23

real time

play04:28

and this is our so let's

play04:32

so as you can see our ips or the proxy

play04:36

is listening to port 8080 and we have

play04:38

configured firefox to send all the http

play04:40

requests through

play04:41

this 8.0 port so whenever firefox is

play04:44

going to send in a request

play04:45

ah through 8080 our machine learning

play04:48

model

play04:49

is going to you know execute on the http

play04:52

request

play04:52

and we are going to find out whether the

play04:55

request

play04:56

is it is bad in nature or not

play04:59

so that's the plan so if you just you

play05:01

know

play05:02

it's going to you know show us some of

play05:04

the inputs here so if we just press

play05:08

5.

play05:10

six it is going to actually it is on the

play05:12

real time it is extracting all the

play05:14

features

play05:15

uh from the http request that the

play05:18

browser is sending and it is actually

play05:19

printing it here

play05:20

uh so now actually we are sending all

play05:23

the

play05:24

good requests so that is why you know

play05:26

that is the you know

play05:27

that is the data point that we are

play05:28

getting for this request so now

play05:31

let's uh send some you know bad

play05:34

bad request to the server so we are

play05:37

going to use some

play05:40

sql injection payloads

play05:43

taken from the internet and we'll see if

play05:45

it is able to detect any of this

play05:49

so let's use this one

play05:56

if we just paste it here

play05:59

it it has missed it

play06:08

so let's use this one

play06:14

as you can see it has printed that

play06:17

intrusion is detected

play06:19

so it is successfully our machine

play06:20

learning model is successfully able to

play06:23

uh identify that the request that we are

play06:25

sending is actually bad in nature so

play06:26

let's try out some other

play06:28

you know other sql injection payload and

play06:34

see if it is able to catch

play06:38

it was actually missed

play06:44

let's see if it is able to catch this

play06:45

one or not

play06:49

it's also missed

play06:54

so let's take this one

play07:00

it is able to catch that you know uh

play07:03

this

play07:04

request that our browser is sending is

play07:07

bad in nature and it has

play07:08

printed intrusion detected uh so as i

play07:11

have

play07:12

you know said in my previous video i

play07:15

need to work more on

play07:16

you know this feature extraction from

play07:18

the

play07:19

the training data uh i have not much

play07:22

spent much time to do that actually so

play07:24

obviously the

play07:26

this uh overall quality of this ips is

play07:29

going to improve

play07:30

so before i do that actually i just

play07:32

wanted to show you

play07:33

uh how real time our ips web application

play07:36

intrusion prevention system works uh so

play07:39

i'll continue to work on that

play07:40

and probably you know keep you guys

play07:42

posted how am you know

play07:43

trying to improve this uh ips that you

play07:46

know i have developed

play07:48

and how i am actually tuning this

play07:49

clustering model

play07:51

to kind of you know make it more precise

play07:53

and accurate

play07:54

so i hope you have enjoyed this video uh

play07:56

so if you enjoy the kind of content i

play07:58

upload on this channel i'd request you

play08:00

to

play08:00

uh stay subscribed to this channel so

play08:04

that's all i wanted to discuss in

play08:05

today's video i'll see in the next video

play08:10

bye

play08:13

[Music]

play08:15

you

Rate This

5.0 / 5 (0 votes)

Ähnliche Tags
Machine LearningWeb SecurityIntrusion DetectionModel DeploymentFeature ExtractionK-Means ClusteringData AnalysisReal-Time TestingCybersecurityPython Programming
Benötigen Sie eine Zusammenfassung auf Englisch?