Beginner Machine Learning tutorial (TryHackMe Advent of Cyber Day 14)

UnixGuy | Cyber Security
14 Dec 202316:23

Summary

TLDRこのビデオ脚本は、AIと機械学習の基本を解説し、それらがサイバーセキュリティにどのように適用されるかについても説明しています。チャレンジの14日目にCTOが楽器の製造プロセスを妨害し、欠陥品が発生する問題を機械学習で解決しようとするストーリーを通じて、観客を引き込みます。脚本では、遺伝的アルゴリズム、粒子群、神経ネットワークなど、機械学習アルゴリズムの3つのタイプに触れています。さらに、教師あり学習と教師なし学習の2つのトレーニング方法も紹介されています。最後に、データセットの重要性と、ニューラルネットワークのトレーニングと検証プロセスについても語られており、サイバーセキュリティにおけるAIと機械学習の活用事例も詳しく説明されています。

Takeaways

  • 🤖 AIと機械学習は、現在最もホットなトピックの1つであり、市場で非常に注目されています。
  • 📚 このチュートリアルは、TryHackMeによって提供され、初心者が参加できる完全無料で楽しい教育的なチャレンジを提供しています。
  • 🧐 AIと機械学習の違いを理解することが重要で、機械学習は人間知能を模倣するプロセスです。
  • 🔍 問題を効果的に解決するためには、適切な機械学習アルゴリズムが必要です。3つの主要なタイプが紹介されています:遺伝子アルゴリズム、パーティクルスウォーム、ニューラルネットワーク。
  • 🧠 ニューラルネットワークは人間の脳を模倣し、多数のニューロンが入力を受け取り出力を生成するプロセスを試みます。
  • 📈 ニューラルネットワークをトレーニングする方法として、教師あり学習と教師なし学習の2つの方法が説明されています。
  • 🔢 入力データを正規化し、重みと掛け合わせることでニューラルネットワークはどの入力がより重要なかを判断します。
  • 📉 アクティベーション関数は、出力を一定の範囲内に保つために使用され、ランダムな数字ではなく比較可能な出力を保証します。
  • ➡️ フィードフォワードループとバックプロパゲーションは、ニューラルネットワークをトレーニングするプロセスで、ネットワークが正しい決定を下すために使用されます。
  • 📝 データセットはニューラルネットワークに与えられる情報で、十分な情報だけを与えることでオーバートレーニングを避けます。
  • 🔑 サイバーセキュリティでは、AIと機械学習が異常行動の検出やユーザー行動分析に役立ち、迅速かつ正確な予測を提供できます。
  • 🏆 最後に、チャレンジを解決し、正確な予測を行った場合、フラグを獲得できます。

Q & A

  • AIと機械学習の違いは何ですか?

    -AIは人工知能の総称であり、機械学習はその一つのプロセスです。機械学習は、人間のような意思決定を機械にさせるプロセスであり、データセットを用いて機械が正確な意思決定を下すように教えます。

  • 遺伝子アルゴリズムとは何ですか?

    -遺伝子アルゴリズムは、自然選択と進化のプロセスを模倣したアルゴリズムです。生存競争に基づく単純でありながらも、実際の状況で実装すると複雑になる可能性があります。

  • ニューラルネットワークとは何ですか?

    -ニューラルネットワークは、人間の脳の神経細胞がどのように機能するかを模倣した機械学習の手法です。多くの入力を受信し、出力を生む細胞とニューロンを模倣しています。

  • 教師あり学習とは何ですか?

    -教師あり学習は、ニューラルネットワークに学習させたい情報を提供する学習方法です。データセットを用いて、機械が正しい意思決定を下すように学習させます。

  • 隠れ層とは何ですか?

    -隠れ層はニューラルネットワークの中間レイヤーで、数学的な計算やデータの解釈が行われます。複雑な計算を行うには多くのレイヤーが必要です。

  • 活性化関数とは何ですか?

    -活性化関数は、ニューラルネットワークの出力を一定の範囲内に収めるための関数です。これにより、異なるトイを比較することができます。

  • フィードフォワードループとは何ですか?

    -フィードフォワードループは、ニューラルネットワークを訓練する手法の1つで、入力を正規化し、入力層にフィードし、答えを得るプロセスです。

  • バックプロパゲーションとは何ですか?

    -バックプロパゲーションは、ニューラルネットワークが正しい答えを出したかどうかをフィードバックするプロセスです。これにより、ネットワークはより良い意思決定を下す方法を学びます。

  • データセットとは何ですか?

    -データセットは、ニューラルネットワークにフィードする情報で、機械学習モデルを訓練するために使用されます。適切な量のデータがないと、ニューラルネットワークは新しい問題に対して適切な意思決定を下すことができません。

  • オーバートレインとは何ですか?

    -オーバートレインは、ニューラルネットワークに過剰な情報を与えることで、ネットワークが答えを覚えてしまい、ロジックを理解しなくなってしまう状態です。バリデーションを通じて、ネットワークが十分に学習しているかどうかを判断します。

  • AIと機械学習がサイバーセキュリティにどのように役立つか説明してください。

    -AIと機械学習は、異常行動の検出やマルウェアの検出、ユーザー行動分析など、サイバーセキュリティの様々な分野で役立ちます。機械学習は、通常のネットワークトラフィックやユーザーの行動パターンを学習し、異常を検出することができるため、サイバー攻撃の検出と対応において非常に役立ちます。

  • このチュートリアルで使用されたPythonスクリプトの目的は何ですか?

    -このチュートリアルで使用されたPythonスクリプトは、トレーニングデータセットとテストデータセットを使用してニューラルネットワークを構築し、トレーニングとバリデーションを実行し、最後にテストデータセットに対する予測を行って結果を出力するものです。

Outlines

00:00

🤖 AIと機械学習の基礎とそのサイバーセキュリティへの応用

この段落では、AIと機械学習の基本を解説し、それらがサイバーセキュリティでどのように使われるかについて説明しています。チュートリアルは、「Advent of CBA by try hack me」の一部であり、初心者向けの無料コンペ테ーションを通じて楽しく学ぶことができます。CTOがToy Pipelineを問題にし、制御エルフを配置することで問題に対処している物語を通じて、AIと機械学習の力を使って問題を解決しようとしています。また、AIと機械学習の違いや、機械学習アルゴリズムの3つのタイプ(遺伝的アルゴリズム、粒子群、神経ネットワーク)についても触れています。

05:00

🧠 ニューラルネットワークのトレーニングとデータセットの重要性

第2段落では、ニューラルネットワークのトレーニング方法とデータセットの扱いについて詳しく説明しています。ニューラルネットワークは3つのレイヤー(入力レイヤー、隠れレイヤー、出力レイヤー)から構成され、各レイヤーで行われる計算とその意味が解説されています。また、フィードフォワードループとバックプロパゲーションというトレーニング手法や、データセットを扱う際のバリデーションとテストデータの使い方についても説明されています。

10:02

🔍 パイソンスクリプトを使った機械学習の応用と課題解決

第3段落では、具体的なPythonスクリプトを使って機械学習モデルを構築し、トレーニングデータセットとテストデータセットを使って問題を解決する方法が説明されています。データの正規化、80/20の分割、そしてトレーニングとバリデーションのコードを用いた説明がされています。また、予測結果をファイルに保存し、それをURLにアップロードして正確さを検証するプロセスも紹介されています。

15:03

🛡️ AIと機械学習を用いたサイバーセキュリティの応用

最後の段落では、AIと機械学習がサイバーセキュリティにどのように役立つかについて語られています。異常行動の検知やユーザー行動分析など、AIが提供する高速な検知能力の活用方法が説明されています。また、コースの最終課題として、AIと機械学習に関する質問への答え方を解説し、最終的なフラッグの提出方法も紹介されています。

Mindmap

Keywords

💡人工知能(AI)

人工知能とは、機械が人間のように思考や判断を行う能力を持つように設計された技術です。このビデオでは、AIがサイバーセキュリティにどのように適用されるかについて解説し、問題のあるトイを特定するのに役立つと述べています。

💡機械学習

機械学習は、人工知能のサブセットであり、データを通じて機械が自己学習し、判断を行うプロセスを指します。ビデオでは、機械学習がトイの品質検査やサイバーセキュリティでの異常行動検知にどのように役立つかを説明しています。

💡遺伝的アルゴリズム

遺伝的アルゴリズムは、自然選択と進化のプロセスを模倣したアルゴリズムで、最適解を見つけるために使用されます。ビデオでは、これは機械学習アルゴリズムの1つとして触れられており、問題解決手法の一例として紹介されています。

💡パーティクルスウォーム

パーティクルスウォームは、鳥が集団で特定の地点に群がるプロセスを模倣したアルゴリズムです。ビデオでは、機械学習アルゴリズムの1つとして説明されており、複雑な問題解決に役立つとされています。

💡神経ネットワーク

神経ネットワークは、人間の脳の神経細胞がどのように機能するかを模倣したアルゴリズムです。ビデオでは、神経ネットワークがトイの検査プロセスで使用され、正常性と欠陥品を判断するのに役立つと述べています。

💡教師あり学習

教師あり学習は、機械学習の手法の1つで、ラベル付きのデータセットを使って機械が学習します。ビデオでは、この手法がニューラルネットワークをトレーニングし、正常なトイと欠陥品を区別するのに使用されると説明されています。

💡非教師あり学習

非教師あり学習は、機械が与えられたデータを独自に分析し、パターンを見つけるプロセスです。ビデオでは、この手法がトイのデータに対して適用され、ニューラルネットワークが自己学習を行う方法として触れられています。

💡隠れ層

隠れ層は、ニューラルネットワークの中間レイヤーで、入力データを処理して意味のある情報を作成します。ビデオでは、隠れ層がニューラルネットワークで計算が行われ、トイの正常性を判断する上で重要な役割を果たしていると述べています。

💡フィードフォワードループ

フィードフォワードループは、ニューラルネットワークの訓練プロセスで使用される手法で、入力データを順番に処理して予測結果を得ます。ビデオでは、このループがニューラルネットワークが自己学習を進める上で欠かせないプロセスであると説明されています。

💡バックプロパゲーション

バックプロパゲーションは、ニューラルネットワークの訓練プロセスで使用される手法で、ネットワークの予測が正確かどうかをフィードバックします。ビデオでは、このプロセスがニューラルネットワークがより正確な予測を行うように学習するのに必要であると述べています。

💡データセット

データセットは、機械学習モデルに入力されるデータの集まりで、モデルが学習・予測を行うための基礎です。ビデオでは、データセットがニューラルネットワークに情報を与え、トイの正常性を判断する訓練に使用されると説明されています。

Highlights

AI is one of the hottest topics in the market at the moment.

This tutorial is part of the Advent of Cyber Security Awareness (CBA) by TryHackMe, a free competition for beginners.

The basics of AI and machine learning will be covered, along with their application in cybersecurity.

The difference between AI and machine learning will be explained.

Machine learning is a process to teach a machine to mimic human intelligence.

Three types of machine learning algorithms will be explored: genetic algorithm, particle swarm, and neural network.

Neural networks are the most popular and mimic how neurons work in the human brain.

Two methods of training neural networks will be discussed: supervised learning and unsupervised learning.

Supervised learning provides the neural network with labeled data to learn from.

Unsupervised learning allows the neural network to find interesting patterns in unlabeled data.

The neural network consists of an input layer, hidden layer, and output layer.

The hidden layer is where the mathematical computations and decision-making occur.

The feedforward loop and backpropagation are key processes in training a neural network.

The importance of proper data preparation and avoiding overtraining will be discussed.

The example will use a supervised learning method to classify toys as defective or not.

The process of building and training a neural network model will be demonstrated using Python code.

AI and machine learning have practical applications in cybersecurity for detecting unusual behavior and user activity.

The tutorial provides a step-by-step guide to using AI and machine learning in solving a cybersecurity challenge.

The presenter shares his experience and insights on the evolution and effectiveness of AI in cybersecurity over the years.

Transcripts

play00:00

AI is one of the hottest topics in the

play00:02

Market at the moment and if you ever

play00:04

wanted to learn about AI then this free

play00:06

tutorial is for you this is part of the

play00:08

Advent of CBA by try hack me which is a

play00:11

completely free competition for

play00:12

beginners to participate in fun and

play00:14

educational challenges I will go through

play00:16

the basics of AI and machine learning

play00:19

but I will also explain to you how AI

play00:21

can be used in cyber security and if

play00:23

you're new to the channel I'm un next

play00:25

guy and I'll help people learn their

play00:26

first cyber security job even if they

play00:28

don't have any degree or technical IT

play00:30

background I have many videos where I

play00:32

show you exactly how to land your first

play00:34

cyber security job like this video for

play00:36

example but wait a minute it's almost

play00:38

Christmas time let's get into it so in

play00:41

day 14 of the challenge the CTO has made

play00:44

our toy pipeline go wrong that's not

play00:46

unusual of CTO by the way by infecting

play00:48

elves at key positions in the toy making

play00:50

process he has poisoned the pipeline and

play00:53

caused the elves to make defective toys

play00:55

maxky has started to combat the problem

play00:57

by placing control elves in the pipeline

play00:59

these elves take measurements of the

play01:01

toys to try and narrow down the exact

play01:03

location of problematic elves in the

play01:05

pipeline by comparing the measurements

play01:07

of defective and perfect toys however

play01:09

this is an incredibly tedious and

play01:10

lengthy process so he's looking to use

play01:12

machine learning so in this story the

play01:14

CTO has messed up the process of making

play01:16

toys and what we want to do is use the

play01:19

power of AI and machine learning to try

play01:21

and detect which toys are in good

play01:23

condition and which toys are defective

play01:25

but first things first we need to learn

play01:27

the difference between Ai and machine

play01:28

learning you see so many people use the

play01:30

word AI by mistake because what they

play01:32

really are referring to is what we call

play01:34

machine learning machine learning is

play01:36

essentially a process where we try and

play01:38

teach a machine to mimic human

play01:40

intelligence for example I want my

play01:42

machine to be able to make decisions

play01:44

accurately but the problem we face is

play01:46

that sometimes we end up using F

play01:48

statements for example if we're

play01:49

producing toys and we want all the toys

play01:52

to be red but one of the toys is blue

play01:54

then our machine can detect if the toy

play01:56

is blue because that's an easy process

play01:57

we can just have an F statement that say

play01:59

if the the toy is not thread then the

play02:01

toy is defective but unfortunately in

play02:03

the real world the problems that we Face

play02:05

are a little bit more complicated for

play02:07

example in our toys problem here it's

play02:09

not just the color but sometimes it's

play02:11

the height sometimes it's the weight or

play02:13

the dimensions so to effectively solve

play02:15

this problem we need to use proper

play02:17

machine learning the process of

play02:18

analyzing the input and trying to find

play02:20

the problem this process and this flow

play02:23

of decision we call it an algorithm now

play02:25

there are so many different algorithms

play02:27

for machine learning which go beyond

play02:29

this video but here we will explore

play02:31

three types of machine learning

play02:32

algorithms so the first one is called

play02:34

genetic algorithm this structure

play02:36

essentially mimics the process of

play02:37

natural selection and evolution it's

play02:39

essentially an algorithm based on the

play02:41

theory of survival for the Fest fairly

play02:43

simplistic idea but it can be

play02:45

complicated when we try to implement it

play02:47

in real life the second type is called

play02:49

particle swarm so this one aims to mimic

play02:51

the process of how birds flock in a

play02:53

group together at a specific point and

play02:55

the third one is called the neural

play02:56

network the neural network is by far the

play02:59

most popular one in fact this is the one

play03:01

that I learned long time ago in

play03:02

University and if I'm honest with you I

play03:04

forgot most of the stuff that I learned

play03:06

because I didn't practice or apply it in

play03:08

the real world but today with the AI

play03:10

boom I find myself needing to go back to

play03:12

those Concepts and learn them again

play03:14

which is a lot of fun for me so

play03:16

basically neural networks mimic the

play03:17

process of how neurons work in our brain

play03:19

neural networks try to replicate our

play03:22

human brain so we have so many cells and

play03:24

neurons in our brain and they receive

play03:26

various inputs and they produce outputs

play03:28

so this is the basic idea behind neural

play03:30

networks and this is the one that we

play03:32

will use in our example today and I

play03:34

promise you as we go through the example

play03:36

it will become a lot easier to learn and

play03:38

understand so the neural network much

play03:40

like the human brain needs to learn

play03:42

before it can make decisions and there

play03:43

are many ways to teach neural networks

play03:45

how to make decisions in fact so many

play03:47

PHD programs are based on specific way

play03:50

of teaching neural networks and trying

play03:51

to find the efficiency of that way in

play03:53

this example here we will be learning

play03:55

two methods of training neural networks

play03:58

the first one is called supervised

play03:59

learning learning and the second one is

play04:00

called unsupervised learning the

play04:02

supervised learning is where we provide

play04:04

the neural network with the information

play04:06

that we want it to learn so ideally we

play04:08

need to create what we refer to as data

play04:10

sets these data sets are essentially

play04:12

information that we feed to the neural

play04:14

network in order for the neural network

play04:16

to be able to make decisions based on

play04:17

this data for example if I want to teach

play04:19

the newal network about toys my data set

play04:22

should include information about what a

play04:24

good toy looks like so that when the

play04:25

neuron Network sees a toy that doesn't

play04:27

look like the good toy then it can tell

play04:29

you what it's defective or not the

play04:31

second type is called unsupervised

play04:32

learning and this one is a little bit

play04:34

more complex we basically let the neural

play04:36

network try and find interesting things

play04:38

so in our toys example instead of

play04:40

specifically telling the neural network

play04:42

that this is a good toy what we do

play04:44

instead is we feeded large amount of

play04:46

information about toys in general and

play04:48

then we let the neural networks try to

play04:50

make correlations and decisions based on

play04:52

that data as I said this is a complex

play04:54

topic and in the challenge there are

play04:56

links that you can follow where you can

play04:58

read more about this later on in this

play05:00

example that we will go through we will

play05:01

use the supervised learning method so as

play05:04

you can see in this example our machine

play05:06

learning model here consist of three

play05:08

layers we have the input layer and then

play05:10

we have a hidden layer and we have then

play05:12

the output layer so at the moment we

play05:13

have a neural network that doesn't have

play05:15

any information so the first step is to

play05:17

give it some information that

play05:19

information is fed to the neural network

play05:21

at the input layer so in this example

play05:23

you can see that the input layer have

play05:25

the height width length color scheme

play05:27

make ID and the check ID and at the end

play05:29

of that model we have something called

play05:31

the output layer this will simply tell

play05:33

us whether the toy is defective or not

play05:35

but then in the middle we have the

play05:37

hidden layer this is where the magic

play05:39

happens this is where the math where the

play05:41

computation happens this is where the

play05:43

neural network tries to make sense of

play05:45

the data and the hidden layer itself can

play05:47

be composed of so many layers in fact

play05:49

the more complicated the calculations

play05:51

are the more layers we need now this

play05:53

example doesn't go deep into what a no

play05:55

is but if I wants to simplify it for you

play05:57

a node can be a server it can be a

play06:00

laptop it even can be an application

play06:02

instance or a thread think of a node is

play06:04

like something that does computations so

play06:06

the more computational nodes we have the

play06:08

more servers we have the more computing

play06:10

power we have the more mathematical

play06:12

computation we can perform so if we have

play06:14

a complex neural network that's doing

play06:17

true artificial intelligence we really

play06:19

need a lot of computational power and a

play06:21

lot of mathematical calculations as well

play06:24

now in this example we will zoom in a

play06:26

little bit on the hidden layer to see

play06:27

how the calculations are performed but

play06:29

don't worry it's not complex math it's

play06:31

very very basic simplified for you just

play06:33

to know enough to understand how neural

play06:36

networks actually work we don't just

play06:37

take the inputs as they are in fact we

play06:39

multiply the inputs by the weight now

play06:41

this is important logic we need to

play06:43

understand the relationship between the

play06:45

height and the weight of the toy this

play06:46

will help the neural network understand

play06:48

which input contribute more to the

play06:50

output than others so for example one

play06:52

toy could be a lot taller than other

play06:54

toys but once we factor in the weight

play06:56

then we can see how much that height is

play06:58

actually contributing simil similarly we

play06:59

don't just take the output value as it

play07:01

is but we take that output value and we

play07:04

put it into what we refer to as an

play07:05

activation function the purpose of the

play07:07

activation function is that we don't

play07:09

want the outputs to be just random

play07:11

numbers we want these outputs to be

play07:13

within one range so we can compare the

play07:15

toys to one another so in this example

play07:17

we want the output to be a decimal

play07:19

number between 0 and 1 or it could be

play07:21

between minus1 and 1 now that we have

play07:23

some idea of how the neural network

play07:25

operates the next step would be to train

play07:27

the neural network this means we need

play07:29

information to feed into the neural

play07:31

network so it can start and make

play07:33

decisions for us so the first method

play07:35

here is called the feed forward Loop

play07:37

this is the simplest form of training so

play07:39

essentially the way it works the first

play07:41

step is we normalize the input normalize

play07:43

the input is what we talked about in the

play07:45

previous step which is we multiplies it

play07:47

by weight to help the neural network

play07:49

decide which inputs are more important

play07:51

and then we feed the inputs to our nodes

play07:53

in the input layer as you can see in the

play07:55

diagram and then we simply get the

play07:57

answer so the more data we give to the

play07:58

neural netk took the better the decision

play08:00

will be because it will know what a

play08:02

defective toy looks like but that's only

play08:04

half of the equation we have been

play08:05

feeding input to the Neal Network now we

play08:08

need to do what we refer to as back

play08:10

propagation this is where we simply tell

play08:12

the neural network whether the answer

play08:14

was correct or not and that's how the

play08:15

neural network can learn so we will give

play08:17

the neural network a certain input and

play08:19

then the neural network will come and

play08:20

tell you well I think this toy is good

play08:22

but then you will look at it and say yes

play08:23

it was and then you will look at the

play08:25

answer and then you will tell the neural

play08:26

network whether the answer was correct

play08:28

or not and that's how the neural network

play08:30

can be trained to make better decisions

play08:33

now the last topic that we need to talk

play08:34

about before we go into solving the

play08:36

challenge is the information that we

play08:38

feed to the neural network which we

play08:40

refer to as data sets now the type of

play08:43

data that we give to the neural network

play08:44

is a huge topic on its own and it's not

play08:47

straightforward so as tryck me trying to

play08:49

explain it if you were at school and

play08:51

your teacher have explained to you that

play08:53

1 + 1 = 2 and 2 + 2 = 4 but then in the

play08:57

exam you get the question of 3 + 3 now

play08:59

we know that the answer is six but you

play09:01

can only know that if you understood the

play09:03

basic principle of addition if you just

play09:06

memorized 1 + 1 = 2 and 2 + 2 = 4 then

play09:09

you won't be able to calculate 3 + 3 you

play09:12

have to understand the logic behind the

play09:14

calculation the same thing applies to

play09:16

machine learning we can simply give our

play09:18

neural network so much information but

play09:20

then the neural network end up

play09:21

memorizing all of this information the

play09:23

problem is when the neural network faces

play09:25

a new problem that it hasn't seen before

play09:27

then it can't make a decision so the way

play09:29

we train our neuron network is we need

play09:31

to train it in such a way that it

play09:33

understands the underlying logic it's

play09:35

not just cramming and memorizing things

play09:37

so the way we fix this problem is that

play09:39

we need to know how much information is

play09:42

enough otherwise we will end up into

play09:44

what we refer to as overtraining this is

play09:46

when we give the neur network way too

play09:48

much information that it ends up

play09:50

memorizing the answers as opposed to

play09:52

understanding the logic so to achieve

play09:53

that we do something we refer to as

play09:56

validation we essentially validate

play09:58

whether the neural network has Lear

play09:59

enough or not so to perform validation

play10:02

we have to split the data set into three

play10:04

data sets the first set is called the

play10:05

training data this is the information

play10:07

that we feed to the neural network but

play10:09

then we also need validation data which

play10:11

is what we will use to validate whether

play10:14

the neural network understood our

play10:15

training or not so after each training

play10:17

round we need to send the validation

play10:19

data through our Network to determine

play10:21

the performance if the performance

play10:23

starts to decline then we know that

play10:24

we're overtraining our neuron Network

play10:27

and finally we have testing data this

play10:29

data set is simply used to calculate the

play10:31

final performance of the network the

play10:33

network shouldn't see this data until

play10:34

we're completely done with the training

play10:36

process so with all of this background

play10:38

information now we can proceed to solve

play10:40

our Challenge and I promise you the hot

play10:42

part is done this challenge is

play10:44

straightforward and I'll show you

play10:45

exactly how to solve it so we

play10:47

essentially have three files one is a

play10:49

python script called detector and then

play10:51

we have our training data set and we

play10:54

have our testing data set we need

play10:56

training data set we need validation

play10:58

data set and we need testing data set in

play11:00

this particular example the training and

play11:02

the validation data sets are both in the

play11:04

same file so we only have two files for

play11:06

Simplicity now what we need to do is we

play11:08

need to add some lines of codes to our

play11:11

python script but don't be afraid you

play11:12

don't actually need to know Python

play11:14

Programming or anything because these

play11:16

lines of codes are given to you all you

play11:18

need to do is you need to start the

play11:19

machine in the top right corner and this

play11:21

will split our screen in half and then

play11:23

we can copy the lines of codes into the

play11:25

detector script so as you can see if we

play11:27

walk through the codes the first thing

play11:29

we do is we import python libraries that

play11:31

are needed for our neural network here

play11:33

we're using two libraries one is called

play11:35

pandas and the other one is called py

play11:37

learn for building our neuron Network

play11:39

and the next thing is we need to load

play11:41

the data set in our case we have two

play11:43

Excel files and then once we get the

play11:45

input data remember we normalize it so

play11:47

in our previous example we had to

play11:49

multiply it by the weight but also in

play11:51

this example we need to make all the

play11:52

inputs numerical so even the color we

play11:54

need to just convert it to numbers so

play11:56

that we get a simple decimal output

play11:58

answer and finally we need to load the

play12:00

data everything will be stored in the

play12:02

variable we call x and testore x stores

play12:05

the testing data for us the next step is

play12:07

where you will need to copy some lines

play12:08

of code so for our data set remember we

play12:11

have one file for both the training and

play12:13

the validation so we will split the data

play12:15

in this example we will use an 80/20

play12:17

split so what I will do I will simply

play12:20

copy this line into the script sh I use

play12:22

the Vim editor because as I was

play12:24

recording it was a little bit slow

play12:26

remember you can just use the graphical

play12:27

interface you can simp simply double

play12:29

click on the script and you can copy and

play12:31

paste the line you don't actually need

play12:32

to use the vi editor if that's easier

play12:34

for you and the next step we need to

play12:36

copy these lines to normalize the data

play12:38

simply copy paste this line into the

play12:40

script and then we need to start the

play12:42

training process again all I will do is

play12:44

just copy this line into the script then

play12:47

we've got our classifier code and our

play12:49

validation prediction code as well and

play12:51

finally we need to insert a testing

play12:53

prediction code as well once you're done

play12:55

remember to save the file if you've

play12:57

double clicked on the file and you edit

play12:59

that just make sure to click save and

play13:01

then we will simply run this python

play13:02

script so we simply type the command

play13:04

Python 3 space and then we type

play13:06

detector. py you can copy this line of

play13:09

code if it's easy for you and then we

play13:10

can watch the magic happen a few moments

play13:13

later now when you finish the prediction

play13:15

will be saved to a file then we simply

play13:17

need to upload the prediction to this

play13:19

URL so open this URL within your testing

play13:22

machine and then we upload the file so

play13:24

let's see if our prediction were

play13:26

accurate if our accuracy is above 90%

play13:28

person then we will be awarded with the

play13:30

flag and here we go winning we got the

play13:33

flag I'm not going to lie it feels good

play13:35

now when you run it you may not always

play13:37

get 90% accuracy and if that happens

play13:39

simply run it again and the explanation

play13:41

for that is within neural networks there

play13:43

is some Randomness built in remember

play13:45

they try to mimic a real human brain a

play13:48

real neural network so this is a an

play13:50

issue to look out for now when it comes

play13:52

to cyber security Ai and machine

play13:54

learning are not a new topic in fact

play13:56

even 5 years ago I was bombarded by by

play13:58

vendors telling me how they have this

play14:00

sophisticated Ai and machine learning

play14:02

built into their products but when I go

play14:04

and test them I get extremely

play14:06

questionable results it was simply bad

play14:09

however today they have gotten a lot

play14:11

better now some of the security tools

play14:13

that can make use of AI is tools we use

play14:15

to detect unusual behavior and this is

play14:18

important because neural network can be

play14:20

really good at detecting things a lot

play14:22

faster than us humans for example if you

play14:24

have a malware on your network then the

play14:26

traffic will look different than the

play14:28

normal traffic so your neural network

play14:30

can actually learn what a normal

play14:32

day-to-day operation looks like and

play14:34

whenever there is something unusual then

play14:36

that AI or machine learning can generate

play14:38

an alert for you this is extremely

play14:40

useful for cyber security professionals

play14:42

who work in detecting and responding to

play14:44

cyber attacks another thing which is

play14:45

also popular in the industry is what we

play14:47

refer to as user Behavior analytics for

play14:50

example if every day I log into my email

play14:53

at 9:00 a.m. from Australia but all of a

play14:55

sudden out of the blue my username logs

play14:57

into my email email at 3:00 a.m. from

play15:00

France this is an unusual behavior

play15:02

because machine learning have learned

play15:04

and observed my behavior over a very

play15:06

long period of time and now it can

play15:08

detect if I'm doing something that's

play15:09

unusual now notice these are simplistic

play15:12

examples but as you've learned in this

play15:14

course if you provide really good data

play15:16

sets into your machine learning it can

play15:17

provide you with fairly accurate

play15:19

predictions now the final part of the

play15:21

challenge is to answer the questions so

play15:23

the first question is what is the other

play15:24

term given for artificial intelligence

play15:26

this is an easy one machine learning the

play15:28

second question is what machine

play15:30

learnings aim to mimic the process of

play15:32

natural selection it's called genetic

play15:34

algorithm we talked about this at the

play15:36

beginning of the lesson and then what's

play15:38

the name of the learning style that

play15:39

makes use of labeled data to train a

play15:41

machine learning structure if we scroll

play15:43

up this is called supervised learning

play15:45

and what's the name of the layer between

play15:47

the input and output layers of neural

play15:49

network this is the hidden layer this is

play15:51

where the magic happens and finally what

play15:53

is the name of the process used to

play15:55

provide feedback to the newal network on

play15:57

how close it prediction was is called

play15:59

back propagation this is where we tell

play16:00

the network whether the prediction was

play16:02

correct or not and finally we need to

play16:04

submit the flag that we got so all we

play16:06

need to do is copy paste the flag here

play16:08

and submit now as I said Ai and machine

play16:10

learning are an extremely Hot Topic in

play16:12

cyber security and if you're trying to

play16:14

land your first cyber security job then

play16:16

I have a step by-step road map to take

play16:17

you from absolute beginner all the way

play16:19

to becoming a cyber security

play16:21

professional in this video and I'll see

play16:22

you there

Rate This

5.0 / 5 (0 votes)

Related Tags
AI機械学習初心者チュートリアルCTOトイパイプライン問題解決サイバーセキュリティデータセットニューラルネットワークフィードフォワードバックプロパゲーション
Do you need a summary in English?