Complete Beginner's Tutorial to Google Colab
Summary
TLDRIn this video, Greg introduces the Google Colab environment, a cloud-based platform for running Python code in Jupyter notebooks. He demonstrates basic functionalities, such as writing and executing code, interfacing with the operating system, and utilizing Google Colab's GPU and TPU resources. Greg also explains how to manage files, handle variables, and install additional Python libraries. The video is geared towards beginners, emphasizing educational purposes and the ease of experimenting with machine learning models without needing high-end hardware.
Takeaways
- 😀 The video is a tutorial on using Google Colab, not specifically a Python tutorial, but it involves writing some Python code.
- 📘 Google Colab is an interactive Python environment, allowing users to write and execute Python code in 'cells'.
- 🖥️ Users can interface with the operating system within Google Colab using an exclamation mark before commands, similar to using a terminal in Linux.
- 🔍 Colab provides access to sample data folders with pre-existing files useful for data science, which is a common use case for Jupyter notebooks.
- 🌐 Google Colab is cloud-based, meaning the computation is done on Google servers, and it's free to use for basic functionality.
- 💻 The video mentions that users can connect to different runtime types, including GPU or TPU, which can be beneficial for machine learning tasks.
- 🚀 Colab is particularly useful for those without access to powerful hardware, as it allows for machine learning model training without the need for expensive systems.
- 🔄 The video demonstrates how to manage variables and lists within the Colab environment, highlighting the stateful nature of the environment compared to traditional script execution.
- 📑 The script discusses the use of text cells in Colab for adding notes or documentation, following Markdown syntax for formatting.
- 📝 The tutorial covers creating a table of contents using headings and subheadings to organize the notebook, which is helpful for navigating and understanding the code structure.
- 🔌 The video explains how to install new Python libraries using pip within Colab and the potential need to restart the runtime for changes to take effect.
Q & A
What is the main focus of the video script?
-The main focus of the video script is to provide an introduction to the Google Colab environment, explaining its features, how it works, and its use for writing Python code, particularly in the context of data science and machine learning.
What is the difference between Google Colab and Jupyter Notebooks?
-Google Colab is similar to Jupyter Notebooks in that they both allow for writing and executing Python code in an interactive environment. The main difference is that Google Colab is cloud-based, allowing users to connect to a runtime that can include GPU or TPU resources, which may not be available on their own computers.
How can you interact with the operating system in Google Colab?
-In Google Colab, you can interact with the operating system by using an exclamation mark (!) before commands, such as 'ls' to list all files in the current directory, similar to using commands in a Linux shell.
What is the purpose of the sample data folder in Google Colab?
-The sample data folder in Google Colab contains pre-existing files that are often used for data science. These files are included to facilitate users in learning and testing data science and machine learning models without needing to find or upload their own datasets.
What is the advantage of using Google Colab for data science and machine learning projects?
-Google Colab provides an accessible and powerful platform for data science and machine learning projects by offering free access to computational resources such as GPUs and TPUs, which can significantly speed up the training of models and handling of large datasets.
Outlines
😀 Introduction to Google Colab
Greg introduces the video with an overview of Google Colab, explaining it as an interactive Python environment. He clarifies that while the tutorial will involve minimal Python coding, the focus is on demonstrating the functionalities of Colab. Greg mentions the use of Jupyter notebooks for writing and interacting with Python code, and how Colab allows for GPU or TPU usage, which can be beneficial for data science and machine learning tasks. He also touches on the free nature of Colab, with the option for users to upgrade for more resources.
📓 Exploring Colab's Features and Python Interaction
The second paragraph delves into the features of Colab, such as the ability to write and execute Python code in cells, interface with the operating system using shell commands, and the inclusion of sample data for data science projects. Greg demonstrates how to declare variables, execute shell commands, and interact with the file system. He also explains the use of Col
Mindmap
Keywords
💡Google Colab
💡Jupyter Notebooks
💡Interactive Python (IPython)
💡Runtime
💡GPU (Graphics Processing Unit)
💡TPU (Tensor Processing Unit)
💡RAM (Random Access Memory)
💡Pip
💡Markdown
💡Table of Contents
💡Machine Learning
Highlights
Introduction to Google Colab and its use as an interactive Python environment.
Explanation of Jupyter Notebooks for writing and executing Python code interactively.
Demonstration of how to write and output Python variables within Colab.
Showcasing the ability to interface with the operating system using Google Colab.
Highlighting the inclusion of sample data for data science in Colab's environment.
Differences between local Jupyter and cloud-based Colab, including runtime options like GPU and TPU.
The free use of Colab for machine learning models and coding without the need for high-end hardware.
Mention of Colab Pro for enhanced features like more RAM and better GPU/TPU access.
The importance of using Colab responsibly to avoid resource limitations.
How to write and execute Python code cells and interface with the OS using the exclamation mark.
Using the file system within Colab, including creating new folders and viewing hidden files.
Introduction to the variable inspector in Colab for examining stored variables.
Explanation of how to use text cells in Colab to create a table of contents with markdown.
The persistent state of variables in Colab, unlike traditional Python scripts.
How to upload and download files in Colab with caution due to the temporary nature of the environment.
The option to mount Google Drive for more permanent file storage within Colab.
How to manage RAM usage in Colab and what to do if the environment crashes due to insufficient memory.
The process of installing new Python libraries in Colab using pip and potential need for runtime restarts.
Final thoughts on using Colab for learning and testing rather than heavy-duty professional use.
Transcripts
hey everyone my name is greg and today
we're going to learn about the google
collab environment so this is not a
specifically python tutorial in google
colab although we will write a little
bit of python code very little this is
showing how the environment works and
just what it's about so basically if
you're not familiar with jupiter or
jupiter notebooks basically what they
are is this way to write python code and
what we call i so interactive python so
pi and then nb notebooks so we're
writing code in interactive python
notebooks where what that means is
basically ignoring text for now because
that's really just text we'll look into
that a little bit you write code and
usually python code into these cells or
for example i could get a variable maybe
x is equal to five so i made an integer
five and made that equal to x and we can
actually output that without even
writing print we can go ahead and just
leave that as the last line as long as
that's the last line even over here it's
still going to output it because these
are just empty lines it outputs that
variable and puts it right here and what
we call say the output terminal and
actually you can sort of interface with
the operating system a little bit in
google collab as well for example with
exclamation you can do something like ls
which is like list all of the files
that's how you do that in linux well it
does that same thing it shows sample
data that's really all that's here we
have this sample data folder which has a
bunch of pre-existing files that are
often used for data science so most of
the time ipython notebooks are for data
science and machine learning not always
but most the time and so they do have
these files included in there already so
it's pretty similar to jupiter but the
main difference is that you go and
connect to a runtime and you can change
the runtime type as well and get a gpu
or tpu even if your own computer doesn't
have one so this is a cloud-based and
based thing where you have to go into
the browser go to google collab and
although it may look like it's running
on your computer most of this stuff is
not actually running on your computer
you are just interfacing with it via
this notebook you know writing the code
here what it actually does is puts this
all on google servers and completely for
free unless you get cov plus which i
think gives you more ram sometimes you
run out of ram machine learning models
do tend to eat up ram a lot so you may
actually run out of ram and you can get
better access to gpus or tpus with cobot
plus but unless you do any of that you
are just using colab servers for free so
that you can train machine learning
models or just really write code in a
nice way without worrying about a fancy
computer you don't have to go out and
buy some big system so that you can
train machine learning models and you
know if you are like a company or
someone that's making a startup and
you're relying on training important
machine learning models collab may not
be enough for you you may have to go for
some sort of an upgrade for one of the
cloud-based systems or get your own
system like a good nvidia or amd and
actually the apple stuff is getting
better for gpus as well
but this is a cloud-based resource
mainly for education purposes so
probably you're not going to want this
if you are a heavy duty user it's more
so great for just kind of testing things
and mostly learning so we've seen a
couple of the pieces you can write code
and then you can kind of interface with
the operating system we could do that a
little bit more we could do exclamation
which really just means interface with
the the os versus the actual python code
if you don't do exclamation you're just
writing python code if you're doing an
exclamation you could do something again
like pwd that means print the working
directory they made our working
directory slash content and if you
really wanted you know you could
actually tool around here and view
hidden files this tab here this is your
files and you could go in to look around
here but almost all of the time
you won't need to do any of this you may
want to make a new folder and you can do
that if you want you could do
exclamation maker we'll just call it new
underscore folder we'll do that and you
should if you refresh you will actually
see a new folder and you can do stuff
like that if you want to something
interesting in google colab and i
believe is only available in jupyter
notebook
if you have some extensions is this kind
of variable thing and you're seeing
stuff that actually shouldn't really
exist because i did this earlier these
are all of the variables that it
currently has stored and so you can
actually look them up so if we did
something like f is equal to let's make
it the list of one two and then three we
should actually see f pop up over here
and you can filter by if i type f that
is going to come up just a normal
alphabetical based search and it says
it's a shape of three items so the shape
thing is basically actually if you hover
it apparently it does show you what it
is
this shape only really makes sense for
uh for numpy arrays and the reason that
that's actually getting its own column
is because again this is often for
training machine learning models having
stuff in numpy arrays and tensors is
extremely common and so this is a two by
two array uh you can't see it because
this was earlier but if i did show you i
made a earlier and it's the numpy array
with the the first row is one two and
the second row is three four and so its
shape is two by two it has two rows and
two columns now just a quick example of
text if you wanted to you could easily
write any text you just do the add text
thing there or maybe add text here by
the way you can just kind of delete
cells as you want to here delete and you
can also move them around this is moving
this one into different locations i'll
just place it back at the bottom you
could put any text anywhere you wanted
and if you just type some text say hi
well all that is is going to have hi it
really follows the markdown rules and
sort of slightly different and weird
scenarios but what it is basically is if
you do another text and then we write
say a hashtag a hashtag and then hi what
that single hashtag was and by the way
i'm double clicking to go into the
editor there and i'm just single
clicking to to look at how it looks
normally if i double click there you'll
see the hashtag and then i think you do
need the space in between
no actually you don't so just hashtag
and then text normally people put a
space there though you should be able to
see in the table of contents we have the
section called hi
and if you were to do say another piece
of text anywhere else you could do two
hashtags and then say hi as well
and basically what that does is it is
see and indents it because that's a
subsection we have the main section high
then we have a subsection
of this high as well over here and i
believe the furthest level of indenting
you can go is three so we could do a or
maybe four actually but i don't think
you can go infinitely at least you
definitely don't want to go past three
or four if we do another hashtag we'll
do three hashtags and then say hello
that is going to nest it under that
section and if you did say you know just
two of them we could do uh hi we'll just
do the number seven you can see that
places it at its appropriate level
indentation that has two hashtags and so
that goes at the same level of the other
two hashtag things so you can build up a
table of contents that way if you wanted
and you could make it simple and just do
only single sections so you could say
you know this is high too this is
another section and it seems to be
placing it under that no there we go it
takes a second for it to figure it out
sometimes but what the nice thing is is
you can click on these and it's going to
bring you to those
appropriate sections in the code so
wherever you want to kind of set up your
table of contents where you have
sometimes people when they're training
models they'll do something like um you
know getting the data or actually
imports and then getting data
pre-processing model training final
evaluation
maybe maybe an exploratory data analysis
you could make all of those a section
and then you could have sub sections
within those if you wanted to and that
makes it look good both for the
organization of the code like from the
table of contents level and just looking
at the code i can really help people
especially if you are sharing these with
people uh can really help people
understand and yourself what is going on
again if you did want to get a gpu you
can always go into runtime and then
change runtime type nor by default you
won't have a hardware accelerator which
is just a graphics processing unit and
then there's a tensor processing unit
gpu is for both gaming display uh and
for number crunching still on here they
really mean it just as a number cruncher
or for machine learning and for tpu in
this context you can really think of
that as just an advanced gpu
and so they're going to have least
access to tpus a little bit less access
to gpus and you should very rarely have
trouble uh finding a server and
appropriate resources when you're just
on none here
and why am i saying this thing about
resources well basically how this works
uh is that google treats you you know
how you treat it and so if you go to
really abuse the gpu like if you're just
when you're logging into your account
and you're constantly using a gpu uh
they are not going to like you for that
tpu you know they're not going to like
you for that either just use it when you
have to use it for a couple of minutes
when you're training a model and then
you shouldn't run into any difficulties
but yeah just just be nice to them
because they're being nice to us for
doing this for free now a little bit
into the python code just to make sure
we understand jupiter itself now this is
a little bit different than just running
a python script top to bottom because as
you can see we still have all these
variables that exist right now if we go
into variables you'll see all of these
things normally like in a python script
it'll just run top to bottom you know
occasionally like he'll jump around for
functions you'll jump up and down for
loops or if statements and stuff but in
general you run it from top to bottom
and then you're not going to have like
any variables like in memory or whatever
that stuff is going to be gone because
you just ran a script and it should be
done but here it's not really done
because we have access to all these
things you have access to f here and you
have access to x
and if you wanted to you could run a
loop for i
in range 4 and we will just do print i
so you could do that and then you could
change a variable maybe make a new
variable we'll say
lst is equal to just a list of 4 5 6
and then each time we will go in at lst
dot append with i so we can do that and
firstly well what's that gonna do it
prints the same thing but lst down here
you can see now
four five six zero one two three we
modified that variable and what you
could do is just remove this line and
then you could do it again and what that
does is it appends to that list it kept
the same one because you didn't
redeclare it it kept that same one and
then it appended four more things and
you could do that again and again if you
wanted to and that's going to keep
growing of course you wouldn't really
want to do something exactly like that
but the point is to display the fact
that we have these variables in memory
you're not just running a script top to
bottom everything's deleting you are
modifying these variables and this is
very useful for data science and machine
learning because often you're like
manipulating a pandas array and you
might want to do some things here or
there and keep it in memory and look at
it later
and you might want to train a model and
then you know do something for 10
minutes just like you know figure some
things out and then you know maybe use
that model or tweak some things now you
can upload and download files into this
environment but just be a little bit
wary that
it's an online resource that's you're
not really dedicating and you don't have
full access to so just because you
upload a file here for example
there is no guarantee that if you were
to look an hour later or even five or
ten minutes later that that file is
still going to be there because uh your
resources could get washed up at any
point however downloading can be
extremely powerful because you know once
you do something on here download it
that file will be on your computer you
know until something you know wrong or
you deleted on purpose happens if you
wanted to to kind of combat that issue
you could mount your drive here permit
this notebook access to your google
drive files so what's saying here
connecting to google drive will permit
code executing this notebook to modify
files in your google drive until access
is otherwise revoked so you could have
that access to your google drive
therefore you're not really going to
lose files actually it puts it in right
there
that way you're not really going to lose
files because you can just
you can move them to and from your
google drive and then you don't really
have to worry about that to be honest i
scarcely use this i mostly upload things
and then you know what they're usually
on my computer saved no matter what and
if i need to re-upload it again i will
re-upload it again
and if i need to download something i
will quickly download that and then not
worry that it's going to disappear
because i've already saved a file of
that
so usually i don't actually do this and
i don't remember the complexities on how
that involves but it's not super bad
there are some helpful tutorials out
there if you want to learn about
mounting your google drive just so you
know some of the complexities of this
occasionally you will run out of ram and
if so it's probably going to crash and
you'll lose most of your results if that
happens you'll probably just have to
connect to a runtime again and you're
probably good to go
sometimes things are going to just look
like they're going to go forever and
then you go to cancel them so for
example i could do something like fur i
in range of a very very big amount and i
will show you what happens we'll try to
print i but it's not going to be
actually i'm not going to print anything
i'm just going to do pass this will
irritate it eventually this will
probably figure out that it's not going
to work
but for us what we want to do is try and
stop it that did stop it and so you're
lucky that it got keyboard interrupt and
you're good to go
and this does not kill that most the
time this does not kill your environment
by the way we should still have our
previous variables in x right here but
not always there will be times for sure
when even if you click stop here
you know it's doing something
complicated and it's not going to want
to stop so what if that does happen you
can try to do run time interrupt
execution try to keep interrupting it
and if not you are going to have to
restart runtime which is going to fix
your problems
and maybe disconnect and delete runtime
but most the time i usually don't click
this one i usually start i usually do
restart runtime if there's any big
issues and then you'll just run the
cells by hand or if you wanted you could
do a run all as well but be careful
about the run all because the very last
thing that ran into a problem would
probably run into a problem again yeah i
mean it depends what your variables are
and how you have kind of the order in
which you did things but be wary of the
run all because you might just run into
the same problem again and again okay
sorry i don't have the camera turned on
because i'm doing this the next day i
can't believe i forgot that if you need
a new environment that does not exist
already so for example import scikit
learn we do that with import sk learn
that works okay it doesn't have any
problem with that because scikit-learn
is installed into the library and
actually we can see what is installed
with pip and then list that's going to
show you all the different environments
all of this stuff exists in there
already so it has tensorflow sql scikit
learn so i can learn pandas scipy a lot
of different libraries mostly for data
science and machine learning related
stuff if there is one that is not
installed already and there is tons of
them you can do
pip install and then whatever that
library is so one example of something
that's not in there by default is auto
sklearn so i will try to do import auto
sklearn like that that is how you do it
if you looked up the documentation they
would tell you how to import it it's
like that except that doesn't work
there's no module named that and so we
need to do the pip install and so you
would look up how to do that you'd look
up how to do pip and then look at the
documentation online for that specific
library they would tell you in this
specific case pip install auto dash
sklearn so i'm going to run that it's
going to go and download that and
sometimes depending on what happens if
it's using libraries that are already
installed and messing around with the
versions of those as we'll see we saw
scipy and scikit-learn in here probably
desk is there as well if it does that it
may tell you at the end what you'll see
here after all of this stuff sometimes
is it will tell you to restart the
runtime and that is only the case if it
is involving libraries that have already
been installed so as you can see here it
sort of installed scipy uninstalled
other versions it does some weird stuff
and so it will say you must restart the
runtime in order to use the newly
installed versions and so if i were to
do import autos qlearn it still doesn't
like that it actually gives a different
error message about versions what you
can do to fix that in this case is run
time and then we will do restart runtime
that clicks yes and after that's
restarted you should be able to not do
the pip install again you should just be
able to import it and you can see this
time it worked properly so not always
will you have to restart the runtime but
sometimes you will sometimes you can
just pip install library and then you
can import it right away other times you
have to restart the runtime but that's
how you get new python libraries and if
you did need to get a different version
specifically you have a different
library say that you specifically wanted
to pip install numpy equals equals 1.18
and you could try and do that and if
there is a specific version like that
it'll get the point zero by default
numpy equals 1.18 or whatever version of
whatever specific library you wanted it
may have trouble with that depending on
the other versions of other things and
again you may have to restart the
runtime as well to use those new
installed versions but if you wanted a
specific version you would do it like
that and if you wanted a new library you
could also get the newest version of
something with pip install dash dash
upgrade and then numpy what that will do
is try its best to get the most
up-to-date numpy or whatever library
version and again it may have difficulty
with other things again it might not
like other libraries because of the way
that pip
in python works but you can try and do
it like that i hope this was helpful and
drop a like if it was maybe consider
subscribe to the channel if you're not
subscribed already and i'll see you next
time guys
関連動画をさらに表示
How to Set up VS Code for Data Science & AI
Python Tutorial for Absolute Beginners #1 - What Are Variables?
Introduction to Spyder - Part 2
Execute Python Code Directly from MATLAB (pass and receive variables)
Google Colab Tutorial for Beginners | Get Started with Google Colab
#8 Machine Learning Specialization [Course 1, Week 1, Lesson 2]
5.0 / 5 (0 votes)