How to Set up VS Code for Data Science & AI
Summary
TLDRIn this video, Dave, a data scientist, shares his experience with VS Code for data science projects, highlighting its advantages over Jupyter Notebooks in terms of workflow and productivity. He introduces VS Code, explains essential settings and extensions like Python, Pylance, and Jupyter, and demonstrates how to run Python code and Jupyter notebooks within VS Code. Dave emphasizes the efficiency of using VS Code for the entire data science project lifecycle, showcasing its interactivity and readiness for production.
Takeaways
- π Introduction: Dave, a data scientist, introduces a tutorial on setting up VS Code for data science projects, which has improved his workflow and productivity.
- π οΈ Tool Overview: VS Code is a free, integrated development environment (IDE) by Microsoft, compatible with Windows, Linux, and macOS, offering features like debugging, syntax highlighting, and intelligent code completion.
- π Productivity Boost: Dave emphasizes VS Code's impact on his efficiency as a data scientist, allowing for the entire project lifecycle management without the need to convert Jupyter notebooks to Python files.
- π Extensions: The script highlights the importance of installing specific extensions for Python, such as the Python extension pack, Pylance, and Jupyter, to enhance VS Code's functionality for data science tasks.
- π¨ Customization: Dave discusses customizing VS Code with themes and settings, including workspace-specific settings that can override user-level settings for project-specific configurations.
- π Interactive Coding: A key feature demonstrated is the ability to run Python code interactively within VS Code using the 'Jupyter send selection to interactive window' setting, combining the benefits of Jupyter notebooks with VS Code's features.
- π Code Management: Dave explains how VS Code facilitates easier code management for data science projects, making the transition from development to production smoother without the need for code duplication.
- π Workspace Setup: The script includes a step-by-step guide on setting up a workspace in VS Code, which helps in organizing and managing data science projects more effectively.
- π Workflow Efficiency: Dave demonstrates the efficiency of using VS Code for data science by showing how to interactively run and test code segments, making the debugging and development process faster.
- π± Learning Resource: The video serves as an educational resource for those looking to enhance their data science workflow with VS Code, offering insights into extensions, settings, and best practices.
- π Transition from Jupyter: The script highlights the benefits of transitioning from Jupyter Lab/Notebooks to VS Code for data scientists seeking a more efficient and feature-rich coding environment.
Q & A
What is the main topic of Dave's video?
-The main topic of Dave's video is about setting up Visual Studio Code (VS Code) for data science projects and how it has improved his workflow and productivity.
Why did Dave switch from Jupyter Notebooks to VS Code?
-Dave switched from Jupyter Notebooks to VS Code because it has significantly improved his workflow and productivity, offering more features and better support for the entire data science project life cycle.
What does Dave consider as the biggest advantage of using VS Code over Jupyter Notebooks?
-The biggest advantage of using VS Code over Jupyter Notebooks is that it allows managing the entire data science project life cycle more efficiently, eliminating the need to transform notebooks into Python files for production.
What is VS Code according to Dave's introduction?
-VS Code is a free, integrated development environment (IDE) made by Microsoft, available for Windows, Linux, and macOS, with features like debugging, syntax highlighting, intelligent code completion, and embedded Git support.
What are some of the must-have extensions Dave recommends for Python development in VS Code?
-Dave recommends the Python extension pack, which includes Python (essential for running Python code), Intellicode (an AI assistant for Python development), an environment manager, Python indent fixer, and Docstring.
What is the purpose of the 'Jupyter' extension in VS Code?
-The 'Jupyter' extension allows users to work with Jupyter Notebooks within VS Code, providing the benefits of both Jupyter and VS Code environments.
How does Dave demonstrate the interactivity of running Python code in VS Code?
-Dave demonstrates the interactivity by showing how to run selected lines of code in a Python file using the 'Jupyter: Send Selection to Interactive Window' feature, which allows for an interactive session similar to Jupyter Notebooks.
What is the significance of the 'Pylance' extension mentioned in the script?
-The 'Pylance' extension is a feature-rich language server for Python in VS Code that enhances the autocompletion suggestions and IntelliSense capabilities, making Python development more efficient.
How does Dave use workspace settings in VS Code to customize his projects?
-Dave uses workspace settings to override user settings specific to a project, allowing for project-specific configurations that are saved within the VS Code workspace file.
What is the 'Jupyter Send Selection to Interactive Window' setting in VS Code and how does it help in data science projects?
-The 'Jupyter Send Selection to Interactive Window' setting allows users to send selected code from a Python file to a Jupyter interactive window instead of the Python terminal, enabling an interactive way of running and testing code blocks during data science projects.
How does Dave suggest using the interactive window feature in VS Code for data transformation?
-Dave suggests using the interactive window to test and validate parts of the data transformation function by selecting and running specific lines of code to check the output, ensuring the correctness of the transformation before applying it to the entire dataset.
Outlines
π Introduction to VS Code for Data Science
Dave, a data scientist, introduces the video's focus on setting up VS Code for data science projects. He explains the benefits of using VS Code over Jupyter Notebooks and Labs, highlighting improved workflow and productivity. The video will cover an introduction to VS Code, personal themes and settings, essential extensions, running Python code, and using Jupyter Notebooks within VS Code. VS Code is praised for its features like multi-language support, debugging, syntax highlighting, intelligent code completion, snippets, code refactoring, and embedded Git.
π οΈ Setting Up VS Code with Extensions and Themes
The paragraph details the process of setting up VS Code for data science, starting with installing the Python extension pack which includes essential tools for Python development. Dave discusses the importance of Intellicode for AI-assisted code completion, improving speed and accuracy. He also mentions other extensions like Pylance for enhanced Intellisense, and the Jupyter extension for integrating Jupyter Notebooks within VS Code. Additionally, he covers changing themes and icon themes to personalize the VS Code environment.
π Organizing Data Science Projects in VS Code
Dave demonstrates how to organize a data science project in VS Code by creating a workspace, which allows for project-specific settings. He explains the difference between user-level and workspace-level settings and their implications for project organization. The paragraph emphasizes the convenience of workspace settings for maintaining a consistent project environment across different sessions.
π Combining Jupyter Notebooks and Python Files in VS Code
The paragraph explains how to work with Jupyter Notebooks and Python files in VS Code. Dave shows how to open and run a Jupyter Notebook within VS Code using the Jupyter extension. He also discusses the advantage of running Python files interactively using the 'Send Selection to Interactive Window' setting, which allows for testing and validating code sections without rerunning the entire script, thus mimicking the interactivity of Jupyter Notebooks.
π Enhancing Productivity with Interactive Python Script Execution
Dave illustrates the productivity benefits of using VS Code for data science by interactively running sections of a Python script. He demonstrates how to validate a data transformation function step by step, emphasizing the speed and efficiency gained from this method. The paragraph highlights the seamless integration of interactive execution with the robust features of VS Code, resulting in a workflow that is both productive and enjoyable.
π Conclusion and Call to Action
In conclusion, Dave reflects on the significant improvement in his productivity and coding experience after transitioning from Jupyter Lab and Notebooks to VS Code. He encourages viewers to subscribe to his channel for more data science-related content and thanks them for watching. The paragraph serves as a summary of the benefits of using VS Code for data science and an invitation for viewers to follow for further insights.
Mindmap
Keywords
π‘Data Science
π‘VS Code
π‘Extensions
π‘Integrated Development Environment (IDE)
π‘Jupyter Notebooks
π‘Python
π‘Debugging
π‘Syntax Highlighting
π‘Intelligent Code Completion
π‘Workspace
π‘Code Refactoring
Highlights
Introduction to setting up VS Code for data science, emphasizing its impact on workflow and productivity.
Brief overview of VS Code as a free IDE by Microsoft, available on multiple platforms.
VS Code's features such as multi-language support, debugging, syntax highlighting, and intelligent code completion.
The extensibility and customizability of VS Code, with a vast marketplace for extensions.
Productivity benefits of VS Code over Jupyter Notebooks, including project lifecycle management.
Demonstration of creating a workspace in VS Code for project-specific settings.
Importance of the Python extension pack for running Python code within VS Code.
Explanation of Intellicode as an AI assistant for Python development in VS Code.
The convenience of extensions like Pylance for enhanced autocompletion and Intellisense.
Jupyter extension that allows running Jupyter Notebooks within VS Code.
Customization options for themes and icon themes in VS Code to suit personal preferences.
Workspace and user settings distinction in VS Code for project-specific and global configurations.
The 'Jupyter: Send Selection to Interactive Window' setting for interactive Python code execution.
Combining the interactivity of Jupyter Notebooks with the functionality of VS Code for efficient coding.
Example of using VS Code for data transformation and validation within a Python script.
The productivity gains from using VS Code's interactive window for step-by-step data science projects.
Final thoughts on the advantages of VS Code for data science, including code readiness for production.
Transcripts
welcome everyone today i have another
exciting new video for you guys uh first
of all my name is dave i'm a data
scientist and in this video i will show
you how to set up fies code for data
science this is something that has
completely changed the game for me
coming from working with jupiter
notebooks and jupiter lab which are
awesome tools by the way but vs code has
totally changed the game for me in terms
of my workflow and my productivity so in
this video i'll show you how i've set up
my fuse code and i use it for my data
science projects right so here's what
we'll cover in this video
i'll first give a brief introduction
about field code then i'll go over some
themes and settings that i personally
use then some must have extensions for
if you want to write biting code then
i'll show you how to run python code
using field code and lastly i will show
you how you can also run jupyter
notebooks within fields code all right
so what is fies code so for those of you
that don't know it's a free integrated
development environment or ide for short
made by microsoft
and it's available for windows linux
macos so doesn't matter what system
you're using you can install fies code
for free and get started now why do i
like vs code first of all because it has
so many features it supports many
languages it has a really good features
for debugging syntax highlighting
intelligent code completion this is a
big one
snippets code refactoring and uh
embedded git within it so that's really
convenient for version control what also
makes it really awesome is the
extensibility and the customizability so
there are many
many settings and there's a huge
marketplace with extensions to add new
languages teams debuggers etc and you
really have to know a few settings and
extensions to make it work for writing
python code and doing data science
projects and then productivity since
i've made the switch to fierce code
coming from jupiter lab jupiter
notebooks it has just made me so much
more efficient as a data scientist and
then this is probably the biggest
advantage of using vs code versus just
uh jupiter notebooks or jupiter lab it's
that you can manage the entire data
science project life cycle and this also
has to do with with just productivity as
i just mentioned when you first write
your code in a jupiter notebook and then
once it's ready for production you have
to transform that notebook into a python
file and that just takes a lot of time
basically have to write your code twice
and vs code makes this so much easier
and i will show you how to do that so
first of all if you don't have fierce
code you can just google it go to the
website and you can download it install
it for free alright let's now hop into
vs code and i'll show you how i've set
up mine to work on data science projects
so
i've opened up a blank field code file
we'll start off by going to top right
corner file and opening a folder so i've
made a demo folder for this video which
i will open it has a typical data
science project
layout so it has some data
we have some code and some python code
and we have some notebooks let me close
this out for now and the next thing that
i will do once i've imported the folder
is i will save it as a workspace so i go
to file save workspace as
and
i'll just save it here as demo which is
fine what i've just done is i've saved
this file with
the imported folder as a vs code
workspace file and within this workspace
you can save settings that are
particular to this project so that could
be really convenient and another nice
way of working like this is that now
once i want to work start working on a
project again i can just open up the
workspace file and it will open up vs
code within this workspace with all my
folders attached so now we've set up the
project so now how do you get started as
i mentioned when you just downloaded vs
code it's kind of blank it's kind of
empty so we need to add some extensions
and we to have to tweak some settings in
order to make it work for us so let's
hop into the extensions and you can do
this by clicking on this icon in the
left bar over here
and this will first show you a list of
all the extensions that you currently
have installed as you can see these
these are all my extensions that i'm i'm
running and there also is a search bar
where you can search for extensions so
first and foremost
what we'll start with is the python
extension back
this is basically a must-have pack that
installs uh
six different packages that i think are
just very convenient some of them are
necessary to run python code within vs
code so on the overview over here you
basically have a description of
what's in the in the extension and
i already have this extension pack
installed so
for me it says disable uninstall but for
you when you don't have it installed
there will be an install button over
here so you can just do that install it
it's free it will take a couple of
seconds and it will stall install the
following extensions
to fuse code so i'll quickly go over
what these are so first of all python
is essential to run python code so this
is a extension that is produced by
microsoft so this basically enables you
to run python code and here you can see
you can manage your
environments which version of python you
are using so this is a must-have so
what's uh what else is in it then we
have intellicode which
is very awesome this is and i just
explained how this is
not so good within jupiter lab and
jupiter notebooks but this is just
basically makes your life as a data
scientist easier it's an ai assistant
python development tool that
autocompletes suggest etc so you can see
an example here of what it does whenever
you start typing intellicode will prompt
some suggestions
and using the arrow keys you can toggle
through them and then just hit enter and
it will auto complete the code so this
will do two things
it will make you write code faster and
it will also make sure that you make
less errors because you don't have to
type all the
methods and the attributes
by hand and you can just auto complete
them so
that in turn will also
enable you to just work faster all right
so what else is in it there's an
environment manager
django comes uh with it if if you work
with django that's really uh convenient
uh this is the extension to fix python
indents uh automatically also very
convenient and then you have outer dock
string which is also a cool tool this uh
basically what this does if you type
three quotes it will insert a dog string
template which is very important to just
comment your functions and your codes
for yourself but also when you're
working with other people alright so
that's the python extension pack
let me take a look at my notes yes so
the next one that we're looking into is
by lance so let me just look it up
pylance same thing here i've already
have it installed
you can install it over here basically
what pylance is it's a feature rich
server language in python for vs code
and it basically it works together with
intellisense so as it says here pylance
has the ability to supercharge your
python intellisense basically this makes
the autocompletion suggestions etc even
better
so another really nice extension to have
and then lastly jupiter and
this is a wrong tab
this is
a really cool extension that basically
gives you the best of both worlds so i
will show you later how this works but
basically this allows us to use all the
uh the benefits from jupiter and jupiter
notebooks within fies code so jupiter
over here install it as well those are
like the the essentials and there's a
lot more
code snap is also a pretty
funny tool that you can use to
make pretty
code export so you can highlight a
section of code and then export it to to
a jpeg for example so i sometimes use
this for presentations pretty cool
path intelligence you can look into
basically makes working with bots
easier so when you type double quotes
you can look through your parts and
easily look up data data files for
example but yeah that's just about it
about the extensions oh no wait one more
thing you can also look for teams and
icon teams
within the
marketplace so if you want to tweak some
settings and adjust the look and feel of
this code you can do that here so for
example i know that atom has a has a
team for example so
yeah you can look that up check if it
fits fits your vibe
how you do that once you've installed
the theme you can go
to the settings in the
lower left corner here and by clicking
on the gear icon and then you can hit
color team and this will show you all
the themes so i can basically just
scroll through those and you can see how
it how it will change and when you
install a theme from the marketplace it
will show up here so let me just i use
the dark plus default dark this is the
one i like so this is how you switch up
your teams and another cool thing
is you can change your icon
theme if i click here file icon team i
think the
you can disable them and i think this is
like the default or maybe this one these
are from vs code itself so you have
minimal and set eye and you can see on
the left on the left over here what what
it does so
um i use the material icon team and why
i do that is uh what i really like about
this team is that it creates different
folders for
different
depending on the name of the folder so
here you can see data has a little
database icon docs as doc items
different color and the models has a
different
different color and a different icon so
and the source is in green with the code
icon next to it so basically this allows
me to really easily look up folders that
i want to check so not just by name but
also by color and it's i think it just
adds a little nice touch and now there's
one uh more thing i want to show you
about the settings um so if i go to
settings over here now first of all
there are a ton of settings that you can
tweak from font sizes to
whatever you can
look through all of these i have almost
everything just on default i've also
never really looked in
too much of it because i like
the way it comes out of the box but
there's one important thing
to note here and that is that you have a
user level and a workspace level and
basically how this works is
on the user level everything that you
tweak will be saved within ps code and
then once you open up a new file and you
open up a new project by default it will
look at the user level uh you can
basically compare this to just your your
settings in any uh program you just
change them and when you open up a new
instance of the program your settings
are still there but then a really
convenient thing is that you also have
workspace settings and workspace
settings override your usb user settings
and they are tied to the workspace so as
i showed in the beginning of this video
i imported this folder and then i saved
this this project to this workspace now
this workspace contains settings and for
example if i want to increase the font
size over here so let me just check i
can if i click here and here you won't
see any difference but here i will make
adjustments on a user level and here i
will make adjustment to just the
workspace level that is a distinction
that you have to be aware of all right
so and there's one setting
that i want to show you and
this is really cool and we'll check that
out in a bit and that is the jupiter
send selection to interactive window and
i will do that on the user level and i
will look for jupiter sent to
so jupyter send selection to interactive
window we want to check this button over
here and what this does as it describes
here when pressing shift enter send
selected code in a python file to the
jupyter interactive window as opposed to
python terminal and i will show you
later in this video what this will do
but this is awesome all right now that
we've set up fierce goat with the
settings and the extensions i'll now
show you why it's so awesome and give
you a few examples
so let me first start by
opening up a notebook and by the way
this is just a notebook some notebooks
that i downloaded from this github
repository over here it's just some
basic bandless exercises
and what i can do is let me just open it
up and and here you can see that we can
just open up a jupyter notebook and also
just run it just like we're used to so
this is really awesome and this works
because we have installed the jupiter
extension and now we can basically just
work like any other notebook in the top
right corner over here we can select our
environment so i'm using anaconda and my
base environment and this is really
awesome so we can just um
work on jupiter notebooks here and let
me just open this up in the finder it's
just a notebook file and what we can do
if i just open this up and open like a
regular jupiter session you can see this
this is basically just the same as you
would normally work but now we're doing
it in fields code so that is how you
work with jupyter notebooks within vs
code and now i'll show you how to work
with python files and something that has
completely changed the game for me in
terms of my productivity so let me close
this and i'll open up a quick python
file here very basic file we import
pandas we load a data frame and we print
the data so the main difference between
working with jupiter notebooks and
working with python files is that in
jupyter notebooks you run code sell by
cell which is very convenient you can
use breakpoints and you can basically
for for data science projects you can
very easily manage a project step by
step so you first load the data you
check it you do some explorations you
tweak some things and this is really
convenient because in a data science
project you constantly go back and forth
back and forth so you load the data you
do some tweaks visualize something
create a function back and forth and
iterate until you have your desired
output so that's what makes a jupyter
notebook very convenient for that and if
you compare this to running just python
file for example that we have over here
we can do run python file is it will run
everything
in one go so it will import the library
it will load the data and then it will
print the head of the data in this case
but the thing is say for example you're
working with quite a big data set you do
some transformations that maybe take a
couple of seconds each
and then you've noticed that all the way
at the end your
visualization doesn't really work you
mess it up
now if you were working in a python file
you would have to adjust your code and
then rerun that entire python file and
do all those transformations again so
load the data etc and this can take a
couple of seconds sometimes even minutes
every time that you run the file and a
jupyter notebook counters this by
running it in codeblocks but now by
using vs code we can get the best of
both worlds and i'll show you how to do
this so remember the setting
that we tweaked within the settings is
jupiter
sent
so it says here when pressing shift
enter
send the selected code in python file to
a jupyter interactive window as opposed
to the python terminal what this
basically means is that we can now
within the python file as you can see
this is not a notebook but this is a
python file we can select a line of code
and then hit shift and enter
and vs code will fire up a interactive
jupiter session so this is basically the
same thing that's running in the
background whenever you're running a
jupyter notebook and it will store the
lines of code that you've run within
memory so
you can see for example here the
variables no there's nothing over here
but then once we load the data
so i can either select the line
or i can just have my cursor on one of
the lines and hit shift enter
what you can see it will run this line
of code and then it will store it into
the variables so what this basically
means is that we can write our python
code but run it in an interactive
way just like we would do in jupyter in
a jupyter notebook all right and i'll
show you why this is so awesome and how
you can use this in like a regular data
science project so i have a little
script here that basically reads some
data then there is a function
to transform some of the data and then
to store it so this is like the the
typical first step within a data science
project i will set my cursor on the
first line and hit shift enter to fire
up an interactive window that's
completed so then i go to the second
line and import the data and what i can
then do i can just select the data and
then run it so this is so convenient for
working on a python script is i can just
use
my arrows and then select
certain variables or select whole lines
to run the code so if i have my cursor
just on this line or i select the whole
line i can run it and then what i can do
i can just move my cursor forward then
select just data and then run it and it
will show me the output and this just
makes it so fast to to work through
through a code and you also don't have
to
um for example in a notebook insert a
line below then type data there and then
check the output you can just run this
very interactively and it's yeah it's
just very a very natural way of writing
code
also you have all the
tools
in the toolbar here that you
also have in the jupyter notebook so for
example i can clear everything up let me
show you another cool thing so i have a
data transformation function here that
basically takes the data and let me just
show what data is
and it takes the item price and it
basically turns it into a float so as
you can see it has a it has a dollar
sign here and that makes it a an object
within panda so we can't do any uh
calculations with it so this would be
like a typical data transformation and
i've
defined this this function and then what
i can do here is i can
call this function put data variable we
didn't and then we have our data
transformed but what you would typically
do is you write this function
line by line and in between you test a
few things so what i can now do is
instead of just
running this whole function i can just
go to
this particular section highlight it
and then check whether the output is
correct so i can start off with the item
price
and as you can see this is just the item
price from
the data data frame and this is an
object and then i can use the string
that replace method so i'll select
this over here and i can check
all right this works the dollar sign is
now
gone but it's still an object and then i
can say okay but now we want this as a
float so we run
we add s-type float to the end run it
and boom now you can see dollar sign is
removed and pandas recognized it is
recognizes it as a float alright so now
i've basically validated
my function i'll select it i'll run it
the function is now defined so i can see
this is a function what i can then do is
i run the line over here and then now my
data transformed is a new data frame
which has the item price as a float so
yeah this has completely changed the
game for me in terms of my my
productivity and
once you get used to this way of working
it just becomes so fast you can just
hop through your code using the arrow
keys and then make selections by
using command or control and shift and
highlighting certain parts or by using
alt or option for example then basically
you have the best of both worlds so you
have the interactivity of going back and
forth
of jupiter notebooks but you also have
the added functionality of vs code in
the extensions to write in a python file
uh with all the added features that we
discussed like the auto completion uh
and the suggestions to basically speed
up your coding make less errors etc and
you combine those and it will just
really improve your workflow it has at
least done uh for me as i already
mentioned the best thing is that the
code that you're writing is also ready
for production because it's in in a
python file and not in a jupyter
notebook so you don't have to do that
transformation so that's what i wanted
to show you in today's video this is how
i use and set up vs code for my data
science project as i said it has
completely changed the game for me
coming from jupiter lab jupiter notebook
switching to the workflow that i just
explained over here i'm just way more
productive and can write code
way faster it's also way more fun i
think so
yeah um i hope this video helped you out
if it did i would really appreciate it
if you like this video subscribe to the
channel
i'll be sharing more videos related to
data science basically whenever
i encounter something in my work
that i think can help other people i
will try to create a video about it
so yeah if that's something you're
interested in definitely subscribe
thanks for watching see you next time
5.0 / 5 (0 votes)