PyAutoGUI - Locate anything on your screen | Simple Pyautogui project

Coding 101 with Steve
3 Mar 202212:41

Summary

TLDRIn this tutorial, the creator demonstrates the most crucial function of the 'PyAutoGUI' package used for automating tasks on your computer. By using functions like 'locateOnScreen', 'moveTo', and 'locateCenterOnScreen', users can automate tasks such as navigating through GUIs, pressing hotkeys, and clicking on specific areas of the screen. As an example, the video automates subscribing to a YouTube channel. The creator also shows how to fine-tune the automation using confidence levels for more accurate results. Viewers are encouraged to follow along and try automating the task themselves.

Takeaways

  • 💻 The video introduces the PiAutoGUI package for Python, emphasizing its usefulness in automation projects.
  • 📦 The 'locateOnScreen' function finds the location of an image on the screen, returning its coordinates and size.
  • 📸 To use 'locateOnScreen', capture a screenshot of the desired area (e.g., with the Snipping Tool) and pass the image to the function.
  • 🖱️ The 'center' function can return the center coordinates of the located image, useful for precise mouse movements.
  • 🎯 The 'moveTo' function moves the mouse to the specified coordinates on the screen.
  • 🔗 The 'locateCenterOnScreen' function combines 'locateOnScreen' and 'center' for more efficient workflows.
  • 🔑 To open a new browser tab, the 'hotkey' function is used, simulating keyboard shortcuts (e.g., Ctrl+T).
  • 📝 The 'write' function types text (like URLs or search queries) into the targeted location.
  • 🤖 Confidence levels can be adjusted in image recognition (using the 'confidence' attribute) to handle pixel mismatches.
  • 📽️ The script automates subscribing to a YouTube channel by combining several PiAutoGUI functions, including image recognition and keyboard/mouse interactions.

Q & A

  • What is the most important function in the pyautogui package discussed in the video?

    -The most important function in the pyautogui package discussed in the video is 'locateOnScreen', which is used to find the location or coordinates of any part on the screen using an image as a parameter.

  • How does the 'locateOnScreen' function work?

    -The 'locateOnScreen' function works by taking an image as a parameter and returning the full location of that image on the screen, including its coordinates (x and y), width, and height.

  • What is the purpose of the 'center' function in pyautogui?

    -The 'center' function is used to find the center coordinates of a part on the screen. It is often used after 'locateOnScreen' to get the exact center point for mouse movement.

  • How can you move the mouse to a specific location using pyautogui?

    -To move the mouse to a specific location, you can use the 'moveTo' function in pyautogui, passing the coordinates (x and y) as arguments.

  • What is the 'locateCenterOnScreen' function and how does it simplify the automation process?

    -The 'locateCenterOnScreen' function is a combination of 'locateOnScreen' and 'center' functions. It returns the center location of an image on the screen in one operation, simplifying the process of locating and moving to screen elements.

  • What is the role of the 'hotkey' function in the automation project described in the video?

    -The 'hotkey' function is used to simulate keyboard shortcuts. In the video, it is used to press 'Ctrl+T' to open a new tab and 'Enter' to navigate to the typed URL.

  • How does the 'prompt' function assist in taking user inputs during automation?

    -The 'prompt' function in pyautogui shows a pop-up with a text box to take user inputs. It is used in the video to get the channel name for subscribing to a YouTube channel without manually typing it in the code.

  • What is the purpose of the 'confidence' attribute when locating images on the screen?

    -The 'confidence' attribute is used to set the threshold for image matching. A higher value means a more precise match is required. It is important for accurately locating elements on the screen, especially when there might be slight variations in the image.

  • Why is it necessary to install the 'opencv-python' package when using the 'confidence' attribute?

    -The 'opencv-python' package is required for image processing tasks, such as adjusting the confidence level for image matching. It provides the necessary tools to handle image recognition with varying degrees of accuracy.

  • What is the final project demonstrated in the video using pyautogui?

    -The final project demonstrated in the video is an automated process to subscribe to a YouTube channel. It involves opening a browser, navigating to YouTube, searching for the channel, and subscribing to it using pyautogui functions.

Outlines

00:00

💻 Introduction to pi auto gui

The speaker begins by introducing the 'pi auto gui' package in Python, emphasizing its importance in automation projects. They demonstrate how to set up a new Python file in VS Code and install the package using 'pip install pi auto gui'. The key function 'locate on screen' is introduced, which finds the location of an image on the screen. An example is given where the speaker captures the 'edit menu' using a snipping tool, saves it, and uses the function to get its coordinates. The 'center' function is also explained, which finds the center coordinates of an image. The 'move to' function is briefly mentioned, which moves the mouse cursor to a specified location.

05:01

🔍 Advanced Automation with pi auto gui

The speaker continues by showing how to automate browser actions using 'pi auto gui'. They discuss the 'hotkey' function to simulate key presses and the 'write' function to input text into a field. A problem arises where the actions are performed in the terminal instead of the browser, which leads to the introduction of the 'prompt' function. This function allows the user to input data through a pop-up, which is used to get the channel name for subscription. The speaker also explains the use of the 'locate center on screen' function with a 'confidence' attribute, which requires the installation of 'opencv-python'. The speaker demonstrates how to automate the process of subscribing to a YouTube channel by locating and interacting with various UI elements.

10:06

🎯 Completing Automation with Confidence Adjustment

The final paragraph details the completion of the YouTube subscription automation. The speaker captures images of the channel logo and subscribe button, and uses these to locate and interact with them in the script. They mention the importance of the 'confidence' attribute in the 'locate center on screen' function, which affects the accuracy of image recognition. After adjusting the confidence level, the automation successfully opens YouTube, searches for the channel, and subscribes without manual intervention. The speaker concludes by encouraging viewers to follow along, subscribe to their channel, and provide feedback in the comments.

Mindmap

Keywords

💡Pi Auto GUI

Pi Auto GUI is a Python package designed for automating computer GUIs (Graphical User Interfaces). In the video, the presenter uses this package to demonstrate how to automate tasks on a computer. It's a core tool for the tutorial, as it allows the creation of scripts that can interact with the graphical interface of other programs, such as opening menus, clicking buttons, and typing text.

💡Locate On Screen

The 'locate on screen' function is a crucial feature of the Pi Auto GUI package. It is used to find the location or coordinates of a specific element on the screen by matching it with an image file. In the video, this function is used to identify the position of the 'Edit' menu, which is then used to move the mouse cursor to that location, showcasing the automation of a common GUI interaction.

💡Center

The 'center' function is another utility within the Pi Auto GUI package that calculates the center coordinates of an element once its location has been identified using 'locate on screen'. This is particularly useful for mouse movement automation, as it allows the script to move the cursor not just to the element's location but to its precise center. In the script, this function is applied to ensure accurate mouse positioning.

💡Move To

The 'move to' function is used to move the mouse cursor to a specific set of coordinates. In the context of the video, after determining the center coordinates of the 'Edit' menu using the 'center' function, the 'move to' function is employed to simulate the act of moving the mouse to that point, effectively automating the navigation to that menu.

💡Locate Center On Screen

This function combines the 'locate on screen' and 'center' functions into a single operation. It directly returns the center coordinates of an image when passed as a parameter, streamlining the process of locating and centering on an element. The video script uses this function to simplify the automation process by obtaining the center coordinates in one step.

💡Hotkey

A 'hotkey' is a keyboard shortcut that can be used to perform certain actions quickly. In the video, the presenter uses the 'hotkey' function to simulate pressing 'Ctrl + T' to open a new tab in a browser, which is an example of automating keyboard input using the Pi Auto GUI package.

💡Write Function

The 'write' function is used to simulate typing text into a field or application. In the video, it's employed to write the URL of YouTube into the browser's address bar, demonstrating how text input can be automated using the Pi Auto GUI package.

💡Prompt

The 'prompt' function is used to create a pop-up that can take user input. In the video, it's utilized to get the name of the YouTube channel to subscribe to from the user, showing how the package can interact with the user to gather information needed for the automation process.

💡Confidence

The 'confidence' attribute in the 'locate center on screen' function refers to the level of certainty required for the image recognition to be successful. A higher confidence level means the function will only return a match if the image closely resembles the one provided. In the video, adjusting the confidence level is discussed as a method to improve the accuracy of the image recognition process.

💡OpenCV Python

OpenCV Python is a library used for computer vision tasks, including image recognition. In the video, it's mentioned as a requirement for using the 'confidence' attribute effectively. The presenter installs this package to enhance the image recognition capabilities of the Pi Auto GUI package, which is crucial for accurately locating elements on the screen.

💡Automation Project

An 'automation project' refers to a task or series of tasks that are automated using software. In the video, the presenter creates a simple automation project that involves subscribing to a YouTube channel automatically. This project serves as a practical example of how the various functions of the Pi Auto GUI package can be combined to achieve a complex automation goal.

Highlights

Introduction to the most important function of the PyAutoGUI package.

Demonstration of creating a new Python file in VS Code for the project.

Explanation of installing the PyAutoGUI package using pip.

Importing the PyAutoGUI package in the Python script.

Description of the 'locateOnScreen' function to find screen coordinates.

Tutorial on capturing an image of the 'Edit' menu using the Snipping Tool.

Using the 'locateOnScreen' function to find the location of the 'Edit' menu.

Introduction to the 'center' function to get the center coordinates of an image.

Explanation of the 'moveTo' function to move the mouse cursor.

Combining 'locateOnScreen' and 'center' into the 'locateCenterOnScreen' function.

Creating a simple project to automate subscribing to a YouTube channel.

Using the 'hotkey' function to press 'Ctrl+T' for opening a new tab.

Utilizing the 'write' function to enter the YouTube URL into the search bar.

Explanation of the 'prompt' function to minimize the editor and run code in the background.

Capturing the YouTube search box and subscribe button using the Snipping Tool.

Using 'locateCenterOnScreen' to find the search box and subscribe button coordinates.

Automating the process of clicking, typing the channel name, and pressing enter.

Introduction to the 'confidence' attribute for more accurate image recognition.

Installation of the 'opencv-python' package for image recognition.

Adjusting the 'confidence' level to handle pixel mismatch issues.

Final demonstration of the automated YouTube channel subscription process.

Encouragement for viewers to follow the tutorial and subscribe to the presenter's channel.

Transcripts

play00:00

hello coders

play00:02

in this video i am going to show you the

play00:05

most important function of pi auto gui

play00:08

package

play00:09

this function is so important that you

play00:11

will use it in almost every automation

play00:14

project that you create and i will also

play00:16

be creating a simple project using this

play00:19

function

play00:20

so let's get started

play00:27

i am inside my vs code i'll create a new

play00:31

file

play00:32

set the language to python

play00:35

and save the file on my desktop

play00:39

now i'll come here

play00:41

those who have not seen my previous

play00:42

video let me tell you i am using a

play00:45

python package named pi auto gui which

play00:48

is used to automate computer guise

play00:52

so you will have to install it like this

play00:54

type pip install pi auto gui

play01:01

i have already installed this package

play01:04

so now coming to the code

play01:06

i'll import the package

play01:09

now

play01:10

if you want to get the location or

play01:12

coordinates of any part on the screen

play01:15

then there is a function named locate on

play01:18

screen

play01:21

this function takes an image as a

play01:23

parameter and returns the full location

play01:26

of that image on the screen

play01:28

for example

play01:30

if i want to get the location of this

play01:32

edit menu

play01:34

then what i will do is

play01:36

i'll search for snipping tool on my

play01:38

computer like this open it

play01:42

click on new

play01:44

and then capture this edit menu area

play01:49

and save it as a png file

play01:51

or any image type file

play01:55

now i will pass the name of this image

play01:57

file over here

play02:00

let me print the response

play02:03

i'll come here

play02:05

and run the code

play02:07

you can see it has printed the full

play02:09

location it says from left it's at 116

play02:14

and from top it's at 6.

play02:16

these are the coordinates x and y

play02:19

where x is 1 1 6 and y is 6

play02:23

then it gives the width and height of

play02:25

that image

play02:27

but we just require the coordinates that

play02:29

is

play02:30

the x and y value so in order to do that

play02:33

there is another function named center

play02:37

let me show you

play02:38

i'll pass the response to the center

play02:40

function

play02:42

and it will return to me the center

play02:44

coordinates of that part

play02:47

let me run this

play02:49

and there you go it has given me the

play02:51

exact center coordinates

play02:55

now i'll move to that location using the

play02:58

move to function

play03:03

but before that i'll have to take the

play03:04

coordinates inside a variable like this

play03:08

and then pass it to the move to function

play03:11

by the way move to function i have

play03:13

explained it in detail and there are

play03:15

more functions that i have explained in

play03:16

my previous video so do check that out

play03:19

what move to function does is whatever

play03:22

the coordinates that you pass to this

play03:23

function it is going to move your mouse

play03:26

to that particular location or to that

play03:28

particular coordinate that's it

play03:30

okay

play03:31

let's run this

play03:32

did you see that my mouse automatically

play03:35

came to this edit menu

play03:37

so

play03:38

you had to perform two operations in

play03:40

order to get the location of a part on

play03:42

the screen but you can also do this

play03:45

using one operation that is by using a

play03:47

function named locate center on screen

play03:52

this function is the combination of

play03:54

locate on screen and center functions

play03:57

here you need to pass the image file

play03:59

name and it will return the center

play04:00

location of that image i'll just print

play04:03

the response

play04:05

and now run the code

play04:08

you got the same result let me just

play04:10

write the comments for you just to

play04:12

understand

play04:15

so you can use only this function and

play04:18

your work is done so coders

play04:20

this was the most important function of

play04:22

buy auto gui package which you will

play04:25

require in almost every automation

play04:28

project

play04:29

now is the time to create a simple

play04:31

project using this function and other

play04:33

functions that i showed you in my

play04:35

previous video

play04:37

the project that i'll create is

play04:38

subscribing a youtube channel

play04:40

automatically i have written down the

play04:42

steps already so the first step is to

play04:45

open a new tab on the browser see i know

play04:48

that shortcut for new tab is control

play04:51

plus t so to press ctrl and t i will use

play04:55

the hotkey function

play04:57

and i'll pass

play04:58

ctrl

play05:00

and

play05:01

t

play05:03

that's it the next step is to search

play05:05

youtube.com

play05:07

to do that i'll use the write function

play05:10

and pass the youtube url

play05:13

so this will write the url in the search

play05:15

bar

play05:16

and then i'll press enter using the

play05:18

hotkey function now we'll just run this

play05:20

much and see

play05:22

but there is one problem you'll see it

play05:24

when i run the code

play05:25

let's see i'll keep the browser open

play05:27

behind the editor like this

play05:30

i'm running the code and here is the

play05:33

problem you can see that it typed

play05:35

control plus ta and the url inside the

play05:38

terminal itself

play05:39

so we need to find a way to minimize our

play05:42

editor and run the code in the

play05:44

background

play05:45

in order to do this there are some

play05:47

functions in pi to gui package which can

play05:50

take inputs from you we will use one of

play05:52

those functions

play05:54

so the function i will use is

play05:57

prompt

play05:59

this function shows a pop-up and a text

play06:01

box to take inputs

play06:03

the attribute i'll pass is text

play06:08

which will store whatever i'll type in

play06:10

the text box

play06:12

and title

play06:14

to give a heading for the popup i'll

play06:16

type

play06:17

enter the channel name

play06:21

and this will return the text that i

play06:22

will enter so i'll store it in this

play06:24

variable

play06:26

i'll call it

play06:27

channel underscore name

play06:31

you can give whatever name you want i'll

play06:33

write the comment

play06:37

so let me show you the channel that i

play06:38

want to subscribe

play06:41

by the way i am not doing any paid

play06:43

promotions

play06:44

i've been following this channel

play06:47

so the channel name is

play06:49

the english scholar online camp

play06:53

if you want to learn grammar and

play06:54

punctuation

play06:56

improve your vocabulary get some useful

play06:58

tips and free resources

play07:00

then you can go ahead and check this

play07:02

channel and subscribe it

play07:05

one thing i love is that most of the

play07:07

videos on this channel are short ones

play07:10

okay so now i'll go back to the code

play07:13

here i'll print the channel name now

play07:16

i'll run the code again

play07:18

and here the popup is

play07:20

it says enter the channel name

play07:23

i'll type

play07:24

the english

play07:25

scholar

play07:27

online camp

play07:29

ok

play07:32

did you see that

play07:33

my code was running in the background

play07:35

and we were able to do what we wanted

play07:38

now the next step is to type the enter

play07:40

channel name in the search box

play07:43

so to locate the search box

play07:45

i'll capture this search

play07:47

using the snipping tool

play07:51

and i'll save it

play07:53

i'll also capture the subscribe button

play07:55

oh but for that first i need to

play07:58

unsubscribe

play07:59

okay now you can see the subscribe

play08:01

button

play08:02

now i will capture it using the snipping

play08:05

tool again

play08:07

and i'll save it

play08:12

going back to the code

play08:14

so use our important function again

play08:17

to locate the search box and also locate

play08:20

the subscribe button on the screen

play08:22

locate center on screen

play08:26

the file name

play08:33

once i locate it i'll move to that

play08:35

location so i'll use move to function

play08:40

then i need to click on the search box

play08:43

so i'll use the click method

play08:45

after clicking i need to type the

play08:47

channel name

play08:48

so i'll use the write function and then

play08:51

pass the channel name variable to that

play08:53

function i'll just write the comment

play08:59

okay then again to press enter

play09:02

pi auto gui

play09:04

dot

play09:05

hotkey

play09:07

enter i forgot one thing i also need to

play09:09

click after i press ok on the pop-up

play09:12

so i will have to add a click function

play09:14

there also

play09:16

here okay

play09:18

now i want to show you one attribute for

play09:20

the locate center on screen function

play09:22

that is confidence

play09:24

i'll type confidence

play09:27

equal to

play09:29

0.9

play09:30

0.9 means 90 percent so 1 will be 100

play09:34

percent this means that even if there is

play09:38

a 90 match

play09:39

then still it should consider

play09:42

and to use this confidence attribute i

play09:44

need to install a package called opencv

play09:47

python

play09:48

so i'll type pip install

play09:51

opencv python

play09:56

done

play09:57

now let's run this code

play10:05

did you see that i didn't touch my mouse

play10:08

or keyboard it automatically opened

play10:10

youtube and searched for the channel

play10:12

so now

play10:14

the next step would be to open the

play10:16

channel

play10:17

to do that i need to click on the

play10:18

channel logo

play10:20

so now you know what i'm going to do

play10:22

open snipping tool

play10:27

capture the logo

play10:30

and save it

play10:33

now come back to the code

play10:35

i'll just copy paste this over here

play10:38

and

play10:39

change the file name

play10:40

that's it

play10:41

i'll put the necessary sleep of one

play10:44

second for the next and final step

play10:48

again i'll copy paste

play10:51

change the file name

play10:54

this image is the subscribe button that

play10:55

we captured earlier

play10:57

and that's it

play10:59

now let's run it

play11:06

oops

play11:08

it did not search the logo this might

play11:10

happen sometime due to pixel mismatch so

play11:13

here the confidence was 90 percent still

play11:16

it did not work so now i'd reduce it to

play11:19

80 percent and try again

play11:21

in such cases you'll have to decrease or

play11:23

increase the confidence and keep trying

play11:26

that's why the confidence attribute is

play11:28

very important

play11:30

now i think it should work let's see

play11:51

bingo it worked i did not touch my mouse

play11:54

or keyboard it automatically opened

play11:57

youtube search for the channel name

play12:00

then opened the channel

play12:02

and subscribed it

play12:05

so by this i have completed almost all

play12:08

the functions of the pi auto gui package

play12:11

and showed you how you can automate

play12:13

anything on your computer using this

play12:15

package

play12:17

now i want you to follow this video step

play12:20

by step

play12:21

and then using your code

play12:23

subscribe to my channel

play12:25

and let me know in the comment section

play12:27

if you learned something from this video

play12:29

then please do not forget to subscribe

play12:31

to my channel in order to be notified

play12:33

about my latest videos thank you so much

play12:36

for watching see you in my next video

Rate This

5.0 / 5 (0 votes)

関連タグ
Python AutomationPyAutoGUIScreen AutomationMouse MovementYouTube AutomationCoding TutorialHotkey FunctionImage RecognitionAutomation ProjectProgramming Guide
英語で要約が必要ですか?