Webinar: Get Started with Browse AI (May 30, 2024)

Browse AI
7 Jun 202435:19

Summary

TLDRIn this webinar, Nick from Browse AI and CEO Ry provide an insightful overview of web scraping, demonstrating how their platform simplifies data extraction without coding. They showcase the process through LinkedIn job listings and Redfin property searches, highlighting the AI-assisted robot training, monitoring, and integration capabilities. The session also addresses common questions, emphasizing the ethical and legal aspects of web scraping, focusing on public data extraction. A special promo code 'webinar 20' offers a 20% discount on annual subscriptions, encouraging user engagement.

Takeaways

  • 🌐 The webinar is a global event with participants tuning in from various locations such as Detroit, Michigan, British Columbia, Guadalajara, Mexico, and more.
  • πŸ•’ The host acknowledges different time zones and assures that a recording will be sent out for those who cannot stay up late or tune in early.
  • πŸ€– Nick introduces himself as a User Advocate at Browse AI, emphasizing his role in ensuring users have the best experience possible.
  • πŸŽ₯ Ry, the founder and CEO of Browse AI, makes an appearance with his daughter Raha, adding a personal touch to the professional presentation.
  • πŸ“ The webinar covers an overview of web scraping, including its definition, the process of data extraction, and the potential uses of the collected data.
  • πŸ” Browse AI's mission is to democratize access to information on the internet, making data extraction more accessible and less expensive than traditional methods.
  • πŸ› οΈ Browse AI allows users to train a 'robot' to extract data through a visual interface without the need for coding knowledge, simplifying the web scraping process.
  • πŸ“ˆ The company has experienced significant growth, with over 420,000 users extracting 6.8 billion records since January 2023, highlighting the demand for its services.
  • 🏒 Browse AI is trusted by teams at well-known companies and has a range of features that set it apart from basic web scraping tools, such as AI-assisted data selection and integration with over 7,000 apps.
  • πŸ”‘ The webinar includes a demo of how to use Browse AI to scrape job listings from LinkedIn and property details from Redfin, showcasing the practical application of the tool.
  • πŸ”„ The process of creating a workflow to connect two robots for deep scraping is demonstrated, along with the integration of data into Google Sheets for easy access and organization.

Q & A

  • What is the main purpose of the webinar?

    -The main purpose of the webinar is to provide an overview of web scraping, introduce Browse AI, demonstrate its capabilities with examples, and address questions from the audience.

  • What is Browse AI?

    -Browse AI is a tool designed to democratize access to information on the internet by allowing users to easily extract and monitor data from websites without the need for coding knowledge.

  • How does Browse AI simplify the web scraping process?

    -Browse AI simplifies the web scraping process by enabling users to train a robot through a visual interface where they simply point, click, and name the data they want to extract.

  • What is the significance of the 'robot studio' feature in Browse AI?

    -The 'robot studio' is a new feature in Browse AI that allows users to train robots without needing to install a browser extension, making it more accessible for users in environments where extensions are restricted.

  • How does Browse AI handle websites with pagination?

    -Browse AI can automatically handle pagination by scrolling down to load more items and extracting data from each page as needed.

  • Can Browse AI be used to extract data from social media sites like Instagram?

    -Browse AI generally does not recommend extracting data from social media sites that require login, as it may lead to account flagging due to different IP addresses and potential privacy concerns.

  • What types of data can Browse AI extract from websites?

    -Browse AI can extract various types of data, including text, images, and specific details from lists or individual webpages, such as job listings, property details, and product information.

  • How does Browse AI ensure the legality and ethical use of its web scraping services?

    -Browse AI focuses on extracting public data and has policies in place to avoid extracting sensitive or personally identifiable information. It also does not support extracting data from websites that could be in a legal gray area.

  • What is the difference between the old Chrome extension and the new robot studio in Browse AI?

    -The old Chrome extension required users to install it on their machines to train robots, while the new robot studio is a web-based interface that runs within Browse AI's platform, eliminating the need for local installation and allowing for faster updates.

  • How can users get assistance if they encounter issues with Browse AI on a specific website?

    -Users can reach out to Browse AI's customer success team via support forms or emails. The team is available in different time zones and can provide guidance, although priority may be given to users on paid plans.

  • What is the process for creating a workflow in Browse AI?

    -To create a workflow in Browse AI, users first create two separate robots (Robot A and Robot B). Robot A extracts a list of links or items, and Robot B is set up to scrape detailed data from each link or item provided by Robot A. The workflow connects these two robots to automate the data extraction process.

Outlines

00:00

🌐 Web Scraping Overview and Introduction to Browse AI

Nick, a User Advocate at Browse AI, welcomes viewers from around the world to a webinar, noting the time differences and assuring that a recording will be sent out. He introduces Ry, the CEO, and they discuss the webinar's agenda, which includes an overview of web scraping, an introduction to Browse AI, demonstrations, use cases, and a Q&A session. Web scraping is defined as data extraction from websites, which can be a competitive advantage for businesses. Browse AI aims to democratize access to information, making it easy to extract and monitor web data, which was historically difficult and expensive. The tool was first launched in 2021.

05:01

πŸ€– Features and Demonstration of Browse AI's Web Scraping

The presenter highlights Browse AI's unique features, such as emulating human interactions, solving captchas, and integrating with over 7,000 apps. A live demo is conducted to show how to use Browse AI for web scraping without coding knowledge. The presenter guides viewers through the process of extracting job listings from LinkedIn, showcasing the ease of training a robot to capture specific data fields and creating a monitor for automated data extraction.

10:03

🏑 Advanced Web Scraping Techniques with Redfin Example

In this segment, the presenter demonstrates advanced web scraping techniques using Redfin as an example. The process involves searching for homes, extracting property listings, and details using two connected robots. The presenter also explains how to create a workflow to automate the process and integrate the extracted data with Google Sheets. Additionally, the presenter shows how to perform a bulk run using a CSV file to extract data for multiple URLs at once.

15:05

πŸ” Use Cases of Browse AI and Monitoring Data

The presenter discusses various use cases of Browse AI, such as monitoring products and pricing, property listings, job postings, government websites, and member directories. The focus is on the tool's ability to automate the extraction of data that can provide businesses with valuable insights and a competitive edge. The presenter also addresses the importance of timely data extraction and how Browse AI can help users stay updated with minimal effort.

20:06

πŸ“Έ Addressing Questions about Data Extraction and Browse AI Capabilities

The Q&A session begins with a question about the format required for data extraction, to which the presenter responds that Browse AI is designed to handle structured data and recommends using other tools for unstructured text. Another question about workflows and data extraction from multiple robots is addressed, explaining the concept of deep scraping where one robot extracts links and another extracts details from those links. The presenter also clarifies that Browse AI cannot extract data from social media sites that require login.

25:08

πŸ›  Transition from Browser Extension to Robot Studio and AI Assistance

The presenter discusses the transition from using a browser extension to Robot Studio, which allows for faster release cycles and doesn't require installation. Robot Studio is more intelligent and offers an interface within Browse AI to train robots without installing extensions on local machines. The presenter also explains the AI features in Browse AI, emphasizing that AI is used to assist and automate individual steps of the scraping process, rather than automating the entire process from start to finish.

30:10

πŸ“ Legal Considerations and Future of Web Scraping with AI

The presenter addresses the controversial nature of web scraping, focusing on the importance of extracting only public data and avoiding sensitive or personally identifiable information. Browse AI has policies in place to guide users on appropriate data extraction. The presenter also touches on the future of AI in web scraping, acknowledging that while full automation is not yet possible, Browse AI is committed to improving individual steps to eventually achieve a fully automated process.

35:11

πŸŽ‰ Conclusion and Promo Code Offer

The webinar concludes with a promo code offer for a 20% discount on an annual subscription, valid for the next 48 hours. The presenter expresses gratitude to the attendees, encourages feedback for future webinars, and assures that a recording of the session will be sent out soon.

Mindmap

Keywords

πŸ’‘Web Scraping

Web scraping refers to the method of extracting data from websites, typically using automated tools or software. In the context of the video, web scraping is the core functionality provided by Browse AI, allowing users to collect data from the internet without manually copying and pasting. The script mentions web scraping as the main service of Browse AI, emphasizing its importance for businesses to gain a competitive advantage by accessing the right data at the right time.

πŸ’‘Browse AI

Browse AI is the company and platform being discussed throughout the video script. It specializes in web scraping and data extraction services. The script introduces Browse AI as a tool with a mission to democratize access to information on the internet, making it easier for users to extract and monitor data from websites. The platform is designed to be user-friendly, requiring no coding knowledge for operation.

πŸ’‘User Advocate

A User Advocate is a professional who works to ensure that the needs and experiences of users are considered and addressed within a company. In the script, Nick introduces himself as a User Advocate at Browse AI, indicating that he works closely with various teams, including marketing, support, product, and engineering, to enhance the user experience.

πŸ’‘Web Scripting

Web scripting, as discussed in the script, is synonymous with web scraping. It involves writing scripts or using software to automate the extraction of data from websites. The video aims to provide an overview of web scripting and demonstrate how Browse AI facilitates this process, making it accessible to users without the need for coding skills.

πŸ’‘Robot Studio

Robot Studio is a feature within Browse AI that allows users to train robots for web scraping tasks. The script highlights Robot Studio as the new and future-oriented way to use Browse AI, moving away from the previous reliance on browser extensions. It provides a visual interface for users to train robots by pointing, clicking, and naming the data they wish to extract.

πŸ’‘Workflow

In the context of Browse AI, a workflow is a sequence of actions that connects multiple robots to perform a complex web scraping task. The script explains how workflows can be used for deep scraping, where one robot extracts a list of links and another robot uses those links to extract detailed information from each corresponding page. This demonstrates the capability of Browse AI to automate multi-step data extraction processes.

πŸ’‘Integration

Integration in the video script pertains to the ability of Browse AI to connect and transfer data to other platforms and tools. Examples given include Google Sheets, Zapier, and other apps, enabling users to seamlessly use the scraped data for various purposes. The script mentions integration as a key feature of Browse AI, allowing for the automation of data transfer to popular third-party services.

πŸ’‘Monitoring

Monitoring, as discussed in the script, is a feature that allows users to set up their scraping tasks to run automatically at specified intervals. This eliminates the need for manual initiation of the scraping process, ensuring that data is consistently updated. The script demonstrates setting up a monitor for a web scraping task to run daily, except on weekends, to capture job listings.

πŸ’‘Bulk Run

A bulk run in Browse AI is the process of executing a web scraping task for a large number of URLs simultaneously. The script describes using the bulk run feature to process multiple property listings at once by importing a CSV file with URLs, showcasing the platform's capability to handle large-scale data extraction efficiently.

πŸ’‘AI Magic

The term 'AI Magic' is used in the script to describe the automated features of Browse AI that simplify the web scraping process. It refers to the platform's ability to intelligently detect and select data fields on a webpage without manual input from the user. The script illustrates this feature during a demo, where Browse AI quickly identifies and selects relevant data fields for extraction.

πŸ’‘Pre-built Robots

Pre-built robots in Browse AI are pre-configured scraping scripts that users can utilize for common data extraction tasks without having to create their own robots from scratch. The script mentions the availability of pre-built robots for websites like LinkedIn and Redfin, allowing users to start scraping tasks immediately with minimal setup.

Highlights

Introduction of Browse AI as a tool for web scraping and data extraction, emphasizing its user-friendly approach and no-code requirement.

Highlight of Browse AI's mission to democratize access to information on the internet, making it easily accessible to everyone.

Overview of the company's growth, with over 420,000 users and 6.8 billion records extracted since January 2023.

Explanation of Browse AI's unique features, including emulating human interactions, solving captchas, and auto-adapting to site layout changes.

Demonstration of LinkedIn job scraping, showcasing the process of training a robot to extract specific job listing data.

Introduction of Robot Studio as the future of Browse AI, offering a new way to train robots without the need for browser extensions.

Illustration of AI-assisted data extraction, where Browse AI automatically detects and selects relevant data fields.

Tutorial on creating a monitor for automated data extraction at scheduled times, reducing manual effort.

Discussion on integrating Browse AI with other tools like Google Sheets, Zapier, and more for extended functionality.

Example of deep scraping, where multiple robots work together to extract data from lists and individual items.

Explanation of workflows, showing how data from one robot can trigger another robot to perform further actions.

Introduction of bulk run feature, allowing for the extraction of data from a large number of URLs at once.

Presentation of various use cases for Browse AI, such as monitoring products and pricing, property listings, job postings, and government websites.

Addressing common questions about the legality and ethical considerations of web scraping with Browse AI.

Clarification on the difference between the Browse AI extension and Robot Studio, and the benefits of the latter.

Emphasis on Browse AI's commitment to public data extraction and the avoidance of sensitive or personally identifiable information.

Offering of a promo code 'webinar20' for a 20% discount on an annual subscription, valid for the next 48 hours.

Invitation for feedback to improve future webinars and the Browse AI tool, showing the company's dedication to user satisfaction.

Transcripts

play00:01

all right we got some people trickling

play00:02

in

play00:04

here as we waiting if you wouldn't mind

play00:06

popping in the chat where are you tuning

play00:07

in from always curious where in the

play00:10

world everyone

play00:11

is what time is it there

play00:31

and if you notice me looking down here

play00:33

that's my backstage screen so I can see

play00:36

the chat and whatnot not ignoring you

play00:38

just looking for answers Detroit

play00:40

Michigan cool a little bit later there

play00:44

we're on the west coast British

play00:50

Columbia hell Fox no Scotia nice I lived

play00:53

there for 10 years love it

play01:05

I think the weather's a little nicer

play01:07

there today than it has been here

play01:08

guadalahara Mexico

play01:10

awesome New York Newport Beach

play01:13

California we to live there also

play01:16

is Santa Katarina Brazil

play01:20

Indiana all over the world I would like

play01:23

to point out if it's really late where

play01:24

you are or super early and you're tuning

play01:26

in we will be sending out the recording

play01:28

so don't feel the need to

play01:30

stay up because you might miss something

play01:31

that you'll never see again it'll be in

play01:33

the

play01:33

recording India probably fairly late

play01:36

there Tunisia Florida Oregon

play01:40

nice give it a few more minutes for some

play01:42

more people to to join and then we'll

play01:44

kick it off

play02:23

can't think I'm just going to get

play02:24

started here um anybody who Tunes in a

play02:26

little bit late they can they can catch

play02:28

up so my name is is Nick as you can see

play02:30

on the screen here um I'm a user

play02:32

Advocate at browse AI I live on the

play02:34

marketing team but I work very closely

play02:36

with support and product and Engineering

play02:38

to make sure our users get the very best

play02:40

experience possible and also joining us

play02:42

I'm going to pop him on stage here um Ry

play02:44

the founder and CEO of browse AI R want

play02:47

to say hi real

play02:49

quick hey everyone I'm Ry um and I'm

play02:53

calling from Vancouver Canada and uh

play02:56

this is my daughter Raha who wants to

play02:58

say hi uh every now and then when I work

play03:00

from

play03:02

home thanks

play03:07

Nick all right

play03:12

perfect in this webinar it's a brief

play03:16

overview um overview of web scraping is

play03:18

what we're going to cover first

play03:22

here sorry if I'm muting Myself by

play03:25

accident it's because on this screen the

play03:27

space bar does mute but on this one it

play03:28

does the slids my apologies all right

play03:31

next introduction to browse AI for those

play03:33

of you who may not be familiar a demo of

play03:36

web scripting a fairly simple

play03:38

example a demo with a bit more

play03:41

complexity and then some use cases and

play03:43

examples what you know what can be done

play03:45

with browse

play03:46

Ai and then some Q&A and finally

play03:50

wrapping up so about 15 or 20 minutes or

play03:53

so until we get to the

play03:55

Q&A so what is web

play03:57

scripting essentially webrip in is the

play04:00

extraction of data from a website in

play04:03

case you're not aware that's that's the

play04:04

gist of it of what web scraping is what

play04:06

browse AI does and what can you do with

play04:09

the data once you've scripted once

play04:10

you've collected it you could send it

play04:12

into something like a spreadsheet or a

play04:13

database or even an API we have some

play04:16

people who power apps via the data

play04:17

they're

play04:18

scraping and why it's important for

play04:20

businesses if you can get the right data

play04:22

at the right time it can be a huge

play04:24

competitive Advantage historically

play04:26

collection of that data has been

play04:28

expensive time consum consuming and

play04:30

sometimes honestly impossible to scale

play04:34

and that's where browse AI came

play04:35

in the mission of browse AI is to

play04:38

democratize access to information on the

play04:40

internet simply put we think everyone

play04:42

should be able to get access to it much

play04:43

more easily than it has been in the

play04:46

past the origin is back in 2020 already

play04:50

started building the first piece of that

play04:51

which was an easy and affordable and

play04:53

reliable way to extract and monitor that

play04:56

data from the web as as a whole

play05:01

and it launched first publicly in 2021

play05:04

so we're a few years in and I think

play05:06

we're just really hitting our stride so

play05:08

whatever you've seen so far it's we're

play05:10

about to kick it up a

play05:12

notch browse the guy in a nutshell we're

play05:14

trying to allow anyone to train a robot

play05:17

by simply pointing clicking and naming

play05:19

the data you want to extract it's all

play05:21

Visual and there's no code needed that's

play05:24

that's the big one there's no no coding

play05:26

knowledge necessary whatsoever

play05:29

trusted by teams at some of these

play05:31

companies you may know people who work

play05:33

there you may work there yourself um we

play05:35

have over 420,000 users and just since

play05:38

January of

play05:39

2023 6.8 billion records have been

play05:43

extracted that's a lot of data just in a

play05:46

span of a couple years

play05:48

here so here's what makes our software

play05:51

difference I'll let you review this for

play05:53

a minute while I take a sip it's all

play05:54

kinds of features here

play06:00

some of the big ones are emulating human

play06:03

interactions solve captas um extract

play06:06

data on specific schedule handle

play06:08

pagination Auto adapt to site layout

play06:10

changes integrating with 7,000 plus apps

play06:13

scrape data with no code using AI um it

play06:17

really is quite remarkable what browse I

play06:18

can do compared to basic web scraping

play06:21

you might have seen someone do with with

play06:22

python or something like

play06:24

that so the first example we're going to

play06:26

cover is LinkedIn and jobs it's kind of

play06:30

a big use case with with data scraping

play06:32

these days so first we're going to

play06:34

search for a specific role in an

play06:38

area next we're going to train a robot

play06:42

to extract the data we

play06:46

want and finally we're going to create a

play06:49

monitor that runs on a schedule so you

play06:52

don't have to keep going to do it

play06:54

yourself so we'll go into this first

play06:56

demo and take myself off the screen here

play06:58

focus on what's being shown and I'll

play07:01

join you again when I'm done with the

play07:03

demo

play07:10

here go to linkedin.com jobs you'll be

play07:14

asked to sign in but if you go to

play07:15

linkedin.com and then click on jobs you

play07:18

don't have to sign in it's a cool little

play07:20

trick for you all right let's search for

play07:23

social media managers in the

play07:27

US and this list down on the left this

play07:29

is what we're going to try to extract so

play07:31

I'm going to copy this URL and then go

play07:34

over to browse AI paste it in a new

play07:39

robots and you'll notice down below

play07:41

there's a couple of pre-built robots

play07:42

that show up you could use these if you

play07:44

wanted them as starting points but I'm

play07:45

going to show you how to build one

play07:47

instead of using

play07:48

these so this website does not require

play07:51

me to log in I won't check this box and

play07:53

I'll click start training robot until

play07:56

recently the Chrome extension was how

play07:57

one would use browse AI but the robot

play08:00

studio is new and the future of browse

play08:02

AI so I'll show you how to use that

play08:04

instead for anybody who's used the

play08:06

Chrome extension in the past you'll no

play08:08

doubt notice some pretty big differences

play08:11

between robot studio and the

play08:14

extension and here's something that is

play08:16

different watch what happens after I

play08:18

select my list what used to happen is I

play08:21

select my list and then I have to go

play08:22

through and pick all my Fields but now

play08:25

we sprinkle in a little bit of AI magic

play08:27

so first let's go over here on the right

play08:29

and click on capture text and then from

play08:32

a list and then once I hover you'll see

play08:35

different things being highlighted when

play08:36

I get exactly what I want I'm going to

play08:38

click and the AI starts to detect my

play08:41

fields for me now in matter of moments

play08:43

here I will

play08:45

have a list with a name job listings and

play08:49

a bunch of fields already selected for

play08:51

me so I didn't have to pick any of those

play08:53

and if I wasn't doing a demo I would

play08:55

probably keep this but instead I'm going

play08:58

to scroll down here and click on select

play09:00

manually instead and then cancel my

play09:02

edits so I can show you how to do it

play09:04

manually even doing it manually we've

play09:07

made it pretty simple you just have to

play09:08

hover and click and select all the

play09:10

different pieces of data that you like

play09:12

I'll take visible text and then I'll

play09:14

click again to get the link and then

play09:16

I'll click to get the location here and

play09:18

the status and when it was posted let's

play09:21

get the logo image URL and the link for

play09:24

the

play09:25

job once I'm finished selecting my

play09:27

Fields I click on confirm or press

play09:29

Center and then I go through and I name

play09:31

all of these so job

play09:33

title

play09:36

company company

play09:39

link

play09:43

location

play09:45

status

play09:48

posted logo and Job Link no Job Link all

play09:56

right press enter and I'm ready to name

play09:57

my list here scroll to the left see all

play10:00

my Fields name my list job listings just

play10:03

again job listings just like the AI did

play10:06

and let's take a look and see to choose

play10:08

the number of items I want to extract so

play10:11

I happen to know that 60 is how many

play10:13

show up before I scroll down some more

play10:15

and the pagination method is scrolling

play10:18

down so I will pick that over here on

play10:21

the

play10:22

right and I will save my actually no

play10:24

first I will show you that you can click

play10:27

in here and remove a column so let's say

play10:29

I don't want company link it's gone and

play10:32

let's say I want to rename posted to

play10:35

date posted enter and that's changed so

play10:38

now let's save my captured list and once

play10:40

I've saved it I can go over on the right

play10:42

here hover and click on the trash can to

play10:44

remove something else so the logo should

play10:47

be gone here and let's take a quick look

play10:49

yep logo is now gone I'll click continue

play10:53

and then

play10:56

finish and now it's time to give the

play10:58

robot a name so I'll name it something

play11:01

like social media no let's capitalize

play11:04

that social media manager

play11:07

jobs in the US and then I'll click save

play11:11

and browse ey is now going to spin up a

play11:13

new server on the cloud in order to

play11:16

emulate the actions that I just took so

play11:18

here we go simulating user

play11:21

actions and you can see right here there

play11:23

three list items now 33 list

play11:27

items gives you a of idea of how many

play11:30

have been scraped so far and now 58 and

play11:33

just about

play11:36

done and there we go 60 list items you

play11:39

can scroll down you can see them all

play11:41

here and that took about what 20 seconds

play11:45

now down at the bottom you see a few

play11:46

options of what you can do next in this

play11:48

case I'm going to say yes this looks

play11:50

good and approve it now that's cool and

play11:52

all but what if we set up monitoring to

play11:54

do this automatically for us so every

play11:58

day except on weekends at a specific

play12:01

time of noon in my time zone I would

play12:06

like to go to this URL and get 60 job

play12:08

listings I'll give my monitor a name smm

play12:12

jobs week dat check and I don't need

play12:16

emails sent to me and I'll save

play12:18

it next we can integrate with any number

play12:21

of tools here in the integrate tab

play12:23

you've got Google Sheets zap your air

play12:25

table make.com workflows web hooks

play12:29

all kinds of options or you can view it

play12:31

in the tables tab here for your robots

play12:33

click on tables and go to the job

play12:36

listings here and you can see all of

play12:38

them right here without going to a

play12:39

separate tool or if you'd like to export

play12:43

to a CSV for example or to Json Json

play12:47

Json and you can choose whether to show

play12:50

data when the input parameters were

play12:51

exactly the same and finally up here at

play12:54

the very top you can see how many

play12:56

credits are used in order to run the

play12:57

tasks and how many lists are

play13:00

included okay next we'll get into red

play13:02

fin let me take a quick sip

play13:06

here all right in this example we're

play13:09

going to search for homes in a specific

play13:11

area with some criteria that I've

play13:14

entered we're then going to extract the

play13:16

individual properties as well as the

play13:18

details of those

play13:20

properties we'll create a workflow to

play13:23

connect two robots together and also

play13:25

integrate with Google

play13:26

Sheets and finally we will do a bulk run

play13:29

using tables so we'll import a CSV and

play13:32

see how bulk run

play13:37

works okay we're on redfin.com here and

play13:40

I've done a search for Portlands and on

play13:42

the right here I've set a couple

play13:43

parameters up to 600,000 and two-bedroom

play13:46

two bath so I'm first going to copy this

play13:48

URL up here and go to pre-built

play13:52

robots you can get here by clicking on

play13:54

pre-built robots on the marketing site

play13:56

and you can search for red fin if you'd

play13:58

like to find those pre-built robots or

play14:00

you can go in the left hand sidebar

play14:02

scroll down click on red fin and those

play14:05

same pre-built robots will show

play14:07

up in this case I'm going to go to the

play14:10

dashboard to do it so we've got a new

play14:12

robot I'll type in redfin.com and I'll

play14:15

choose extract list of properties from

play14:17

redin and I'll replace this URL here

play14:20

with the one I

play14:22

copied go 10 is fine for the limit of

play14:25

properties and next step we can review

play14:27

the configuration here uh looks good

play14:29

and let's click Start

play14:31

extracting while this one's running

play14:33

let's go to a single property copy the

play14:35

URL for that and go to the dashboard

play14:39

again and we'll type

play14:42

in.com and choose the extract property

play14:45

details preil

play14:46

robot okay just as before we go here and

play14:50

get rid of this URL put ours in Click

play14:52

next step one thing to confirm looks

play14:55

good and start

play14:57

extracting okay so now now this one is

play14:59

running as

play15:00

well and now it's time to make the

play15:02

workflow to connect these two robots so

play15:05

I will call it something very creative

play15:06

like red

play15:08

fin deep

play15:12

scraping

play15:13

demo so robot a is the one that will

play15:16

trigger the workflow to run in this case

play15:18

that is the one that gets all of the

play15:20

properties the list of properties and I

play15:23

have actually pre-created a robot to

play15:24

make this process smoother and I have

play15:26

called it something very creative it's

play15:28

called red finan robot a select that one

play15:32

and we'll click next now robot B is the

play15:35

one that runs and then goes to scrape

play15:37

the details from each link that robot a

play15:40

provides so we're going to click down

play15:42

here and select red fin robot B and I

play15:46

have to pick the field that will

play15:47

correspond to the link I want to scrape

play15:49

there's only one of those and it's

play15:50

called link and then we'll click next

play15:52

step now I decide when I would like this

play15:55

to run

play15:56

always only if robot a changes only if

play16:00

robot a finds new items or if robot a

play16:03

finds new or changed items while

play16:05

monitoring I'm going to leave it as

play16:07

always for now and then we'll click next

play16:10

step now with the workflow on robot B

play16:13

you have to have an integration setup

play16:14

and place to send the data and we don't

play16:16

have one yet so we'll click on this link

play16:18

here open robot B integration page and

play16:20

we'll choose Google Sheets it's very

play16:22

common one it's easy to set up I've got

play16:25

got an account set up but if you've ever

play16:27

logged in or created an account with

play16:28

Google it'll look very similar to you

play16:30

the modal pops up and you choose your

play16:32

account so I will select my account

play16:34

here and I'm going to create a new

play16:36

spring sheeet I'll give it a name

play16:39

like redin homes demo you can choose to

play16:43

only sync changes by checking that box

play16:45

if You' like and you can edit the data

play16:47

mapping so I'm going to change this

play16:49

sheet name to homes um leave everything

play16:52

else the same looks good you could

play16:54

rename things if you wanted to on the

play16:55

right here and I'll create spreadsheet

play16:58

and activate integration

play16:59

the creation will only take a moment

play17:01

here and then we'll move on to the next

play17:03

step which shows you the Google sheet

play17:05

that was created with this handy link

play17:08

makes it very easy let's open that up

play17:10

and there you go it's empty currently

play17:12

but this is the sheet it will integrate

play17:15

with rather than make you sit through

play17:17

the triggering of a workflow I've

play17:18

already done it and populated a

play17:20

different sheet up here so I'll show you

play17:22

in this tab what it would look like if

play17:24

we could fast forward what I just built

play17:26

it would end up like this in a gole tool

play17:29

sheets and if we check out the history

play17:31

of that robot it will be shown as a bulk

play17:33

run because multiple URLs will run at

play17:35

the same time that's what it's

play17:38

called and speaking of bulk runs there's

play17:40

another way to do it via tables so if

play17:43

you go to the tables for a particular

play17:46

robots you can click on import CSV and

play17:49

then on the right hand side here you can

play17:51

download a sample CSV with the column

play17:53

you'll

play17:53

need and you can do this for up to

play17:56

50,000 new rows at once

play17:59

I'm going to upload a CSV it doesn't

play18:00

have quite that many rows but I'll

play18:02

upload this

play18:03

here so we can see what that looks like

play18:07

and that URL is the correct one from the

play18:10

CSV so I will click just to make sure

play18:13

yep that's the one I

play18:15

want and click

play18:18

confirm when I click Start extracting

play18:20

data you'll notice that there's a button

play18:22

to remove duplicate rows I'll do this

play18:25

and you'll see that we'll go from 4,900

play18:27

or so and it comes down to

play18:31

4559 so you don't have to worry if you

play18:32

have duplicate rows we'll recognize that

play18:35

and let you take care of

play18:38

it I've got a bulk run in progress here

play18:40

let's go back to the dashboard and I

play18:42

will click on the bulk ran robot go

play18:46

check out the history and you'll see

play18:48

I've got one in progress here check the

play18:51

details and you'll notice that there are

play18:54

a total down here of

play18:57

4,558 00 are finished zero failed you

play19:00

can scroll down see each individual one

play19:03

that has been run scroll back up you can

play19:06

pause the bulk run you can stop the bulk

play19:08

run and you can also filter by

play19:10

successful failed or in progress in

play19:13

addition to

play19:18

all and my face is back all right so you

play19:23

saw a couple of examples there how else

play19:25

are people using browse AI

play19:30

no I removed myself from the

play19:33

stage I swear I will get the hang of

play19:35

these keyboard shortcuts all right um

play19:38

how else are people using browse AI to

play19:40

monitor products and pricing that's a

play19:41

big one businesses can keep an eye on

play19:43

the products and prices of competitors

play19:45

for

play19:46

example property listings as I showed um

play19:49

House Hunters can keep track as well as

play19:51

Realtors to see what's out there job

play19:53

postings LinkedIn is far from the only

play19:55

website that has jobs you could scrape a

play19:58

bunch of and put them all into one one

play20:00

place if you wanted to government

play20:02

websites things like construction

play20:03

permits or licenses often times if

play20:05

you're not there within an hour of it

play20:07

being posted you might lose out entirely

play20:09

so that's another big use case and

play20:12

member directories you could find

play20:14

networking collaboration opportunities

play20:16

you know on autopilot without having to

play20:17

keep checking all these places and I'll

play20:19

let these ones run through as I take a

play20:21

sip

play20:23

here and this is far from all the use

play20:26

cases this is just some of the ways that

play20:28

people use AI um aggregating reviews

play20:31

financial news aggregation lead

play20:32

generation market trends customer

play20:35

reviews I mean if you're creative

play20:38

there's there's lots of ways you can use

play20:39

a toola browse

play20:41

AI so time for some Q

play20:48

Anda got we've got one of our teammates

play20:51

masoon on the back end here checking out

play20:53

the questions and we've got some that

play20:55

are public I believe here

play21:01

yes we'll be emailing the recording that

play21:03

is probably one of the questions on your

play21:04

mind actually I think it is one of the

play21:05

public questions yes there will be a

play21:07

video of this after it's over we'll

play21:09

email it out to everyone who registered

play21:11

so even if you're not here which you

play21:13

don't hear this right now because you're

play21:14

not here but in the recording you'll

play21:15

hear it um we'll send that out so R's on

play21:19

stage yeah there was a question that I

play21:21

wanted to answer uh by Kathy uh they're

play21:24

asking does the information you you you

play21:26

are wanting to scrip need to be in a

play21:28

specific form format so most information

play21:31

that uh people want to extract from

play21:33

websites is is in a structured format

play21:35

that browse AI can easily recognize and

play21:36

extract uh the the exceptions are when

play21:41

for example you want to extract data

play21:43

from a company's about page and then um

play21:46

extract the year they were founded and

play21:48

stuff like that so you're trying to pull

play21:50

some information out of um a messy

play21:53

unstructured text uh browse AI is not

play21:56

designed for that uh so in those cases

play21:58

we recommend using browse to extract the

play22:00

entire content we already have a

play22:02

prebuild robot for that that gives you

play22:03

the entire HTML of that page if you just

play22:06

give it the URL um and then you can pass

play22:08

that through something like GPT like uh

play22:11

if you use zap here you can use zap here

play22:12

to pass that through an llm like GPT and

play22:15

then extract the information you want to

play22:17

extract from that blob of text uh so

play22:20

that's the only exception where browse

play22:22

AI alone would then be able to give you

play22:23

the information you're looking

play22:27

for awesome aome thanks Ary uh there is

play22:29

a longer question down here when you

play22:31

create a workflow that is pulling data

play22:33

from two robots how do you get them to

play22:35

pull the same product data robot a pulls

play22:37

product title and description and price

play22:39

robot B Signs into the URL and pulls

play22:41

product title and product cost I'm

play22:44

understanding

play22:45

correctly actually AR do you know

play22:47

understand what the the question yeah I

play22:50

think uh maybe we should clarify how

play22:52

most people are using workflows so most

play22:54

people are using workflows for this

play22:56

concept that's called Deep scraping

play22:58

which means you you have multiple layers

play23:01

of data on the same website um and for

play23:05

example you want to have one robot go

play23:06

and grab a list of links that's on on

play23:09

certain pages and then you want to have

play23:11

another robot that goes into each of

play23:13

those links and then extracts the

play23:14

details of those links uh or or or the

play23:17

details of those items that you have the

play23:18

links for so in those cases you create

play23:21

robot a and robot B robot a is

play23:23

extracting the list of links and then

play23:25

passing it along to robot B robot B is

play23:28

going through every one of those links

play23:30

and extracting all the details uh this

play23:32

is the most common way people use it um

play23:35

there is a second way much less popular

play23:37

way that people use this which is you

play23:39

could have a robot a that is extracting

play23:41

for example a list of keywords uh and

play23:43

then you can have a robot B that is

play23:45

searching those keywords on Google and

play23:47

extracting the search results or

play23:49

searching those keywords on another site

play23:50

and extracting the results that's also

play23:52

possible um but it's just less common

play23:58

yeah I guess I guess what I want to

play24:00

highlight is like with workflows there's

play24:02

something that robot a is extracting and

play24:05

then passing along to the robot B and

play24:07

then robot B is using that to uh

play24:09

navigate to to the page that it needs to

play24:12

extract data

play24:13

from yeah I think in the red finin

play24:15

example I may not have been clear enough

play24:17

that the first one is getting all those

play24:18

property links and then the second one

play24:20

is set up to scrape the data from each

play24:22

individual listing page and then those

play24:24

two work together as as a workflow so

play24:27

that's something could have highlighted

play24:29

a bit more

play24:31

clearly and you can also set up monitors

play24:34

to do these things um on schedule as

play24:38

well all right any other

play24:46

questions I see a couple of other

play24:49

questions um so can this product extract

play24:52

images from Instagram

play24:56

um so generally speaking social media

play25:00

websites uh they require you to log in

play25:03

to extract data uh and we uh so and by

play25:07

social media sites I mean like Facebook

play25:09

Instagram U and uh uh typically if you

play25:13

have a robot login on your behalf it's

play25:14

using a different IP address uh so it

play25:16

could be flagged by by their system uh

play25:19

so we don't recommend it um if you look

play25:22

at our help center we have an article

play25:23

that says we don't recommend using

play25:25

browse AI for extracting logged in data

play25:27

from social media websites um there's

play25:30

another question that asks um so I'm

play25:34

having a hard time getting browse AI to

play25:35

work on a specific website is there a

play25:37

way to talk with someone to see if the

play25:39

tool will work on this particular

play25:41

website um yes we have a customer

play25:43

success team uh in three different time

play25:45

zones that's uh happy to help uh we we

play25:49

do have um so so we we are signing up

play25:53

thousands of people every day uh and uh

play25:57

uh there there is a bit of a

play25:59

prioritization going on so uh if you if

play26:01

you are on a paid plan uh your support

play26:04

tickets would be prioritized uh because

play26:06

we have a small team and we wouldn't be

play26:07

able to um answer to answer every ticket

play26:11

uh in in uh a very short amount of time

play26:13

uh but uh we try to help everyone that

play26:16

reaches out to support via support

play26:18

brows. or the support form on our

play26:20

website um and uh yeah like if if you're

play26:23

running into any particular challenge we

play26:25

can usually guide guide you towards uh

play26:28

Health Center article or a video demo

play26:30

that shows you how to get around that um

play26:33

or sometimes we might even like tweak

play26:34

your robot for you so uh to make it more

play26:37

work the way you

play26:40

want awesome thanks Arty um one question

play26:43

I didn't really I touched upon it very

play26:44

briefly in the demo of the difference

play26:46

between the extension and robot Studio

play26:48

you want to maybe touch on that um when

play26:50

the extension started what's happening

play26:51

at robot Studio what you know what we're

play26:53

trying to improve

play26:54

on yeah so um we so for the first couple

play26:58

of years that browse AI was live we only

play27:00

had a we we had a browser extension that

play27:03

people had to use they had to install it

play27:05

on their machine and then they would

play27:07

have they would have to use it to train

play27:08

these robots uh the robot would run on

play27:11

our Cloud servers it wouldn't run on

play27:12

your machine uh but the extension was

play27:15

our way of uh teaching the robot how to

play27:18

perform certain actions on that site to

play27:20

grab the information for you um but

play27:22

later we found out that this is many

play27:25

users are not comfortable with that uh

play27:27

for example people that work at large

play27:29

Enterprises uh they they're they they

play27:32

they aren't allowed to install browser

play27:34

extensions on their machines for

play27:35

security reasons uh so we wanted to come

play27:38

up with a way that doesn't require

play27:39

installing anything uh and that's why we

play27:41

built robot studio uh and as a side

play27:44

benefit with robot Studio we also get to

play27:46

have much faster release cycles and

play27:49

iterations because we don't have to go

play27:51

through Chrome web stores approval

play27:53

process uh so there's a there's an

play27:55

interface within browse AI there's a

play27:58

browser within browse AI uh where you

play28:00

are opening the website that you want to

play28:02

extract data from through our servers uh

play28:05

and then you're you're training a robot

play28:07

to extract that data for you uh and

play28:09

there's a lot of intelligence baked in

play28:10

it's much more intelligent than the

play28:12

Chrome extension that we used to have uh

play28:14

but we're still uh you still have the

play28:17

option to create robots with the Chrome

play28:18

extension it's just being

play28:22

deprecated and I did want to point out

play28:24

real quick um the AI feature you saw

play28:26

where when I selected the list and then

play28:27

I kind of built it automatically

play28:29

selecting the fields and naming the list

play28:31

that's being rolled out so if you don't

play28:32

yet have access it is coming um I

play28:35

clearly have access because I work at

play28:36

browse AI one of the perks if you don't

play28:38

see it yet it is coming and it'll look

play28:39

very much like what you saw in that demo

play28:42

and I gotta say it's pretty nice like

play28:44

the manual way is not difficult but it's

play28:47

a lot easier to just have a robot

play28:48

literally do it for you so and we're

play28:50

just getting started so we're going to

play28:51

take all the feedback as you're using it

play28:53

so please do if you see something that

play28:56

um isn't quite working right or you have

play28:58

questions or would like to have be

play29:00

different or better then please let us

play29:01

know um

play29:03

also Q&A if you have any questions right

play29:06

now can submit some questions here got

play29:08

already ready to go to answer your

play29:11

hard-hitting

play29:12

questions I shouldn't challenge

play29:20

people oh masu do you want to um add the

play29:22

link to the screen of where people can

play29:24

submit feedback like if the head product

play29:26

feedback or yeah

play29:29

yeah and we really appreciate your

play29:31

feedback on how we can um make this

play29:34

webinar more valuable and more

play29:38

informative yeah please suggestions um

play29:41

what you would like to have seen um what

play29:43

could have been done different better

play29:45

this is only our second time really

play29:47

doing this recently so we do want to

play29:50

hear from the people how to make these

play29:52

most valuable

play30:00

quiet Bunch no

play30:02

questions anything else top of Mind Arty

play30:04

you wanted to put out there that you

play30:06

know are common questions um many people

play30:09

have questions about what is AI scraping

play30:13

uh so and and they and they think it's

play30:15

just one thing uh and if you Google it

play30:17

there are many people talking about it

play30:20

um our approach to scraping using AI

play30:24

might be a bit different from what most

play30:26

people imagine so most people imagine AI

play30:28

should automate the entire process from

play30:30

zero to 100 so you just tell it I want

play30:32

to extract data from this site and then

play30:34

you have all the data in front of you um

play30:37

we we think we um the industry will get

play30:40

there but it's not there yet and every

play30:44

solution that tries to provide that

play30:46

today um it it has a very low success

play30:49

rate uh and that's because people need

play30:52

different kinds of information even even

play30:53

when you're trying to extract data from

play30:55

one the same website that someone else

play30:57

does you might want to extract a

play30:59

different type of data uh so what we

play31:02

believe is the right approach to take

play31:03

right now is to automate every step of

play31:06

the process individually and then over

play31:08

time as we um really perfect those like

play31:12

little automations here and there uh

play31:14

we're going to merge them merge them

play31:15

together and through that we're going to

play31:16

be able to automate the full process uh

play31:19

that's the approach we're taking so for

play31:21

like in every step from like selecting

play31:23

the data on the website that you want to

play31:25

extract to naming that robot to making

play31:27

sure that the robot is functioning

play31:29

properly over time we're using AI uh but

play31:33

it doesn't do all the work for you it's

play31:35

more like an assistance that's with you

play31:37

and it saves you time but it but its

play31:39

priority uh its first priority is to get

play31:42

you exactly what you're looking for and

play31:44

the second priority is to make it as

play31:45

easy as possible not the other way

play31:50

around so I do see a question here that

play31:53

kind of ties into something we get asked

play31:54

that you could answer much more

play31:55

eloquently Arty um has there been much

play31:58

negative feedback from Target websites I

play31:59

guess it goes into the question of

play32:01

legality of web scraping and and all

play32:04

that yeah so it it is a controversial

play32:08

topic uh but something we focused on

play32:10

from day one was we focused on public

play32:12

data we do support extracting logged in

play32:15

data but it's meant to be used on data

play32:18

that you own uh so for example if you

play32:20

want to extract data from one of the

play32:22

tools that you use that hosts your data

play32:24

but it doesn't give you an easy way to

play32:25

export that data or integrate it uh

play32:28

you're more than welcome to use browse

play32:29

AI for that but our primary focus is

play32:33

public data and uh yeah so and and we

play32:37

also um really limit access to or access

play32:41

to extracting data that could be

play32:43

sensitive or in the gray area uh for

play32:45

example uh Health Data personally

play32:48

identifiable data uh that's the kind of

play32:50

information that we don't want to focus

play32:51

on extracting uh so um yeah and we have

play32:55

policies in place if you look at our

play32:57

website we have polic on what kind of

play32:59

data you should be extracting what kind

play33:00

of data you

play33:04

shouldn't I don't see any other

play33:06

questions um I remember a question from

play33:08

the last webinar where someone asked if

play33:09

you could take a screenshot of a website

play33:11

and yes you can that's one of the

play33:12

features I didn't demo but screenshots

play33:15

are part of typically the standard of of

play33:18

most robots I'm just taking a screenshot

play33:19

of the page you can also select an area

play33:21

of a website that you'd like to have a

play33:23

screenshot of either just screenshot or

play33:25

in addition to a list um also just text

play33:28

you can get text that's not part of a

play33:30

list which is sometimes handy if there's

play33:32

some piece of information on that page

play33:34

outside of the list like maybe what the

play33:36

title of the page is or the number of

play33:37

results you care about or you want to

play33:39

know what was searched you can copy text

play33:41

from the website outside of just list

play33:43

data as well yeah and I would also say

play33:47

uh if you just want to open a website

play33:48

and grab a screenshot we have a prebuilt

play33:50

robot for that you don't have to train a

play33:51

robot and uh yeah you just create that

play33:55

robot using the if you visit the library

play33:58

of pre-built

play33:59

robots that's a good point yeah it's

play34:02

it's much easier also pre-built robots

play34:05

if you don't see one and you would like

play34:07

one please let us know about that too um

play34:09

we do have quite a few but we can't get

play34:11

them all and user feedback is a big one

play34:13

in terms of what people would love to

play34:15

have pre-built and not have to build

play34:20

themselves give it another second here

play34:22

for some more questions to trickle in

play34:24

and

play34:26

otherwise well I'll just put this up now

play34:28

there is one more thing oh already you

play34:30

had something no just one more thing one

play34:33

more thing um how about a promo code why

play34:35

not so if you use the code webinar 20

play34:38

you can save 20% off an annual

play34:40

subscription for the first year it'll be

play34:42

valid for the next 48 hours as a special

play34:44

thank you for sticking with us and

play34:46

tuning in and for your questions and

play34:48

attention so please do take advantage of

play34:50

that um it's a real code get yourself a

play34:53

deal um outside of that thanks so much

play34:57

for attending the webinar um if you have

play35:00

any feedback again we'd love to hear it

play35:01

because we want to do more of these want

play35:03

to make sure they're valuable and we're

play35:05

not just putting stuff out there we

play35:06

think people care about what else do you

play35:08

want to see so thank you again for

play35:10

tuning in the recording will be sent out

play35:13

probably today or tomorrow and don't

play35:15

forget this promo code and we'll see you

play35:17

next time

Rate This
β˜…
β˜…
β˜…
β˜…
β˜…

5.0 / 5 (0 votes)

Related Tags
Web ScrapingAI AutomationData ExtractionUser ExperienceWebinarMarketing ToolProductivityTech DemoWorkflowIntegration