Effortlessly Scrape Data from Websites using Power Automate and Power Apps
Summary
TLDRIn this tutorial, the presenter demonstrates how to scrape exchange rate data from a website using Power Automate cloud flow. They extract table data, convert it into a JSON array, and showcase how to utilize this data in various applications like Power Apps, Excel, or Dataverse. The process involves using HTTP actions, data manipulation expressions like substring and nth index of, and concludes with displaying the scraped data in a Power App gallery. The video is a practical guide for those interested in web data extraction and integration.
Takeaways
- 🌐 The demonstration focuses on scraping data from a website using Power Automate cloud flow.
- 📊 The example data is a table of exchange rate information which is extracted and converted into a JSON array.
- 🔄 The JSON array can be utilized for various purposes such as updating Excel tables, adding items to a list, or saving records to Dataverse.
- 🔧 The process involves using HTTP action to access the website and retrieve HTML content.
- 🔍 The script uses 'indexOf' and 'nthIndexOf' to locate the specific table containing the exchange rate data within the HTML.
- 📖 The data is extracted as a substring between the opening and closing tags of the table.
- 🔄 The extracted HTML table is converted to XML and then to JSON for easier manipulation and use.
- 🛠️ The JSON data is simplified using 'select' and 'compose' actions to extract only the necessary information.
- 📱 The final step involves passing the JSON array to a Power App where it can be displayed in a gallery or used for other purposes.
- 🔗 The demonstration also covers updating the Power Automate flow to respond to a Power App trigger and creating a JSON schema for the response data.
Q & A
What is the main focus of the demonstration in the video script?
-The main focus of the demonstration is to show how to scrape data from a website using a Power Automate cloud flow, specifically targeting a table of exchange rate data.
What is the final format of the scraped data in the demonstration?
-The final format of the scraped data is a JSON array, which can be used for various purposes such as adding data to a table in Excel, a list, or saving as records to Dataverse.
How does the presenter plan to use the scraped data in their scenario?
-In the scenario presented, the scraped data is planned to be passed to a Power App and displayed in a gallery.
What is the first step the presenter takes in building the flow for scraping data?
-The first step in building the flow is to use a manual trigger and an HTTP action to access the website via a GET request and retrieve the HTML content.
Why does the presenter use the 'nth index of' expression in the flow?
-The presenter uses the 'nth index of' expression to find the second occurrence of the opening table tag in the HTML content, as the data of interest is within the second table on the webpage.
How does the presenter extract the table data from the HTML content?
-The presenter extracts the table data by finding the positions of the opening and closing table tags, then using the 'substring' expression to get the data between these tags.
What conversion does the presenter perform on the extracted table data?
-The presenter converts the extracted table data into XML first and then into a JSON array for easier manipulation and use in Power Apps.
What Power Automate expression is used to simplify the JSON array data into a more usable format?
-The presenter uses the 'select' action in Power Automate to simplify the JSON array data by looping through each object and extracting the desired values based on column headers.
How does the presenter update the Power Automate flow to respond back to the Power App?
-The presenter updates the flow to use an HTTP response action to send the data back to the Power App, and creates a JSON schema based on the output of the 'select' action.
What is the method used in the Power App to trigger the flow and display the scraped data?
-In the Power App, the presenter uses a button to trigger the flow and populates a gallery with the data by using a collection that is created as a result of running the flow.
Outlines
🌐 Introduction to Web Scraping with Power Automate
The speaker introduces a tutorial on how to scrape data from a website using a Power Automate cloud flow. The example focuses on extracting exchange rate data from a table on a website and converting it into a JSON array. The potential uses of the data once converted are discussed, such as adding it to an Excel table, a list, or saving it as records in Dataverse. The speaker encourages viewers to watch the demonstration if interested and reminds them to like and subscribe.
🔗 Setting Up the Power Automate Flow
The tutorial continues with the setup of a Power Automate flow that starts with a manual trigger. The speaker plans to convert this into a PowerApps trigger for practical use but uses a premium connection for initial testing. The HTTP action is used to access the website's data via a GET request, and the URL is inputted into the flow. After saving and testing, the HTML of the webpage is retrieved and inspected for the table data. The speaker demonstrates how to find the table's opening and closing tags and extract the data between them using the 'substring' expression.
🔄 Converting HTML to XML and JSON
The speaker explains the next steps in processing the extracted HTML table data. The data is first converted into XML to facilitate easier conversion into JSON format. The XML conversion is done using the 'xml' expression, and then the JSON conversion is performed using the 'json' expression. The speaker then demonstrates how to extract the table rows from the JSON array and prepare the data for further use, such as sending it to a PowerApp or creating new list items.
📱 Integrating with PowerApp and Displaying Data
The final part of the tutorial involves updating the Power Automate flow to use a PowerApps trigger and responding back to the PowerApp with the scraped data. The speaker shows how to create a JSON schema based on the data's structure and use the HTTP response action to return the data to the PowerApp. In the PowerApp, a button is used to trigger the flow, and a gallery is populated with the data retrieved from the website. The speaker concludes the demonstration by showing the real-time updating of the data in the gallery and encourages viewers to like and subscribe for more content.
Mindmap
Keywords
💡Power Automate
💡Cloud Flow
💡JSON Array
💡Power App
💡HTTP Action
💡Table Tag
💡XPath
💡Substring
💡XML
💡Dataverse
💡Gallery
Highlights
Introduction to scraping data from a website using Power Automate cloud flow.
Focus on extracting exchange rate data from a table on a website.
Demonstration of pulling data into a JSON array for use in Power Apps.
Explanation of the versatility of using data arrays for various applications.
Step-by-step guide to building a flow from scratch to scrape data.
Use of HTTP action to access the website and retrieve HTML content.
Identification of the specific table containing the desired data.
Technique to find the nth occurrence of a tag using 'indexOf' and 'nthIndex'.
Extraction of the table's HTML content using substring and dynamic values.
Conversion of the HTML table into XML for easier data manipulation.
Transformation of XML data into a JSON array for Power Apps compatibility.
Accessing the table row objects from the JSON array for data extraction.
Use of 'select' action to loop through objects and extract specific values.
Creation of a simplified data array based on the table's column headers.
Integration of the scraped data into a Power App using a button trigger.
Displaying the scraped data in a Power App gallery for real-time updates.
Overview of the practical applications of the scraped data, such as updating Excel tables or lists.
Conclusion of the demonstration with a summary of the key learning points.
Transcripts
hi there folks in today's demonstration
i'm going to show you how we can scrape
data from a website using a power
automate cloud flow
so in today's example i'm actually going
to look at a table of exchange rate data
i'll then pull that from the website
into a json array which and then in my
scenario going to pass to a power app
and display in a gallery
now of course because the data is in an
array you could use it to add data to a
table in excel or maybe add some items
into a list or indeed save as records to
dataverse plenty of options with the
data once it's converted to an array
if that's something that interests you
please make sure you watch on if you
haven't already make sure you like and
subscribe and without further ado let's
jump into the demonstration
so here you can see the table in
question on our website here we've got
three columns of data
with the currency and then two different
values
i'm going to pull this across into a
flow which i'm going to build from
scratch today
now if i start with my flow you can see
that i have the manual trigger
and i will eventually convert that into
the powerapps trigger but for now for
testing i'm going to kick things off
with a premium connection so we do need
to use the http action and that will
allow us to access that website
via the get and then i can paste the url
that was keen to get out there into this
parameter here
now if i go ahead and save this and test
it we'll be able to see that just from
this quick action that we've added we're
able to get all of that html back into
our flow
so open that up we can have a look at
the body here and here we have all the
html so if i jump back onto that website
and want to right click and go to view
page source
i can then do a search for the table tag
which is what i'm going to be doing
today in power automate
so you can see that i have three
potential opening table tags there if i
jump down onto the second one and maybe
start scrolling across a bit hopefully
we'll start seeing some data that's uh
recognizable so i see at least one of
the columns there we've got monet
and we've also got h a h
um my french is not great so apologize
but definitely we're pulling through
some of those values we've got the
currency for euros and this is the the
string of data that we're going to look
for in our flow
so how do we get that well we're going
to have to use compose several times
throughout this solution
and i'm going to use index of and of
course that is going to allow me to
return
the
position of a string so i'm going to
look at that body and i want to return
that opening tag of table
but
i just did a search there on the page
source and it wasn't the first table it
was actually the second table
so i'm going to use the expression nth
index of which will allow me to find the
nth occurrence and in this case if i put
in a comma i want to find the second
occurrence of that tag
now i've highlighted that expression i'm
going to copy it and say okay
we'll rename this action to the opening
tag
and then i'm going to create another
compose to find the closing tag
and that's why i've copied that
expression
so closing tag if i jump into the
expression builder here and paste that
in
rather than this opening tag i now want
to find the closing tag we just got the
forward slash in front of it there and
if i hit okay we now have the position
of the second occurrence of the opening
tag and the second occurrence of the
closing tag
with the next compose we can go and get
that sub string
and we're going to use the expression
substring to do that
so in terms of dynamic values we want to
check that body for this particular
substring
and the substring expression looks for a
starting index which is our opening tag
so we can pick there the opening tag
and then we're looking for a length
now if the closing tag is at position
500 and the opening tag is at position
400 then the length is going to be the
closing tag minus the opening tag so we
can use sub
open close brackets and we're going to
insert the closing tag a comma
and then the opening tag and all being
well if i hit ok that should update and
save so that will get us our table
string
i'll call that table
i'm going to go ahead and save and test
and there's one thing to note at this
point the closing tag is the position of
the beginning of the strings the
beginning of that closing tag but i do
need to have everything up into the end
of that closing tag and if i expand this
compose action here we can see that
whilst i've got the opening tag if i go
all the way to the end i'm missing the
closing tag
so
simple simple step all i need to do is
just type in that closing tag and that
completes off our table or our html
table
next step is all about converting it
into xml because if we convert it into
xml i can convert it a lot easier into
json so another compose i'm going to
rename that as xml
and it's quite simple i type in the
expression xml open close brackets and
select the output from that compose
table and say okay
now that it's an xml
i can easily convert it into json
and i can do that using the json
expression so type in json open and
close brackets
and if i
insert that compose xml and hit ok i'm
just going to rename this as well so
that i know this is my my json array
if i go ahead and save and test that we
can have a look at the output of both
the xml and the json array action
now i've got a video on doing this in
more detail more complex situations but
this is a relatively straightforward
situation for converting it into xml and
then into
json so we can see now we have our json
array and of course to expand this
you can see that the table tag is now a
key for one of the objects and we've got
a table header
array here with the column names
and then the important bit is this table
row array because you can see now that
we have these repeating objects
that have
the currency
and then these two values that we're
looking to select
and so for sending this back to
powerapps or doing anything with it to
be honest if i'm wanting to
create a new list item i want to
simplify this data quite significantly
to be honest
so
i'm going to copy this
object
and i'll open it up in notepad plus
shortly to have a look at it but the
first thing i want to do is to get
access to this table row object so i can
get access to this full array
and if i look at the path i can see that
the path will be table and then follow
this line down is tr
so go back into edit
and i'm going to go into compose
and then i want to create a new
expression and a little tip here
i want to get the expression for this
action here i'm going to go into the
expression tab type in the number one
pick that compose json array and then
just
get rid of that one
and then
because we looked at that array a minute
ago i want to get the table and then the
tr object and hopefully if i say okay to
that and hit test it should return to me
all those objects with the row data
so test that and have a look
and we can see that we now have if i
expand that all of the individual
objects containing each of those rows
so this is where we can use a select and
the select will allow us to then pick
out
the
header we've got here for the currency
and then those two values and so a
select will let us loop through each of
these objects individually and grab
those values
so if i go back into edit and i'm going
to use my select action
and if we insert that compose as the
input
we then need to define our map so we
need those three column headers so i'm
going to jump back onto the website and
just grab
these column headers of course i could
call them anything that i want to be
honest but i'll just go with
the values that are on the website for
now
i'm not even going to try and pronounce
it because i cannot speak a word of
french
we'll grab that last one there
and chuck that one in and just tidy this
up with a few extra return lines and if
i bring across notepad plus and have a
look at one of these objects here
you can see that first of all we need to
get into this td key and then we need to
get into this p key in order to get the
euro string but it is actually different
for the other two objects beneath
so if you know how to access objects
within an array they're called by
integer indexes so this is integer index
zero this is one this is
two
and we want to loop through all of these
in our select so what we do here in the
expression tab is to type in item
question mark
and then we need to think about what we
want to retrieve from this object
so item is going to return
everything
we need to get into this td
key so if i type in now in single quotes
td
and then we want to get into the first
object in the next array so thinking
about inter integer indexes that will be
zero so i can put in question mark
and then in brackets zero
and then after that if i put another
question mark and jump back onto this
notepad plus
i want to get into the value p
for this particular one here so all i
need to do is square brackets single
quotes and the letter p so i'm going to
copy that
and think about the next one so the next
one if i paste that in and bring up the
notepad
if i was to return p i would get
everything in this object
but i want to get the text
so i can copy that key name and i've got
to remember also that we've got
several objects so i'm no longer an
object zero i'm in an object one
so if i go back here i need to change
this to object one and rather than it
just being p it needs to be p text so i
can use a forward slash
text the other option would be to put
another question mark and put the text
value in square brackets
so if i copy that say okay we'll go into
this one here i can paste that in all i
need to change here is the value 1 to
value 2 because we're now into the last
object here 0 1
and just just to demonstrate if i was to
remove this text here
i could put in the question mark and the
hash text so this would be the
alternative in single quotes if you're
not familiar using the forward slashes
so if i say okay and save that and test
that that should hopefully get us
a nicely repurposed array of data
based on those three columns
so it's run okay if i go to the select
here we go we can see we've got the
euros starlings dollars
etc so all those values have been nicely
pulled through
and based on your requirement you can
either now save that to excel into a
list dataverse etc but for today i'm
going to pass this back to my power nap
so for the power app the first thing i
need to do is i need to update the
trigger so if i delete that trigger
i can then go and select the powerapps
triggers and there's one and two we're
gonna go with version two even though it
doesn't really matter version two has
the improvements for the input
parameters we don't need that today but
i prefer the version two
and then in terms of response
whilst there is a respond to
power naps or flow that you see here it
will only currently allow you to return
a string there is a new feature that has
come out this week it's not yet reached
my tenant sadly that does allow you to
parse strings in
power apps to create an array when that
comes out i'll maybe do a quick video
for today's video because we're already
in the premium actions i'm going to use
the http response
which is here it's a premium action
and this allows me to respond back to
the powerapp with the data from the
select
so the data is like i mentioned from the
select we do need to however create this
json schema and the way we do that is
from running the flow
so i go ahead and test that again
and this should run the flow and then we
can jump into the select action and copy
the output to create our schema
so highlighting all of this we could do
a ctrl a and ctrl c
back into edit i can then go into
generate from sample and insert my
payload and say done and you'll see that
i have a nicely created schema here that
now will allow this action to return the
data back to powernaps
so go ahead and give that a save i also
better give it a more meaningful name or
i'll never find it
to scrape
data
from web
hit the save button i'll jump onto my
power app so with the powerapp i'm just
going to use a button today to run the
flow you could do it
via
the unvisible property of your screen
and i'm going to populate a gallery so
if i go and insert my vertical gallery
and i need to attach my flow to my power
app so if i go into the power out
automate button here on the left hand
side go to add flow and go and search
for this new solution scrape data from
web so that's just attaching that flow
into my power app
and then once that's done if i go to my
button
i should be able to
find that new
expression for that flow so pressing
that button now will run that flow but i
want that the results to be in a
collection so i can clear collect
i can type in my collection name which
can be my scraped
data
or scrapped data i've called it
scraped data i need to put the closing
bracket at the end there so that keeps
that expression happy and then i need to
update my gallery so that it's now using
that collection which is created as a
result of running my flow
so if i put it into play mode and hit
the button hopefully it will bring
through the data which is fantastic
and then if i want to display other
fields i can control c and control v in
order to create just a duplicate text
field which we can see here on the left
hand side
and then i could change the expression
here to return the other value
and there we go so we have scraped the
data from the website
returned it into a json array
then we've passed that onto our powerapp
into a gallery and that we can now see
that data so this will be
real time if this website gets an update
and these prices change if we were then
to trigger that flow it would of course
trigger this http action pull the data
as it is find the opening closing tag
get the table converted xml into json
and then we can start building out our
new select
action to create our nicely repurposed
array of data
so that marks the end of the
demonstration uh plenty to take in there
again and some good use of some of the
data operations select and item and
substring in index of or nth index of if
you're looking to find a particular
string or occurrence of a string
if you haven't already please make sure
you like and subscribe and hope to see
you again sometime soon thanks very much
for watching cheers
تصفح المزيد من مقاطع الفيديو ذات الصلة
How To Automatically Add Excel Data to a SharePoint List with Power Automate
This AI Agent can Scrape ANY WEBSITE!!!
Using QuerySurge to Test & Validate Data in Microsoft Power BI
Always Check for the Hidden API when Web Scraping
Can’t INPUT DATA in Power BI? Here is a WRITE BACK Option with Power Apps!
[Eurostat 1/1] Get Europe's data in Power BI with Eurostat API - @EU_Eurostat
5.0 / 5 (0 votes)