Web Scraping in Power Automate for desktop
Intro The Small Bookstore wants your help to scrape the Amazon book prices, so they better can compete. What you will learn Client Name The Small Bookstore Process Name WebScrapeAmazon Description The local book store wants to battle mighty Amazon, and you’re helping them automatically compare prices. So here we scrape Amazon book prices to […]
Table of Contents
Intro
The Small Bookstore wants your help to scrape the Amazon book prices, so they better can compete.
What you will learn
- Primarily:
- Web scraping
- Secondarily:
- Dynamic web scraping
- Excel
- Regex
Client Name
The Small Bookstore
Process Name
WebScrapeAmazon
Description
The local book store wants to battle mighty Amazon, and you're helping them automatically compare prices. So here we scrape Amazon book prices to use them in the comparison later.
- Download the course material here
- Read the Excel book called data
- Set the active Excel worksheet to Topics
- Do an Amazon search for each of the topics in the book section (pick 'Books' from the drop-down)
- Scrape the results (Title, Author, and Price) into a new Excel sheet with the topic name as the sheet name
- Close (and save) Excel
- Advanced:
- Create a loop that scrapes the first three pages of search search
- Can we tune the Excel results? Hint: Regex
Hints
- Launch Excel
- Open the Data Excel book
- Launch new Chome
- Set active Excel worksheet
- Topics
- Get first free column/row from Excel worksheet
- Read from Excel worksheet
- For each
- Go to web page
- Add new worksheet
- Name it %Topic%
- Set drop-down list value on the web page
- Books
- Populate text field on web page
- %Topic%
- Click link on web page
- Extract data from web page
- Write to Excel worksheet
- Close excel and save
- Advanced:
- Add headers
- Before the For each
- Set active Excel worksheet
- Headers
- Copy cells from Excel worksheet
- We're copying the headers
- Set active Excel worksheet
- In the start of the For each
- Add new worksheet
- Topic
- Paste cells to Excel worksheet
- Add new worksheet
- Before the For each
- First three pages
- Loop (1 to 3)
- Get first free column/row from Excel worksheet
- Move the Extract data and the Write to Excel in the loop since we want to do it three times
- Click link on web page
- The next button
- Loop (1 to 3)
- Regex
- Use Regex in the Extract Data to extract the author
- Add headers
((?<=^by ).*(?= \|))|((?<=\| by ).*(?= \|))