Skip to content
All posts

How to extract data from websites to spreadsheets automatically.

In this guide, we'll show you how to use Lutra to extract data from websites into Google Sheets.

 

How is Lutra different from traditional web scraping??

Lutra uses AI for data extraction, making it far more versatile than traditional web scraping tools. Traditional web scrapers rely on fixed rules and patterns to collect data from websites, which can break if the site's structure changes. In contrast, Lutra's AI can understand and adapt to various website layouts and extract meaningful information from unstructured data. This means Lutra can handle a wide range of websites and data formats.

 

Example: Extracting Zillow Data into a Google Sheet

We'll demonstrate how to extract real estate data from Zillow listings and automatically add it to a Google Sheet.

Imagine you're a realtor juggling multiple properties for sale. You need to keep track of all the listings, their prices, and updates to ensure you provide accurate and timely information to your clients. Whether you're representing buyers or sellers, having an organized system is crucial. By extracting Zillow data into a Google Sheet, you can easily monitor market trends, price changes, and new listings.

From a Zillow listing page, you might want to capture the following data: Price, Zestimate, Beds, Baths, Sq Ft (Home), Sq Ft (Lot), Type (Single Family, Condo, ...), Built When, Days on Zillow, Num Views, Num Saves, Listing Agent Name, Listing Agent Contact, Listing Agent Agency.

zillow-ss

 

Step 1: Create a Google Sheet with well-formatted headers

First, let's set up our Google Sheet to capture the data you need. Create a new sheet with headers that match the information from Zillow listings. Here’s a list of headers you might use: Zillow URL, Price, Zestimate, Beds, Baths, Sq Ft (Home), Sq Ft (Lot), Type (Single Family, Condo, ...), Built When, Days on Zillow, Num Views, Num Saves, Listing Agent Name, Listing Agent Contact, Listing Agent Agency

Here’s an example spreadsheet that has already been set up: Google Sheet Example. You can make a copy of it to follow along.

We will create a workflow that will visit each Zillow URL, extract the data from the page, and then update the spreadsheet with the data extracted.

Note: Lutra uses AI for data extraction. The AI works best when the data you need to extract is clear and specific. For example, if you want to extract “Sq Ft (Home)” from a page that also has “Sq Ft (Lot)”, it's better to extract both. This approach guides the AI to identify and separate the two numbers correctly, as it reasons through the context to distinguish between them.

 

zillow-listing-ss-empty

 

 

Step 2: Populate the spreadsheet with Zillow listings you care about

Add the Zillow listing URLs to the “Zillow URL” column in your spreadsheet. You can use Lutra to generate the listings in a separate workflow, or use an existing set of URLs you would like to track.

zillow-listing-ss-url-filled

 

Step 3: Create a new workflow on Lutra

create-new-workflow

 

Next, let's use the "Extract Zillow Data to Spreadsheet" template.

extract_zillow_1

Next, you need to connect the newly created spreadsheet with the workflow. We have an option to make a copy of Lutra's template sheets but for now, we want to use 

extract_zillow_2

When working with spreadsheets, Lutra will request looking at the spreadsheet before it makes a plan of action. You may need to give Lutra authorization to securely access your Google Sheets.
extract_zillow_3

 

After Lutra has permissions, you need to specify the Google spreadsheet to use and then click "Create workflow" to continue.

 

extract_zillow_4

Next, Lutra will convert the instructions into a detailed plan of action, and explain to you how it intends to achieve your goal with the tools it has access to. You can review these instructions and make modifications as necessary.

Click on “Looks Good” to let Lutra know to proceed.

extract_zillow_5

 

Step 4: Run the Workflow

Next, Lutra will implement the plan, creating a workflow just for your task. Behind the scenes, Lutra is writing software for your task and generating a custom form that you can use to configure settings when you run it.

extract_zillow_6

 

 

Congratulations! Your spreadsheet is now updated with the summaries!

zillow-listing-ss-complete
 
 

 

How to extract data from a website to Excel

If you prefer working with Excel, you can easily export your Google Sheets with extracted data to Excel. Here's how:

Open your Google Sheet:

  1. Click on “File” in the top menu.
  2. Select “Download” from the dropdown menu.
  3. Choose “Microsoft Excel (.xlsx)” from the list of formats.
 export-spreadsheet
 
Save the file: Your browser will download the file. Save it to your preferred location on your computer.
Now, you have your extracted data in an Excel file, ready for further analysis or sharing.