Lutra AI Blog

How to extract data from websites to spreadsheets automatically.

Written by Lutra | June 11, 2024

In this guide, we'll show you how to use Lutra to extract data from websites into Google Sheets.

 

How is Lutra different from traditional web scraping??

Lutra uses AI for data extraction, making it far more versatile than traditional web scraping tools. Traditional web scrapers rely on fixed rules and patterns to collect data from websites, which can break if the site's structure changes. In contrast, Lutra's AI can understand and adapt to various website layouts and extract meaningful information from unstructured data. This means Lutra can handle a wide range of websites and data formats.

 

Example: Extracting Zillow Data into a Google Sheet

We'll demonstrate how to extract real estate data from Zillow listings and automatically add it to a Google Sheet.

Imagine you're a realtor juggling multiple properties for sale. You need to keep track of all the listings, their prices, and updates to ensure you provide accurate and timely information to your clients. Whether you're representing buyers or sellers, having an organized system is crucial. By extracting Zillow data into a Google Sheet, you can easily monitor market trends, price changes, and new listings.

From a Zillow listing page, you might want to capture the following data: Price, Zestimate, Beds, Baths, Sq Ft (Home), Sq Ft (Lot), Type (Single Family, Condo, ...), Built When, Days on Zillow, Num Views, Num Saves, Listing Agent Name, Listing Agent Contact, Listing Agent Agency.

 

Step 1: Create a Google Sheet with well-formatted headers

First, let's set up our Google Sheet to capture the data you need. Create a new sheet with headers that match the information from Zillow listings. Here’s a list of headers you might use: Zillow URL, Price, Zestimate, Beds, Baths, Sq Ft (Home), Sq Ft (Lot), Type (Single Family, Condo, ...), Built When, Days on Zillow, Num Views, Num Saves, Listing Agent Name, Listing Agent Contact, Listing Agent Agency

Here’s an example spreadsheet that has already been set up: Google Sheet Example. You can make a copy of it to follow along.

We will create a workflow that will visit each Zillow URL, extract the data from the page, and then update the spreadsheet with the data extracted.

Note: Lutra uses AI for data extraction. The AI works best when the data you need to extract is clear and specific. For example, if you want to extract “Sq Ft (Home)” from a page that also has “Sq Ft (Lot)”, it's better to extract both. This approach guides the AI to identify and separate the two numbers correctly, as it reasons through the context to distinguish between them.

 

 

 

Step 2: Populate the spreadsheet with Zillow listings you care about

Add the Zillow listing URLs to the “Zillow URL” column in your spreadsheet. You can use Lutra to generate the listings in a separate workflow, or use an existing set of URLs you would like to track.

 

Step 3: Create a new workflow on Lutra

 

Next, let's use the "Extract Zillow Data to Spreadsheet" template.

Next, you need to connect the newly created spreadsheet with the workflow. We have an option to make a copy of Lutra's template sheets but for now, we want to use 

When working with spreadsheets, Lutra will request looking at the spreadsheet before it makes a plan of action. You may need to give Lutra authorization to securely access your Google Sheets.

 

After Lutra has permissions, you need to specify the Google spreadsheet to use and then click "Create workflow" to continue.

 

Next, Lutra will convert the instructions into a detailed plan of action, and explain to you how it intends to achieve your goal with the tools it has access to. You can review these instructions and make modifications as necessary.

Click on “Looks Good” to let Lutra know to proceed.

 

Step 4: Run the Workflow

Next, Lutra will implement the plan, creating a workflow just for your task. Behind the scenes, Lutra is writing software for your task and generating a custom form that you can use to configure settings when you run it.

 

 

Congratulations! Your spreadsheet is now updated with the summaries!

 
 

 

How to extract data from a website to Excel

If you prefer working with Excel, you can easily export your Google Sheets with extracted data to Excel. Here's how:

Open your Google Sheet:

  1. Click on “File” in the top menu.
  2. Select “Download” from the dropdown menu.
  3. Choose “Microsoft Excel (.xlsx)” from the list of formats.
 
 
Save the file: Your browser will download the file. Save it to your preferred location on your computer.
Now, you have your extracted data in an Excel file, ready for further analysis or sharing.