Find Deals for Puerto Rico Real Estate | API
The how do we analyze deals in Puerto Rico? How do we get the data for these listings? Well, in this video, I'm going to give you a step by step guide to get short and long term real estate listings in different cities. Within Puerto Rico.
Ariel Herrera 0:00
How to find deals for Puerto Rico real estate. Puerto Rico has had one of the hottest real estate markets in history, Americans have chosen Puerto Rico as a location to conduct business vacation to and even established permanent residents. The major draws of Puerto Rico include tax incentives, cheaper real estate, and year round tropical weather. The how do we analyze deals in Puerto Rico? How do we get the data for these listings? Well, in this video, I'm going to give you a step by step guide to get short and long term real estate listings in different cities. Within Puerto Rico. We will query an API using Python. My name is Ariel Herrera Feld data scientists with the analytics area channel where we bridge the gap between real estate and technology. I am passionate about providing data solutions. So follow my channel to get the latest content of real estate analytics. And stay tuned to the end where I provide you insights and how to visualize the data using Kepler maps. Alright, let's get started.
Right now I'm on clasificados. On line, this is where people can buy and sell goods in Puerto Rico, similar to Craigslist. And here we have information on real estate that's for sale, as well short term and long term listings. Now you may be thinking, why are we looking at this site and not on Zillow? Well, Zillow does have listings that are from brokers, the sometimes that can be outdated, as well, if we want to look to get a deal and work directly with sellers or potentially get something off market. This could be a great way to do so I highly suggest check out my prior video on how to actually search for listings on this site, we're we're going to do is use an API to extract the listing data, so we can analyze it in a spreadsheet. The API we're going to be requesting data from is Puerto Rico real estate. And this has been created by analytics area alongside a talented web scraper to in order to get started, it's important for us to just know quickly what this API is about and how we can get data. Then we'll jump into the Jupyter Notebook. So on the left hand side, we have our endpoints think of endpoints as folders where we can get an extract data. There are two different folders here we have for sale property data, and we have rental property data. Within senatorial I'm going to show you how to get for sale listings as well as rental listings. To get started, use the link below to find Puerto Rico real estate API. Next, you're going to sign up for a free rapid API account. Rapid API is a site that has a plethora of API's available, instead of having to sign up for each API individually. All you need to do is sign up for rapid API once for free. And you could use other API's within the platform. Once you sign up for rapid API, then subscribe to this API here. For this API, we provide 20 Free searches per month. You could also go to a higher plan if you want to analyze deals and maybe across the whole island of Puerto Rico. On the right hand side of the API, we could see code snippets, code snippets allows us to quickly copy this code, paste it say into our own notebook like Google collab, and run it to get data. We also have example responses, where you could see what type of data returns from the API. In this case, we're gonna get information like street address, bedrooms, bathrooms, list price, the town, image, location, and URL of the property. So let's jump right into the Python notebook. Here I'm in Google collab, Google collab is free to use, all you have to do is set up a Gmail account, use the link below to open up this file, and save a copy to your drive, then you could run the same code that I'm doing. So here in this tutorial, we're going to get properties for sale, and for rent from the clasificados website. You can read more about the background and example use cases for this data. But let's jump right into it. So the first step is we want to install packages, we're going to be using kappler. To be able to visualize where our listings are. So you can either press play here, or you can select Ctrl Enter to run the cell. Once the library is installed, we're going to import the remaining libraries that we're going to be using. So click play here. This includes some libraries that are specific to Google collab. data wrangling and visualisation. Next, we have several functions here. In order to run these functions all at once. We can go down a bit more. I know it's a lot of code, but I'll be walking through what the overall While theme of these functions are now local and constants, we can click on the cell. And we could select runtime run before, this is another option if we want to run multiple cells without having to click each one individually. Now we can see that there's a checkmark on the left hand side. So that has completed now are locals and constants. This part is optional. The goal here is that we want to have our rapid API key. This is a unique key that will allow us to connect to the data. This key is found on the rapid API site. There's two locations in the middle where it says header, you could locate your rapid API key as well. On the right hand side, when you look at code snippets, it's there as well. Well, for me, I have all my API key stored in a file called API keys, I'm connecting to my Google collab folder, then I'm locating API keys using Pandas, and I'm stating where that column says rapid, then go grab my key. If you want to skip this part, that's okay. You could just comment this out, and then enter your API key here within strings, I'm going to run these two cells. When I run the cell, Google Ads connect to Google Drive. So I'll connect, select my name.
And click Allow after I Koran the bottom cell to read in my file and get my API key. Great. So now we can go into my favorite part, which is actually getting this data, the listings that are for sale and short term listings that are on the website, we're going to now extract this automatically using an API and with a programming language Python. So first step we're going to work with for sale properties. Our first example is selecting one single city to search for all listings, there are several ways we could find all listings. For one, we can go back to clasificados. Go to for sale. And here we could see all the options of different cities we could search data for. In addition, if we go back to the top, and scroll a little bit down, we could see in useful resources Puerto Rico towns, here I have a Google sheets that actually gives you all of the cities we could search for, as well as our population, and how large it is, as well when it was founded. To start, I'm going to select a very popular City, San Juan, and specifically go nada and Miramar. Now to query the actual API I already have all wrapped within a function. So it's super easy to use, we're going to pass in our rapid API key, our town ID, and our range of low price and high price. So in this example, I'm looking for listings that are between $5,000 to $500,000. Now I can click run, we could see here that our API request was successful. And the next cell, I can view what this response looks like. I could see here that there are some street addresses information on listings, bathrooms, but this is very hard to read. So how do we actually get this into a table format? Well, if we go down here, we could view some of the attributes of our response first. So let's transform our response into a JSON object. And we could look at all the keys that are available. So let's select Play here, we could see that the keys available is status properties, and meta, likely, we want to get our data from properties. But what's important about some of this information that's received is we could see current page, Max page and listing URL. Let's click the listing URL. This is the page that's being scraped on the back end and served up as an API, we could see all the listings, which is 59, in total, that met our requirement of being between 5000 to 500,000, within San Juan Coronado. And if we click the first one, we could also see more information as well as square footage. Now, what this is telling us here is that we just scraped the first page. But there's two pages available since there's 59 listings in total. So how do we get data for all 59? Well, the next step, we can use the same exact function, except we set all pages to true. And this will get data for all pages. This will also be two API requests, since it's going to be two pages in total. Let's select Run. Now we could see that our API request was successful. If and only three seconds, we got all pages. The reason why three seconds is because we pause since the API does have a requirement that there's two requests per second when using basic plan. Now to view this data in a data frame, we can run the next cell.
And the next cell we're actually transforming our list. We have a list of forsale responses each page. So there's two different responses. And we transform this into a data frame, which has 59 rows, 10 columns to view the data, the data we have is street address, number of bedrooms, bathrooms, list price, property type safe. It's an apartment, a house in a commercial building, town, image, URL, latitude and longitude. Now let's take a step back to see how these functions actually work. What get for sale listings does is it takes in your rapid API key, town ID, different optional parameters. And then it goes to get the data from rapid API, we could see here we have a section of request, I have some logic upfront don't have to worry about formats. In that case, then down below is where we're actually shifting the logic to look for all pages. So in the case that you select Yes, I want to get all pages. So if there's five pages of listings on the website, I want to get all five that then goes into this logic, where we are going through every page that's available, we're getting the data, and we're appending it to a list. The other function that we use was transform listings to a data frame for each response in our list. So say, if there's five pages, we'll have five different objects in our list. And for each one of those, we're basically taking the properties, and we're putting it into a data frame, then we're using concat, and pandas to put all these separate data frames, all these separate tables into one single table that we can analyze. Great, now we have our data, but we want to analyze it and see what actually makes the content of these listings. So if we want to view the contents of a data frame, we can use group by and pandas to look at what the data is. In this case, if we group by property type, and we count the different addresses per property type, we could see that for sale in this area. 83% of these properties are apartments. And that makes sense because San Juan is right by the waters, a lot of high rises, so likely, this would be the majority of listings available. And we also have a pie chart we created using Plotly Express. To quickly visualize this as well, in the next cell, we're going to add some features. So in our case, if we look back at the data, bedrooms and bathrooms are available, but we have text on the right hand side, and that we want to remove. So within our add features, we're basically normalizing our current columns and creating other columns towards the end for number of bedrooms, number of bathrooms, list price as an integer. So now we can do some aggregations on this data. In data analysis and analytics. In general, you always want to have a good understanding of what comprises your data set. So it's very important to try to visualize and see where your data lies. And the next cell. Let's look at the property attributes for bed and bath. Here we could see that 33% of these properties are one bed, one bath, and the others are studios. So about 50% or so of these properties are smaller size. This makes sense for the same reason as before, it's in a city, it's close to the water likely high rises with small spaces. Now if we were searching for properties in the countryside, we may see more three beds, four baths, larger homes. And the next saw we could view distribution by looking at histogram and box plots.
Again, we see that mostly, the number of bathrooms is one single bathroom, we could also look at the distribution of list price. If you're new to box plots are pretty simple. We just get to see where our data lots on the lower bound, we could see that the cheapest property is 55,000. On the higher end 500,000 for the list price, then most low data sets between 235,000 and about 400,000 with the median as 310 K if we want to get our data into a spreadsheet, we can uncomment this below and view it say in Excel or Google Sheets. Let's move on to short term rentals. This can be really useful say if you want to be a backpacker, go straight to Puerto Rico live there for several years. Or if you're looking to maybe acquire a vacation rental from someone who's already developed one. Well the first step here is to view all rental towns available. We enter in a rapid API key. We could see all of the rental tabs there's no need to in total, and we could also use the percentage if we wanted to view all I'm going to select here two different towns to get data from this is going to be Catalina and Ponce. Now if I press play, I could see when I filter on DF rental towns that DF rental towns does contain these names. And now that I have these names, I'm going to iterate through a list to get data for each of these towns. And I'm going to use the get long term rentals function to do so, and ultimately transform it into a data frame that I could read as a table. The function summarizes for us when we're getting data, what the city is, how many pages we're getting data for. In this case, we're looking for all pages. So I've got Alina had three pages, and Beyonce had four pages and totals took about eight seconds to run. And if we press play next, we could look at all of our long term rentals. These long term rentals likely are for six months to a year, so they're going to be for a longer period of time, we could see the address bedrooms, bathrooms, list price, property type, town, sub location, image, URL, latitude and longitude. Please note that at times the Street Address isn't the actual address. It's just the name of the listing. Therefore, you could use latitude and longitude to really hone in on where this property is, which I'll show you in just a bit how to do so in a map. In order to analyze this data, we want to do a similar step as we did previously for for sale listings, we want to add features. So we can understand this data, we're going to take a similar approach where we're normalizing number of bedrooms, bathrooms, and list price, so that they could all be integers that we could then plot, we can see the new columns that we've added towards the end. And if we run the next cell, one of our questions may be what is the median rental price, if I buy a home in Puerto Rico as an investment property, what can I potentially rent it for? We're here, we did a simple group by we're looking at each location. And we're understanding by bedrooms, the median mean min and max for the rental properties. So the way we would read this here, if we're looking to purchase a Catalina, and we want to get a three bedroom property, the median rent it's going for is $1,400. So we'll need to back into our calculations to make sure that we have a property that ultimately cash flows. And make sense if we purchase it. Our last section here is getting properties for rent short term, it's going to follow very similar steps to the previous. So what we're doing different here is that we're going to get information on all rental listings. So first, I'm just gonna go through a simple example. And this is getting short term rental listings for the town ID Metro. Here, we can see there are 10 rows, I didn't get all pages, it just got the first page. And this is the type of information that we will receive back including price. Now beforehand, when we were looking at long term rental listings, we were able to see DF rental towns to view all the potential cities that we could search for our question now is if we want to look at short term rental data, how do we know which cities have that data available?
What I'm doing here is doing a for loop. I'm going to go through each of these cities. And I'm going to query the cities to see how many pages of data does it have. If it has zero pages of data, that means there's nothing available, and we can't extract that data. But if we see that there's pages of information, we can then look to query that for our short term listings. Once complete, we basically queried each of these cities wants to see how many listings each of them have, we have 92 rows in total, we could see that some cities like Metro central st north, they actually have data available, where others like the third to last row has zero Max pages, meaning no listings. So this helps us to just get an overview of what data is available as well. And then you could query individual cities, similar to how I did up top to get the data. Now my favorite part here is area. What happens if we don't get the actual address? How do we visualize where these listings are? Well, in this case, we could use Kepler maps within our API we already have latitude and longitude. Let's take for sale listings. As an example, I'm going to copy our current data frame and create a new one called DF for sale map. Since we're going to map this we already have latitude and longitude. But let's change this to numeric so it can be passed into our Geo Data Frame. Here we pass in our data frame for sale map and state which columns are a longitude and latitude. Now when we run this So it looks pretty similar of a data frame. However, we now have this new column called geometry. And if we click the next cell, we create our own Kepler chart, we had our own function that I created called Get kappler, map config. And that basically stated the conflict requirements for our Kappeler chart. And now we could see where all of our listings are, and we could hover over some of these as well. And there's different color schemes, we have lighter colors and darker colors. This represents price the darker the color, the higher the price. In this case, we see a price for 450,000. And the lower the color, the lower the price. This case there's a house for 130,000 I hope this tutorial has been super useful to see how you could use an API to be able to get data listings long term and short term for Puerto Rico. You can then use this data to analyze rental prices, as well as list prices and ultimately, whether a deal would cashflow or not please leave comments below on the tools that you use to search for real estate in Puerto Rico as well like this video if you want to see more like it and subscribe if you haven't already. Thanks.