Get Real Estate Property Data using Estated's API | Python Tutorial Part 1
Need to get quality real estate property data to analyze a deal or create an application? In this video, I will cover how to get historical property data in a Python tutorial using Estated's API.
Transcription
Ariel Herrera 0:00
Hey everyone, are you looking to be able to get quality real estate data for properties that goes back historically? Well, I'm going to show you how to do that and a simple to follow tutorial in Python using a stated API. My name is Ariel Herrera with the analytics area channel where we're bridging the gap between real estate and technology. If you enjoy these in depth videos of understanding how to use technology, coding Python in order to get real estate data, then please subscribe so you get the latest content, as well like this video, if you want to see more of it. Alright, let's get started. As a quick refresher, in the last video I went over what is stated is why would you choose to use a state its API over some other third parties. And ultimately, we went through the API and the user interface to be able to select different addresses and return data for it. Now, it's really useful. However, what's the best source of using this data is through its API programmatically, right. So say, if you want to be able to get data for a list of leads, maybe potential motivated sellers? Well, it's going to be very tedious to have to go into a user interface and manually input an address each time and pull out the information that you want. using Python, we can actually extract this data programmatically, get it into an Excel spreadsheet, even apply some filters beforehand, and be able to analyze our data in one single shot. So as a quick refresher, as stated, has lots of data, which makes us special in comparison to other real estate data sources, and that it goes back historically. So we have information on parcel building data, historical deeds, tax, and assessments, and more. And what's really neat about state ID is that we're able to get information on properties on a one off basis. So for example, say if you wanted to use atom data, which is another source, it's really more if you're trying to build a full blown application, and there's going to be some long term costs there as well, if you're looking to use something like prop streams, they actually don't provide their API readily to work with. So a lot of working with their data is still more in like a manual sense. So assayed, it is a happy medium, where you're able to get that in depth knowledge of a property while being able to work with that programmatically. So right now I'm going to jump straight into the Google collab. So we're going to be walking through his the code in order to be able to get this data, this code is also going to be in the show notes. So please click below, you'll be able to actually copy this notebook, input your own API key and run this as well. Let's go back to a state ID itself. So beforehand, we went over the user interface on putting in an address and being able to get information back. Now, if you're using the API, what's really neat about this data is that they have so much information in terms of what data they provide, as well as a data coverage, which I found really neat. See here, we could see property object, some of the information we can get, if we want to dive say, into the market assessments, we could see what exact fields we'll be able to receive, we can receive the year land value, improving the value total value of the property, which normally you'd have to go into a county website to get that. And if you want to automate that, you would need a web scraper to pull that data. So this is really neat, because you're able to do it at scale for properties in different cities. Now, shifting over to this tab, we can actually see more in depth about the data dictionary, which may be a little bit granular information depending on where you're at with your analytic skills. But it does provide some really great information as to understand what fields we can get when we try to get information for our property, as well as descriptions and examples. So when you sign up for a state aid, you're going to be able to get 100 Free API calls, which is awesome. So you're able to play with the datasets before having to even pay a fee. And once you get to your dashboard, you'll see on the right hand side, that it tells you how many API calls left you have in your free version. And on the left hand side, you'll be able to actually get your API key, which is going to allow you the access to programmatically get this data once you have your API key, come back to the notebook. And right now we could just walk through it so you understand how to be able to get the data with Python. So I love using Google collab because it's free for one and two, you don't have to have Python installed locally on your machine. So you don't have to worry about versioning things or issues there. It's a great way to be able to use Python for free in a collaborative notebook and also be able to share it with others. So the first steps here is importing the library arrays that we're going to use, then I'm going to touch this part in a moment. But some of what we want to do is get this data on a repeated basis. So instead of rewriting our code each and every time we have a new property, we make a function so that we can quickly go and grab that data. The next step I have here is reading in some locals and constants. So for me my API key, I have it stored in a CSV file that I have in Google Drive. That way, I can easily have all my API keys in one place and be able to pull it out using Pandas here, I'm going to run the first couple of cells. And you could do this two ways, you could either click the play button on the left, or you could do Shift Enter, as well, which I find to be a little bit faster. So I'm going to run through these. And it's going to ask, if you have like me, your API is stored on a separate file, I'll ask to make that connection to Google Drive. And I'm going to say yes, click My name, and allow the connection. Now that it's mounted to the drive, we can then go into where I have my API keys file. And with pandas, I'm going to state that I want my state ID key. And now I have that there. Now going straight into our transformations, we could see how we pulled the API just for a single address. So the base URL in order to get property information is this string right here. And if there's future versions, then you would just change the latest version up there. Then you enter in your API key, which is related to the token, and whatever address you have, that would be broken down into the street address, city state zip codes, these are the parameters that the API requires in order to give you a response back of that property data. So I'm going to run this here. And I could see that I got text, which includes data, publishing date, street number and a lot more, it's a little bit hard to read right now, since we have it in a text format. So let's transform this to a JSON format. Now we could see our key value pairs a little bit more easily, we could see we have information on the structure, and more taxes as well valuation. But we can make this even easier to read. What we can do here is use pandas, and normalize this JSON file. And we know we want to get the information within data. Let me take a step back to show you why we know we want the information there. If I look at the keys for this dictionary right now, or Jason object, I could see I have data metadata and warnings. Now, if I dive deeper into data, we could see all the information that we want for the property. And if we look at the keys for that, we could see we could nest even deeper to actually getting that information that we care about, which includes the address, parcel assessments, and more. So now let's run this using Pandas. And what we're going to get is that same data, so everything within data gets returned to us in this data frame here. So we could see we have information on taxes, assessments, market assessments and deeds for single property. But it is truncated, because think about it this way, taxes could change when you know a property is sold, or depending on the area, it might change year to year, as well as assessments and deeds. So in this case, we still have this data very nested. And it's a little bit hard to read in a table verse for some of our other data like street number, street name. And we have some more the parcel information that is flattened, because it's not going to change the parcel location shouldn't change over time. And I'll show you how to actually get this information that's nested. But as of right now, let's say Okay, great, you got the property address, you were able to run it through the API, and now you have an output. How do you actually have this consumable that you can look at it in Excel? Well, Google Cloud allows a really easy way to take a data frame, put it to a CSV file, and then download it. So now that same data that we were able to put into a data frame or table, we can now see that in Excel so we see all the tax information as well as all those other columns that we were able to return which goes all the way to see T column CTS has a lot a lot of data, which is awesome. That's the whole reason why we're querying from stated API's because we want a lot of historical data. So now I did tell you that this is still a little bit hard to read when looking at the taxes assessments and on since it's still in nested. How do we actually do this that are look at those sub tables. If we go back to our JSON response, we want to get information under data and say if we want to just get the deeds data, we can run this. And we see now we have just our deed information. And this is several different dictionaries here. And what we could do next is just for the deeds dictionary, we can create a data frame for it. So if we run this here, we could see that for the single property, there were five different deeds going back to being recorded in 2010. And now we have all the information that was previously stored just in that one column that we had right here. And now it's actually flattened out or we could, again, also export it to an Excel file, and be able to view it and work with it. So I hope this has been helpful. Next video, we're actually going to dive into a case study to find potential investments. And the idea here is that we're going to take this one single property example and actually extrapolate this to be able to programmatically get data, not just one, but multiple properties. So please check out part three and subscribe if you haven't already. Thanks.