How to Skip Trace Property Owners using BatchData with Python

Need to get property owner data to contact motivated sellers? In this video, we cover how to use the BatchData API to get contact information such as phone, email, and owner address for a property.

Ariel Herrera 00:00

As the market is crashing, the number of home sales are declining and price cuts are happening every day across real estate markets. As investors, how do we capitalize on the changing market, there are four steps. First, we need to find properties that we could buy at a discount. And the way we find these distressed properties are finding people who are motivated to sell. We want to locate property owners through skip tracing, and ultimately strike a deal off market for property. In this video, I'm going to show you how we can do skip tracing automatically to find the contact for property owners. This includes phone number, email, and address. That way we can target those who are going into Pre Foreclosure divorce behind in taxes first, so we could find properties at a discount and expand our portfolio. My name is Ariel Herrera with analytics area channel, we bridge the gap between real estate and technology if you enjoy data driven solutions to expand your real estate business, and please subscribe, as well as like this video, so I know to make more of it. All right, let's get started.

Ariel Herrera 01:22

There's four steps within this process of finding off market deals and closing on them. The first is finding distressed property owners, which could be people who are behind on their mortgages. Second is finding what are those owner's names? Third, how do we contact them email cell phone address, and then fourth striking a deal? How do we find motivated sellers? Do we just go around your neighborhood and then start knocking on every door and see who's behind on their mortgage? Definitely not. This data is already accumulated by our local county. However, local county can be very difficult to extract data. So therefore I like to use a system called prop stream. Prop stream is a tool that already collects data nationwide from counties and provides that an easy way so that you can search for properties of in your city. So for example, say if we wanted to look at Newark, New Jersey, and we wanted to see all properties that were in pre foreclosure, we can click pre foreclosure. And the state has already been taken from the counties it's about 481 properties, we could filter this down even more to make it that these properties we want to see are vacant. And if we do this, we see we have 14 properties here, we can then save our search so that we can see the property owners for each of these properties. Once we have our list saved, we can then export it into a CSV file. When we look at the CSV file, we could see we have a lot of information including the address of each property. And then we also get information about the owner, first name, and last name. And if it's an LLC, it only comes up in the last name. And all of this is great, we still don't have information on the contact data. So we want to do skip tracing. That way we can find the owners email, phone number, and home address. We can do this through Prop stream. However, batch data does provide a really awesome API to extract this information with Python, which could be really useful if you're feeding this all into, say, a CRM system like Zoho, or Podio. So let's do a walkthrough right now. Right now I'm on batch data's get started for their API. So batch data is very similar to prop stream and that they ingest a ton of data nationwide, including information on properties, as well as the status. So if it's vacant, distressed, and so on, as well as information on the property owner, which is what we're looking for. So the first step that you want to do is go to batch data's website, which is linked below. And you're going to set up an account in your account, you're going to create an API token and purchase credits, which I believe at this time, there's a minimum of $20 to get started. And each API call at this time is 12 cents. So here, we could see that they have a really good overview of how to sign up how to log in, and then ultimately how to create your API token. So when you do get to this phase, you want to create two tokens. Each token, you can name them prod so for production, and the second one should be mock for mock data, which is this check sign right here. And the reason why we have both is when we play around with this data, we don't want to waste money doing so. So what's really awesome about bash eight is that they allow us to work with their data for free by using mock data and classifying that with their API. So once you get the API, you'll see it here, and you'll want to copy it and save it in a private location. Then when you're ready to actually have production, and pull information from the API and actually get the real life, email, phone number and owner's address, then you're going to purchase credits and just follow these steps. So let's dive into this API. On the left hand side, we could see that there's a ton of API's here. So say, if you wanted to consistently look at properties in your area that are in foreclosure, you could use batch data as API to do this, it does get a little bit pricey depending on how large the size of your searches. So definitely be aware of that ahead of time.

Ariel Herrera 05:51

Now, if we go to property skip trace, we could see here that we have the API, the endpoint, which is skip trace, which is going to allow us to get data. So on the right hand side, we could see our parameters. And this example, we have a request, and it looks to get information on this address, call 1011 rosegold Street, in Franklin Square, New York with zip code Paston. There's also a second property address passing, which is really nice, because we're able to not just query for one single address, but we could do multiple addresses at the same time. And it shows you down here, what are the combinations that are required, it's important to also include at least a zip code and st could be the same across different areas of in the United States. Now, if we go down here, we could see that we actually are provided by batch data, a sample request, we can drop down here and select different programming languages. In our case, from prior videos, we typically use Python and request the request library. So here, we could technically copy this code, and then run it. But as you see their payload is in a weird string. And it makes it really difficult when we're trying to do this for multiple properties. So I'll show you in the notebook in a moment, how we actually make the request. But now going down a bit further. What's really cool is that batch data does show us a response example. So in this example, we could see that we'll get information on persons, so persons related to each address will be able to see if they are in bankruptcy. If there's been a death, Donna do not call us which is very important, what their email is, what their mailing address will be, as well as phone numbers, and we get information on if the phone number is reachable, and a score, I believe the score is associated with how likely that phone number is associated with that person. Once you have your API keys, your batched login set up, what you're going to do next is open the Google collab. If you're new to Google collab, it's basically an environment on the cloud, we're able to run Python code with that means that you don't need to have Python installed on your local machine, you could do everything here and essentially, open the link that I have, and make a copy to your drive and run the same code. So we have here to start with is we want to import necessary packages that we're going to be using. This is specific to Google collab. So if you're running this on, say, Visual Studio or Jupyter Notebook, you may not need this. I also have this option set to max columns. That way, when we start to get data and look at it as a table format, we could see all the columns and not just a truncated version. So to run this, we can either press play, or do Ctrl, enter. The next step is specific for Google Cloud. And if you're actually saving your API, how I am within a CSV file. So what I do in these two cells is that I connect my Google Drive account. And I have a CSV file that stores all of my API's. So what I do here is I read my CSV file, and then I associated to a variable called batch data, mock API key. Now, if you don't want to take this step, that's fine. Just replace where I have mock data API key with your string. The next step is we're going to actually retrieve the data from the API. So as we saw previously, we have an endpoint here, but this is a production endpoint. So if you actually want to query the mock data, which is what I suggest to do, whenever you query any of batch data's endpoints, you would be able to click mock server here and you'll see this endpoint change. So this is free of cost. Now we have that and put it as a URL. And then in our JSON data pay payload, we have the property address that we want to get information for. So I just took the same exact property address that was in batch data's example. And as you see here, they have the payload in this string, which is very hard to read and work with. So that's why I've actually broken this out. So it's a little bit easier if you want to add your own property to it or add multiple. Next, we pass in our headers, and we specify that bear. Before our actual mock API key, this took me a little bit time to figure out I had to message the team that basically in order to authorize you to have this portion before your string, and once you do, you can make the request and run this cell.

Ariel Herrera 10:49

We could see here that we have a ton of information within a string. But of course, this is not easy to read. So let's transform this into a JSON object with key value pairs. And let's look specifically at the results key.

Ariel Herrera 11:07

Now starting to get a little bit more readable, we could see we have information on phone numbers, property and more. But it'll be even better if we could bring this into a table format. So let's use pandas and Python to transform this into a data frame. So what we do here is we specify response, transform that into a JSON object, we want to get results, and we want to look at persons specifically. So if we run this, we'll see now that we have one single row since there's one person associated with this property, and we have information on their emails, phone numbers, whether there's a litigator or involuntary lien. And we have even more information at the end, such as if the property is vacant USPS deliverable, and what their equity percentage is. But of course, the whole idea of skip tracing is we want to get the property owners contact information. That way, we can actually dial up our phone and reach out to this owner to possibly strike a deal. So we want to get information from emails and phone numbers. So let's start off with phone numbers. Here, what I first do is take the phone numbers and put it into a data frame. Because right now phone numbers is nested within a list, which is not useful. So if we run this, we could see that all the information in this one cell is now broken out into four different rows, we have four rows that has number, carrier, type, tested, reachable, and score. Of course, this is dummy data. So for carrier in the future, you might see Verizon, T Mobile, and so on type, I definitely tried to reach out to mobile instead of landlines, I see a lot of landlines, actually blocking calls. So if we go to the next part, where we're going to do is filter on valid phone numbers. So this is going to change based on your requirements are. For me, I only want to look at phone numbers that are reachable. So I set this value to true, then I only want to look at phone numbers have a score of over 90. And my assumption here I'm thinking, Okay, if the score is over 90, then it's more likely associated with that person. And I won't be calling some random individual who's like this is not my property stop calling me. So we see here that once we filter on this, we're able to get just two rows back. So we eliminated that second row here. And we eliminated that fourth row where the score was below 85. I did some other transformations here. And this is specific for testing. Whenever we're doing skip tracing, we want to make sure we are not calling people that are on the do not call us that's actually illegal. So in our mock data, batch data provides two different phone numbers for Do Not Call list. So in order just to make this example, complete, I am replacing those dummy numbers up here with those numbers that the mock data already has. So if you use this in production, you won't be doing the same. And then I transformed these numbers into a string so that we could join on them later on. So now this second part here is calling another endpoint. So if we go back to batch data, we could see in their endpoints under phone number, they have a phone do not call status. And if we click that, we could see this is the mock data endpoint. We have an example of how to make a request and then we have an example response. So if we go back we can run this information were were requesting is to check in all the phone numbers in our data frame. So we put that into a list it Any of those are on the Do Not Call list. So we run this here. And we are going to get a JSON response back. In our response, we have results, we have information on when we made the request, the match count and the phone numbers here. So let's transform this again into a data frame and only select columns that we care about. In this case, we have a number, and we have do not call as our columns, we could see that for the first row, it is set to false. So this means that the person that we're trying to reach is not on the Do Not Call list. So yes, we can call them. Now the other numbers here are set to true. So we want to do is bring this do not call data back into our original phone table. So in this next cell we're doing is we're making sure that our phone numbers are set to strings, that way, we'll be able to join on them. And in this row right here, what we're doing is merging our data frames. So we're saying, Get our valid phone numbers, merge them for do not call and make sure that we associate each row and join them on number. Think of this as a VLOOKUP on Excel. And when we run this, we now have our original table with the numbers, and we have an added column do not call. And we see here that only one of our phone numbers is actually available to not call. And we could actually filter on this by setting do not call equals to false, and we return one single row. Now that we have our valid phone number, we want to be able to bring that back into our original data frame. Now the thing here is that we could have three valid phone numbers one valid phone number, so because there's a dynamic range, what we do here is for each valid phone number, we add them dynamically into our data frame. So if we have three phone numbers, and we'll have three phone number columns, if there's just one, there'll be one phone number column, then we also create an email column. So we want to grab the first email that's in that list within our data frame. And once we do this, we could see now if we go all the way to the end, we have phone one as our new column. That's our valid phone number that we can call. And then we also have the column email. And this is our valid email. So if we go back to the beginning, we could see that originally, these columns were hard to deal with because they were nested within a dictionary and within a list. But now we can actually get this data easily. So if you want to feed it, say into a CRM system, we can. And if we just truncate some of the columns, we have, instead of having 50 columns, let's just select about 10 or so. And we could see we have information on the owner's first last name, the original street, city, state and zip code that we passed in to our skip tracing API. And we also could see the equity equity percentage, if they are an absentee owner, when which in this case they are if the property is vacant, if it's USPS deliverable, which because the mail can be delivered, we assume that it is not vacant. And lastly, what we were looking for our phone and our email, we have successfully skip trace these properties. And if we want to output this into an Excel file, we can do so here as well. I hope this tutorial has been super useful and has shown you that you could use Python for skip tracing. And if you want to use multiple systems you can you don't just have to stick to batch data. Sick to prop stream you could interchange and use them however you'd like. You could also use other CRMs and I highly suggest for you to check out my prior videos on how to set up Podio and future videos of how to set up Zoho whether it's your wholesaling, investing, flipping or agent business. Thanks so much. And if you haven't already, please subscribe.

Previous
Previous

5 Best Real Estate Books for Beginner Investors

Next
Next

How Accurate is Zillow's Zestimate? Coverage Map