Python Visualizations Plotly | Real Estate Analytics (Copy)

In this video we will cover visualizations using Plotly Express, we will apply graphs to property data to easily view trends, you will learn how to create stunning graphics and valuable insight that can be derived from dashboards.

Video Block
Double-click here to add a video by URL or embed code. Learn more

Ariel Herrera 0:00

In this video we will cover visualizations using Plotly Express, we will apply graphs to property data to easily view trends, you will learn how to create stunning graphics and valuable insight that can be derived from dashboards. Let's jump into Lesson Seven notebook from our table of contents. What is data visualization? Data visualization is a discipline of trying to understand data by placing it in a visual context so that patterns, trends and correlations that might not otherwise be detected can be exposed. There are numerous visualization tools available in Python, and you can use them for different interactive environments. In our scenario, we're using Google collab which works off of Jupyter Notebooks, we've explored different data manipulation libraries, like Python, and we also use some plots in matplotlib. We are going to now explore Plotly, which is my favorite visualization tool, and the most progressive for simple yet beautiful charts. And as an overview, you might be thinking, Well, why choose Plotly? What is this library? And how does it compare to the rest? Well, right here I have dashboard visualization tools, I'm going to show you more about dashboards. But as you can see here, starting in 2017, since Plotly, Dash has launched, it has had tremendous amount of usage in the community, as well as streamlet. We will cover the both of these at the end for our first results. Let's first install the packages Plotly Express and pandas, then we will import these Plotly Express we import it as Plotly dot Express, and then we're going to nickname this as as px that way we don't have to write Plotly express each and every time we make a chart. Then in our next cell, we're going to read in a CSV file that I have stored in GitHub. And the CSV file is based off of median sale price in the US. So if we copy the source, we could view it here in the course, we will dive into detail of all these different US market economic fields that we can extract for free using sites like Fred. And this case, I have median sales price of houses sold in the United States. This is quarterly data and dates back all the way to 1963. In the y axis, we have dollars as the median sales price. And on the bottom axis we have the years, we can truncate this just to look at the last 20 years. And we could see how median price of homes have really increased, especially after 2020. I've downloaded this data and made it available in GitHub for us to use first five rows, we have a column called date, and M S P U S, which is the name for this particular series MSP us. And if we want to fact check one of these, we could look 1963, that the median sales price in the US was $17,800. And if we go back to extender plot, we could see if this value does match, I really wish I would have been alive during this time to invest in real estate. I'm copying over some code, that's going to allow us to view the number of rows in our data frame and number of columns. We did this in the previous section where we wrapped our data frame in the len function as well. For the columns, we have 230 rows, and we have two columns. These column names aren't that intuitive. So let's actually rename them as easy to do in Python. So Let's state our data frame, our new data frame, which is going to be the same name, let's re name and what we want to rename is columns. And now we're going to place a dictionary. So our column MSP us will be the key and the new value, let's set it to median sale price. Now let's view the head of our data frame. Great. Now we have a more readable value for our column. But date is still uppercase. That's a little bit confusing. So let's make sure that all of our values are lowercase. As we remember, I'm going to add some code appear. We can take our data frame and look at columns by specifying dot columns. Now let's get this into a list.

So for all items in our list, we want to lowercase them. So we can do a list comprehension which we cover in one of the previous lessons. So for each object in our list, let's do X and lower. So we're going to lowercase this value and wrap it in a list First. Now we can see in our list, we have date lowercased. So let's copy this, we could delete the cell. And now we can paste this to make everything for our columns lowercase. And we will take our data frame dot columns, and set it equal to this new lowercase columns. And now this reflects with our data frame. Now we have a secondary data frame that we're reading in. This is called DF Prof. Let's look at this in more detail. In our data frame, we have 109 rows, we have 13 columns. Let's be one of these rows, we have information from California, this street address is in Palm Springs. We have information on the bedrooms, bathrooms, square footage, year, property type, property estimate, rent estimate, and some other location details. This is actual real data that has been pulled through one of my previous tools. Now let's start answering some questions from our data. Let's view the relationship between square foot and property estimate. I assume that the larger the house is, the higher the property estimate is going to be less difficult to really say for certain without visualizing how this looks. So if we go over to Plotly documentation Plotly Express has lots of documentation that makes it really easy to get started. If we scroll down, we can see there's different types of functions. And if we want to work with some basic charts, we can start with scatter here in the Plotly Express documentation, they give us very simple overviews of how to use their functions. If we go down a bit, we could see there's an example of using the Iris dataset which is flowers. And then in order to use their actual function, we write px for Plotly, express dot scatter, to have a scatterplot we put in our data frame, and then we have what's going to go on our x axis, so horizontal, and our Y axis, then we show our figure. So let's copy this over. So we can replicate this similar code. In our case, our data frame is going to be called DF prop. So let's copy this over. And our x axis, we want to see square footage, so we can copy or column and paste it, then we want to see property estimate, we believe that the higher the larger the house, the more it's going to be valued. Now let's run this cell. So we can test our theory Plotly creates a visualization where we have square foot on the bottom, and then on the y axis property estimate. And yes, it is pretty much a linear relationship, we see that properties with lower square footage go for less value under a million. But as we start to get to larger homes, it starts to trend upwards. So we can say with confidence just by looking at the visualization, that there is likely to be a relationship between these two variables. Going back to the documentation, we can even scroll all the way down to the bottom and see more complicated charts. Now that they're complicated to create. But they get even more intriguing. Since there's more information that we can add. Plotly Express has a lot of neat functionality, you can have things as a scatterplot as a line chart, I highly suggest to explore their docs in more detail. Let's create a basic line chart. Let's view median sale price year over year, like we saw in Fred, this is a line chart. Let's replicate it on our own, we go to Plotly Express and to line chart, we can grab one of their examples. So let's copy this and paste it in our cell. Now we're going to change the data frame to our name, which we establish up here is data frame ms for median sale. So we're going to place this here, our x axis will have the date and then our y axis will have median sale price. We can change our title as well. run our cell. And now we have the same line chart. We can even dive deeper by hovering over the cell

selecting a certain time period and only view in this case back to 2004, which is really cool. Now let's go to our bar chart. If I double click in this line chart, we go back to the original view. What if we want to see it as a bar chart instead? Well, now we can go back to our Plotly documentation to basic charts. And here we should see bar charts. Let's copy an example piece of code and we could paste it below. This is also going to be from the same data frame since we're going to stick with median sale price, this will change the date, and our y axis will be median sale price, run this cell. And now we quickly see the same data in bar chart form. And we could really see that after 2011 has been pretty much a run up for real estate for almost a little over a 10 year timeframe. Now remember, in our last session, when we were using some nifty matplotlib, visuals within pandas, we could do the same using Plotly going to copy the code above and change a few things. So instead of it being a bar chart, it's going to be a histogram. This will allow us to view distribution of our data. In particular, let's look at the properties data frame, and particularly the bedrooms, how many properties are in our table based on bedroom size. Let's change this to DF prop. And since it's a histogram, we won't have an X and Y axis, let's just remove y. And here we're going to replace date with bedrooms. We could see most of the data in our data frame has three bedrooms 55 of the properties do. Then second is four bedrooms with 32. These properties are likely in areas where families reside since their homes that usually have extra bedrooms for children. Let's do one more view on distribution by looking at Rent estimate with a boxplot. So we can copy our figure here, paste it for box plot, change histograms, a box, and look at Rent estimate as our column.

For all of our data, we could see where it lies. The median rent estimate for our properties is about $4,000, with the minimum being 1700 and maximum being almost $10,000 for rent, which is a lot but I guess that's California for you. Then we have the core tiles which on the lower end is about 3000 for rent and higher end about 5800. From these last five visualizations, you could see how easy it is to create interactive charts when applying them to pandas data frames. Now I can tell you from experience when I started about eight years ago, there were not libraries are as easy to use matplotlib was probably the number one and it was an interactive. So this is pretty new in terms of the programming space, and really helps to get analysis and ideas across especially when communicating with stakeholders. Now the best part about visualizations is telling a story. And usually you need more than one visualization. And this constitutes into a dashboard, which is going to be your ultimate project building a Tableau dashboard for US housing market data at the end of this course, let's take a preview of what dashboards look like Plotly also has dash enterprise. And from back in the notebook. If we look at that visualization Plotly dash is one of the most popular open source dashboards. So open source meaning that it's free to use for developers to create these stunning visualizations. If we look at the gallery, you could see many different applications from different industries as well as machine learning to. I chose one example from Plotly dash. This is spatial clustering about this app. This app applies spatial clustering and regionalization analysis to discover the dataset of Airbnb listings in the city of Austin. If we go to the right hand side, we can select cluster based on House type. And we can run the clustering algorithm and update the map. On the left hand side we will see the map update, we now have different regions shaded in different colors. If we go down, we could see user satisfaction. Up top, there's one group in particular that has almost 100% satisfaction every single category, and that is group five. If we go up to property types, which in total, there's 11,000 properties, we can see what makes group five here in purple is house property types, then guest house. And lastly farmstays Which is pretty new in comparison to the other charts. We could see that in the chart on the left hand side that the farmhouses are outside of the city. So maybe if we're looking to get into Airbnb, we should look to get a farmhouse about 20 miles out of Austin. This is the power of using visualizations where you can make quick observations and have actionable results. Going back into our notebook, we could see that not only Plotly Dash has had a lot of success in the last several years, but so has streamlined stream It is one of my favorite apps to spin up quick web apps with Python with limited amount of code. One of my favorite apps that they have is fidelity account overview, we could see in the left hand side are some parameters. On the right hand side, we could see how to use the app. It's a quick overview where if we have a fidelity count, where we're trading either in our 401k, or on the side, we can download our file, and then we can upload it to this web app. And we can view different visualizations on ticker data, as well as our account and how it's performing in between symbols. In this lesson, we've covered scatter plots, line charts, and bar charts, all in Plotly Express. We've also used Plotly express to visualize our distribution of data. And the next section we will be covering joining datasets, and later on. In this course, we'll be building a visualization dashboard using Tableau. I'll see you in the next lesson. want full access the introduction to real estate Data Analytics course. Then sign up for the course in the link below. You will learn Python programming all with real estate related examples. This includes web scraping, retrieving data from sources like Zillow, realtor, Redfin, Yahoo Finance, US census, and more.

If you haven't already, check out the introduction video to the course on YouTube to get a full understanding of what the course has to offer. Also, members of the free tech and real estate group on Facebook receive a 20% off the course seeing the next lesson.

Transcribed by https://otter.ai

Previous
Previous

Python Join Data | Real Estate Analytics

Next
Next

Python DataFrames | Real Estate Analytics