Using MapQuest API to Get Geo Data

A friendly tutorial on getting zip codes and other geographic data from street addresses.


Knowing how to deal with geographic data is a must-have for a data scientist. In this post, we will play around with the MapQuest Search API to get zip codes from street addresses along with their corresponding latitude and longitude to boot!

The Scenario

In 2019, my friends and I participated in CivTechSA Datathon. At one point in the competition, we wanted to visualize the data points and overlay them on San Antonio’s map. The problem is, we had incomplete data. Surprise! All we had were a street number and a street name — no zip code, no latitude, nor longitude. We then turned to the great internet for some help.

We found a great API by MapQuest that will give us exactly what we needed. With just a sprinkle of Python code, we were able to accomplish our goal.

Today, we’re going to walk through this process.

The Data

To follow along, you can download the data from here. Just scroll down to the bottom tab on over to the Data Catalog 2019. Look for SAWS (San Antonio Water System) as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

Download the file by clicking on the link to the Excel file.

Image for post
Screenshot by Ednalyn C. De Dios

OR, you can click on this.

MapQuest API Key

Head on over to https://developer.mapquest.com/ and create an account to get a free API key.

Image for post
Screenshot by Ednalyn C. De Dios
Image for post
Screenshot by Ednalyn C. De Dios
Image for post
Screenshot by Ednalyn C. De Dios
Image for post
Screenshot by Ednalyn C. De Dios
Image for post
Screenshot by Ednalyn C. De Dios

Copy the ‘Consumer Key’ and keep it in a safe place. We’ll need it later.

Jupyter Notebook

Now, let’s fire up a Jupyter notebook and get coding!

For starters, let’s set up the environment by doing a couple of imports.https://towardsdatascience.com/media/7d0f7ced4082761e995ecf8ce0213c3f

Don’t forget to replace the API_KEY (line#12) with your own key above.

Now. let’s read the Excel file with a simple df = pd.read_excel().

Image for post
Screenshot by Ednalyn C. De Dios

Next, we’ll combine the street number and street name columns.https://towardsdatascience.com/media/1696465db27770b7f2942ab707d2efa5

Image for post
Screenshot by Ednalyn C. De Dios

The ALL CAPS hurts my eyes. Let’s do something about it:

df['street_address'] = df.street_address.str.title() .
Image for post
Screenshot by Ednalyn C. De Dios

Below are two functions that call the API and returns geo data.https://towardsdatascience.com/media/3ec6009e8b6069387a9edde18bdad0d3

We can manually call it with the line below. Don’t forget to replace the ‘#####’ with your own API key. You can use any address you want (replace spaces with a + character).

get_zip('https://www.mapquestapi.com/geocoding/v1/address?key=####################&inFormat=kvp&outFormat=json&location=100+ Military+Plaza&thumbMaps=false&delimiter=%2C')

But we’ve got many addresses, so we’ll use a loop to call the API repeatedly.https://towardsdatascience.com/media/9a970862f0352997417c5211df359a9b

Let’s see what the result looks like:

Image for post
Screenshot by Ednalyn C. De Dios

Finally, let’s create a dataframe that will house the street addresses — complete with zip code, latitude, and longitude.https://towardsdatascience.com/media/adfcc23ff94f54877bc80b72e2537ed9

Voila! We’ve got ourselves geo data.

Image for post
Screenshot by Ednalyn C. De Dios

For extra credit, let’s import the data in Tableau and get a pretty spiffy visual:

Image for post
Screenshot by Ednalyn C. De Dios

And that’s it, folks!

You can find the jupyter notebook here.

Thanks for stopping by and reading my post. Hope it was useful 🙂

If you want to learn more about my journey from slacker to data scientist, check out the article below:From Slacker to Data ScientistMy journey into data science without a degree.towardsdatascience.com

And if you’re thinking about switching gears and venture into data science, start thinking about rebranding now:The Slacker’s Guide to Rebranding Yourself as a Data ScientistOpinionated advice for the rest of us. Love of math, optional.towardsdatascience.com

Stay tuned!

You can reach me on Twitter or LinkedIn.

This article was first published in Towards Data Science’ Medium publication.

Get Your Feet Wet in Power B I

A hands-on introduction to Microsoft Analytics Tool


As a data scientist, you’ll need to learn to be comfortable with analytics tools sooner or later. In today’s post, we will dive headfirst and learn the very basics of Power BI.

Be sure to click on the images to better see some details.

The Data

The dataset that we will be using for today’s hands-on tutorial can be found at https://www.kaggle.com/c/instacart-market-basket-analysis/data. This dataset is “a relational set of files describing customers’ orders over time.” Download the zip files and extract them to a folder on your local hard drive.

Download Power BI Desktop

If you haven’t already, go to https://powerbi.microsoft.com/desktop and click on the “Download free” button.

Image for post
Screenshot by Ednalyn C. De Dios

If you’re using Windows 10, it will ask you to open Microsoft Store.

Image for post
Screenshot by Ednalyn C. De Dios

Go ahead and click on the “Install” button.

Image for post
Screenshot by Ednalyn C. De Dios

And let’s get started by clicking on the “Launch” button.

A Thousand Clicks

Image for post
Screenshot by Ednalyn C. De Dios

Click on “Get data” when the splash screen appears.

Image for post
Screenshot by Ednalyn C. De Dios

You will be presented with a lot of file format and sources; let’s choose “Text/CSV” and click on the “Connect” button.

Image for post
Screenshot by Ednalyn C. De Dios

Select “order_products_prior.csv” and click on the “Open” button.

Image for post
Screenshot by Ednalyn C. De Dios

The image below shows what the data looks like. Click on the “Load” button to load the dataset into Power BI Desktop.

Image for post
Screenshot by Ednalyn C. De Dios

Load the rest of the dataset by selecting “Get Data” and choosing the “Text/CSV” option on the dropdown.

Image for post
Screenshot by Ednalyn C. De Dios

You should have these three files loaded into Power BI Desktop:

  • order_products_prior.csv
  • orders.csv
  • products.csv

You should see the following tables appear on the “Fields” panel of Power BI Desktop, as shown below. (Note: the image shows Power BI in Report View.)

Image for post
Screenshot by Ednalyn C. De Dios

Let’s see what the Data View looks like by clicking on the second icon on the left side of Power BI Desktop.

Image for post
Screenshot by Ednalyn C. De Dios

And now, let’s check out the Model View where we will see how the different tables are related to each other.

Image for post
Screenshot by Ednalyn C. De Dios

If we hover a line, it will turn yellow and the corresponding related fields are both highlighted as well.

Image for post
Screenshot by Ednalyn C. De Dios

In this case, Power BI Desktop is smart to infer the two relationships. However, most of the time, we will have to create the relationships ourselves. We will cover this topic in the future.

Image for post
Screenshot by Ednalyn C. De Dios

Let’s go back to the Report View and examine the “Visualizations” panel closely. Look for the “slicer” icon which looks like a square with a funnel at the bottom right corner. Click on it to add a visual to the report.

Image for post
Screenshot by Ednalyn C. De Dios

In the “Fields” panel, find the “department_id” and click the checkbox on its left.

Image for post
Screenshot by Ednalyn C. De Dios

This will cause the “department_id” field to appear under the “Visualizations” panel in the “Field” box.

Next, take your mouse cursor and hover over the top right corner of the visual in the Report View. Click on the three dots that appeared in the corner as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

Click on “List” in the dropdown that appeared.

Image for post
Screenshot by Ednalyn C. De Dios

While the “department_id” visual is selected, you should see corner marks indicating the visual as the active visual. While the “department_id” is active, press CTRL+C to copy it and then CTRL+V to paste it. Move the new visual to the right of the original visual.

Make the second visual active by clicking somewhere inside it. Then look for the “aisle_id” field in the “Fields” panel on the right of Power BI Desktop as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

Try selecting a value on the “department_id” visual and observe how the selection on “aisle_id” changes accordingly.

Image for post
Screenshot by Ednalyn C. De Dios
Image for post
Screenshot by Ednalyn C. De Dios

Now, examine the “Visualizations” panel again and click on the table visual as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

In the “Fields” panel, select “product_id” and “product_name” or drag them in the “Values” box.

Image for post
Screenshot by Ednalyn C. De Dios

Power BI Desktop should look similar to the image below.

Image for post
Screenshot by Ednalyn C. De Dios

This time, try selecting a value from both “department_id” and “aisle_id” — observe what happens to the table visual on the right.

Image for post
Screenshot by Ednalyn C. De Dios

Let’s create another visual by copying and pasting the table visual. This time, select (or drag) the following fields to the “Values” box of the visual.

  • order_id
  • user_id
  • order_number
  • order_hour_of_day
  • order_dow
  • days_since_prior_order

Power BI Desktop should now look similar to the image below.

Image for post
Screenshot by Ednalyn C. De Dios

Try clicking one of the selections in the table visual (where it’s showing “product_id” and “product_name”) and observe how the table on the right changes accordingly.

Image for post
Screenshot by Ednalyn C. De Dios

For a closer look, activate Focus Mode by clicking on the icon as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

The table displays the details of orders that have the product that you selected in the table with “product_id” and “product_name.”

Get out of Focus Mode by clicking on “Back to report” as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

Let’s rename this page or tab by right-clicking on the page name (“Page 1”) and selecting “Rename Page.”

Image for post
Screenshot by Ednalyn C. De Dios

Type in “PRODUCTS” and press ENTER.

Image for post
Screenshot by Ednalyn C. De Dios

let’s add another page or tab to the report by right-clicking on the page name again (“PRODUCTS”) and selecting “Duplicate Page.”

Image for post
Screenshot by Ednalyn C. De Dios

Rename the new page “TRANSACTIONS” and delete (or remove) the right-most table with order details on it.

Change the top-left visual and make update the fields as shown below. The “Fields” box should say “order_dow” while the top-left visual is activated.

Move the visuals around so it looks similar below.

Image for post
Screenshot by Ednalyn C. De Dios

Do the same thing for the next visual. This time, select “order_hour_of_day” and your Power BI Desktop should like the image below.

Image for post
Screenshot by Ednalyn C. De Dios

Do the same thing one last time for the last table and it should now contain fields as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

Let’s add another page or tab to the report by clicking on the “+” icon at the bottom of the report’s main work area.

Image for post
Screenshot by Ednalyn C. De Dios

Basic Exploration

In the “Visualizations” panel, select “Stacked column chart.”

Image for post
Screenshot by Ednalyn C. De Dios

Resize the chart by drabbing their move-handles.

Make sure the “Axis” box contains “order_dow” and the “Values” box with “order_id” respectively. Power BI Desktop should automatically calculate the count for “order_id” and display the field as “Count of order_id” as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

The graph above is interesting because it shows a higher number of orders for Day 0 and Day 1.

Let’s make another chart.

We will follow the same procedure of adding a chart and for this time, we’ll use “order_hour_of_day” in the “Axis” box as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

The graph shows the peak time for the number of orders.

One last graph!

We will add another chart with “days_since_prior_order” in the “Axis” box.

Image for post
Screenshot by Ednalyn C. De Dios

This last graph is the most interesting because the number of reorders peaks during these three time periods: 7 days, 14 days, and 30 days since prior order. This means that people are in a habit of resupplying every week, every two weeks, and every month.

That’s it, folks!

In the next article, we will “prettify” our charts and make them more readable to the others.

The procedures above have been drawn out. But if you’re a novice Power BI user, don’t despair! With regular practice, the concepts demonstrated in this article will soon become second nature and you’ll probably be able to do them in your sleep.

Thank you for reading. If you want to learn more about my journey from slacker to data scientist, check out the article below:From Slacker to Data ScientistMy journey into data science without a degree.towardsdatascience.com

And if you’re thinking about switching gears and venture into data science, start thinking about rebranding now:The Slacker’s Guide to Rebranding Yourself as a Data ScientistOpinionated advice for the rest of us. Love of math, optional.towardsdatascience.com

Stay tuned!

You can reach me on Twitter or LinkedIn.

This article was first published in Towards Data Science’ Medium publication.