Get Your Feet Wet in Power B I

A hands-on introduction to Microsoft Analytics Tool


As a data scientist, you’ll need to learn to be comfortable with analytics tools sooner or later. In today’s post, we will dive headfirst and learn the very basics of Power BI.

Be sure to click on the images to better see some details.

The Data

The dataset that we will be using for today’s hands-on tutorial can be found at https://www.kaggle.com/c/instacart-market-basket-analysis/data. This dataset is “a relational set of files describing customers’ orders over time.” Download the zip files and extract them to a folder on your local hard drive.

Download Power BI Desktop

If you haven’t already, go to https://powerbi.microsoft.com/desktop and click on the “Download free” button.

Image for post
Screenshot by Ednalyn C. De Dios

If you’re using Windows 10, it will ask you to open Microsoft Store.

Image for post
Screenshot by Ednalyn C. De Dios

Go ahead and click on the “Install” button.

Image for post
Screenshot by Ednalyn C. De Dios

And let’s get started by clicking on the “Launch” button.

A Thousand Clicks

Image for post
Screenshot by Ednalyn C. De Dios

Click on “Get data” when the splash screen appears.

Image for post
Screenshot by Ednalyn C. De Dios

You will be presented with a lot of file format and sources; let’s choose “Text/CSV” and click on the “Connect” button.

Image for post
Screenshot by Ednalyn C. De Dios

Select “order_products_prior.csv” and click on the “Open” button.

Image for post
Screenshot by Ednalyn C. De Dios

The image below shows what the data looks like. Click on the “Load” button to load the dataset into Power BI Desktop.

Image for post
Screenshot by Ednalyn C. De Dios

Load the rest of the dataset by selecting “Get Data” and choosing the “Text/CSV” option on the dropdown.

Image for post
Screenshot by Ednalyn C. De Dios

You should have these three files loaded into Power BI Desktop:

  • order_products_prior.csv
  • orders.csv
  • products.csv

You should see the following tables appear on the “Fields” panel of Power BI Desktop, as shown below. (Note: the image shows Power BI in Report View.)

Image for post
Screenshot by Ednalyn C. De Dios

Let’s see what the Data View looks like by clicking on the second icon on the left side of Power BI Desktop.

Image for post
Screenshot by Ednalyn C. De Dios

And now, let’s check out the Model View where we will see how the different tables are related to each other.

Image for post
Screenshot by Ednalyn C. De Dios

If we hover a line, it will turn yellow and the corresponding related fields are both highlighted as well.

Image for post
Screenshot by Ednalyn C. De Dios

In this case, Power BI Desktop is smart to infer the two relationships. However, most of the time, we will have to create the relationships ourselves. We will cover this topic in the future.

Image for post
Screenshot by Ednalyn C. De Dios

Let’s go back to the Report View and examine the “Visualizations” panel closely. Look for the “slicer” icon which looks like a square with a funnel at the bottom right corner. Click on it to add a visual to the report.

Image for post
Screenshot by Ednalyn C. De Dios

In the “Fields” panel, find the “department_id” and click the checkbox on its left.

Image for post
Screenshot by Ednalyn C. De Dios

This will cause the “department_id” field to appear under the “Visualizations” panel in the “Field” box.

Next, take your mouse cursor and hover over the top right corner of the visual in the Report View. Click on the three dots that appeared in the corner as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

Click on “List” in the dropdown that appeared.

Image for post
Screenshot by Ednalyn C. De Dios

While the “department_id” visual is selected, you should see corner marks indicating the visual as the active visual. While the “department_id” is active, press CTRL+C to copy it and then CTRL+V to paste it. Move the new visual to the right of the original visual.

Make the second visual active by clicking somewhere inside it. Then look for the “aisle_id” field in the “Fields” panel on the right of Power BI Desktop as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

Try selecting a value on the “department_id” visual and observe how the selection on “aisle_id” changes accordingly.

Image for post
Screenshot by Ednalyn C. De Dios
Image for post
Screenshot by Ednalyn C. De Dios

Now, examine the “Visualizations” panel again and click on the table visual as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

In the “Fields” panel, select “product_id” and “product_name” or drag them in the “Values” box.

Image for post
Screenshot by Ednalyn C. De Dios

Power BI Desktop should look similar to the image below.

Image for post
Screenshot by Ednalyn C. De Dios

This time, try selecting a value from both “department_id” and “aisle_id” — observe what happens to the table visual on the right.

Image for post
Screenshot by Ednalyn C. De Dios

Let’s create another visual by copying and pasting the table visual. This time, select (or drag) the following fields to the “Values” box of the visual.

  • order_id
  • user_id
  • order_number
  • order_hour_of_day
  • order_dow
  • days_since_prior_order

Power BI Desktop should now look similar to the image below.

Image for post
Screenshot by Ednalyn C. De Dios

Try clicking one of the selections in the table visual (where it’s showing “product_id” and “product_name”) and observe how the table on the right changes accordingly.

Image for post
Screenshot by Ednalyn C. De Dios

For a closer look, activate Focus Mode by clicking on the icon as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

The table displays the details of orders that have the product that you selected in the table with “product_id” and “product_name.”

Get out of Focus Mode by clicking on “Back to report” as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

Let’s rename this page or tab by right-clicking on the page name (“Page 1”) and selecting “Rename Page.”

Image for post
Screenshot by Ednalyn C. De Dios

Type in “PRODUCTS” and press ENTER.

Image for post
Screenshot by Ednalyn C. De Dios

let’s add another page or tab to the report by right-clicking on the page name again (“PRODUCTS”) and selecting “Duplicate Page.”

Image for post
Screenshot by Ednalyn C. De Dios

Rename the new page “TRANSACTIONS” and delete (or remove) the right-most table with order details on it.

Change the top-left visual and make update the fields as shown below. The “Fields” box should say “order_dow” while the top-left visual is activated.

Move the visuals around so it looks similar below.

Image for post
Screenshot by Ednalyn C. De Dios

Do the same thing for the next visual. This time, select “order_hour_of_day” and your Power BI Desktop should like the image below.

Image for post
Screenshot by Ednalyn C. De Dios

Do the same thing one last time for the last table and it should now contain fields as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

Let’s add another page or tab to the report by clicking on the “+” icon at the bottom of the report’s main work area.

Image for post
Screenshot by Ednalyn C. De Dios

Basic Exploration

In the “Visualizations” panel, select “Stacked column chart.”

Image for post
Screenshot by Ednalyn C. De Dios

Resize the chart by drabbing their move-handles.

Make sure the “Axis” box contains “order_dow” and the “Values” box with “order_id” respectively. Power BI Desktop should automatically calculate the count for “order_id” and display the field as “Count of order_id” as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

The graph above is interesting because it shows a higher number of orders for Day 0 and Day 1.

Let’s make another chart.

We will follow the same procedure of adding a chart and for this time, we’ll use “order_hour_of_day” in the “Axis” box as shown below.

Image for post
Screenshot by Ednalyn C. De Dios

The graph shows the peak time for the number of orders.

One last graph!

We will add another chart with “days_since_prior_order” in the “Axis” box.

Image for post
Screenshot by Ednalyn C. De Dios

This last graph is the most interesting because the number of reorders peaks during these three time periods: 7 days, 14 days, and 30 days since prior order. This means that people are in a habit of resupplying every week, every two weeks, and every month.

That’s it, folks!

In the next article, we will “prettify” our charts and make them more readable to the others.

The procedures above have been drawn out. But if you’re a novice Power BI user, don’t despair! With regular practice, the concepts demonstrated in this article will soon become second nature and you’ll probably be able to do them in your sleep.

Thank you for reading. If you want to learn more about my journey from slacker to data scientist, check out the article below:From Slacker to Data ScientistMy journey into data science without a degree.towardsdatascience.com

And if you’re thinking about switching gears and venture into data science, start thinking about rebranding now:The Slacker’s Guide to Rebranding Yourself as a Data ScientistOpinionated advice for the rest of us. Love of math, optional.towardsdatascience.com

Stay tuned!

You can reach me on Twitter or LinkedIn.

This article was first published in Towards Data Science’ Medium publication.

Forecasting in Power BI

A visual step-by-step guide to forecasting using Power BI.


In this post, we’ll go through the process of creating forecasting in Power BI.

Get the Data

You can download the dataset that I used here. It contains daily female births in California in 1959¹. For a list of other time-series datasets, check out Jason Brownlee’s article 7 Time Series Datasets for Machine Learning – Machine Learning Mastery.

Let’s load the data into Power BI. Open up Power BI and click on “Get data” on the welcome screen as shown below.

Image for post
Screenshot by the Author

Next, you’ll be presented with another pane that asks what type of data we want to get. Select “Text/CSV” as shown below and click on “Connect.”

Image for post
Screenshot by the Author

When the File Open window appears, navigate to where we saved the dataset and click on the “Open” button on the lower right-hand corner.

Image for post
Screenshot by the Author

When a preview appears, just click on “Load.”

Image for post
Screenshot by the Author

We’ll now see the main working area of Power BI. Head over to the “Visualizations” panel and look for “Line Chart.”

Image for post
.Screenshot by the Author

This is what the line chart icon looks like:

Image for post
Screenshot by the Author

Next, a visual placeholder will appear. Drag the hot corner marking on the lower right-hand corner of the placeholder and drag it diagonally down and to the right corner of the main working area.

Image for post
Screenshot by the Author
Image for post
Screenshot by the Author

Next, head over the “Fields” panel.

Image for post
Screenshot by the Author

With the line chart placeholder still selected, find the “Date” field and click on the square box to put a checkmark on it.

Image for post
Screenshot by the Author

We’ll now see the “Date” field under Axis. Click on the down arrow on the right of the “Date” as shown below.

Image for post
Screenshot by the Author

Select “Date” instead of the default, “Date Hierarchy.”

Image for post
Screenshot by the Author

Then, let’s put a checkmark on the “Births” field.

Image for post
Screenshot by the Author

We’ll now see a line graph like the one below. Head over the Visualization panel and under the list of icons, find the analytics icon as shown below.

Image for post
Screenshot by the Author

Scroll down the panel and find the “Forecast” section. Click on the down arrow to expand it if necessary.

Image for post
Screenshot by the Author

Next, click on “+Add” to add forecasting on the current visualization.

Image for post
Screenshot by the Author

We’ll now see a solid gray fill area and a line plot to the right of the visualization like the one below.

Image for post
Screenshot by the Author

Let’s change the Forecast length to 31 points. In this case, a data point equals a day so 31 would roughly equate to a month’s worth of predictions. Click on “Apply” on the lower right-corner of the Forecast group to apply the changes.

Image for post
Screenshot by the Author

Instead of points, let’s change the unit of measure to “Months” instead as shown below.

Image for post
Screenshot by the Author

Once we click “Apply,” we’ll see the changes in the visualization. The graph below contains forecast for 3 months.

Image for post
Screenshot by the Author

What if we wanted to compare how the forecast compares to actual data? We can do this with the “Ignore last” setting.

For this example, let’s ignore the last 3 months of the data. Power Bi will then forecast 3 months worth of data using the dataset but ignoring the last 3 months. This way, we can compare the Power BI’s forecasting result with the actual data in the last 3 months of the dataset.

Let’s click on “Apply” when we’redone changing the settings as shown below.

Image for post
Screenshot by the Author

Below, we can see how the Power BI forecasting compares with the actual data. The black solid line represents the forecasting while the blue line represents the actual data.

Image for post
Screenshot by the Author

The solid gray fill on the forecasting represents the confidence interval. The higher its value, the large the area will be. Let’s lower our confidence interval to 75% as shown below and see how it affects the graph.

Image for post
Screenshot by the Author

The solid gray fill became smaller as shown below.

Image for post
Screenshot by the Author

Next, let’s take into account seasonality. Below, let’s set it to 90 points which is equivalent to about 3 months. Putting this value will tell Power BI to look for seasonality within a 3-month cycle. Play with this value with what makes sense according to the data.

Image for post
Screenshot by the Author

The result is show below.

Image for post
Screenshot by the Author

Let’s return our confidence interval to the default value of 95% and scroll down the group to see formatting options.

Image for post
Screenshot by the Author

Let’s change the forecasting line to an orange color and let’s make the gray fill disappear by changing formatting to “None.”

Image for post
Screenshot by the Author

And that’s it! With a few simple clicks of the mouse, we got ourselves a forecast from the dataset.

Thank you for reading. If you want to learn more about my journey from slacker to data scientist, check out the article below:From Slacker to Data ScientistMy journey into data science without a degree.towardsdatascience.com

And if you’re thinking about switching gears and venture into data science, start thinking about rebranding now:The Slacker’s Guide to Rebranding Yourself as a Data ScientistOpinionated advice for the rest of us. Love of math, optional.towardsdatascience.com

Stay tuned!

You can reach me on Twitter or LinkedIn.

[1] Machine Learning Mastery. (June 21, 2020). 7 Time Series Datasets for Machine Learning. https://machinelearningmastery.com/time-series-datasets-for-machine-learning/

This article was first published in Towards Data Science’ Medium publication.

Create an N-Gram Ranking in Power BI

A quick start guide on building a Python visual with a few simple clicks of the mouse and a dash of code.


In a previous article, I wrote a quick start guide on creating and visualizing n-gram ranking using nltk for natural language processing. However, I needed a way to share my findings with others who don’t have Python or Jupyter Notebook installed in their machines. I needed to use our organization’s BI reporting tool: Power BI.

Enter Python Visual.

The Python visual allows you to create a visualization generated by running Python code. In this post, we’ll walk through the steps needed to visualize the results of our n-gram ranking using this visual.

First, let’s get our data. You can download the sample dataset here. Then, we could load the data into Power BI Desktop as shown below:

Select Text/CSV and click on “Connect”.

Select the file in the Windows Explorer folder and click open:

Click on “Load”.

Next, find the Py icon on the “Visualizations” panel.

Then, click on “Enable” at the prompt that appears to enable script visuals.

You’ll see a placeholder appear in the main area and a Python script editor panel at the bottom of the dashboard.

Select the ‘text’ column on the “Fields” panel.

You’ll see a predefined script that serves as a preamble for the script that we’re going to write.

In the Python script editor panel, place your cursor at the end of line #6 and hit enter twice.

Then, copy and paste the following code:

import re
import unicodedata
import nltk
from nltk.corpus import stopwordsADDITIONAL_STOPWORDS = ['covfefe']import matplotlib.pyplot as pltdef basic_clean(text):
wnl = nltk.stem.WordNetLemmatizer()
stopwords = nltk.corpus.stopwords.words('english') + ADDITIONAL_STOPWORDS
text = (unicodedata.normalize('NFKD', text)
.encode('ascii', 'ignore')
.decode('utf-8', 'ignore')
.lower())
words = re.sub(r'[^\w\s]', '', text).split()
return [wnl.lemmatize(word) for word in words if word not in stopwords]words = basic_clean(''.join(str(dataset['text'].tolist())))bigrams_series = (pandas.Series(nltk.ngrams(words, 2)).value_counts())[:12]bigrams_series.sort_values().plot.barh(color='blue', width=.9, figsize=(12, 8))plt.show()

In a nutshell, the code above transforms extracts n-grams from the 'text' column of thedataset dataframe and creates a horizontal bar graph out of it using matplotlib. The result of plt.show() is what Power BI displays on the Python visual.

For more information on this code, please visit my previous tutorial.From DataFrame to N-GramsA quick-start guide to creating and visualizing n-gram ranking using nltk for natural language processing.towardsdatascience.com

After you’re done pasting the code, click on the “play” icon at the upper right corner of the Python script editor panel.

After a few moments, you should now be able to see the horizontal bar graph like the one below:

And that’s it!

With a few simple clicks of the mouse, along with some help from our Python script, we’re able to visualize the results of our n-gram ranking.


I hope you enjoyed today’s post on one of Power BI’s strongest features. Power BI already has some useful and beautiful built-in visuals but sometimes, you just need a little bit more flexibility. Running Python code helps with this. I hope this gentle introduction will encourage you to explore more and expand your repertoire.

In the next article, I’ll share a quick-start guide to extracting named-entities from a Pandas dataframe using spaCy.

Stay tuned!

You can reach me on Twitter or LinkedIn.

This article was first published in Towards Data Science‘ Medium publication.

Create a Network Graph in Power BI

In a previous article, I wrote a quick start guide to visualize a Pandas dataframe using networkx and matplotlib. While it was fun to learn and explore more about network graphs in Python, I got to thinking about how to present the results to others who don’t have Python or Jupyter Notebook installed in their machines. At TaskUs, we use Power BI for most of our reporting so I began to search for a custom Power BI visualization that can take the data and transform it into a meaningful network graph.

Enter Network Navigator.

Network Navigator is a custom visual in Power BI that is created by Microsoft. It allows you to “explore node-link data by panning over and zooming into a force-directed node layout (which can be precomputed or animated live).”¹ In this post, we’ll walk through the steps needed to create a network graph using the custom visual.

First, let’s get our data. You can download the sample dataset here. Then, we could load the data into Power BI Desktop as shown below:

Select Text/CSV and click on “Connect”.

Select the file in the Windows Explorer folder and click open:

Click on “Transform Data”.

Click on “Use first Row as Headers”.

Click on “Close & Apply”.

Next, find the three dots at the end of the “Visualizations” panel.

And select “Get more visuals”.

Point your mouse cursor inside the search text box and type in “network” and hit the “Enter” key and click on the “Add” button.

Wait a few moments and you’ll see the notification below. Click on the “OK” button to close the notification.

You’ll see a new icon appear at the bottom of the “Visualizations” panel as shown below.

Click on the new icon and you will see something similar to the picture below.

With the new visual placeholder selected, click on “Reporter” and “Assignee” in the “Fields” panel and it will automatically assign the columns as the Source and Target Node.

Let’s add labels by clicking on the paintbrush icon.

Click on “Layout” to expand the section and scroll down until you see the “Labels” section.

Click on the toggle switch under “Labels” to turn it on

and voila!

That’s it! With a few simple clicks of the mouse, we’re able to create a network graph from a csv file.


I hope you enjoy today’s post on one of Power BI’s coolest visuals. Network graph analysis is a big topic but I hope this gentle introduction will encourage you to explore more and expand your repertoire.

In the next article, I’ll share my journey from slacker to data scientist and I hope it’ll inspire others instead of being dissuaded by haters.

Stay tuned!

You can reach me on Twitter or LinkedIn.

[1]: Business Apps — Microsoft AppSource. (May 16, 2020). Network Navigator Chart https://appsource.microsoft.com/en-us/product/power-bi-visuals/WA104380795?src=office&tab=Overview