How to use Incremental Refresh on ANY data source!

A few days ago, a colleague asked me if it would be possible to have more data from the Azure Cost Management API than only the last 30 days. Obviously, my first thought was sure, let’s use Azure Synapse and store it in a Data Lake. But then the real challenging question was asked: Would it be possible purely and only with Power BI, not with other services and tools? I was like: Let me brainstorm with my good colleague and co-organizer of the Power BI User Group Switzerland, Denis Selimovic. After a few minutes we (mainly him, but I’ll never admit it 😀 ) came up with the idea of using Datamarts as staging area and using Dataflows afterwards to enable Incremental Refresh. With this workaround, we’ll have an Azure SQL DB (this is technically a Datamart behind the scenes) as staging area, and therefore Incremental Refresh will work as Query folding will be possible! Denis already wrote a great article how to set it up for the Power BI Log Files, which only holds the last 30 days. Check it out here: https://whatthefact.bi/power-bi/power-bi-datamart/persisting-temporary-accessible-data-via-power-bi-datamarts-with-the-example-of-power-bi-activity-logs/ In my blog post I’m going to use a SharePoint site to test at the end the different scenarios (deleting, modifying, and adding new data). I just want to highlight one more time: This approach will work with any Data Source that Power Query / Datamart can connect to. So, it will also work with Excel Sheets, CSV files, BLOB, etc. What a game changer!

What is this Incremental Refresh, Query Folding, and why should I care?

Usually, if you connect to a data source with Power BI – and once your transformation and modelling is done – you set up an automatic refresh of the dataset. The beauty of this is, that all data will be refreshed every time. That works perfectly fine for small datasets. But what if you wish to only update the last few days because there is no need to refresh data from last years as those data never changes? For example, a sales report showing my sales from 2012 – 2022. Sales coming from the years 2012 – 2021 do not change usually so there is no need to update them on a regular base therefore we’re looking for a way to update only the last 7 days of 2022 in this example. This will speed up the Dataset refresh and that’s exactly what Incremental Refresh does. As creator of a dataset, you can set up how many days, months, or years you wish to refresh and everything older than that should just be stored. More insights about Incremental Refresh can be found here: https://docs.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-overview

And how does Query Folding plays a role in this whole setup? Because we configure a specific date in our refresh (in our example we wish to refresh only the last 7 days of 2022), this date has to be provided somehow to the data source. If we’re talking in SQL, this means there has to be somewhere a WHERE clause filtering the data to the last 7 days. While Power BI connects to the data source, it tries to create Queries in the data source language (so if we connect to a SQL DB, it will talk SQL) and on top it tries to push all the different transformations that we did in Power Query to the data source. Again, as an example, if we rename a column from “Column A” to “Revenue” and our data source is SQL, it will generate something like SELECT [Column A] as [Revenue] so that SQL does the transformation. This is exactly what Query Folding is. It tries to push the transformations down to the data source. My friend and MVP Nikola Ilic did a great blog about Query Folding which you can find here: https://data-mozart.com/what-is-a-query-folding-in-power-bi-and-why-should-i-care/ or if you’re more interested in the Microsoft Docs following this link: https://docs.microsoft.com/en-us/power-query/query-folding-basics

Due to the fact that Incremental Refresh requires Query Folding to be able to get the latest data we’re looking for, not all data sources are supported. As an example, Excel, BLOB, CSV files, can’t be incrementally refreshed until now!

Power BI Datamarts

During Build 2022 in May, Microsoft announced a new artefact called Power BI Datamarts (see https://powerbi.microsoft.com/en-us/blog/democratize-enterprise-analytics-with-microsoft-power-bi/) to democratize enterprise analytics for everyone. With Datamarts, every user has an intuitive no code / low code database solution at hand as behind the scenes, an Azure SQL Database will be created.

A datamart creator can use the no code experience to extract, transform, and load data into a database that is fully managed by Power BI. There’s no need to worry about creating and managing dataflows or data refresh schedules—it’s all automatic. The user gets an intuitive SQL and visual querying interface for performing ad-hoc analysis on the data. Users can then connect to the datamart using external SQL-aware tools for further analysis.

Arun Ulagaratchagan

Therefore, we can connect to any data source, load it into a Datamart, and store it technically in a Database. Because now we have our data in our database, we can connect to it with a Dataflow and set up Incremental Refresh as Query Folding is now supported!

Let’s create a Datamart

As of today, Power BI Datamart is in Public Preview and a Premium feature so Premium, Premium per User, or Embedded is required. In my case I’m going to use a PPU license to create a Datamart. To be able to do so, I log in to PowerBI.com and select my demo workspace PBI Guy. In there, I choose New and select Datamart.

For the purpose of this blog post, I’m going to use a SharePoint list but as mentioned already, you can easily use something else like an Excel Sheet, CSV file, etc.

Therefore, I have to select Get data from another source and choose SharePoint Online list afterwards. Once selected, I provide my SharePoint site and my credentials, select my list, and hit transform data.

In Power Query Online I select only the needed columns (ID, Title, Date, and Revenue) and make sure that all data types are correct. As Incremental Refresh requires a DateTime column, please ensure your date column is set up correctly.

Once done, I select to load the data into my Datamart, and rename it on the next screen to “Staging Datamart” by selecting the arrow at the top.

Next, I create a Dataflow which should connect to my Datamart. Before I do so, I go back to my workspace, select the three dots besides my newly created Datamart, and hit Settings.

In there, I expand Server settings and copy the string.

Now I head back to my workspace, select New, and choose Dataflow.

On the next screen, I select Add new Table, and search for Azure SQL Database.

Once selected, I provide the copied Datamart (Azure SQL) string connection as Server name, select Authentication kind “Organizational account”, and select Next.

On the next screen, I select my table, and check in the Preview window if the data is correct. Once approved, I select Transform data.

In the Power Query Online experience, I don’t have to adjust anything anymore, but it would be possible if needed. Therefore, I just select Save & close, and save my Dataflow on the next screen with the name “Incremental Refresh”.

As next step, I have to configure Incremental Refresh. Luckily, this is pretty straight-forward. I just select the Incremental Refresh button, turn it on, and choose my Date column within the Dataflow as the DateTime column needed.

Lastly, I configure to store the past 3 years and only refresh the last 7 days. After hitting save I finished the configuration.

Once saved, a window pops up at the top right to Refresh the Dataflow now. I do so by selecting the button Refresh now to load the data into the Dataflow.

It’s time to test

Now that we have set up everything (connecting with a Datamart to our data source, connecting a Dataflow to our Datamart and setting up incremental refresh) let’s test if it works as expected. Today is the 14th of July 2022. In my demo list, I have some sales starting 1st of July until today. If I connect now with Power BI to my Dataflow, I see all five entries coming indirectly from SharePoint – so far so good.

Now, let’s do some changes in the SharePoint list. I will delete two rows, one from the 1st of July and one from the 11th of July. Further, I changed Product 2 name to Product 22 on the 4th of July and updated the Revenue on the 7th. Lastly, I added a new sale for today.

Our first step is now to trigger a refresh for our Datamart. Once the refresh successfully finished, we see a 1:1 copy of our SharePoint list.

Now, let’s trigger a refresh of our Dataflow. Once it’s finished, I hit the refresh button in my Power BI Desktop, which is connected to the Dataflow, to see the end result.

And as expected, Product 1 and 2 haven’t changed! So, we have now some historization in Power BI – awesome! But let’s analyze each row to understand the behavior.

Because we set up the Incremental Refresh to refresh only the last 7 days, everything prior to it will be ignored. Because Product 1 and 2 are older than 7 days, the changes didn’t affect our data in the Dataflow. But what about Product 3 which is dated 7th of July? This is, from an Incremental Refresh Point of view, 8 days ago because

  • 14. July = Day 1
  • 13. July = Day 2
  • 12. July = Day 3
  • 11. July = Day 4
  • 10. July = Day 5
  • 09. July = Day 6
  • 08. July = Day 7

and therefore, 7th July hasn’t been updated neither in our refresh. Product 4, which was dated on the 11th of July, has been removed – this is as expected. And lastly, our newest sale from today has been added (Product 6) which is also as expected.

Great, this is a real game changer as with Power BI you can now create a real staging area, and on top use Incremental Refresh to historize your data! But keep in mind, with this approach the data will only be available in the Dataflow. I would highly recommend using at least your own Azure Data Lake Storage Gen2 to store the Dataflow into it (see https://docs.microsoft.com/en-us/power-bi/transform-model/dataflows/dataflows-azure-data-lake-storage-integration). This way, you can access and enhance it if needed. Further, you can do backups and make sure it will not be lost if you delete your Dataflow.

Please let me know if this post was helpful and give me some feedback. Also feel free to contact me if you have any questions.

If you’re interested in the files used in this blog check out my GitHub repo https://github.com/PBI-Guy/blog This time I’ll provide only the PBIT file not containing the data as well not providing the User and Password. Thanks for understanding.

How to loop through an API with Power BI without knowing last page

Recently a customer reached out to me with a challenge because he knows I love challenges and especially solving them. In his case he got access to the Clarksons Research API, and he would like to connect with Power BI to it. So far so good, pretty straight forward. But in his case the API provides a maximum number of rows per page, and it doesn’t provide you how many pages they are in total. And obviously this can change in future as more data will be available through the API, so he’s looking for a dynamic approach to loop through the API, get all data from all pages, and import it into Power BI. Now we have a challenge and I’m happy to walk you through my solution approach and how it can be solved.

Setting the scene

As described above, we’re going to use the Clarksons Research API. Our goal is to connect with Power BI to it and get all data available in a dynamic approach. Meaning if more (or less) data will be available in future, the automatic refresh should dynamically react and get everything available.

First Steps

My first step was to connect with Power BI to check if a connection is in general possible. Once logged in in the Clarksons Research API we can even find an example code how to connect with Power BI to it – nice!

Following this approach, we first have to set up a dynamic authentication. This means we have to request a token which can be used to authenticate against the API. Because the token expires after a while, we have to create a function which will be called to generate each time a new token. This is also well documented above in the “Set up a dynamic authentication”, so I’m going to use the same code. To create the function I’m opening Power Query, select Get Data, and choose Blank Query. Once loaded, I select Advanced editor and copy & paste the code.

let
    Source = () => let
        url = "https://www.clarksons.net",
        body = "{""username"": ""YOUR_USERNAME"",""password"": ""YOUR_PASSWORD""}",
        Source = Json.Document(Web.Contents(url,[
                
            Headers = [#"accept"="application/json" ,
                        #"Content-Type"="application/json"],
            Content = Text.ToBinary(body) ,
            RelativePath="/api/user/ApiAuthentication/GenerateAuthenticationToken"
                ]      
            )),
        access_token = Source[token]
    in
        access_token
in
    Source

First thing I do is to test if I get a token from my newly created function. So, I select it, rename it to “Get Token” and hit Invoke. We got a big string back which represents the token, so my function works.

Obviously, I have to provide a username and password in the function (I marked the part red in the screen shot above). To make my life easier, so that I don’t have to update username and password every time in the code once I change my password, I created two text parameters which will hold my values. To have a better structure in Power Query I created two folders to hold Parameters and Functions. This is purely for structuring my Power Query and has no effect on the code. Afterwards, I add the two new parameters in the function replacing the hardcoded values.

A quick test by invoking the function again shows that the function still works, and I get a token back. I copy the token as I need it in a few seconds again.

As a next step I can now call the API and authenticate with the token from the function. I used the Web connector, entered the example URL https://www.clarksons.net/api/vessels?Page=1&PageSize=20 and selected Advanced at the top. The reason is we have to add a HTTP requests header parameter and provide the embed token. This is simple done choosing advanced, add Authorization as parameter name at the bottom, and add the value “Bearer ” followed by the token copied previously. Attention, there is an empty space after Bearer which is required!

Once done, I hit ok and choose Anonymous to connect. Now I got the first 20 rows coming from the API.

My first test worked perfectly fine, but I need to add one more parameter into the M-Query. I hardcoded the token in the request, but I want to get it dynamically as it can expire, and I don’t want to provide it manually every time. So, I choose Advanced Editor and add the function into the header’s details of my request. On top, I have to specify a RelativePathURL otherwise my dataset will not refresh. This means my whole M-Code looks now as following (top is how it looked, bottom shows my new code):

So far so good. This means I can now connect to the API, get a result, and the embed token will be dynamically created and provided. Now I have to get all the data and not just the top 20 rows.

Understanding the API

As I don’t get an indication how much pages there are and how many results per page I can get (unfortunately the documentation is not really good…), my next step is to further parametrize the request so I can test out the limit of the API. To not lose my work done so far, I copy the whole M-Query of my request, select Get Data, choose Blank Query, and paste the whole Query. This way I have now two tables. I rename one to “Hardcoded” and the other one to “Parametrized”. This way I can always check the result and make sure the API provides me the right data.

As my next step, I create two new Parameters called Page Number and Rows, both are Decimal Number value Parameters. For Page Number I enter 1 (for first page) and for Rows I enter 20. This are the values we see in the Relative Path URL. In my first try I want to make sure I’ll get the same result as the Hardcoded one. Afterwards I update the M-Code and parametrize the RelativePathURL as following:

Once I hit enter, I got an error message saying Power Query can’t apply the & operator to type Text and Number.

Because we decided to set our Parameters as Numbers, Power Query can’t combine now a Number and Text. This means we have two options now: modify the Parameter to be Text or transform in our M-Code the Parameter to be text. I choose the second option, as there is a Number.ToText() function in M so I update my code as following:

After hitting the done button, I see the same result as the hardcoded one – perfect! Now let’s test how many rows I can back per page. By just updating the Parameter “Rows” with a new number, I see a bigger result set. Once I add a too big number Power Query will return an error. After trying some different numbers, I figured out the maximum rows per page is 1000 in this case. So, I let the parameter Rows be 1000.

Next, we have to figure out which is the last page currently. Same procedure, I update the Page Number Parameter until I get an error or empty result and figure out what the maximum number is. In this case the last page is 202. This means if I get 1000 rows per page, there are 202 pages in total (so roughly 202’000 rows), and if I configure the parameter to 203 pages, I get not a “real data row” back.

Now I know how I can call the API, how many rows per page I can get back, and how many pages there are currently.

The dynamic approach

Till now I’m calling the API hardcoded through parameters. But what if I can call the API multiple times and combine the output together to one, big table? Of course, this would work manually by adding for each new page a new query, but that’s not really efficient (as Patrick LeBlanc from Guy in the Cube says: I’m not lazy, I’m efficient!). Therefore, I’m going to create another function which will call the API. In the function itself I’ll provide a parameter which will define which page I wish to call. For example, if I provide the value 1, the first page of the API should be called giving me the first 1000 rows back. If I provide the value 2, the second page of the API should be called giving me the second 1000 rows back, etc. To not lose my process so far, I create another Blank Query (select New Source, Blank Query), rename it to Dynamic, and open the Advanced Editor. In there I copy and paste the first line of the Parametrized table M-Code – see screen shot below. The upper M-Code shows the Parametrized table, the lower shows the new Dynamic M-code.

Now I’m going to create a function out of it by simply putting (page as number) => at the top. This means my new function will expect a number parameter called page as input.

Lastly, I have to make sure the provided input will be hand over to my API call. Therefore, I have to update the piece of code where I’m providing the Page Number as previously created Parameter and replacing it with the page parameter from the function.

Now I have a function and if I enter a number, a new table will be created with the current data from the provided page.

As we can see there is still some work to do to get one, nice, and clean table. I’m interested only in the “results”, so I select “List” to navigate further. And because I’m efficient, I open again the Advanced Editor and copy the newly created step to paste it into my function as well. This way I don’t have to navigate in my table, but the function gives me already back what I’m looking for. If you do this, don’t forget to add a coma at the end of the “Source” line.

To make sure it works I test it by invoking the function again and yes, it works.

As next step I create a list with one row, each row with one number counting onwards. In Power Query there is a function for that called List.Generate() Let’s test it by creating a list with number from 1 – 10.

First line defines where the list starts (number 1), where it should end (10), and in which steps it should increment (+1 for each new row). Once done, we have to convert the list to a table. This is pretty straight forward in Power Query by selecting the List, hit Transform Menu in the Ribbon, and choose “To Table”.

On the next screen we just confirm by selecting OK.

Now I want to test the Dynamic function by invoking it in my new generated table. This way the function will be called for each row, therefore each number will be provided to the function as page, and if everything works as expected I’ll get 10 pages back each containing 1000 rows. To do so I select Add Column in the Ribbon and choose Invoke Custom Function. I name my new column “Result”, select Dynamic as function query and hit OK.

Awesome, I got a result per number (page) as List. This means now I would just need to transform my data to extract the result into one big table, but I still have the issue that my approach is not dynamic. I hardcoded the list numbers to start at 1 and end by 10 but we have 202 pages. Of course, I could hardcode that (or pass the parameter) to create a list, but it’s still hardcoded. I wish to create a list until no pages are available. Luckily, the List.Generator() provides a function to test against a condition and until this condition is true, it will create new rows. Once condition is not true anymore, it will stop. In this case my condition should be something like “create a new number / row per page coming from the API until I don’t receive any rows / page from the API anymore”. Let me first test what I get back if I provide the number 203 in my Dynamic function because it doesn’t exist. Once done, I see the result is empty.

This means I can check if the result is empty and if so, stop creating new rows. In M-Code this will look as following:

List.IsEmpty([Result]) = false

Further List.Generator() asks where to start the list. I wish to provide that dynamically coming from the API but also want to make sure that if no page is available no error will occur during refresh. So, I have to try if I get something back for page number 1 (that’s where I start) calling my Dynamic function and if not, give me null back. On top I have to create a parameter indicating that I start at page number 1 which I’ll use afterwards to count onwards until we reach the end. I’m saving the whole result in my step called Result. This piece of code looks as following:

[Result = try Dynamic(1) otherwise null, pagenumber=1]

The next function in List.Generator() is creating the next row in the list if the condition is met. Again, I’m wrapping the Dynamic function around Try Otherwise to make sure no error will occur if somehow the API is not reachable and providing now the pagenumber parameter. And if successful, please count +1 in my pagenumber parameter for the next page. This piece of code looks as following:

[Result = try Dynamic(pagenumber) otherwise null, pagenumber = [pagenumber] + 1]

Lastly, I wish to get the Result back, so I provide the optional function in List.Generator() providing just [Result]. This means my whole code now looks as following:

let
    Source = List.Generate(
        () => [Result = try Dynamic(1) otherwise null, pagenumber=1],
        each List.IsEmpty([Result]) = false,
        each [Result = try Dynamic(pagenumber) otherwise null, pagenumber = [pagenumber] + 1],
        each [Result]),
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Changed Type" = Table.TransformColumnTypes(#"Converted to Table",{{"Column1", Int64.Type}}),
    #"Invoked Custom Function" = Table.AddColumn(#"Changed Type", "Result", each Dynamic([Column1]))
in
    #"Invoked Custom Function"

If I hit now Done, Power Query should loop through the whole API, going through each page and create a new row for each page in my automatic created list. For each row it will call the API to get the data. Let’s test it.

Once finished (this can take a while now!), I get an error in my query. The reason is because I did some transformation and Power Query can’t do them anymore. So, I’m deleting all steps until I see no error and extract now everything to New Rows by hitting the two arrows icon in the column.

Next, I expand the Records by again hitting the two arrows icon, select all columns I wish to include (in my case all of them), and uncheck the “Use original column name as prefix”. By hitting ok, I have now my complete table with all data from the API!

Obviously, I can do now all kind of transformation I wish and need to do, and – even more important – set the correct data types for each column as well as following best practices approach once it comes to data modelling. Before I hit the Close & Apply, I rename my “Dynamic” function to “Get API Page”, delete the unnecessary Invoked Function lists, and rename my finale table to “API Table”. Of course, you can choose another name, whatever suits you best. Lastly, I right click on my Hardcoded and Parametrized table and deselect the Enable load option to not load the data into my data model but still to keep my queries. If you don’t wish to keep them, just delete them as well.

Once done, I hit the Close & Apply button and wait until the table is loaded. If you keep an eye on the data load dialog, you’ll see the rows loaded are increasing every time by more or less exactly 1000. This means or paging from the API works (remember our Rows parameter in Power Query?).

One last tip before you leave. If the data load takes too much time and your token expires (remember, we have to get a token to authenticate against the API and this token has a lifespan) during the refresh, you can probably test the timeout of the token in the request. This means you have to update the function by adding the timeout at the end of your URL request and increase the lifespan. In my case this would work as the API provides a timeout function and therefore it would look like following (don’t forget to add the coma after the relative path):

Please let me know if this post was helpful and give me some feedback. Also feel free to contact me if you have any questions.

If you’re interested in the files used in this blog check out my GitHub repo https://github.com/PBI-Guy/blog This time I’ll provide only the PBIT file not containing the data as well not providing the User and Password. Thanks for understanding.

The Power BI Championship Switzerland

For the first time in Switzerland, we organized the Power BI Championship which crowned the top 3 teams based on their solution. In this post I’m going to share what the Championship is all about, what the challenge was (feel free to challenge yourself and create a report on your own), and how the jury scored all the amazing solutions we got. Lastly, I’m going to share the Top “10” solutions in the GitHub repo so everyone can get inspired about the different ideas and solutions.

What is the Power BI Championship

The whole idea of the Championship was to enable everyone to work with Power BI and get to see what’s possible within just a few days. We organized a Briefing Session on the 20th of May 2022 to explain everyone who registered how the Championship will look like and how – even if you’re a new enthusiast – you can skill yourself during the upcoming days and weeks. Starting from the 20th we posted every day for the upcoming two weeks a new enablement Guy in a Cube video to make sure you know the basics about Power BI and how to create reports. (See All Enablement Videos from Guy in a Cube down below).

On the 2nd of June we finally had our Kick-Off where we introduced the Dataset which all Champions have to use. We gave each individual and team the choice between the GHO OData API (https://www.who.int/data/gho/info/gho-odata-api) and the Swiss Open Data about rent prices (https://opendata.swiss/en/dataset/durchschnittlicher-mietpreis-pro-m2-in-franken-nach-zimmerzahl-und-kanton2). The goal was to use at least one of these two datasets and create a report on top. Everyone was free to add more, publicly available data to enhance the report and provide a solution until midnight 7th of June.

Once all teams have submitted their solution a jury of three technical Microsoft FTEs went through the PBIX files to score the solution based on a scoring system. This made sure scoring was as objective as possible and the winner was really determined based on neutral criteria, not personal preferences. All three jury members scored different aspects of each report and the average was taken as final points, e.g. How is the Layout? How is the Data Model? Did the team follow Best Practices? Are all columns / tables from the Data Model used? How is DAX structured? How complex is the DAX used? Did they think about a mobile layout? And so on. Nevertheless, some criteria are still more important than others. Therefore, we decided to weight the different categories. An overview can be found in the image below.

The main idea was to determine a Top 10, announce them on the 10th of June, and invite them to the Finals on the 17th of June. But there was one “issue” – the different solutions have been so amazing and the scoring so close we just couldn’t do a hard cut and invite only the Top 10! We decided during the scoring that we invite even the Top 14 plus 3 Golden Buzzers to the Finals! Golden Buzzer in this case means that each jury member had a Golden Buzzer to vote for a solution once to see them in the Finals regardless of their score. This way we had 17 great teams presenting their solution on the 17th of June.

Once the teams have been announced they had seven days to prepare for the live demo and also collect some extra points through LinkedIn. This means each team had to collect Likes, Comments, and Shares to score more points at the end. Idea behind this was to check if others also like the team’s solution or was it only the jury. But again, we wanted to make sure that just because someone had a huge LinkedIn community, they will not win based on this. So, we weighted again all Teams. The team with the most Likes, Comments, and Shares got 10 points, the second best got 9 points, etc.

Lastly each Team had to sell their solution to the jury who represented a CEO, CIO, and CMO within a 10min live demo. Again, all teams have been scored and weighted (best team 10 points, second best team 9 points, etc.) to make it as fair as possible.

In a last step, we weighted each score (Solution Score 70%, Social Media Score 10%, Live Demo 20%) to determine the Top 3.

This is just an example and not the real scoring during the Championship

Based on these criteria, the winners of the Power BI Championship Switzerland 2022 are:

1st Place: Team BIdventure, Members: Karim Abdrakhmanov & Anthony Haberli
2nd Place: Team Clouds on Mars, Members: Greg Stryjczak & Wojtek Bak
3rd Place: Pascal Kiefer

I want to highlight one more time that this doesn’t mean the other solutions haven’t been good – contrary! All finalists have delivered awesome and outstanding reports and it made the jury’s life really hard to have “only three winners”! But the small details in the end decided the scoring – like a Championship should be. If you’re interested in how close it really was, check the GitHub repo with all the final solutions and write in the comments which was your favorite one.

All Enablement Videos from Guy in a Cube

Please let me know if this post was helpful and give me some feedback. Also feel free to contact me if you have any questions.

If you’re interested in the files used in this blog check out my GitHub repo https://github.com/PBI-Guy/blog This time I’ll provide only the PBIT file not containing the data as well not providing the User and Password. Thanks for understanding.

Power BI and Activity Logs with Python

One of the most asked questions when I talk with Power BI Admins / Service Owners is how can they analyze which reports are (most) used. Power BI respectively Microsoft offers you different ways to get such insights. In this blog post I’ll give you an overview of the different ways and walk you through how to leverage the Power BI Admin REST API with Python to extract log files and analyze them – obviously – with Power BI.

What’s available

Here is a quick overview of different ways to get an usage metric of one or multiple reports.

  • Usage Metrics
  • M365 Audit Logs
  • Power BI & Azure Log Analytics
  • Power BI & PowerShell
  • Power BI REST API

Let’s explore them in more details. First of all the Usage Metrics which is out-of-the-box available in Power BI Service. As of this writing we differentiate between the classic workspace and new workspace Monitor Usage. Both of them gives you insights about the usage of your report. For example you can see how many views, unique viewers, and viewers per day for the specific report. Further, you could edit the report and delete the filter to get an overview of all reports within the same workspace. Unfortunately it’s not possible to have one Usage Metrics report across different workspaces this way which is therefore not the best solution for admins to get a total overview. Lastly, the Usage Metrics gives you an overview of the last 90 days, not more.

Going further with M365 Audit Logs. Each user, who have View-Only Audit Logs, Audit Logs, Global Admins, and/or Auditors permissions, have access to the general M365 Audit Log files. Power BI, like any other M365 service, logs all kind of different activities in the M365 Audit Log for the last 90 days. Therefore you can go the the M365 Compliance Center – Audit and search on different Activities. In the screen shot below I’m searching for Viewed Power BI report and get a result per Date and Time of the different Reports users have accessed (viewed). If wished I can download the result as CSV.

As nice as this possibility is there are two pain points from my point of view. First, if I have the sufficient rights to access the Log Files I have access to every log entry from any other M365 Service as well (e.g. SharePoint Online, Exchange Online, Dynamics 365, etc.), not only Power BI. Secondly, it’s a manual process. As Patrick from Guy in the Cube says: I’m not lazy, I’m efficient. And every manuel process is not really efficient. Therefore let’s explore the other ways, probably we find a better solution.

Since June 2021 you can connect an Azure Log Analytics Workspace with a Power BI Workspace to store automatically activity information. The beauty of this way is that once set up, you don’t have to run any kind of script or do something manually as everything is automatically saved in Azure Log Analytics. Further, it’s just storing Power BI related data and you don’t have access to other Microsoft Services logs. The downside is that Power BI Premium (per Use) and Azure Log Analytics is required, which can increase your costs. Costs for Log Analytics are, from my point of view, not that high. It starts already at $2.76 per GB but still it has to be considered. Prices for Azure Log Analytics can be found here. Another limitation (as of writing this post) is, that you can connect only one Power BI Workspace with one Azure Log Analytics workspace meaning you would need to create one Azure Log Workspaces per Power BI Workspace to store the log data. Of course you can combine it afterwards in Power BI or other tools but it’s still not on one central place. How to use and set up Azure Log Analytics with Power BI can be found here.

Moving on to the next possibility of using PowerShell to extract the log files. Instead of manually extracting the Log File in the Microsoft Compliance Center, as seen above, you can also create a PowerShell script to automatically get the needed data. Microsoft offers a good documentation (see here) how to use it and what’s required. One important thing to keep in mind is that you have two different cmdlet to get your data. One is the Power BI Activity Event (Get-PowerBIActivityEvent) and the other is the Unified Audit Log (Search-UnifiedAuditLog). The first one will give you only Power BI-related data for the last 30 days while the Unified Audit Log will give you access to the whole audit log for the last 90 days. This means you could also get data from other Microsoft Services, not only Power BI. Therefore you’ll find a filter option with the Unified Audit Log to get only Power BI related data (see screen shot below, marked with a red line).

I tested successfully the Unified Audit Log cmdlet as you can see below. I extract one file for the last 90 days in this example but of course you can extract only one day and run the script on a daily basis to get a history of your data (see line #20 in script) The whole script is of course in my GitHub repo.

The best part of this approach is that you can specify what exactly you wish to export – not only Power BI related logs. Further if you’re exporting it on your own you can choose how to store the data (CSV, Excel, Database, etc.) and create a historization. But like with manually extracting the log files you would require sufficient permissions. If you would use the Power BI Activity Log (Get-PowerBIActivityEvent) you won’t have access to the whole log file and you would only require Power BI Admin, Power Platform Admin, or Global Admin rights. And of course you would require some tech skills for this approach to create and use a PowerShell script but even myself with some guidance from Bing and YouTube could create such a script to extract the data and store it as CSV file. 🙂

Lastly, if you’re more familiar working with REST APIs the last option would be your preferred way. As in the document stated Power BI offers a REST API to get Activity Events. Let me walk you through how you can leverage the API with Python and store the data into an Azure Data Lake Gen2 to analyze it afterwards with Power BI.

Requirements

  • Create / Use a Service Principal
  • Enable Service Principals rad-only access to Power BI admin API in Tenant Settings
  • Azure Data Lake Gen2
  • Python Skills
  • Understanding REST APIs

First of all let’s make sure we have the sufficient permission to run our Python script. Because I wish to automate it and I don’t wish to use my personal account I’ll need a Service Principal. How to register an App / create a Service Principal can be found in this guide. But let me walk you through as well. Log in into your Azure Portal, search for “App Registration” and hit “New Registration”. Give your app a name and select the supported account type (I choose the first option, only Single Tenant). In my case I called my Service Principal “Power BI REST API”.

Note: No API permissions are required as the application’s Azure AD permissions will have no effect. The application’s permissions are then managed through the Power BI admin portal.

Afterwards I created a Security Group called Power BI REST API and added my Service Principal to the group.

Next, we have to make sure that Power BI in general allows Service Principals to use the Power BI Admin REST API. To do so I log in to Power BI, go to the Admin Portal and Enable in the Admin API Settings the necessary feature. Once enabled I add my Power BI REST API Security Group.

Now that we’re ready and have the sufficient permission let’s create our Python script. I’ll do so in the Azure Synapse environment but of course it would also work in any other tool like Visual Studio Code.

As usual with Python let’s import the needed libraries first. In this case we need following libraries:

  • msal
    This is the Microsoft Authentication Library which we use to authenticate against Power BI.
  • requests
    I’m using the requests library to send my REST API requests.
  • json
    Because the result from the REST API is a JSON format I’m using this library to read the result.
  • pandas
    With this library it’s very easy to create tables (dataframes) and store them afterwards.
  • date & timedelta from datetime
    I need the current date to be able to automate my script and run it on a daily base.
#Import necessary libraries
import msal
import requests
import json
import pandas as pd
from datetime import date, timedelta

Afterwards let’s parametrize our script. First thing, let’s get yesterdays date. Because I want to run the script in the morning, it doesn’t make sense to use the current date as not much have happened. Therefore I’m focusing on the day before with following code:

#Get yesterdays date and convert to string
activityDate = date.today() - timedelta(days=1)
activityDate = activityDate.strftime("%Y-%m-%d")

I’m converting the date into a string in the format of YYYY-MM-DD which is required for the REST API call. Our next step is to make sure we can authenticate with our Service Principal. For that we need the Client ID, Client Secret, Authority URL, and the Scope. If you head over to the Azure Portal where the Service Principal was created (Search for “App Registration”) you’ll find the Client ID in the Overview pane.

To create a Client Secret hit the Certification & Secrets tab and choose New client secret. Configure and add a description and wished. Once created make sure to copy the value because afterwards it won’t be visible anymore and you would need to create a new secret.

For the Authority URL please us https://login.microsoftonline.com/ adding your tenant name at the end. In my case it’s kbubalo.com. Lastly we need the scope which is https://analysis.windows.net/powerbi/api/.default

#Set Client ID and Secret for Service Principal
client_id = ""
client_secret = ""
authority_url = "https://login.microsoftonline.com/kbubalo.com"
scope = ["https://analysis.windows.net/powerbi/api/.default"]

Now let’s set up the REST API URL which we’re calling to get our Activity Data. Looking at the doc here or in the Power BI Admin REST API doc here we see the main request is GET https://api.powerbi.com/v1.0/myorg/admin/activityevents following with the start and end datetime parameter. Because we parametrized our start and end date my URL parameter in Python looks as following:

#Set Power BI REST API to get Activities for today
url = "https://api.powerbi.com/v1.0/myorg/admin/activityevents?startDateTime='" + activityDate + "T00:00:00'&endDateTime='" + activityDate + "T23:59:59'"

As you see we’re adding the time at the start and end date to get the whole day.

Lastly we specify one more parameter for my CSV path where we wish to store the file.

#Set CSV path
path = 'abfss://powerbi@aiadadlgen2.dfs.core.windows.net/Activity Log/'

Now that we parametrized everything that we need let’s create our Python code. First thing to do is to authenticate against the Power BI Service to make sure we can call the REST API. Therefore I’m using the msal and my parameters to authenticate and store the result into a result parameter.

#Use MSAL to grab token
app = msal.ConfidentialClientApplication(client_id, authority=authority_url, client_credential=client_secret)
result = app.acquire_token_for_client(scopes=scope)

If we receive and access token, we would like to proceed calling our Power BI REST API. From the result we have to extract the access token, which will be used to call the REST API, and specify the header. Once done, we can call the REST API URL specified above with the requests.get method.

#Get latest Power BI Activities
if 'access_token' in result:
    access_token = result['access_token']
    header = {'Content-Type':'application/json', 'Authorization':f'Bearer {access_token}'}
    api_call = requests.get(url=url, headers=header)

Reading the doc it says following:

Microsoft Doc

If the number of entries is large, the ActivityEvents API returns only around 5,000 to 10,000 entries and a continuation token. Call the ActivityEvents API again with the continuation token to get the next batch of entries, and so forth, until you’ve gotten all entries and no longer receive a continuation token. The following example shows how to use the continuation token. No matter how many entries are returned, if the results include a continuation token, make sure you call the API again using that token to get the rest of the data, until a continuation token is no longer returned. It’s possible for a call to return a continuation token without any event entries. […]

Therefore we have to call the continuation URL as long as one is given and we wish to store the already called data of course to not lose them. Because of that we create an empty Pandas Dataframe in which we can append our results from each call. Further we store the continuation URL into a parameter and the first result into a second data Dataframe which we append to our empty one. Afterwards we have to do a while loop until no continuation URL is available anymore. Obviously we need to store from each call again our result into a Dataframe and append it to our first (now not empty anymore) Dataframe. Lastly we specify as Dataframe Index the Id URL.

    #Specify empty Dataframe with all columns
    column_names = ['Id', 'RecordType', 'CreationTime', 'Operation', 'OrganizationId', 'UserType', 'UserKey', 'Workload', 'UserId', 'ClientIP', 'UserAgent', 'Activity', 'IsSuccess', 'RequestId', 'ActivityId', 'ItemName', 'WorkSpaceName', 'DatasetName', 'ReportName', 'WorkspaceId', 'ObjectId', 'DatasetId', 'ReportId', 'ReportType', 'DistributionMethod', 'ConsumptionMethod']
    df = pd.DataFrame(columns=column_names)

    #Set continuation URL
    contUrl = api_call.json()['continuationUri']
    
    #Get all Activities for first hour, save to dataframe (df1) and append to empty created df
    result = api_call.json()['activityEventEntities']
    df1 = pd.DataFrame(result)
    pd.concat([df, df1])

    #Call Continuation URL as long as results get one back to get all activities through the day
    while contUrl is not None:        
        api_call_cont = requests.get(url=contUrl, headers=header)
        contUrl = api_call_cont.json()['continuationUri']
        result = api_call_cont.json()['activityEventEntities']
        df2 = pd.DataFrame(result)
        df = pd.concat([df, df2])
    
    #Set ID as Index of df
    df = df.set_index('Id')

Side note: During the creation of the Python Code I called the REST API for test purpose to make sure it works. During the test phase I received a result with all the columns. That’s the reason why I know which columns and in which order I have to create the empty Dataframe.

Our last step is to save our Dataframe into our ADL Gen2. This is very simple with Pandas Dataframe:

    #Save df as CSV
    df.to_csv(path + activityDate + '.csv')

Make sure to add “.csv” at the end to have the proper file type.

That’s it! Our whole code is done. Let me copy it for you so you have for sure the whole code in proper order.

#Import necessary libraries
import msal
import requests
import json
import pandas as pd
from datetime import date, timedelta

#Set parameters

#Get yesterdays date and convert to string
activityDate = date.today() - timedelta(days=1)
activityDate = activityDate.strftime("%Y-%m-%d")

#Set Client ID and Secret for Service Principal
client_id = ""
client_secret = ""
authority_url = "https://login.microsoftonline.com/kbubalo.com"
scope = ["https://analysis.windows.net/powerbi/api/.default"]

#Set Power BI REST API to get Activities for today
url = "https://api.powerbi.com/v1.0/myorg/admin/activityevents?startDateTime='" + activityDate + "T00:00:00'&endDateTime='" + activityDate + "T23:59:59'"

#Set CSV path
path = 'abfss://powerbi@aiadadlgen2.dfs.core.windows.net/Activity Log/'

#Use MSAL to grab token
app = msal.ConfidentialClientApplication(client_id, authority=authority_url, client_credential=client_secret)
result = app.acquire_token_for_client(scopes=scope)

#Get latest Power BI Activities
if 'access_token' in result:
    access_token = result['access_token']
    header = {'Content-Type':'application/json', 'Authorization':f'Bearer {access_token}'}
    api_call = requests.get(url=url, headers=header)

    #Specify empty Dataframe with all columns
    column_names = ['Id', 'RecordType', 'CreationTime', 'Operation', 'OrganizationId', 'UserType', 'UserKey', 'Workload', 'UserId', 'ClientIP', 'UserAgent', 'Activity', 'IsSuccess', 'RequestId', 'ActivityId', 'ItemName', 'WorkSpaceName', 'DatasetName', 'ReportName', 'WorkspaceId', 'ObjectId', 'DatasetId', 'ReportId', 'ReportType', 'DistributionMethod', 'ConsumptionMethod']
    df = pd.DataFrame(columns=column_names)

    #Set continuation URL
    contUrl = api_call.json()['continuationUri']
    
    #Get all Activities for first hour, save to dataframe (df1) and append to empty created df
    result = api_call.json()['activityEventEntities']
    df1 = pd.DataFrame(result)
    pd.concat([df, df1])

    #Call Continuation URL as long as results get one back to get all activities through the day
    while contUrl is not None:        
        api_call_cont = requests.get(url=contUrl, headers=header)
        contUrl = api_call_cont.json()['continuationUri']
        result = api_call_cont.json()['activityEventEntities']
        df2 = pd.DataFrame(result)
        df = pd.concat([df, df2])
    
    #Set ID as Index of df
    df = df.set_index('Id')

    #Save df as CSV
    df.to_csv(path + activityDate + '.csv')

Let’s test our script by executing it. Once finished I see that a CSV file is stored in my ADL Gen2 – awesome! Now I can automate it in Azure Synapse through a Azure Data Factory Pipeline and run it on a daily base. After a while I see multiple files, one for each day.

Let’s quickly test the result by querying the CSV files through Serverless SQL built-in Azure Synapse (yes, that’s possible!)

Great, we have some data with different Activities!

In my next post I’ll walk you through how to create a Power BI Report on top to analyze the different log files at once – stay tuned!

Please let me know if this post was helpful and give me some feedback. Also feel free to contact me if you have any questions.

If you’re interested in the files used in this blog check out my GitHub repo https://github.com/PBI-Guy/blog This time I’ll provide only the PBIT file not containing the data as well not providing the User and Password. Thanks for understanding.

Multilingual Reports in Power BI

Creating Power BI reports in Switzerland can be challenging when it comes to the question: In which language should we display the report? The reason behind is that Switzerland has 4 official languages and if we consider English, it’s even 5. Therefore it’s not easy to fulfill this requirement with one report – or is it? Let me walk you through different scenarios how you can create multilingual reports with Power BI.

Possible Solutions

As in most scenarios in Power BI there are different ways to achieve the same goal. Therefore let me list some solutions:

  • Create a separate report for each language
  • Create one report and translate data through a service (like Azure Cognitive Services)
  • Create a Datasets for each language
  • Create a report, use a “translation table”, and work with bookmarks
  • Create one report and a “translation table” to display the right language

Of course each scenario has his pros and cons so let me go through some of them quickly and focus afterwards on the main solutions I’m going to present in this post.

Create a separate report for each language

Creating a separate report for each language is from my point of view a maintenance nightmare. Imagine having 5 different languages, therefore having 5 “different” reports (which all look the same but using only a different language), and now you have to customize a visual. Not only you have to do the work multiple times, you have to do it every time a modification is required! Besides that you have to keep an overview which users has access to which report making sure they have access to the right language. Therefore I wouldn’t recommend to go this path and look into other ways to fulfill the multilingual report requirement.

Create one report and translate data through a service

Using a Service to translate your data is a nice and handy solution. The good side of this approach is that you do not have to create your own translation for each word and sentence for each language. This can be done automatically through an API. On the other hand the translation itself could lead to unexpected results, sentences could be translated in a wrong or misleading way which can cause other, unexpected issues which could be avoided in the first place. Therefore it’s not a perfect solution but for sure one worth trying if you do not have a “translation table” which can be used. If you’re interested in how to call an API to translate your data let me know and I’ll be happy to post about it in future.

Create a Datasets for each language

Chris Webb wrote a great article (see https://blog.crossjoin.co.uk/2021/02/21/implementing-data-as-well-as-metadata-translations-in-power-bi) how you can create multiple Datasets – one for each language – and use the DirectQuery Mode (not Live Connection! 😉 ) to connect to the needed Dataset. This approach has also his benefits as the data and meta data like column name and title can be translated. On the other hand it’s again a maintenance effort for all the different Datasets and PBIX files.

Create a report, use a “translation table”, and work with bookmarks

I saw a post from Greg Philips how you can handle Multilingual Reports with a translation table and multiple pages and bookmarks. Everything is described here: https://blog.enterprisedna.co/create-a-multilingual-power-bi-report/

This solution gives you the full flexibility to translate data, columns, titles, etc. in your report. On the other hand you would need to create a separate page for each language and maintain the bookmarks on top of it. If changes will occur you would need to change it on every page. Still better than having multiple reports, but could lead to a lot of maintenance hours. Nevertheless it’s a really nice solution to work with different languages in one report.

Create one report and a “translation table” to display the right language

Another approach is to have a “translation table” within Power BI which is used to translate the needed data in the right language. Let me walk you through how this could look like in Power BI.

As an example imagine we have a table with some fruits, numbers, and an ID for each fruit. Further we have a second table with the fruit ID, display language, and the translated value.

Once in Power BI imported you can create a relationship between the two tables through the ID if that’s not already automatically done during the import process.

As a next step you can add a Slicer Visual using the Language column from the Language table and add a Table Visual with the Number Column from the Fruits table and Value Column from the Language table.

If you select now a language in the slicer the Table Visual will filter it to the selected language. You can even force a selection through the slicer in the settings by turning on “Single select”.

To enhance this solution and apply automatically the correct language you can even build a third table matching the language with an user and use RLS to apply the right language. In my simple solution I created following table.

Tip: I blogged about RLS a while ago and if you’re interested in how to implement it check it out here: https://pbi-guy.com/tag/row-level-security/

As you can see the table contains the language, name and UPN of each user. Once in Power BI imported create a relationship between the language column from the User Language table and the Language column from the Language table. Our whole relationship model looks now as following.

Our next step is to create a role and set a rule so users can only see their specific language. We’re going to user the USERPRINCIPALNAME() to identify which user is logged in and match the mail address with the User Language table.

Once this is set up the Slicer Visual is not needed anymore as through RLS the data is already filtered. I just added a card visual showing the current language and the USERPRINCIPALNAME (I created a DAX measure for this). If we test now RLS on behalf of another user – let’s take Luke – we see following.

If we change it to Chewbacca the language will apply automatically.

This approach is very good if you do not have a lot of data and languages. As you can imagine if you add a new fruit your language table will need three more lines for the three languages in this case. If you add a new language your language table will increase even more which leads to a bigger data model size in both cases. Further performance can decrease due to the implemented RLS rule. On the other side user experience can increase as the language will automatically apply but users can’t change it on their own if RLS is in place (e.g. in my personal case I prefer to display everything in English even if I’m based in the German-speaking part of Switzerland).

Besides data translation Power BI offers ways to translate field and table names automatically as well. Let me show you how to add different languages into your model.

Translate Power BI field and table names

To be able to add multiple languages into your model you’ll need to work with third-party tools. This could be Visual Studio, Tabular Editor, or anything else that can handle and configure a tabular model. In my case I like to work with Tabular Editor because as soon as you install it you’ll have it integrated in Power BI Desktop to open it directly from the ribbon connected to your model.

Once opened you’ll find a “Translations” folder in your model. Right-click on it and hit New Translation.

A new window will pop-up where you can choose which culture you wish to add. In my case I select de-DE and hit OK. This means once translations are implemented in de-DE language Power BI will recognize the Browser or Power BI Desktop Language Settings and display automatically the translation of my tables and columns in this language.

You’ll notice how the Translation folder has now two different languages. In my case en-US and de-DE. Now I can expand the Tables folder, select a table or column, scroll down to Translated Names Properties and add for each language a translation. As you can see in my example below I choose the table Fruits and translated it to Früchte in German.

To make the translation easier Chris Webb has a very useful and easy-to-use tool which is available for free here: https://www.sqlbi.com/tools/ssas-tabular-translator. To be able to use it export the translations as json file and import it afterwards by right-clicking on the Translations folder in your model.

Once everything is translated as wished save your model and publish the report to Power BI Service. In there you have to change the browser language or choose through Power BI settings your preferred language and your table and column names will be display accordingly. If you wish to test in in Power BI Desktop keep in mind that since March 2021 release Power BI Desktop will not reflect the model translations so you would need to have an earlier version if you wish to do so.

Edit: Further keep in mind that this works only with Premium. Therefore you’ll need Power BI Premium, Premium per User, or an Embedded capacity.

Please let me know if this post was helpful and give me some feedback. Also feel free to contact me if you have any questions.

If you’re interested in the files used in this blog check out my GitHub repo https://github.com/PBI-Guy/blog This time I’ll provide only the PBIT file not containing the data as well not providing the User and Password. Thanks for understanding.

Refresh a Power BI Dataset with Python

As you probably know Power BI Datasets can be refreshed manually, on a scheduled base or through the Power BI REST API. Depending on the data source you will require a Power BI Gateway to be able to update your dataset. Manually refreshing a dataset can be handy if you need an instant refresh but usually you will at least schedule it to automate this process. But in few scenarios you probably don’t want to wait till a certain, scheduled time or you wish to trigger a refresh immediately but programmatically after your ETL / ELT process has finished. In this blog post I’m going to show how to trigger a refresh with Python.

JFYI: You can also use Power Automate, PowerShell or anything else that can call and authenticate against the Power BI REST API to trigger a dataset refresh and there are plenty good blog posts out there explaining how to do so.

Prerequisites

  • Power BI Dataset in Power BI Service
  • basic Python know-how
  • a registered Azure App
  • optional: Azure Synapse Analytics

How to

First of all we need a Service Principal on which behalf we can trigger the Dataset refresh programmatically. You could also use your user and password to log in and call the Power BI REST API (so-called Master User Authentication) but I wouldn’t recommend this approach due to some obvious reasons: Your password can change, what happens to the program if you leave the company, probably too much rights, etc. Therefore we’re going to register an App and give just the needed permissions. A very well how-to-guide can be found here: https://www.sqlshack.com/how-to-access-power-bi-rest-apis-programmatically/.

Checking the doc (https://docs.microsoft.com/en-us/rest/api/power-bi/datasets/get-refresh-history-in-group & https://docs.microsoft.com/en-us/rest/api/power-bi/datasets/get-refresh-history-in-group) we see that in our case we need at least the Dataset.ReadWrite.All or Dataset.Read.All to be able to check the Refresh History, and Dataset.ReadWrite.All to be able to trigger a refresh.

Before we’re going to refresh a Dataset we want to make sure that no refresh is happening right now. That’s the reason why we need the Refresh History API.

Once the app is register you’ll get an App ID and Secret. Store this information somewhere securely as we’re going to need it later on.

To be able to access the Dataset our Service Principal will need sufficient permission on the dataset itself. In my case I’m just going to add the user as admin in my workspace.

As a next step we’re going to write our Python Code, authenticate against Power BI and call the needed REST APIs. You can choose whatever environment suits you best to write Python. In my case I’m going to use Azure Synapse and create a Jupyter Notebook to run my Python Code.

If you’re interested in how to start with Azure Synapse feel free to check out this link: https://docs.microsoft.com/en-us/azure/synapse-analytics/get-started

I created a folder called PBI Guy and create a new Notebook with the name Power BI Dataset Refresh.

As first step we need to have some code and assign afterwards a Spark Pool to be able to run it. I like to start with importing my needed libraries. Because we need to authenticate against Microsoft Azure Active Directory we’re going to use the MSAL library (https://docs.microsoft.com/en-us/azure/active-directory/develop/msal-overview). Further we will call the Power BI REST API. For this purpose I’m going to use the requests library (https://docs.python-requests.org/en/latest/). Afterwards we would need to convert the output into a JSON format to make it easier to process the data. Therefore we’re importing the JSON library as well (https://docs.python.org/3/library/json.html). Last but not least I’m going to use the Pandas library (https://pandas.pydata.org/) just because it’s so simple to transform and extract the necessary data. Luckily with Python importing libraries is straight forward and our first code snippet looks as following.

I’m sure there are multiple, probably easier ways to achieve the same goal triggering a Power BI Dataset refresh but thinking about future improvements of my code (probably store the refresh history, compare it to other data, etc.) I’ll make it nice and clean from the beginning.

Our next goal is to set up all the needed parameters – and there are a few! We will need:

  • Client ID
  • Client Secret
  • Tenant Name
  • Workspace ID
  • Dataset ID

After we registered our App / Service Principal we got the Client ID and Secret. If you’re not sure what’s your tenant name just login to your M365 Admin Center, go to Setup, click on Domains, and see your domain name ending – this is your M365 tenant name. The easiest way to get your Workspace and Dataset ID is to head over to Power BI Service, click on your Dataset and check your URL. The red part is the Workspace ID, the blue part shows you the Dataset ID.

Besides that we specify the Authority URL which is needed to authenticate with the MSAL library. Basically it points Python to the right Microsoft Tenant. Further we also need to specify the scope (Power BI in this case). Lastly we also need our Power BI REST API URL. Because we’re interested just in the last refresh history status we filter it in the query to Top 1. With all these information we create following code snippet.

Now it’s time to authenticate and grab an access token which can be used to call the Power BI REST API and get the Refresh History of the Dataset. Once authenticated and an access token is available, we’re using it to provide it to our header of our request statement. After we set up the header correctly we call the API through a GET request providing the URL from above and the header we just created including the access token. As a next step we’re converting the result directly into a JSON format which is used to format it again to a Pandas Dataframe with the columns requestId, id, refreshType, startTime, endTime, status. All these information are provided through the Power BI REST API. Lastly we set the Dataframe Index as the id column. Our code should look now as following.

Let us know check if the code runs successfully. We’re going to execute the whole code to import all libraries, set the parameters, authenticate, and call the Power BI REST API to get the refresh history of our dataset. If everything works fine we should see a result like the one below.

Perfect, this means we got now the Power BI Dataset refresh history and see the last refresh has been executed on demand (meaning manually) and it’s completed. The status in our case is very important because if the refresh is running we can’t trigger a new one and it will fail. From the documentation we can see there are four different status:

StatusDescription
UnknownState is unknown or refresh is in progress
CompletedRefresh completed successfully
FailedRefresh was unsuccessful
DisabledRefresh is disabled by a selective refresh

So to trigger our refresh we want to make sure the status is not unknown. Further we can also specify what should happen if the last refresh status is failed or disabled (e.g. enable it through another API – see https://docs.microsoft.com/en-us/rest/api/power-bi/datasets/update-refresh-schedule-in-group). Due to demo purpose I’m just going to print out a different message based on the status. This means our last code snippet checks the status and depending on it executes a different code.

I added an else section as well if for some reason a new status code will appear so I’m handling every situation separately.

If we now execute the last bit of our code we see the right message has been printed out – so far so good.

But let’s also check the Power BI Dataset Refresh History. If we switch now to Power BI Service, select our Dataset, move on to Settings and check the Refresh History we see following – our Dataset has been refreshed and triggered via API!

Imagine what you can do now – not only calling the Dataset Refresh REST API but as well any other API Power BI offers via Python! Because I’m in Azure Synapse I can even integrate my Python script in Azure Data Factory and trigger a refresh at the end of my ETL / ELT pipeline, and many more! I’m really excited about this opportunity and will definitely elaborate more possibilities in near future.

Keep in mind that the limitation of your licensing can’t be eliminated. This means if you’re using the Power BI Free license you can programmatically refresh a dataset once a day. With a Pro License it’s eight times a day, and with Premium or Premium per User it’s 48 (see https://docs.microsoft.com/en-us/power-bi/connect-data/refresh-data#data-refresh).

Please let me know if this post was helpful and give me some feedback. Also feel free to contact me if you have any questions.

If you’re interested in the files used in this blog check out my GitHub repo https://github.com/PBI-Guy/blog This time I’ll provide only the PBIT file not containing the data as well not providing the User and Password. Thanks for understanding.

Power BI and Paginated Reports – beyond usual use

A lot of customers are asking me what are the differences between Power BI and Paginated Reports, why should they use one OR the other. My answer is mostly: It’s not one OR the other, why not combine and get the most out of both worlds?! I suggest in many cases that Power BI can be used as interactive dashboard “entry-tool” for end users to analyze dynamically their data. But once a pixel-perfect report should be created as PDF (it’s mostly PDF so we’re going to focus on this format) Paginated Reports are simply better. So why not creating a button within your Power BI report to take all the selected filters automatically and create out-of-the-box a Paginated Report PDF print-out? Most customers are first wondering that this kind of scenarios are possible and of course wondering how it can be done. Let me walk you through how to add such a button within a Power BI report in this blog post.

Prerequisites

  • Power BI Report
  • Paginated Report
  • Power BI Desktop
  • Power BI Report Builder (recommended)
  • Power BI Premium / Power BI Premium per User / Power BI Embedded
  • Basic understanding of both worlds

I already have a Power BI and a Paginated Report ready to combine it. If you’re not familiar how to create a Paginated Report or from where to get a simple demo file I can highly recommend the Power BI Paginated Reports in a Day Course. In this course you’ll learn the differences between Power BI and Paginated Reports, how to start and build your first report, and how to publish it afterwards to Power BI.

Further Paginated Reports are only supported with Premium. Therefore you will need a Power BI Premium capacity, Premium per Use license, or Power BI Embedded.

How to

The very first thing we need to do is to publish our Paginated Report to Power BI to get the unique ID of the report from the Service. In my case I open the RDL file with Power BI Report Builder and publish it to a workspace backed up with a PPU license. I name it Sales Analysis.

Once done the report will be available in Power BI Service. If we open it we’ll see the URL pointing to our specific workspace with a unique ID (1a8b6910-c4a2-4611-ae75-5d0b968eb6d3) and pointing to our Sales Analysis Paginated Report which has as well a unique ID (66a1a70a-89cf-4d48-80c1-39d766f9892b). This means we can use this URL to get to our just published Paginated Report.

If we check the Microsoft Documentation about Power BI and how the URL is build (see https://docs.microsoft.com/en-us/power-bi/paginated-reports/report-builder-url-parameters) we see that the URL can be enhanced to provide values for parameters, to provide commands to get a specific format, and many more. So before building something in our Power BI report let’s try the URL to understand what’s possible.

Let’s first try to give a value to the parameter. To initialize a parameter we have to add “?rp:parametername=value”. In our case the internal parameter name of the Paginated Report is called DateFiscalYear and can be checked in Power BI Report Builder. Checking the properties we also see that the parameter is a data type text.

If we’re looking for possible values we can check the “Available Values” and see if a query is used or something is hardcoded within the parameter settings. Alternatively we can also open the Report in Power BI Service and check the drop down list of the Parameter. If we do so we can see that following values are expected.

Let’s try to build the URL now with what we got so far:

URL to Reporthttps://msit.powerbi.com/groups/1a8b6910-c4a2-4611-ae75-5d0b968eb6d3/rdlreports/66a1a70a-89cf-4d48-80c1-39d766f9892b
Initializing Parameter?rp:
Parameter NameDateFiscalYear
Parameter ValueFY2019
Whole URLhttps://msit.powerbi.com/groups/1a8b6910-c4a2-4611-ae75-5d0b968eb6d3/rdlreports/66a1a70a-89cf-4d48-80c1-39d766f9892b?rp:DateFiscalYear=FY2019

If we call the URL now the parameter is automatically set to FY2019 and the report is loaded.

Let’s go further and try to get a PDF automatically. To do so we only need to add “&rdl:format=PDF” at the end of our URL. The “&” symbol combines different commands and to get a PDF automatically the rdl:format=PDF is necessary. Therefore our whole URL looks now as following:

https://msit.powerbi.com/groups/1a8b6910-c4a2-4611-ae75-5d0b968eb6d3/rdlreports/66a1a70a-89cf-4d48-80c1-39d766f9892b?rp:DateFiscalYear=FY2019&rdl:format=PDF

If we call this URL Power BI will automatically generate a PDF.

So far so good! Now that we understand how the URL of a Paginated Report works and how we can modify it let’s try to implement it in our Power BI Report.

After opening the Power BI Report in Power BI Desktop we can add a simply DAX measure with our hardcoded URL to call the Paginated Report.

Paginated Report URL = “https://msit.powerbi.com/groups/1a8b6910-c4a2-4611-ae75-5d0b968eb6d3/rdlreports/66a1a70a-89cf-4d48-80c1-39d766f9892b?rp:DateFiscalYear=FY2019&rdl:format=PDF”

Once added make sure to mark it as Data Category Web URL.

If we add now the measure to our report we see our hardcoded URL. If we click on it the Paginated Report will open. Unfortunately it’s not “connected” with our Power BI Report so far. Meaning if I change the Slicer for example to FY2020 the URL will still point to FY2019. Let’s fix this with some DAX magic.

I add a new Measure to get the selected value of the slicer. In this case I use following formula:

Selected Fiscal Year = SELECTEDVALUE(‘Date'[Fiscal Year])

Now I just replace the hardcoded FY2019 from the first Measure with my second Measure. The DAX Measure looks now as following:

Paginated Report URL = “https://msit.powerbi.com/groups/1a8b6910-c4a2-4611-ae75-5d0b968eb6d3/rdlreports/66a1a70a-89cf-4d48-80c1-39d766f9892b?rp:DateFiscalYear=” & KPIs[Selected Fiscal Year] & “&rdl:format=PDF”

Now every time I select another FY my URL will automatically adopt. That’s very simple with a single selection but what if I wish to have a multi selection, will it still work? Let’s try it out. But before testing the URL we need to make sure the Slicer is enabled for Multi Selection as well as the Parameter in our Paginated Report. Therefore I change the settings of both.

Don’t forget to republish the Paginated Report once the Property has been modified.

Let’s test our URL now in Power BI if we select two FY. I added the Paginated Report URL Measure into a Table visual to see it and select two different FY. Unfortunately the URL do not show both years, even worse it just disappeared. The reason behind is that the SELECTEDVALUE function expects one value.

Luckily we can also give an alternative to the SELECTEDVALUE function in which we can concatenate multiple values. To make sure we got the each value just once we need to use the DISTINCT function as well. Our Selected Fiscal Year Measure looks now as following.

Selected Fiscal Year = SELECTEDVALUE(‘Date'[Fiscal Year], CONCATENATEX(DISTINCT(‘Date'[Fiscal Year]), ‘Date'[Fiscal Year]))

Unfortunately it combines now FY2019 and FY2020 into one string and the URL contains now FY2019FY2020 which will not work. Even if we separate the two fiscal years with a comma or something else it will still not work as Paginated Report will recognize it as one value (e.g. “FY2019, FY2020” is one value and the report will not load). Therefore we need to add for each value the parameter again like in the Documentation described (see https://docs.microsoft.com/en-us/power-bi/paginated-reports/report-builder-url-parameters#url-access-syntax). The syntax looks as following:

powerbiserviceurl?rp:parametername=value&rp:parametername=value

In our case this means we have to have rp:DateFiscalYear=FY2019&rp:DateFiscalYear=FY2020 after the question mark. Let’s adjust our Selected Fiscal Year Measure to get the right URL needed. If we closely look to the syntax we see that the Delimiter can be specified. We’re going to use this and add “&rp:DateFiscalYear=”. In this case every time two ore more values are selected the values will be separated with the right expression. Our final DAX measure looks now as following:

Selected Fiscal Year = SELECTEDVALUE(‘Date'[Fiscal Year], CONCATENATEX(DISTINCT(‘Date'[Fiscal Year]), ‘Date'[Fiscal Year], “&rp:DateFiscalYear=”))

We can also see that the URL is changing dynamically based on the FY selection. If we click now on the URL the Paginated Report will open with the two FY selected and print out a PDF automatically.

Our last step is now to create a button in our Power BI Report and publish it afterwards.

In my case I choose the Pentagon Row shape and add it into my report. Of course you can modify it as wished or even use a visual instead of a shape / button to achieve the same result (open the paginated report).

Position the shape and configure it as needed. Lastly modify the Action property, set the Type to Web URL and configure our DAX Measure to be used as Web URL.

Now just publish the Power BI Report and use the button to open the Paginated Report based on your Slicer selection in Power BI.

Conclusion

As we see we can use Power BI as entry point and use it to set filters to open a Paginated Report afterwards. Due to the flexibility of the Paginated Report URL we can also specify in which format the report should be exported. This could also be a dynamic selection in Power BI. There are further integration possibilities thanks to the Paginated Report Visual in Power BI to display a Paginated Report directly in Power BI.

Please let me know if this post was helpful and give me some feedback. Also feel free to contact me if you have any questions.

If you’re interested in the files used in this blog check out my GitHub repo https://github.com/PBI-Guy/blog This time I’ll provide only the PBIT file not containing the data as well not providing the User and Password. Thanks for understanding.

Power BI Object-Level-Security

Many users are aware that Power BI offers Row-Level-Security to secure their data. As example you can use RLS so users from a specific country can only see the sales numbers from this country and not others. I did already a blog post about the different possibilities (see https://pbi-guy.com/2021/10/12/power-bi-row-level-security/ & https://pbi-guy.com/2021/10/15/power-bi-row-level-security-organizational-hierarchy/ & https://pbi-guy.com/2021/10/15/power-bi-rls-configuration-in-service/). But many customers don’t only want to secure on a row-base they also want to secure their data on a “column-” or “table-base”. And exactly for this purpose Power BI offers Object-Level-Security. Let me walk you through how to set up OLS in Power BI.

Prerequisites

  • Power BI Desktop
  • Power BI Service Account
  • Tabular Editor

How to

To enable OLS we start in Power BI and create first a data model. I’m going to use my standard Sales Report with Wide World Importers sample data. Further I created three visuals with Text boxes to show the different OLS options – No OLS applied, only on one specific column (customer), and on the whole table (dimEmployee). Every visual shows the Profit by different dimension. First one by Sales Territory, second one by Buying Group, and the third one by Employee.

As a next step we have to create the different Roles so OLS can be applied to it. Go to the Modeling Tab in the Ribbon and select Manage Roles.

In here I created two different roles – one where only the OLS for the column Customer should be applied and one where the whole Table dimEmployee should be secured. No DAX expression or anything else is needed – just the two empty roles. Once done hit the Save button.

After the test page and the roles are set up I connect to my model with Tabular Editor by selecting it through the ribbon External Tools.

Pro Tip: If you open Tabular Editor directly from Power BI Desktop you’ll be automatically connected to your data model.

Once Tabular Editor has opened you should see a similar screen like the below.

As a next step I expand the Roles and select first the “OLS on Table dimEmployee”. Once the role is selected in the property pane you see a property “Table Permissions” in the “Security” section. Expand it and configure “None” to the table which should be secured. In our case it’s dimEmployee. This means that every user who will be added to the “OLS on Table dimEmployee” role afterwards will not see any data coming from the dimEmployee table.

Now I select the other role and instead of “None” I set the dimCustomer Table to “Read”. The reason is we just want to secure one specific column and not the whole table. Therefore the table can be read in general but we have to configure specific columns which should be secured. After you set the dimCustomer table to read the role can be expanded on the left hand side which lists all tables in “Read” or “None” mode.

Next select the dimTable below the role, head over to “OLS Column Permissions” under “Security” in the property pane and set the column “Customer” to “None”. Every other column will use the Default behavior of the table which is “Read”.

After we set up everything now in Tabular Editor we can save our model and close Tabular Editor. Back in Power BI Desktop let’s test our roles. First I test the “OLS on Table dimEmployee” role by going to Modeling – View as – selecting OLS on Table dimEmployee – and hit OK.

We see that our OLS works because the right hand visual is not showing anything. Further the whole table dimCustomer is also not visible.

That’s exactly what we expected – great! Let’s test the second role. After we switched the view every visual is showing up but the “Customer” field in the table “dimCustomer” is hidden. This is also expected as we’re not using the Customer field in our report so far therefore everything can be shown.

Let’s turn of the role view and replace the “Buying Group” column with “Customer”.

Than we enable the role view again to see if security applies.

And as we can see yes it does! Because the visual is using the column Customer now it’s not showing up.

As a last step you would need to publish the report to the Service and assign user / groups to the desired role. One user / group can also be added to multiple roles if needed like with RLS.

Personally, I find the OLS very useful to secure your data model but the message which appears to end user is not very user friendly. I would love to see an update here which says at least it’s secured instead of “Something went wrong” because as an admin it’s expected behavior and not wrong. Best option would be if I could configure the message as I wish.

Please let me know if this post was helpful and give me some feedback. Also feel free to contact me if you have any questions.

If you’re interested in the files used in this blog check out my GitHub repo https://github.com/PBI-Guy/blog This time I’ll provide only the PBIT file not containing the data as well not providing the User and Password. Thanks for understanding.

Show “secured” Images in Power BI

Use Case

From time to time I got asked how you can display images in Power BI. The answer is pretty simple and straight forward (see https://docs.microsoft.com/en-us/power-bi/create-reports/power-bi-images-tables) if the image is publicly available. But customers can’t expose their internal pictures to the public and still want to display them in Power BI. The issue is with “secured” images that the Power BI visual can’t log in with an account to be able to display it. So how can you still save your images in a secured environment and still display them in Power BI? Let’s check it out.

I tried different ways and found two that work. One would be to get the binaries of each image, transform it to a text field and display afterwards with a custom visual the image. This solution has a big “but” from my point of view. Because the text field has a 32766 character limit you would either have to resize your image or you have to do some Power Query and DAX magic to split the field into multiple ones and add them afterwards together again. Chris Webb wrote a blog how this could be done here: https://blog.crossjoin.co.uk/2019/05/19/storing-large-images-in-power-bi-datasets/

As good as this solution might be I would like to find an easy, low-code / no-code way to achieve the same. And luckily there is one!

Power Apps & Power BI – Better Together

Because Power BI is part of the Power Platform and Microsoft offers a seamless integration into the other Services (and vice versa) let’s leverage the benefits of it and try to display our images through a Power Apps app in Power BI.

Prerequisites

To be able to implement the solution you would need a few things:

  • Power BI Desktop
  • Power BI Service Account (optionally)
  • Power Apps Account
  • SharePoint Document Library with pictures

The Power BI Desktop can be downloaded for free through the Power BI website. Power Apps is included in different licensing options like E3 or E5. In our case we’re going to use a SharePoint Connector which is a Standard Power Apps connector meaning no additional licensing is needed. If you don’t have a Power Apps license yet you can sign up for a Developer Plan and use to for development purpose. Keep in mind that you can’t share your apps created with a Dev. Plan nor can you show Power Apps apps in Power BI coming from this specific environment.

How to

We start in Power BI Desktop and connect to our SharePoint list to get a few basic details. I’ll not guide you through how to connect to a SharePoint Document Library. These details can be found here. In my case I have a simple Document Library with some Star Wars picture.

In Power BI I connect to the Document Library and get only the ID, Notes, Created, Modified, and ServerRelativeURL fields. You can of course select more or less fields but make sure you get the ID which will be needed later on in the Power Apps app.

Once loaded I create a simple Table visual with the three fields ID, Notes, and ServerRelativeURL to display some data. As a next step I add the Power Apps Visual to my canvas and position it to the right. Once you added the Power Apps Visual you’ll see the necessary steps how to get started.

So let’s do our first step and add our wished fields to the data section of the visual. This fields can be accessed later on through Power Apps. Make sure to add the Id field and also check that it doesn’t summarize! The behavior (Sum, Count, etc.) will be provided to Power Apps and we don’t want to summarize our Ids. As soon as you add your first field the Power Apps visual will change where you can choose and existing app or create a new one. In our case we’re going to create a new one. A window will pop up asking if it’s ok to open your browser with the Power Apps URL. Hit OK.

Tip: If you’re facing some issues while opening Power Apps or you can’t choose your environment through the Visual, open Power Apps in your default browser, choose your environment in the browser, switch back to Power BI and try to hit the “Create New” button again.

Once Power Apps is loaded you’ll see an object call “PowerBIIntegration” (besides some others). This object has been automatically created through Power BI and makes sure that Power BI and Power Apps can interactive with each other. This means for example if you select now a specific image to filter the same filter will also apply in Power Apps. That’s the reason why you have to start from Power BI and create an App from there. Otherwise the “PowerBIIntegration” object will not be created.

Let’s quickly test if the integration really works. I select the Baby Yoda picture in Power BI and the list is automatically filtered in Power Apps – great!

Our next goal is now to show the Images from SharePoint in our Power Apps app and make sure the integration still works (filter on a specific image in Power BI should also filter the Image in our app). Therefore we first have to create a connection to our SharePoint list. To do so go to the Data Tab in Power Apps and add SharePoint as data source.

If you haven’t created a connection yet hit the “Add a connection” button. In my case I can choose an existing one.

Afterwards choose your Site and Document Library in which you stored your pictures. In my case I have a Doc Library called PBI Guy Pictures. I select it and hit “Connect”.

Now that we’re connected to our Doc Library we can display the images out of it. To do so insert a Gallery. I choose a vertical one but the layout can be modified afterwards as well.

Once inserted I adjust the two Galleries so that our freshly inserted one is at the top and at the bottom I display the first Gallery. Per default our new Gallery shows default text and images. We have to connect our Gallery to our SharePoint Doc Library and than decide what we wish to display. So let’s connect it by selecting the Gallery and set the Data source through the Properties pane.

Our next step is to display the right image. Select the first image object in the Gallery and choose the “Items” property either on top left of the screen or find it in the properties pane in the Advanced section.

Replace “SampleImage” with ThisItem.’Link to item’ to create the link to our needed image.

You can also modify the view of the Gallery, add new fields into it, etc. but in my case I just want to display my notes with the ID together. Therefore I select the ID Text Box and change the code in the function window from ThisItem.ID to ThisItem.ID & ” ” & ThisItem.Notes

Our last step is to create a connection between our SharePoint Gallery and the “Power BI Gallery” from the beginning so that filters from Power BI will effect our SharePoint Gallery as well. The best way to do so is to filter the new Gallery by an ID to make sure we got that one specific image we’re looking for. That’s the reason why we need our ID field from the beginning! 🙂

To filter the new Gallery select it and choose the Items property. In my case I see ‘PBI Guy Pictures’. This is the whole table we have to filter and making sure that only the selected ID is showing up. Therefore we wrap our Table with a Filter() statement. Our first argument in the filter statement is the ‘PBI Guy Pictures’ table. Our second argument is the filter condition which should point the ID field from Power BI to the ID field of SharePoint. Therefore we use the PowerBIIntegration object to grab the necessary data. Unfortunately this is a Table Data Type and we can’t match Table with a single Number Data Type. To get a single value I extract just the first value of the whole table with the First() statement followed with the column I’m looking for (ID in this case). Our whole function looks now as following: Filter(‘PBI Guy Pictures’, First([@PowerBIIntegration].Data).Id = ID) And I immediately see that it works! Previously I selected Baby Yoda and now only this picture is showing up.

As a last step I want to make sure the first Gallery is not visible in my app. So I select the first Gallery and set the Visible property to false.

Of course you can now modify your app and make it more shiny but just for the demo purpose I save it as it is now through File, Save. Once done the App is ready to be used in Power BI as well and the cross-filter works!

If you wish to share the report make sure all users have also access to the Power Apps app and the SharePoint Library. Otherwise people won’t be able to see the pictures in their report.

Please let me know if this post was helpful and give me some feedback. Also feel free to contact me if you have any questions.

If you’re interested in the files used in this blog check out my GitHub repo https://github.com/PBI-Guy/blog This time I’ll provide only the PBIT file not containing the data as well not providing the User and Password. Thanks for understanding.

Power BI Theme with Background Image

A few weeks ago I saw a small but very useful hack if you’re working with Power BI Themes. Did you know that you can create a Theme with a background image? Let me show you how!

What is a Theme in Power BI?

Power BI Report Themes helps you to apply design changes to your entire report. With a Theme you can specify what the default color of your visuals should be, changing icon sets, or applying default visual formatting like title text size. Further details can be found here: https://docs.microsoft.com/en-us/power-bi/create-reports/desktop-report-themes

Once a Theme is created you can save it as JSON file and distributed it to your whole organization. Like with every other JSON file you can of course modify it as wished. In our case we wish to add a background image because background images are not saved through the Desktop in the JSON file.

How to create a Theme

Once Power BI Desktop is opened select the View Tab, expand Themes, and hit Customize current theme.

A new window will pop up in which you can customize your current Theme. In my case I choose to go with darker colors in general. You can also customize your Text, Visuals, Page, and Filter pane. Once done hit the Apply button.

You’ll see now that the current Theme is used in your Power BI report. Last step is to save the current Theme as JSON.

If you wish you can try to add a background image in your report but this will not be saved in the Theme / JSON file.

Once saved I open the JSON file in Notepad++ and format it as JSON format through the JSTool plugin. As we can see the data colors has been saved in our JSON file (and nothing else).

To be able to add now a background image we need to add a code snippet and convert our background image to base64 format. Let me walk you through the steps. First, we add following code snippet after the dataColors line.

,”visualStyles”: {“page”: {“*”: {“background”: [{“image”: {“name”: “Demo”,”scaling”: “Fit”,”url”: “data:image/jpg;base64, ” },”transparency”: 50}]}}}

Now we can add our base64 formatted image after the comma of base64 but it has to be before the quote. To convert an image into base64 just search for an online service which can do so. In my case I’m using https://onlinejpgtools.com/convert-jpg-to-base64 and simply drag my image in the necessary field. Once done you’ll get the code right away on the right sight. Hit the “Copy to clipboard” button to have it in your clipboard.

The last step now is to paste the code in the needed code area (after the comma of base64, before the quote). I formatted my final code in Notepad++ to make it more readable and it looks like following now.

If your image is a PNG just replace the red marked part of your code to be PNG instead of JPG.

Finally I have to save my JSON file and import it into every Power BI report file I wish to use my new Theme.

As you can see the background image is saved within the Theme with the correct settings. In my case I set the transparency to 50%.

Please let me know if this post was helpful and give me some feedback. Also feel free to contact me if you have any questions.

If you’re interested in the files used in this blog check out my GitHub repo https://github.com/PBI-Guy/blog This time I’ll provide only the PBIT file not containing the data as well not providing the User and Password. Thanks for understanding.