Automating PowerBI with Azure

Tushar Sinha
6 min readSep 12, 2021

--

Template to Automate PowerBI using Azure SQL and csv Files

(yes I know bullet numbering is messed up — for some reason medium wont allow me to fix this)

Tushar Sinha

September 12, 2021

Version: 1.0

Overview

This document outlines key steps to automate a PowerBI data chain using Azure SQL as the data source and also covers how to automate data updation into Azure SQL.

This is meant to be a companion document to the one published here.

Goals

  1. Build Azure SQL Server and prepare update process using blob storage, csv files and Azure Logic Apps
  2. Build PowerBI dashboard

Azure Section

Azure SQL

  1. For a small dataset, we choose a Basic Azure SQL (5 DTU) — this should be a total cost of US$ 4.95 per month. Database size can be up to 2 GB in this plan. Note that choosing a vcore based database creates higher charges since charges are applied each time a query is run
  2. To keep costs low, other security features are not to be chosen
  3. Choose sql credentials, not Active Directory credentials and make sure to keep note of the uid and pwd
  4. We will initialise the database with csv files
  5. Further updates also by way of csv

Azure Storage Container

  1. Make sure to turn off public access in the storage account settings
  2. Create a blob container and again ensure public access is off
  3. Upload csv files into a particular folder in the container (say ‘folder’)
  4. Create Shared access signatures and store the access strings somewhere safe (remember to set an expiry date as long as the app is needed or make not of the expiry date and regenerate the SAS upon expiry)
  5. Run the following queries in the database. As below, note to remove the first ‘?’ from the SAS, it should start with ‘sp=’:

CREATE DATABASE SCOPED CREDENTIAL MyCred

WITH IDENTITY = ‘SHARED ACCESS SIGNATURE’,

SECRET = ‘sp=xxxxxx’;

CREATE EXTERNAL DATA SOURCE MySource

WITH (

TYPE = BLOB_STORAGE,

LOCATION = ‘https://[mycontainer].blob.core.windows.net/[folder]',

CREDENTIAL = MyCred);

  1. This stores credentials and path to folder in the database and can be used in future queries
  2. Assuming we want to create a table MyTable and initialise it with data in the blob folder from myfile.csv, run the following (exact entity names such as table names and database names may vary)

create table MyTable (

[Col1] varchar(255),

[Col2] int,

);

BULK INSERT [dbo].[MyTable] FROM ‘myfile.csv’

WITH (

CHECK_CONSTRAINTS,

DATA_SOURCE = ‘MySource’,

DATAFILETYPE=’char’,

FIELDTERMINATOR=’,’,

ROWTERMINATOR=’0x0a’,

FIRSTROW=2,

KEEPIDENTITY,

TABLOCK

);

Logic Apps to Automate Upload of CSV into Blob Storage

  1. While logic apps can also be triggered via email, for this use case I prefer to keep it simple and use One Drive for Business
  2. In the Logic App creation panel, start with a blank template and search for the one drive for business (When a file is created) connector (it will ask for one drive account authorisation)
  3. Choose the folder to monitor and apply the connector
  1. Next choose the ‘Create blob V2’ connector, choose the blob folder name and in Blob name enter the dynamic field (chosen from drop down, not typed!) File Name and for Blob content choose File content

​​

  1. Now create a file in the one drive folder and check if the trigger is working (An entry in the ‘Trigger History’ window should say Fired (note that you can click the Run Trigger option as in the screenshot below instead of waiting for the trigger time interval to lapse)
  1. Check to see if file created in blob folder
  2. Note that while this step is not needed strictly speaking (can simply enter into database from one drive), it is highly recommended since sql server queries are not guaranteed to run successfully and can fail if the database is locked by another process etc. While I do not have an error handling routine built in here, at least the data is saved as csv and can be manually inserted into the database — subsequent versions of this document will flag errors and gracefully attempt recovery by scheduling fresh insertion queries of the failed file directly from Blob storage

Logic Apps to Automate Updation of SQL database with CSV stored in Blob Storage

  1. Create a new blank logic app and for the trigger choose when a blob is added or modified (V2)
  1. Enter the storage account name and the container with the folder to watch as above (folder name goes in the second box title Container)
  2. Next choose Execute a SQL Query (V2) for the second step and choose the right database name and server name (details will be available in the Overview tab of the database created above and the server associated with it
  3. Enter the SQL as below as a query parameter and add List of Files Name as a dynamic field
  4. SQL query in text below for convenience

BULK INSERT [dbo].[MyTable] FROM ‘myfile.csv’

WITH (

CHECK_CONSTRAINTS,

DATA_SOURCE = ‘MySource’,

DATAFILETYPE=’char’,

FIELDTERMINATOR=’,’,

ROWTERMINATOR=’0x0a’,

FIRSTROW=2,

KEEPIDENTITY,

TABLOCK

);

  • Note that the above assumes the first row is the header row and so the query uses FIRSTROW=2 to start from the second row in the file
  • Of course if the csv doesnt match the database schema, it will fail — if it does match, column names don’t matter
  • Take particular care not to use commas in numbers (which excel adds in sometimes) since commas are delimiters and will confuse the query
  • As before try triggering the logic app and then query the table manually to ensure data is indeed inserted correctly

PowerBI Section

  1. In PowerBI Desktop, choose Get Data
  1. Next choose Azure and Azure SQL
  1. Enter server address (leave database blank). Not that Improt will import data and not refresh from the database until manually refreshed or as per scheduled refresh (to be defined in powerbi service)
  1. Create data connections and reports as usual and then hit publish to send this to the powerbi service (choose an appropriate workspace in the dialog box)
  2. Note that database credentials need to be entered in powerbi service as follows in order to enable scheduled refresh
  3. Click on the workspace and in the dataset click as follows (this is lineage view which can be chosen in the view button on the top right):
  1. In the error on the next page, enter credentials
  1. It is likely you’ll see an error as follows:
  1. Go back to the Azure SQL server window in Azure and enable Azure service access to the server per screenshot below
  1. Finally chose a schedule under Scheduled refresh in the PowerBI screenshot per point 8 above.

Sources: https://www.sqlshack.com/azure-automation-use-azure-logic-apps-to-import-data-into-azure-sql-database-from-azure-blob-storage/

--

--

Tushar Sinha

Private Equity professional, passionate about applying data science and machine learning to alt assets