Flexible Data Source Management in Power BI with MySQL

📆 Published on Feb 7, 2025

4 min read

by Macktireh

Flexible Data Source Management in Power BI with MySQL

Introduction
Challenges Encountered
- 1. Connectivity Issues
- 2. Performance and Volume of Data
Solution Implementation
Technical Implementation
Conclusion

Introduction

In the context of a client project I am currently working on (at the time of writing this article), we faced an exciting challenge: designing a Power BI dashboard connected to a MySQL database containing millions of rows. This experience allowed us to develop an effective solution to adapt data sources based on the work environment. In this article, I will explain the steps involved in setting up this solution, addressing the problems encountered and the solutions implemented.

Challenges Encountered

1. Connectivity Issues

After installing the MySQL driver for Power Query, we encountered connection difficulties due to firewall configuration on the client’s professional workstation. Interestingly, the connection worked perfectly from my personal computer and when refreshing reports in the Power BI service.

2. Performance and Data Volume

The database contained several million rows, making local data loading particularly time-consuming. This situation required an alternative approach to optimize performance in the development environment.

Solution Implementation

1. Solution Architecture

To resolve these issues, we developed a hybrid approach:

In local environment (Power BI Desktop): using CSV files stored on SharePoint
In production (Power BI Service): direct connection to MySQL database

2. Data Preparation

To simplify local development, we first wrote the SQL query to extract the necessary data. Then, we developed a Python script that connects to the database, executes the query, exports the results to CSV format, and we uploaded the files to a SharePoint site.

3. Parameter Configuration

We set up five parameters to dynamically manage data sources:

Power Query parameters for data source management

ENVIRONMENT: choice between LOCAL and PRODUCTION
URL_CSV_FILE: link to the CSV file
MySQL_HOSTNAME: MySQL hostname
MySQL_DATABASE_NAME: database name
MySQL_QUERY: pre-built SQL query

Technical Implementation

1. Dynamic Environment Detection Variable

We created a boolean dynamic variable IS_PRODUCTION that simplifies environment detection:

1
let
2
    Source = Text.Contains(Text.Upper(ENVIRONMENT), "PROD")
3
in
4
    Source

This variable automatically returns:

True if the ENVIRONMENT parameter contains “PROD”
False in all other cases

This approach allows us to simply use IS_PRODUCTION in our conditions rather than rewriting the complete formula Text.Contains(Text.Upper(ENVIRONMENT), "PROD") each time.

2. GetData Function

We created a Power Query function named GetData, allowing data retrieval from MySQL in production and from a CSV file locally.

1
let
2
    GetData = (NumberColumnsCSV as number, MySQLQuery as text) =>
3
        let
4
            CsvSource = Csv.Document(
5
                Web.Contents(URL_CSV_FILE),
6
                [
7
                    Delimiter = ",",
8
                    Columns = NumberColumnsCSV,
9
                    Encoding = 1252,
10
                    QuoteStyle = QuoteStyle.None
11
                ]
12
            ),
13
            CsvPromotedHeaders = Table.PromoteHeaders(CsvSource, [PromoteAllScalars = true]),
14

15
            SourceMySQL = MySQL.Database(
16
                MySQL_HOSTNAME,
17
                MySQL_DATABASE_NAME,
18
                [
19
                    ReturnSingleDatabase = true,
20
                    Query = MySQLQuery,
21
                    CreateNavigationProperties = false
22
                ]
23
            ),
24
            Result = if IS_PRODUCTION then SourceMySQL else CsvPromotedHeaders
25
        in
26
            Result
27
in
28
    GetData

This function performs three essential operations:

Local data loading:
- Retrieves a CSV file from a URL
- Uses a comma delimiter
- Specifies the number of columns
- Handles encoding and quote style
- Automatically promotes the first row as headers
Database connection:
- Establishes a connection to MySQL in production
- Executes a custom SQL query
- Retrieves data directly from the database
Dynamic source selection:
- Automatically switches between local CSV and MySQL database
- Uses the IS_PRODUCTION variable as selection criteria

3. Using the GetData Function

To use the GetData function, simply create a new query and rename it as desired. In this example, we call it “MyData”.

1
let
2
    Source = GetData(5, MySQL_QUERY)
3
in
4
    Source

This approach allows data retrieval by specifying only two arguments:

The number of CSV file columns (5 in this example)
The MySQL query to execute (via MySQL_QUERY parameter)

simple, isn’t it? 🙂

4. Optimizing Data Transformations

In some cases, we need to apply specific transformations to CSV data. For this, we identified two possible approaches:

— Modifying the GetData Function

A first approach involves integrating transformations directly into the GetData function:

1
let
2
    GetData = (NumberColumnsCSV as number, MySQLQuery as text) =>
3
        let
4
            CsvSource = Csv.Document(
5
                Web.Contents(URL_CSV_FILE),
6
                [
7
                    Delimiter = ",",
8
                    Columns = NumberColumnsCSV,
9
                    Encoding = 1252,
10
                    QuoteStyle = QuoteStyle.None
11
                ]
12
            ),
13
            CsvPromotedHeaders = Table.PromoteHeaders(CsvSource, [PromoteAllScalars = true]),
14

15
            CsvReplacedValue = Table.ReplaceValue(
16
                CsvPromotedHeaders, ".", ",", Replacer.ReplaceText, {"my_column_numeric"}
17
            ),
18

19
            SourceMySQL = MySQL.Database(
20
                MySQL_HOSTNAME,
21
                MySQL_DATABASE_NAME,
22
                [
23
                    ReturnSingleDatabase = true,
24
                    Query = MySQLQuery,
25
                    CreateNavigationProperties = false
26
                ]
27
            ),
28
            Result = if IS_PRODUCTION then SourceMySQL else CsvPromotedHeaders
29
            Result = if IS_PRODUCTION then SourceMySQL else CsvReplacedValue
30
        in
31
            Result
32
in
33
    GetData

However, this approach is not recommended as it goes against the single responsibility principle: a function should ideally have only one responsibility.

— Transformation in the MyData Query

A more elegant approach involves separating data retrieval from transformation. Let’s update our “MyData” query to apply specific transformations to CSV data:

1
let
2
    Source = GetData(5, MySQL_QUERY),
3
    ReplacedValue = Table.ReplaceValue(
4
        Source, ".", ",", Replacer.ReplaceText, {"my_column_numeric"}
5
    ),
6
    ChangedType = Table.TransformColumnTypes(
7
        if IS_PRODUCTION then Source else ReplacedValue,
8
        {{"my_column_numeric", type number}}
9
    )
10
in
11
    ChangedType

This second approach offers several advantages:

Clear separation of responsibilities
Better code maintainability
Greater flexibility for modifying transformations
Ability to apply conditional transformations based on environment

Conclusion

In this article, we explored a solution to efficiently manage data sources in Power BI by implementing a dynamic switching system between local CSV files and a production MySQL database. The approach we developed not only solved our immediate connectivity and performance challenges but also provided a flexible framework that can be easily adapted to other environments or data sources. This approach can easily be adapted and extended to meet other needs, such as adding new environments (test, pre-production) or supporting other data sources (PostgreSQL, Oracle, etc.). Otherwise, if you want to import and combine multiple Excel/CSV files in a clean and optimized way, I invite you to read my article “Import Multiple Excel/CSV Files into Power BI with a Custom Power Query Function” .

Flexible Data Source Management in Power BI with MySQL

Table of Contents

Introduction

Challenges Encountered

1. Connectivity Issues

2. Performance and Data Volume

Solution Implementation

1. Solution Architecture

2. Data Preparation

3. Parameter Configuration

Technical Implementation

1. Dynamic Environment Detection Variable

2. GetData Function

3. Using the GetData Function

4. Optimizing Data Transformations

Conclusion