Escolha uma Página

This blog post could be called “A better solution for Power BI Dataflow Sharing per Workspace,” as it takes a “Power BI” perspective but delves deeper into implementing OLS (object-level security) and/or RLS (row-level security) to share data through the SQL Endpoint or a Warehouse in Microsoft Fabric.

For this article, the focus will be on the LakeHouse workspace artifact item as a replacement for Power BI Dataflows, but it can also apply to the Warehouse item. If you’re unfamiliar with SQL, don’t worry, as this blog post aims to provide you with a basic understanding.

The Data Sharing (Dataflow) Problem

For those of us coming from the Power BI world, where only Power BI was available, using Power BI Dataflows was the gold standard for reusing our ETL (extract, transform, load) processes and leveraging them both for cloud-based compute engine use (giving analysts the freedom to schedule and run automated flows) and as a data source for reuse, following Roche’s Maxim:

“Data should be transformed as far upstream as possible, and as far downstream as necessary.”

While Power BI Dataflows were a blessing in terms of architecture, reaching as far upstream as the analyst could go, they introduced some technological problems. The main issue was setting up permissions to share data through a dataflow as a data source, which could only be done at the workspace level.

This is not ideal. In the real world, planning which items should be added to what workspace can be inconsistent because users often treat workspaces as department SharePoint folders rather than per-process containers, despite Melissa’s guidelines.

Sharing at a workspace level means that if you want to share data contained in a Dataflow, you need to give access to every other item (existing and future) in the workspace.

This leads to either oversharing content with people or being forced to create separate workspaces just to share the data that a user is allowed to see. This proliferation of workspaces can be even worse if you have OLS (meaning not every table within a dataflow can be seen by the user) or RLS (meaning not every row within a table is available to the user) requirements.

Microsoft Fabric brings new options

With the introduction of Microsoft Fabric, many new exciting features were introduced for those who previously only had access to Power BI Service. These features provide more professional and scalable tools for proper ETL / ELTL processes, moving away from just Power Query (and the M language). Fabric enables the use of other languages such as SQL, Python, R, and more. In a team, you can leverage your team members’ expertise and use different tools for extraction and transformations, reducing the time needed for implementing new projects. As Fabric is an extension of Power BI (sharing the Power BI Premium platform), most items, at the time of writing, can only be shared at the workspace level. However, this may change rapidly, as Fabric receives new features every week.

The Principle of Least Privilege

The “least privilege principle” should be our mantra, especially when dealing with highly confidential data. This means following Melissa’s advice when planning for workspace items. You should be very judicious about who will have member/contributor roles.

It’s not recommended to give viewer roles at the workspace level because users will gain instantaneous access to any future items created in that workspace, including Dataset, Report, and Dataflows. LakeHouses/Warehouses are an exception, as we’ll explain later.

Shouldn’t I be sharing content through APPS?

In the Power BI world, the ideal scenario is to share items via APPs. However, the focus of APP sharing is to share reports/dashboards, somewhat like a food court where clients can choose among all the content selected for them to access it in a single place instead of going to several different locations to consume different analysis.

The APP focus is on consumption of pre-made content at the decision-making level. While this helps, it doesn’t suit every data analyst’s need, as they often need to blend pieces of information for custom analysis. They’re a special class of consumers who actually need to go to the supermarket to make their own recipe, not just the restaurant.

“Per item” sharing solves the data sharing problem

The solution to this problem lies in “per item” sharing. Give users access to only the specific item they need, nothing else. This aligns with the least privilege principle. Users can use the OneLake data hub, a data catalog tool built into Microsoft Fabric, where the analyst can search for available data sources. It can take benefit of the full semantic layer of a dataset or have access to data via SQL Endpoint or Warehouse item.

A side note is that you could set up a Power BI Dataset with OLS/RLS configured and share it similar to how you would share a dataflow, but it requires Power BI Premium per capacity or per user to connect and query data through XMLA, or some Power Automate gymnastics like did to use it via Power Automate. In summary, it requires a high level of knowledge to integrate things and may not be suitable for every user who consumes the data.

OneSecurity is not available yet, but the available options are pretty good!

Fabric, which is still in public preview, so things might change a little, allows you to share just that one item of the workspace for some items, like what we longed for since the day Power BI Dataflows arrived at the appropriate level.

This is not the announced OneSecurity feature yet, where security will be configured once at the OneLake storage level. This will be Fabric’s killer feature once it’s released, providing peace of mind for those who work with highly confidential data daily.

The available way to enforce security now is at the engine level that has access to the data and delivers it to the users. Fabric’s main engines are Apache Spark for data engineering and data science, SQL for data transformation and queries, Analysis Services for semantic modeling, and Kusto for real-time analytics.

For the sake of scoping this already long post, let’s return to the dataflow alternative: the LakeHouse item

When you create a LakeHouse, you get two extra items with it: the SQL endpoint and the default dataset. It’s like a buy one, get three bundle. Hopefully, this will change in the future, as regular users don’t need to know that the SQL Endpoint is a special implementation of the Warehouse item with read-only SQL capabilities. Knowing that the LakeHouse can be read through SQL or Analysis Services should be enough.

By the way, having a SQL connection string is much better for sharing data for external consumption than via dataflows or datasets for data ingestion purposes. It allows secure sharing with every engine, not just Microsoft ones.

It’s important to remember that one of the pillars of Fabric is the separation of storage and compute of the various engines. OneLake is the unified storage, and every Fabric engine can read and write. In fact, any other engine that can read delta tables (open source) can read from and write into OneLake using the APIs.

  • The actual files (delta tables) can be accessed in the LakeHouse item’s storage at OneLake through ADLS Gen2 APIs or the Fabric’s Apache Spark engine.
  • The SQL endpoint or Warehouse items represent the Fabric’s SQL engine.
  • The default dataset item represents the Fabric’s Analysis Services engine (which is the same used by Power BI).

 

Now, let’s move on to the permissions point.

When you share the LakeHouse item, you can grant just the read permission or check specific boxes that give users more privileges

Remember the least privilege principle? By default, you should never tick those additional permissions options.

If you don’t select any of those options, you’re only giving Read permission to the SQL and Analysis Services engines. This means:

  • At the SQL level, you’re letting the user connect to the item’s SQL endpoint and nothing more.
  • At the Analysis Services level, you’re letting people see data that was built for them via a Report. Here the user might be able to see all the data unless you configure permissions at the default dataset level, which isn’t available yet.

Both SQL and Analysis Services engines currently use the item’s creator identity to access the data. So, that’s crucial to be careful if you’re dealing with sensitive data.

  • If you check the Read all SQL endpoint data box, you’re giving the ReadData permission to your SQL engine. This means that you’re giving access to every object (table or view) in it and the ability for the user to build their own queries.
  • If you check the Read all Apache Spark data box, you’re giving the ReadAll permission, which allows the user to access the tables and files (the actual OneLake storage) of your LakeHouse. Currently, you can only access data using OneLake’s API to export data or use an external engine to perform some task or Apache Spark engine to run notebooks for data engineering or data science tasks.
  • If you check the Build reports on the default dataset box, you’re giving the user Build permission to your Analysis Services engine, making the user able to connect to your Analysis Services and create their own DAX Queries or a Power BI reports (which can be described as the DAX Queries generator) on top of it.

Implementing OLS in a Fabric SQL Endpoint or Warehouse

Back to the SQL part, if you give just the Read permission, you’re following the least privilege principle. Giving ReadData means you’re not. Remember, Read means the user can connect but can’t see any object (table or view). On the other hand, ReadData grants permissions to see every available object by default.

Fabric currently doesn’t have a pretty interface (maybe never will) to select which objects you’re allowing or denying users to see. If you’re a Power BI person like me, this is where things get tricky. I actually had the help of my friend Jonatan Torres to help me.

To keep it simple, just open the SQL Query and type the code

It’s really that simple. Open the SQL Endpoint or Warehouse item, hit New SQL query and type and run the desired code.

  • Strings use ‘single quotes’
  • Commenting is done with —
  • All tables or views are available under Schemas -> DBO
  • You can rename it and move it to the shared queries for further use by your team
  • You can select part of the code and click run (or ctrl+enter) to run that piece of code (necessary if there are variations of the same code).

Controlling OLS with Read Permission

If the user (or security group) has Read permission to the SQL Endpoint or Warehouse item, you should add (GRANT) the objects they can see.

— To grant the ability to the User to create SELECT queries over the object
GRANT SELECT ON dbo.RestrictedAccessTableorView TO [userOne@contoso.com];

— To grant the ability to the Security Group to create SELECT queries over the object
GRANT SELECT ON dbo.RestrictedAccessTableorView TO [Security Group Name];

— To grant the ability to the Role to create SELECT queries over the object
GRANT SELECT ON dbo.RestrictedAccessTableorView TO [Role Name];

Controlling OLS with ReadData Permission

If they have ReadData permission, you should restrict (DENY) access to the objects they can’t see. Here you should be careful with the permissions of new objects added.

— To DENY the ability to the User to create SELECT queries over the object
DENY SELECT ON dbo.ForbiddenTable TO [userOne@contoso.com];

— To DENY the ability to the Security Group to create SELECT queries over the object
DENY SELECT ON dbo.ForbiddenTable TO [Security Group Name];

–To DENY the ability to the Role to create SELECT queries over the object
DENY SELECT ON dbo.ForbiddenTable TO [Role Name];

Removing previous permissions or restrictions

Whenever you wish to remove a previously granted or denied permission, you should use REVOKE instead of DENY or GRANT.

How to create and assign user to Roles

To create a new role (a best practice for giving access), use the command:

CREATE ROLE PrivilegedRole;

And to add a new member to the Role:

–Add a User to the already created Role
ALTER ROLE PrivilegedRole ADD MEMBER [userOne@contoso.com];

–Add a Security Group to the already created Role
ALTER ROLE PrivilegedRole ADD MEMBER [Security Group Name];

Implementing RLS in a Fabric SQL Endpoint or Warehouse

You don’t have a built-in RLS solution, but you can leverage the Views object to emulate one.

For those unfamiliar with the term, a View is like a virtual table, a pre-built query that can be used to encapsulate complex queries, provide a layer of abstraction,

or restrict access to underlying data. It’s like a DAX Query.

Here you give the user GRANT SELECT access to the VIEW, not to the original table.

The actual RLS implementation is written at the WHERE clause of your VIEW:

  • Dynamic RLS by using Roles: SUSER_SNAME() approach (that returns the current user’s email)
  • Static RLS by using the user’s email: IS_MEMBER(‘PrivilegedRole’) = 1 approach (that tests if the user is a member of a role).

To create a view, you can use the following code:

CREATE VIEW [dbo].[Name of the View] as
SELECT *
FROM dbo.SampleTable
WHERE
–Here you apply the RLS rule at the user level, applying Dynamic RLS
( SUSER_SNAME() =’userTwo@contoso.com’ AND test_region = ‘region_one_name’)

OR
–Here you apply the RLS rule at the Role level, applying Static RLS
( IS_MEMBER(‘PrivilegedRole’) = 1 AND test_region = ‘region_two_name’);

To alter a view, you can use the user interface to help you.

Unfortunately, the command EXECUTE AS is unavailable to Fabric’s SQL engine at this moment. It would be beneficial to be able to run a query as a specific user and test whether the OLS/RLS has been applied successfully

I’m lost. How do I control OLS permissions I already gave?

After running several commands, it might get easy to get lost. So, if you’re a member or admin of the Workspace, you can always run the command to see the current detailed permissions for the objects:

SELECT DISTINCT   pr.name, pr.type_desc, pe.state_desc, pe.permission_name, pe.class_desc, obj.name AS object_name,  obj.modify_date
FROM sys.database_principals AS pr
JOIN sys.database_permissions AS pe
    ON pe.grantee_principal_id = pr.principal_id
LEFT JOIN sys.objects AS obj
    ON pe.major_id = obj.object_id
WHERE
    permission_name = ‘SELECT’
—    AND pr.name = ‘Security Group Name’

and to see what members belong to each role:

SELECT p.name AS RoleName, m.name AS MemberName, m.type_desc as UserType, m.create_date as CreateDate, m.modify_date as ModifyDate, m.authentication_type_desc as AuthenticationType
FROM sys.database_role_members drm
JOIN sys.database_principals p ON drm.role_principal_id = p.principal_id
JOIN sys.database_principals m ON drm.member_principal_id = m.principal_id
–WHERE p.name = ‘YourRole’;

How should end-users consume this data?

Every Fabric Workspace has a unique SQL connection string. This address is somewhat lengthy ( ifrg6w2xtlsexdsqdx5hfk4-qwusgtsg42lefayozohosi.datawarehouse.pbidedicated.windows.net ), but thankfully, Power BI Desktop has a much better interface to search for it at OneLake data hub. By default, they’ll try to connect you to the default dataset of the LakeHouse (if you have Build permission), but you can switch to connect to its SQL endpoint, which automatically finds the address for you.

You can copy this address from the workspace by clicking at the … of the SQL Endpoint item also.

Conclusion

I hope this article helps you better architect secure solutions in Microsoft Fabric, as the LakeHouse item offers a very powerful way to securely share the right data with the right people.

This is a way better alternative than using Power BI Dataflows as a data source. By sharing through the SQL engine, you can have OLS and emulate RLS using a true universally accepted engine to share large volumes of data securely.

The lack of a proper UI for this might scare newbies like me, but after a while, it’s easier to get used to it, especially if you save the main control queries under Shared queries.

PS1: I’m really eager to see the OneSecurity implementation that might change this all for the best because when you set security at the engine level, you might have gaps between what the user sees depending on which engine he’s using.

PS2: The standard way to share content within Fabric is via shortcuts. This prevents unnecessary duplication of the same data, but I’m afraid it will only be feasible to share confidential data with OLS/RLS when OneSecurity gets fully implemented.