Permissions and Chill

Are you a Salesforce admin inundated with permission requests? Explore our experiment with collaborative filtering to see how we struck a balance between security and simplicity.
Permissions and Chill

As a Salesforce admin for a large organization, it’s part of the job that you have to wrangle permissions for a constant flow of employees coming, going, or changing roles. It’s not an easy, or an efficient, task to manage potentially hundreds or thousands of permissions across the organization so that each employee can do their job while also keeping the organization secure.

You can assign all permissions to everybody, but it’s not very mindful or secure that an intern might have write and delete access to the same objects as the CEO. You can assign little to no privileges (by the principle of least privilege) to everyone but the CEO and that’d be secure, but you don’t want to be inundated with permission requests just for folks to do their job! 

As an admin, how can you strike a balance between security and providing basic capabilities? If only there was a way to know what people would do in the future, so that you can configure the respective permissions and be done! 

We get it, as an admin, you just want to chill! 

With this idea in mind, Salesforce’s Detection and Response (DnR) Data Science team set out to define a method that can peek into the future to predict and recommend just the right (and least) set of permissions to employees so that they can do their jobs and be productive, while increasing security for your organization. We do that by monitoring the permissions used and exercised by all users of an organization, and then using this data to predict/recommend what subset of permissions a new user or any user will likely need based on their activity. 

Sound familiar? Think of it like watching and ranking your favorite shows on a streaming service. The more you watch or rank, slowly but surely, the service will be able to determine your preferences and make smarter recommendations. Only, instead of the latest shows, we have permissions and permission sets, and instead of explicitly ranking movies, we monitor activity of usage. And all of this is automatic. So, instead of manually deciding and assigning permissions and permission sets to users, an admin can binge on the new season of “Stranger Things”. Not during working hours, of course!

Let’s take a look under the hood to see how this is accomplished. 

First of all, the guiding principle we use to assign permissions is known as the principle of ‘least privilege’. Following this principle can increase security in your org. In the event of an adversary taking over an account, the “blast radius” is limited to the access privileges/permissions that are associated with that particular account. 

Second, we rely on a technique called collaborative filtering, which is primarily used by recommendation systems to learn from past user behavior and ‘recommend’ artifacts to new/existing users. The idea is to collaboratively filter out permissions and permission sets that a user might need to be granted on the basis of permission exercising activities by similar users.

Ready to dig a bit deeper? 

Given permissions usage data for all users in an org, for say the last month, we represent this activity using a matrix A of dimensions M x N. Here M is the number of users in the org and N is the number of distinct permissions or permission sets. The entries of this matrix take on values that are equal to the number of times a particular permission was exercised by a particular user. 

This matrix A is then broken up into two smaller matrices that capture critical information present in the matrix A. This breaking up process is also referred to as factorization, and there are a number of techniques to perform this factorization; ALS (Alternating Least Squares), NMF (non-negative matrix factorization), etc.. We end up with two matrices F (Mxk) and G (kxN) that can be combined to form a new matrix A_recon.

Figure 1: Matrix factorization to get two matrices representing users and permissions

From this matrix A_recon, we obtain a sorted list of permissions that are likely to be accessed by each user (represented by rows). What is unique about the collaborative filtering approach is that it takes into account both the user’s activity patterns, and the patterns of their peers; similar-minded users’ activities. Instead of manually assigning users into user groups/roles and provisioning permissions based on user roles, the matrix factorization part takes care of that.

So how does this translate into actual provisioning of permissions? 

Consider the user ‘u’. Let {Puh} represent the set of permissions exercised by user ‘u’ in their history. Let {Pur} represent the set of recommended permissions anticipated by the collaborative filtering model that has not been exercised by the user yet. The union of these two sets {Puh} U {Pur} is the complete set of permissions that are likely sufficient for the user. This set is compared against the set of granted permissions {Pg} for that user. The set difference between {Pg} and {{Puh} U {Pur}} will give us the set of permissions that the user likely does not need. We refer to this set as {Plockdown}, and can turn off these permissions for the user, in line with the least-privilege principle. 

We can also derive another set from the recommended permissions for which the user does not have access yet. The model predicts these permissions are likely to be useful for the user and we can recommend the admin to grant these additional permissions {Pgrant}. For instance, when a new user joins the organization, the admin already, and automatically, has a set of permissions for them! This method can be run periodically to cater to different users’ evolving roles or scope of responsibility.

The idea sounds neat, but how does it do in the real world? 

We evaluated this method on permission usage data collected from a Salesforce-owned org over a two week period. This particular org has 6,128 distinct users and 286 distinct permissions. 

We found that a majority of permissions granted to the users are rarely exercised if exercised at all. Hence, the matrix that we formed happened to be very sparse (lots of zeros, a good thing!). We saw that by recommending top-20 percentile permissions for all users, we are able to cover 98.60% of observed activity of the users. That is, the recommendation model is able to anticipate the permissions required by the users in the future 98.60% times. We managed to significantly reduce the set of permissions that were actually assigned to different users, tightening the security of this org. 

At Salesforce, we build Trust and Security into our products and platforms — and the same goes for exploring new ways to make them even more secure! Stay tuned as we continue to expand our experimentation with this method to cover custom perms and larger orgs and explore the latest tools and resources designed to empower you to be an #AwesomeAdmin!

This post was authored by members of Salesforce's Detection and Response (DnR) Data Science Team:

Regunathan (Regu) Raadhakrishnan, a principal data scientist on the applied science team working on conversational AI for Service Cloud Einstein. Before this role, he was with the Detection and Response (DnR) data science team working on unsupervised models to detect malicious threats. He got his Ph.D in Electrical Engineering from NYU and has authored a book on video summarization. He has written several book chapters, journal papers and conference papers and has over 34 granted patents.

Vijay Erramilli, a principal data scientist in the DnR Data Science team where he has been for the past 4 years, working on data-driven solutions that involve advanced statistics, machine learning and deep learning to various security problems. He has published award-winning papers (with over 2900 citations) at top conferences and journals, given talks at venues around the world, and has multiple patents. He graduated from BU with a PhD in Computer Science.

Ping Yan, Director, Data Science, spent a decade innovating ways of making sense of data in various domains, from consumer behavior modeling to algorithmic security threat detection. Her works were published as journal articles, monographs and books. Ping holds a Ph.D. in Management Information System from the University of Arizona. Ping has spoken at various Data Science and InfoSec conferences such as WITS, CanSecWest, OWASP AppSec, and Spark+AI Summits.

推荐的故事