As the use of Salesforce continues to grow, so does the importance of understanding and managing Data Skew. Data skew is a situation where a large number of records in Salesforce are owned by a single user, leading to an imbalanced distribution of data. This can have serious consequences, such as slowing down performance and causing system errors.
Data Skew occurs due to the odd or non-uniform distribution of the data set when we have a large number of records. Data skew in Salesforce is also a similar performance issue when more than 10000 records of one object are assigned to a single user or more than 10000 child records are tagged to a single parent record.
For example, if there is an account record and the same account has more than 10000 child contact records, this will lead to performance issues due to the data integrity constraints. Hence this is a Data Skew issue.
Let’s see what the different types of Data Skew issues in Salesforce are. There are 3 types of Data Skew in Salesforce.
- Account Skew
- Ownership Skew
- Lookup Skew
1. Account Skew
Account Skew is nothing but a Data Skew when many child records are tagged to a single account record. If you have more than 10000 contact records or opportunity records present under a single account record, this will lead to Account Skew issues.
a. Record Locking
Normally, when you update your contact or opportunity record in Salesforce, your parent account record will also get locked to update the changes in the parent record. Now, when you update multiple contacts or opportunities together via trigger update or bulk data loading, the single parent account will have problems due to the record lock because of the data integrity constraints. This will end up in a Data Skew issue, and the bulk update process will not happen as a result of this.
b. Sharing Problems
When you have a huge set of child records associated with a single account record, you will face issues with sharing rule recalculations. For example, if we change the account from one region to another or add one user to the existing account team, then from the back end, Salesforce has to rework complete sharing conditions for all child records as well. This will lead to many sharing recalculations and eventually result in performance problems.
How Can We Avoid Account Skew?
You will be knowing the answer by this time on how we can avoid it. Yes, we need to have a uniform distribution of child records across parent records. It should not be allowed to accumulate all child records to a single parent.
2. Ownership Skew
Ownership Data Skew in Salesforce refers to an unequal distribution of record ownership among users within an organization. This happens when a considerable number of records are allocated to a single user or a limited group of users, leading to a skewed distribution of data.
For example, when a single user owns more than 10000 records, this will create an Ownership Skew. This is a common issue in Salesforce, as we tend to put all inactive or legacy data sets to a common generic system user.
How Can We Avoid Ownership Skew?
- We should focus on applying even distribution of record ownership to multiple users rather than assigning all ownership to a single user.
- If you want to assign all inactive records under a single user, then make sure that the user should not have any role assigned. This will not completely reduce the Data Skew issue, but it will generate better performance.
- If the user must have a role, then the user should be assigned to the top role. This will prevent the user from being passed around the role hierarchy.
- Ensure that the user is not part of any public groups that act as a sourcing for sharing rules.
3. Lookup Skew
The underlying technical problem for Account Skew and Lookup Skew is similar. Here, when we have more than 10000 child records associated with a single lookup parent record, the performance errors occur because of Lookup Skew. You will have issues because of the record lock and will end up not updating the child records. Also, sharing recalculation will take a longer time, and this will create a lot of performance issues.
How Can We Avoid Lookup Skew?
Salesforce suggests not having thousands of records tagged to a single parent record. We need to ensure that all unnecessary workflows, flow, or process builders are removed from the object, as it will lead to huge performance problems. Also, we should ensure that our apex triggers or classes follow best practices to optimize performance issues.
Summary
It is vital that you do not have more than 10000 records owned under a single user, and you should not have more than 10000 child records for a single parent. We should make this point a bible rule while designing and developing Salesforce solutions.
If you still need clarification regarding the topic, feel free to get in touch with seasoned Salesforce professionals via the saasguru Slack Community.
Take that first step towards your Salesforce career, enrol in our online Salesforce Admin course and get certified on your first attempt. Get personalized study plans, free mock exams, quizzes, flashcards, and more.
Frequently Asked Questions (FAQs)
1. What are the impact of data skew in Salesforce?
- Performance Degradation: Data skew can hinder Salesforce’s efficiency. For example, when a single parent has over 10,000 child records, it can compromise performance due to data consistency checks.
- System Glitches: An uneven data distribution can lead to system malfunctions.
- Record Lock Issues: Concurrent updates to numerous child records can lock the parent record because of data consistency checks, obstructing mass update operations.
- Sharing Complications: Associating a vast number of child records with one account record can disrupt sharing rule recalculations, causing performance setbacks.
2. What are some tools and resources that I can use to identify and prevent data skew in Salesforce?
Salesforce provides several diagnostic and monitoring tools to pinpoint performance challenges. The Salesforce Optimizer is a handy tool to evaluate your Salesforce setup’s health.
3. What are some examples of data skew in Salesforce and how to fix them?
- Account Imbalance: This occurs when a single account record is linked to an excessive number of child records (over 10,000). A solution is to evenly distribute child records among parent records.
- Ownership Disproportion: This happens when one user possesses more than 10,000 records. To rectify this, spread record ownership among various users, avoiding concentration under one user.
- Lookup Imbalance: This arises when over 10,000 child records relate to one lookup parent record. Salesforce recommends evenly distributing records and eliminating redundant workflows or process builders from the object.
4. What are the implications of data skew for Salesforce reporting and analytics?
Data skew can distort reports and analytics. For example, if records are predominantly owned by one user, user-based reports might be unbalanced. Moreover, the performance challenges stemming from data skew can delay the generation of reports and analytical tasks.
5. What are some common mistakes that people make when managing data skew in Salesforce?
- Child Record Pile-up: Permitting all child records to cluster under one parent, resulting in Account Imbalance.
- Centralized Record Ownership: Allocating a vast volume of records (especially dormant or old datasets) to a standard system user, leading to Ownership Disproportion.
- Neglecting Best Practices: Failing to ensure that apex triggers, workflows, and other elements adhere to best practices can intensify Lookup Skew-related performance challenges.