Key Take Away : Sequencing Load Operations

Here some highlight after I have done reading on
Extreme Data Loading, Part 4: Sequencing Load Operations

Configuring Your Organization for the Data Load

1) To consider enable the parallel recalculation

To enable this feature contact Salesforce Support

2) Create Role Hierarchy
3) Load users, assigning them to appropriate roles
4) Configure Public Read/Write organization-wide sharing defaults on the objects you plan to load. Note : some org may have so much data that changing sharing rules from Public to Private takes very long time, therefore just load the data with sharing model that you will use in Production.

Preparing to Load Data

1) Ensure clean data especially foreign key relationship, note that parallel loads switchs to single execution mode, slowing down the load considerably.
2) Suspend events that fire on insert
3) Perform advance testing to tune batch size for throughput.For both the Bulk API and the SOAP API, look for the largest batch size that is possible without generating network timeouts from large records, or from additional processing on inserts or updates that can’t be deferred until after the load completes.

Executing the Data Load

1) Use fastest operation possible : insert is faster than upsert. insert + update can be faster than upsert
2) When Update only load field that have changed for existing record
3) Group child records by Parent Id , if you use separate batched don't reference the same Parent Id. Why? This can reduce risk of record-locking errors.If this cannot be done, you can choose option of using the Bulk API in serial execution mode to avoid locking from parallel updates.

Configuring Organization for Production

1) Defer sharing calculation before performing some or all of the operations below ; depending on the results of your sandbox.
2) Change Public Read/Write OWD to Public Read Or Private appropriately
3) Create or configure public groups / queue [todo configure to what?]
4) Configure sharing rules
5) If Sharing Calculation is not deferred, create public groups, queues, and sharing rules one at a time, and allow sharing calculations to complete before moving on to the next one.
Note :That this will consume long time when resume sharing calculation for large data.
6) Resume events such insert trigger so validation/data enhancement process (Flow,Workflow etc) can run properly in Production


Popular Posts