Presentation on theme: "Bulk Data API Nick Simha Technical Alliance Manager."— Presentation transcript:
Bulk Data API Nick Simha Technical Alliance Manager
Agenda Bulk Data API basics Demo Best practices Resource list
What is the Bulk Data API? REST based, asynchronous API optimized for loading large sets of data.
Why is it useful? Enable high volume integration with Salesforce (volume) Enable integration that has to finish in a certain window of time (speed) Part of suite of features that enable our customers to store very large data volumes in Salesforce –Batch Apex –Skinny tables –Divisions –Custom Indexing –Etc.
How does the Bulk Data API works Client Send 10,000 CSV rows in a batch over HTTPS SFDC Stream file to temp storage and return ID to client SFDC Grab file and save to database in smaller batches SFDC Generate result file and store in temp storage Loop until all records sent (e.g. 50 times for 500k rows) Loop until all files processed Decoupled phases. One doesn't wait for the other, each can run in parallel
How can I call the Bulk Data API? Through Data Loader From any Web Services Client – Java, C# etc. From the command line! Support by our integration partners
Using the Bulk API from a client Create Job Create Batch (es) and add to Job –Number of batches determined by the amount of data and the limits on batch size. Close Job Retrieve Batch Status Retrieve Batch Result
Sample Request POST /services/async/17.0/job HTTP/1.1 User-Agent: curl/7.19.6 (i386-pc-win32) libcurl/7.19.6 OpenSSL/09.8k zlib/1.2.3 Host: na6.salesforce.com Accept: */* X-SFDC-Session:00D80000000MD0n!AQgAQI1EfPPEyWvwuaD_IRpvSlrwm7Kr00e Content-Type: application/xml; charset=UTF-8 Content-Length: 195 insert Contact CSV See Getting Started Chapter in the API guide. Use curl –trace-ascii to capture messages.
Demo Load 100K Addresses of medical providers Cleanse the data
Bulk API - Some Additional Information Can Monitor Bulk Loads in Builder –Monitoring -> Bulk Data Load Jobs Doesnt handle attachments Governor limits – 500 batches per 24 hour limit. 10,000 records per batch. So theoretical limit of 5M records per day. –Caveat – Batch size also needs to be less than 10MB Batch limits can be increased by engineering –Need to contact your SE
Best practices Combine Bulk API with Batch Apex to get optimal performance –Faster than complex triggers –Similar to the demo – this is a generic pattern that you can use in many scenarios Stick with parallel processing unless there is a reason not to do so –See FAQ for scenarios when you would serial processing Handling Very Large Data Volumes requires a comprehensive, holistic approach –Bulk API is one part of the solution
Resource list Bulk API doc http://www.salesforce.com/us/developer/docs/api_asynch/index.htm http://www.salesforce.com/us/developer/docs/api_asynch/index.htm