Category Archives: HubSpot

HubSpot Synchronization and Deleted Leads

Don’t do this at home.

Our org had about 300,000 Salesforce Leads and around 200,000 HubSpot contacts. The Salesforce Leads had accumulated from 2006 and came from many sources, including previous Marketing Automation systems such as Marketo and Eloqua. For a variety of reasons, we decided to purge from Salesforce the obsolete Leads.

So, here’s what happened:

  • Run Data Loader mass delete job in Salesforce
  • Expect HubSpot to delete the corresponding contacts so the systems stay in sync
  • Discover over the course of a few months that HubSpot emails sent to reps on contact activity (like filling out a form) had broken links to the SFDC lead
  • Moderately annoying to the reps but at some point, sales insisted this be fixed


The big reveal was that HubSpot doesn’t delete HubSpot contacts associated with SFDC leads if those contacts aren’t in the SFDC inclusion list!

Here’s an example:

  • Contact comes into HubSpot from some form.
  • HubSpot adds contact to inclusion list (because form completed).
  • HubSpot contact syncs to Salesforce as a new Lead.
  • Sales marks lead with status Bogus Data.
  • HubSpot picks up change in status on next sync but ….
  • Because inclusion list rules say, exclude Status = Bogus Data, the contact is removed from the inclusion list. HubSpot maintains, as a property of the contact, the SFDC Lead Id in case the contact re-enters the inclusion list filter.
  • Salesforce mass delete removes the Lead.
  • Because the SFDC Lead is no longer in the inclusion list, the delete event is not recognized by HubSpot and contact remains in HubSpot. The HubSpot contact is an orphan from the point of view of synchronization.

We’re still exploring this but I believe the conceptual answer should be:

  1. If the HubSpot contact is ever sync’d to Salesforce, the contact should remain in the inclusion list. Thus, deletes from the Salesforce side will be deleted in HubSpot.
  2. Use HubSpot smart lists to filter out contacts marked as disqualified or bogus data or otherwise not worth engaging in new campaigns.
  3. Run periodic (monthly) Salesforce batch jobs to delete Leads that are bogus/spam after x days of existence
  4. Don’t delete from HubSpot as a HubSpot delete won’t delete in Salesforce, leaving you with an unbalanced system
  5. Make your inclusion list rules succinct – they need to be readable on one page, without scrolling

Rapid compare of two lists in Excel

The business problem was to mass delete Leads from Salesforce to cause a corresponding mass delete from HubSpot. The business wanted to know how many of the planned SFDC deletions were in the HubSpot SFDC sync inclusion list. The lists were very large (> 100,000)

  1. Export SFDC leads to be deleted – with email column in export
  2. Export HubSpot contacts in the SFDC integration settings inclusion list
  3. Create a new Excel workbook and place the Hubspot emails into column A and the SFDC emails in column B
  4. In column C, Write a VLOOKUP of value in column B to see if in column A, if not, error
  5. Count the non error cells in column C

Not so fast pard’ner!

Literally, Excel VLOOKUP exact match (4th argument set to false) is really sloooooowwwwwww on large spreadsheets. So, instead, you have to use VLOOKUP twice but with approximate matching and sorted lists.

Step 1 – sort (only) Column A, then sort (only) Column B
Step 2 – in cell C2, the Excel formula is (remember – HubSpot data is in column A, SFDC data in column B):

=IF(VLOOKUP(B2,$A:$A,1,TRUE)=B2, VLOOKUP(B2,$A:$A,1,TRUE), NA()) and then copy the formula down for all rows in column B (SFDC leads).

This runs lightning fast (the alternative VLOOKUP exact match would be in the minutes on a 64 bit quad processor 32 GB Windows 7 machine).

Why does this work so well?
When VLOOKUP uses approximate matching on sorted lists, it stops once it finds the match (or the very next value). So, the IF condition tests to see if VLOOKUP returns the same value as the lookup key, if yes, the true condition simply returns the lookup results again – because we know it was an exact match. If false, return the #N/A value because we know it was not an equal match.

Now, I could have done this faster with the following:

=IF(VLOOKUP(B2,$A:$A,1,TRUE)=B2, B2, NA())

This is because the result column is the same as the lookup column. The second VLOOKUP can be used to return any column from the search array by varying the third argument. I leave the two VLOOKUPs in for future proofing the general fast VLOOKUP technique on other tables.