Fixing and Normalising VCF files


If you use VCF card files, which are contact data for people that is usually seen in Outlook, but not limited to Outlook, they are a universal way of storing contact data, the file can be one person or many people.

Note : This problem only occurs if you are using iCloud Contacts for your Contacts, this does not occur if you use Google Contacts.

This is an example of a single contact in a VCF file as you can see the contacts start with "BEGIN: VCARD" and end with "END:VCARD"

BEGIN:VCARD
VERSION:3.0
PRODID:-//Apple Inc.//iOS 15.6.1//EN
N:Buttered;Toast;;;
FN:Buttered Toast
ORG:Breakfast Inc.;
TITLE:Chief Crunch Officer
EMAIL;type=INTERNET;type=HOME;type=pref:buttered.toast@breakfast.com
TEL;type=WORK;type=VOICE;type=pref:123-456-7890
TEL;type=CELL;type=VOICE:098-765-4321
item1.ADR;type=WORK;type=pref:;;123 Cereal St.;Bowl City;BC;12345;USA
NOTE:Always ready to serve with a smile.
REV:2024-08-15T10:00:00Z
END:VCARD

If you use iCloud contacts which is limited to iPhones and iOS devices, then iCloud is not very efficient at keeping your contacts normalised, so this key section from that card is this section:

PRODID:-//Apple Inc.//iOS 15.6.1//EN

This is the contact for iOS 15.6.1 however it is not uncommon for you to have the same contact over and over again when you go though your updates of iOS version if you use iCloud contacts, so that could look something like this:

PRODID:-//Apple Inc.//iOS 15.8.1//EN
PRODID:-//Apple Inc.//iOS 16.0.1//EN
PRODID:-//Apple Inc.//iOS 16.2.1//EN
PRODID:-//Apple Inc.//iOS 16.6.1//EN
PRODID:-//Apple Inc.//iOS 17.0.1//EN

This means for the same contact you now have 6 entries, the original one from iOS 15.6.1 then the additional ones from the new iOS version, not all iOS version will cause this issue, but in the example you now have 6 x Buttered Toast contacts - all with the same date excluding the PRODID field.

I am not sure about you, but I only want one "Buttered Toast" in my contact list, and if this is left unchecked you can end up with all your contacts being duplicated over and over again, then problem is some of the contacts do not have all the correct data in them which means you have mismatched contacts, then furthermore if you sync this to Outlook or Outlook iOS you problems get worse.

Outlook will also add this to the VCF file when it cannot update them, making a read only contact in addition to the other faulty contacts:

NOTE:This contact is read-only. To make changes\, tap the link above to edi t in Outlook.\n\n\n\n\n\n
item2.URL;type=pref:ms-outlook://people/v2/bb8ecdacedbb8c0e245b26d608623d5e
 4b1b1577c9571aeea6c7a750fecaf4ab2de58221c4efaf35ed01ab2d8c50dbef2c05e0fa3b1
 f730134a4631cd271a9de5879e6b7a36f572619618539bc9156f0c68f52526617d7e0dfbc82
 c3f3072c5c5199ee1053d27e094a807e911c0a2f19?accountKey=602e4397785ac01401d38
 c99b661cc1fa81457adb5025774ea5dac9d9474dacc&checksum=81dbf27ca8761b8b4274
item2.X-ABLabel:Outlook

This is not ideal, so as an example lets take a sample VCF with multiple contacts inside, to make this its just a single one stacked on other ones with the BEGIN:CARD and END:VCARD as below that will look like this, the start and end are in red boxes below:


This VCF file is normal but lets say it had years of neglect and iOS duplication and you want to remove all the duplication and normalise the data so you have one contact not multiple.

Script : VCFContactFixer.ps1

# Define the input and output VCF file paths
$inputVCF = "CurrentContacts.vcf"
$outputVCF = "UpdatedContacts.vcf"
$reportFile = "Report.txt"

# Initialize counters for the report
$totalContacts = 0
$duplicateContacts = 0
$mergedContacts = 0
$skippedContacts = 0

# Function to merge two contacts by combining their unique fields
function Merge-Contacts {
    param (
        [string]$contact1,
       [string]$contact2
    )

# Split each contact into individual lines
    $lines1 = $contact1 -split "`r`n" | Where-Object { $_.Trim() -ne "" }
    $lines2 = $contact2 -split "`r`n" | Where-Object { $_.Trim() -ne "" }

# Use a hashtable to store lines from both contacts, avoiding duplicates
  $mergedLines = @{}

# Add all unique lines from the first contact
    foreach ($line in $lines1) {
        if (-not [string]::IsNullOrWhiteSpace($line)) {
            $mergedLines[$line] = $true
        }
    }

# Add unique lines from the second contact if they don't already exist
    foreach ($line in $lines2) {
        if (-not [string]::IsNullOrWhiteSpace($line) -and -not $mergedLines.ContainsKey($line)) {
            $mergedLines[$line] = $true
        }
    }

# Reconstruct the merged contact by sorting and joining the line
  $mergedContact = ($mergedLines.Keys | Sort-Object | Out-String) -replace "`n", "`r`n"

# Ensure proper VCF format by appending END:VCARD
    return $mergedContact.Trim() + "`r`nEND:VCARD"
}

# Read the entire VCF file content as a single string
$vcfContent = Get-Content $inputVCF -Raw

# Split the content into individual contacts, ensuring to preserve BEGIN and END markers
$contacts = $vcfContent -split "(?<=END:VCARD)`r`n" | Where-Object { $_.Trim() -ne "" }

# Append BEGIN:VCARD to each contact to ensure correct formatting
$contacts = $contacts | ForEach-Object { $_.Trim() + "`r`nBEGIN:VCARD" }

# Initialize a hashtable to store unique contacts by their full name
$uniqueContacts = @{}
foreach ($contact in $contacts) {

# Skip processing if the contact is empty
    if ([string]::IsNullOrWhiteSpace($contact)) {
        continue
    }

# Extract the contact's full name (FN) to use as a unique identifier
  try {
        if ($contact -match "FN:(.*?)(?=`r`n|$)") {
            $fullName = $matches[1].Trim()
        } else {
            throw "Full name (FN) not found in contact."
        }
        if ([string]::IsNullOrWhiteSpace($fullName)) {
            throw "Full name is empty."
        }
    } catch {

# Log warnings for errors in extracting full name and increment skipped contacts counter
        Write-Warning "Skipping contact due to extraction error: $_"
        $skippedContacts++
        continue
    }

# If a contact with this full name already exists, merge the contacts
    if ($uniqueContacts.ContainsKey($fullName)) {
        $existingContact = $uniqueContacts[$fullName]

# Merge the existing contact with the new contact
        $mergedContact = Merge-Contacts $existingContact $contact
        $uniqueContacts[$fullName] = $mergedContact
        $duplicateContacts++
        $mergedContacts++
    } else {

# Add the new contact to the hashtable
        $uniqueContacts[$fullName] = $contact
    }

# Increment the total contacts counter
  $totalContacts++
}

# Write the unique and normalized contacts to the output VCF file
$uniqueContacts.Values | Set-Content $outputVCF -Encoding UTF8

# Generate a report summarizing the normalization process
$report = @"
Normalization Report
=====================
Total Contacts Processed: $totalContacts
Duplicate Contacts Found: $duplicateContacts
Contacts Merged: $mergedContacts
Contacts Skipped: $skippedContacts
Unique Contacts Saved: $($uniqueContacts.Count)
Output File: $outputVCF
"@

# Write the report to a text file
$report | Set-Content $reportFile -Encoding UTF8

# Display the report on the console
Write-Host $report

# Notify that the normalization process is complete
Write-Host "Contacts have been normalized and saved to $outputVCF"
Write-Host "Normalization report saved to $reportFile"

If we take an example VCF file with "zero" corruption or "duplication" and run this script against it nothing should happen......lets see....

Excellent so that is a good test, now lets duplicate some contacts and try the test again, and what we except is that the duplication is removed and saved to the FixedContacts.vcf.....

Note : In this example we have duplicated Butter Toast many times (23 to be precise)


Sweet, here you can see we have 28 contacts, of which 23 included the Butter Toast as a duplication contact, so the script has removed those and was outputted is the original 5 contacts without duplication.

Previous Post Next Post

نموذج الاتصال