If you have two files, in this example we have 2 files and the goal here is to take two files formatted as CSV files in this instance we have the original list and new list (maybe a fresh export) and then from that we want only new entries in a separate file.
full-list.csv - A full export list of dataupdated-list.csv - A delta list that should ideally contain new people
We therefore want Powershell to give us a new file that is called filtered-list.csv that contains only new people from the update-list.csv when compared to the full-list.csv, this is how I accomplished this:
# Define the file paths
$fullListPath = "full-list.csv"
$updateListPath = "updated-list.csv"
$outputPath = "filtered-list.csv"
# Import the CSV files
$fullList = Import-Csv -Path $fullListPath
$updateList = Import-Csv -Path $updateListPath
# Assume the common column is 'Email' (change as needed)
$commonColumn = "Email"
# Check if the common column exists in both files
if (-not $fullList[0].PSObject.Properties.Match($commonColumn)) {
Write-Error "The column '$commonColumn' does not exist in the full list file."
exit
}
if (-not $updateList[0].PSObject.Properties.Match($commonColumn)) {
Write-Error "The column '$commonColumn' does not exist in the update list file."
exit
}
# Create a hashset of the emails from the full list for quick lookup
$fullListEmails = @{}
foreach ($entry in $fullList) {
if ($entry.$commonColumn -and $entry.$commonColumn.Trim() -ne "") {
$fullListEmails[$entry.$commonColumn.Trim()] = $true
} else {
Write-Warning "Skipping an entry in 'fulllist.csv' with missing or empty '$commonColumn'."
}
}
# Filter the update list to include only the new entries that are in the update list but not in the full list
$newEntries = $updateList | Where-Object {
$_.$commonColumn -and $_.$commonColumn.Trim() -ne "" -and -not $fullListEmails.ContainsKey($_.$commonColumn.Trim())
}
# Export the new entries to a new CSV file
$newEntries | Export-Csv -Path $outputPath -NoTypeInformation
Write-Output "Filtered list saved to $outputPath"