Scripting : Searching Website for Content

Well, this is a fun one, I have been on a mission to find a new car for a while and I thought I could get scripting make it easier to find a car from a manufacturers website, mainly the used car market and in this example I will be using BMW (looking but never buying one, as the search is the interesting part)

This script will look for cars that meet the requirement I give it, however lets start for testing with a simple script, this will involve searching for the word "harmon" on all the cars returned in the search as a test as that is a option we can filter for on the BMW website.

The links I have used are live links so they will change as cars and sold and acquired so the link is not important, however BMW give you certain options that are searchable but the ones I want are not, hence the script for this activity, but for now on with "harmon"

Preflight

Before you can run this script you will require some prerequisites downloaded and loaded into Powershell this is a one time command and this is it:

Install-Module -Name PowerShellGet -Force -SkipPublisherCheck
Install-Module -Name Selenium -Force

Preflight : ChromeDriver

ChromeDriver is now you test from scripting using Chrome for can get this from this link here

When you click that link in the new window you want to find the latest "stable" release which at the time of this article is v127.0.6533.72 as below, then click on the stable hyperlink:


We are then looking for the "chromedriver" for "win64", ensure you get a HTTP 200 and its in green as below then copy the link and download it which in this example is the link here


Once downloaded extract the files to a folder, in this example I will use the c:\ChromeDriver that should look like this:


Overview : Chrome-CarSearch.ps1 Runtime Logic

This uses ChromeDriver to start a session and then like with web transaction it uses the website to navigate around the website based on the information you provide it, so you need to understand elements and HTML for this to be customisable.

Note : This script works with a single car or multiple cars however I have not had more than one page of cars, if you have multiple pages you will need to add that the logic of this script!

Chrome Driver v Chrome Versions

Note : Please ensure the installed version of Chrome matches the ChromeDriver version, I had Chrome v120.x installed and the v127.x ChromeDriver and that failed to work, so ensure they are current on the stable branch.

To check the version of ChromeDriver from a Powershell window type the command:

ChromeDriver --version

This will tell you the version of ChromeDriver as below:


Then start Chrome and the click the "three dots" then Help > About Chrome then check the version matches as close as possible:



Define you Script Goals

In this example the goal is simple and outlined below in steps:

  1. Open the BMW Used Car website with the pre-populated search results
  2. Click the Accept Cookies button (damn Cookies in the EU)
  3. Scan the website
  4. Click the "View Specs" button
  5. In the specification look for the word "harmon"
  6. If the word is/is not found close the "view specs" button with the cross 
  7. If the word is found : Extract HREF from view details
  8. If the word is not found : Ignore that car on the website
  9. Provide the full URL for the cars that match they keyword

Pre-crafted URL 

This is something the website does, when you visit the website you get a URL called the base URL and then as you add your options on the URL changes from the base URL to this:

connectivity=DAB&connectivity=USB&connectivity=SAT_NAV&distance=600&drive=PDC&drive=REVERSE_CAMERA_ASS&exterior=XENON&exterior=ELEC_FOLD_MIRRORS&interior=HEATED_SEATS&interior=LUMBAR_SUPPORT&interior=HEAD_UP_DISPLAY&max_supplied_price=22000&source=home&transmission=Automatic

These will be the options you select from this website this will form you pre-crafted URL and this will need to be added to the base URL with a "?" - so as an example, which is not done in code but is required for the code.

https://scan.bear.local/results/?<pre-crafted-url>

Script : Chrome-CarSearch-Test.ps1

# Load Selenium Module
Import-Module Selenium

# Define log file path
$logFilePath = "log.txt"

# Function to log messages to file and console
function Log-Message {
    param (
        [string]$Message
    )
    $Message | Out-File -FilePath $logFilePath -Append
    Write-Host $Message
}

# Define ChromeOptions (if needed)
$ChromeOptions = New-Object OpenQA.Selenium.Chrome.ChromeOptions

# Specify the path to the ChromeDriver executable
$ChromeDriverPath = "C:\ChromeDriver"
$ChromeDriverService =
[OpenQA.Selenium.Chrome.ChromeDriverService]::CreateDefaultService($ChromeDriverPath)

# Start the WebDriver for Chrome
$Driver = New-Object OpenQA.Selenium.Chrome.ChromeDriver($ChromeDriverService, $ChromeOptions)

# Open the URL
$Driver.Navigate().GoToUrl('<precrafted_url>')

# Wait for the page to load
Start-Sleep -Seconds 10
Log-Message "Navigated to URL."

# Accept cookies by clicking the button with the specified class and span text
try {
    $acceptCookiesButton = $Driver.FindElementByXPath("//button[@class='accept-button button-primary']//span[contains(text(),'Accept all')]")
    if ($acceptCookiesButton -ne $null) {
        $acceptCookiesButton.Click()
        Start-Sleep -Seconds 5
        Log-Message "Accepted cookies."
    }
} catch {
    Log-Message "No cookie consent button found or failed to click: $_"
}

# Find all 'View specs' <span> elements
$viewSpecsSpans = $Driver.FindElementsByXPath("//span[contains(text(),'View specs')]")

# Initialize an array to store matching URLs
$matchingCarUrls = @()

# Iterate over each 'View specs' span and check specifications
foreach ($span in $viewSpecsSpans) {
    try {

# Click on the 'View specs' span
        Log-Message "Clicking on View specs..."
        $span.Click()

# Wait for the modal to appear and load
        Start-Sleep -Seconds 5

#Check if 'Harman' is mentioned in <li> elements (case-insensitive)
        $liElements = $Driver.FindElementsByXPath("//li[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'harman')]")
        if ($liElements.Count -gt 0) {
            Log-Message "Harman found in specifications."

# Find the 'View details' button and extract the URL
            $viewDetailsButton = $Driver.FindElementByXPath("//a[contains(span/text(),'View details')]")
            if ($viewDetailsButton -ne $null) {

# Get the href attribute and construct the full URL
                $href = $viewDetailsButton.GetAttribute('href')
                $fullUrl = $href
                Log-Message "Found View details button with URL: $fullUrl"
                $matchingCarUrls += $fullUrl
            } else {

           Log-Message "View details button not found."
            }
        } else {
            Log-Message "Harman not found in specifications."
        }

# Close the modal by clicking the close button
        $closeButton = $Driver.FindElementByXPath("//button[contains(@class,'uvl-c-modal__close')]")
        if ($closeButton -ne $null) {
            Log-Message "Closing the modal..."
            $closeButton.Click()
        } else {
            Log-Message "Close button not found."
        }

# Wait for the close action to complete
        Start-Sleep -Seconds 2
    } catch {
        Log-Message "An error occurred while processing details: $_"
    }
}

# Print all matching car URLs
if ($matchingCarUrls.Count -gt 0) {
   Log-Message "Matching URLs:"
    foreach ($url in $matchingCarUrls) {
        Log-Message $url
    }
} else {
    Log-Message "No matching URLs found."
}

# Close the browser
$Driver.Quit()
Log-Message "Browser closed."

This is what that results look like and there as you can see we have some magical links that take us to the car directly.....


The log file will also capture this as well, but that will also include additional errors.

Mission control : Live Search 🔍 

So far, we have tested on the outcome with a car that has the keyword included, It’s good that it works, but we now need to do the live test with the actual specification required.

I have updated this from one keyword to multiple keywords to make sure the correct results come back, the keywords use the condition “or” this means any one of the keywords can be present to correctly return the results.

Script : Chrome-CarSearch.ps1

# Load Selenium Module
Import-Module Selenium

# Define log file path
$logFilePath = "C:\Quarantine\CarSearch\log.txt"

# Function to log messages to file and console
function Log-Message {
    param (
        [string]$Message
    )
    $Message | Out-File -FilePath $logFilePath -Append
    Write-Host $Message
}

# Define ChromeOptions (if needed)
$ChromeOptions = New-Object OpenQA.Selenium.Chrome.ChromeOptions

# Specify the path to the ChromeDriver executable
$ChromeDriverPath = "C:\ChromeDriver"
$ChromeDriverService = [OpenQA.Selenium.Chrome.ChromeDriverService]::CreateDefaultService($ChromeDriverPath)

# Start the WebDriver for Chrome
$Driver = New-Object OpenQA.Selenium.Chrome.ChromeDriver($ChromeDriverService, $ChromeOptions)

# Open the URL for cookie preferences
$Driver.Navigate().GoToUrl('https://usedcars.bmw.co.uk/eprivacy/')
Start-Sleep -Seconds 10
Log-Message "Navigated to the cookie preferences URL."

# Click the "Reject" button
try {
    $rejectButton = $Driver.FindElementByXPath("//button[@class='reject-button button-primary']//span[contains(text(),'Reject')]")
    if ($rejectButton -ne $null) {
        $rejectButton.Click()
        Start-Sleep -Seconds 5
        Log-Message "Clicked Reject button."
    }
} catch {
    Log-Message "No Reject button found or failed to click: $_"
}

# Navigate to the original URL
$Driver.Navigate().GoToUrl('https://usedcars.bmw.co.uk/result/?connectivity=DAB&connectivity=USB&connectivity=SAT_NAV&distance=600&drive=PDC&drive=REVERSE_CAMERA_ASS&exterior=XENON&exterior=ELEC_FOLD_MIRRORS&interior=HEATED_SEATS&interior=LUMBAR_SUPPORT&interior=HEAD_UP_DISPLAY&max_supplied_price=22000&source=home&transmission=Automatic')
Start-Sleep -Seconds 10
Log-Message "Navigated to the original URL."

# Find all 'View specs' <span> elements
$viewSpecsSpans = $Driver.FindElementsByXPath("//span[contains(text(),'View specs')]")

# Initialize an array to store matching URLs
$matchingCarUrls = @()

# Keywords to search for
$keywords = @("active cruise control", "driver assistance professional")

# Iterate over each 'View specs' span and check specifications
foreach ($span in $viewSpecsSpans) {
    try {
        # Click on the 'View specs' span
        Log-Message "Clicking on View specs..."
        $span.Click()

        # Wait for the modal to appear and load
        Start-Sleep -Seconds 5

        # Check if any of the keywords are mentioned in <li> elements (case-insensitive)
        $liElements = $Driver.FindElementsByXPath("//li")
        $found = $false
        foreach ($li in $liElements) {
            $text = $li.Text.ToLower()
            foreach ($keyword in $keywords) {
                if ($text -contains $keyword) {
                    $found = $true
                    break
                }
            }
            if ($found) { break }
        }

        if ($found) {
            Log-Message "Keyword found in specifications."

            # Find the 'View details' button and extract the URL
            $viewDetailsButton = $Driver.FindElementByXPath("//a[contains(span/text(),'View details')]")
            if ($viewDetailsButton -ne $null) {
                # Get the href attribute and construct the full URL
                $href = $viewDetailsButton.GetAttribute('href')
                $fullUrl = "https://usedcars.bmw.co.uk" + $href
                Log-Message "Found View details button with URL: $fullUrl"
                $matchingCarUrls += $fullUrl
            } else {
                Log-Message "View details button not found."
            }
        } else {
            Log-Message "Keywords not found in specifications."
        }

        # Close the modal by clicking the close button
        $closeButton = $Driver.FindElementByXPath("//button[contains(@class,'uvl-c-modal__close')]")
        if ($closeButton -ne $null) {
            Log-Message "Closing the modal..."
            $closeButton.Click()
        } else {
            Log-Message "Close button not found."
        }

        # Wait for the close action to complete
        Start-Sleep -Seconds 2

    } catch {
        Log-Message "An error occurred while processing details: $_"
    }
}

# Print all matching car URLs
if ($matchingCarUrls.Count -gt 0) {
    Log-Message "Matching URLs:"
    foreach ($url in $matchingCarUrls) {
        Log-Message $url
    }
} else {
    Log-Message "No matching URLs found."
}

# Close the browser
$Driver.Quit()
Log-Message "Browser closed."

Previous Post Next Post

نموذج الاتصال