Tiny PowerShell Project 3 -  Checksum verification

Tiny PowerShell Project 3 - Checksum verification

Message Digest Algorithm 5

I cannot count the number of times that I had to work with engineers that held fancy titles but simply didn't know how to do a basic checksum verification.

MD5 stands for "Message Digest Algorithm 5." It is a widely used cryptographic hash function that takes an input (message or data) of arbitrary length and produces a fixed-size (128-bit or 16-byte) hash value. The primary purpose of MD5 is to generate a unique representation (hash) of the input data in a way that makes it challenging to reverse-engineer the original data or find two different inputs that produce the same hash (collision resistance).

It's important to note that MD5 is considered cryptographically broken and unsuitable for security purposes due to its vulnerability to collision attacks. Because of these vulnerabilities, MD5 is not recommended for cryptographic purposes like secure password storage or digital signatures. However, MD5 still has non-security-related uses, like checksum verification for data integrity in non-critical applications or generating unique identifiers for non-security-critical tasks.

To get the MD5 hash of a single file using PowerShell, you can use the Get-FileHash cmdlet. Here's how you can do it:

# Define the file path of the image
$filePath = "C:\Path\To\Your\Image.jpg"

# Get the MD5 hash of the file
$hashInfo = Get-FileHash -Algorithm MD5 -Path $filePath

# Output the MD5 hash value
Write-Host "MD5 hash of $($filePath): $($hashInfo.Hash)"

Before we dive into today's tiny project, let's also talk about hash tables first. In PowerShell, a hashtable is a data structure used to store key-value pairs. It allows you to associate a value (data) with a unique key (identifier). Hashtables are useful when you need to quickly lookup values based on their keys, making them efficient for data retrieval.

In PowerShell, you can create a hashtable using the @{} notation. Each key-value pair is separated by =. Here's the general syntax of a hashtable:

$hashtable = @{
    Key1 = Value1
    Key2 = Value2
    ...
}

Now, let's go through some CRUD (Create, Read, Update, Delete) examples with hashtables:

1. Create a hashtable: You can create a new hashtable and add key-value pairs to it using the @{} notation:

# Create a new hashtable
$personInfo = @{
    Name = "John Doe"
    Age = 30
    Occupation = "Software Engineer"
}

2. Read from a hashtable: To access the value associated with a specific key, you can use the key inside square brackets:

# Access specific values using keys
$personName = $personInfo["Name"]
$personAge = $personInfo["Age"]

Write-Host "Name: $personName, Age: $personAge"

3. Update a hashtable: You can update the value associated with a key or add new key-value pairs to an existing hashtable:

# Update values or add new key-value pairs
$personInfo["Age"] = 31  # Update Age to 31
$personInfo["City"] = "New York"  # Add a new key-value pair

# Print the updated hashtable
$personInfo

4. Delete from a hashtable: To remove a key-value pair from a hashtable, you can use the Remove() method:

# Delete a key-value pair
$personInfo.Remove("City")

# Print the hashtable after deletion
$personInfo

Now that we have learned how to calculate the MD5 hash values of files and effectively interact with objects in PowerShell, let's explore the possibility of crafting a script to detect duplicate images. The underlying concept involves associating each file with a unique hash value, allowing us to identify duplicated copies by locating repeating hash values. Our approach begins by storing the computed hash values along with their corresponding file names in a hashtable. Subsequently, we perform a lookup to ascertain the existence of each hash value. If a particular hash value is not present, we add it to the hashtable; however, if it already exists, it indicates the presence of a duplicate image.

# Define the folder path where the images are located
$folderPath = "C:\Users\Username\Pictures"

# Initialize a hash table to store the MD5 hashes and their corresponding files
$hashTable = @{}

# Get all image files in the folder and subfolders
$imageFiles = Get-ChildItem -Path $folderPath -Recurse -Include *.jpg, *.jpeg, *.png, *.bmp, *.gif

# Loop through each image file
foreach ($imageFile in $imageFiles) {
    # Get the MD5 hash of the image file
    $hashInfo = Get-FileHash -Algorithm MD5 -Path $imageFile.FullName
    $hash = $hashInfo.Hash

    # Check if the hash already exists in the hash table
    if ($hashTable.ContainsKey($hash)) {
        # The hash already exists, so the image file is a duplicate
        Write-Host "Duplicate image found: $($imageFile.FullName)"
    } else {
        # Add the hash to the hash table
        $hashTable.Add($hash, $imageFile.FullName)
    }
}

There you have it! before we end today's writing, let's see some of the useful methods available to us.

In PowerShell, hashtables are implemented as System.Collections.Hashtable objects, which means they have several useful methods that you can use to work with the data they contain. Here are some of the commonly used methods for hashtables:

  1. Add(key, value): Adds a new key-value pair to the hashtable.
$myHashtable = @{}
$myHashtable.Add("Name", "John")
$myHashtable.Add("Age", 30)
  1. Remove(key): Removes a key-value pair from the hashtable based on the specified key.
$myHashtable.Remove("Age")
  1. ContainsKey(key): Checks if the hashtable contains a specific key and returns True or False.
if ($myHashtable.ContainsKey("Name")) {
    Write-Host "Name exists in the hashtable."
}
  1. ContainsValue(value): Checks if the hashtable contains a specific value and returns True or False.
if ($myHashtable.ContainsValue("John")) {
    Write-Host "The value 'John' exists in the hashtable."
}
  1. Clear(): Removes all key-value pairs from the hashtable, making it empty.
$myHashtable.Clear()
  1. GetEnumerator(): Returns an enumerator that allows you to iterate through the key-value pairs in the hashtable.
$enumerator = $myHashtable.GetEnumerator()
while ($enumerator.MoveNext()) {
    Write-Host "$($enumerator.Key): $($enumerator.Value)"
}
  1. Count: Retrieves the number of key-value pairs in the hashtable.
$numberOfItems = $myHashtable.Count
Write-Host "Number of items in the hashtable: $numberOfItems"
  1. Keys: Returns an array containing all the keys in the hashtable.
$keysArray = $myHashtable.Keys
  1. Values: Returns an array containing all the values in the hashtable.
$valuesArray = $myHashtable.Values

Here is Microsoft about page for hash tables and dotnet's page for hashtable class.

Did you find this article valuable?

Support Application Support by becoming a sponsor. Any amount is appreciated!