What's new

GUID COPY DETECTOR

Cator

Senior Member
Hi,
Inspired by a user here on the forum, I created, with the help of AI, a small scipt that allows you to check if the files we have created in a folder possess the same GUID, the source of so many unexpected problems.
I have attached the code and hope you can test it.
I have included the option to create a report on the desktop and to include subfolders, but I advise against analyzing an entire disk as it may require more memory than what is allocated to Alibre.
I am happy for you to test it and provide feedback for any potential improvements.
Regards,
Francesco
Python:
# ============================================================================
# Script Name: GUID COPY DETECTOR
# Author: Cator
# Date: 05/03/2025
# Function: This script detects and reports duplicate files within a specified directory by calculating a GUID based on the SHA1 hash of each file.
#           It provides options to include subfolders in the scan and to save a report of duplicate files to the desktop.
# ============================================================================

import clr
import os
from os.path import expanduser
from System import Guid
from System.IO import File, Directory
from System.Security.Cryptography import SHA1CryptoServiceProvider
import io

# Function to calculate a GUID based on the file hash
def get_file_guid(file_path):
    try:
        with File.OpenRead(file_path) as f:
            sha1 = SHA1CryptoServiceProvider()
            hash_bytes = sha1.ComputeHash(f)
            return Guid(hash_bytes[:16])  # Create a GUID from the first 16 bytes of the hash
    except Exception as e:
        print("Error reading", file_path, ":", e)
        return None

# Function to enumerate files in subfolders
def get_files(directory_path, include_subfolders=False):
    files = []
    if include_subfolders:
        for root, dirs, filenames in os.walk(directory_path):
            for filename in filenames:
                files.append(os.path.join(root, filename))
    else:
        files = Directory.GetFiles(directory_path)
    
    return files

# Main function
def find_duplicate_guids(directory_path, include_subfolders=False, save_report=False):
    if not Directory.Exists(directory_path):
        print("The folder does not exist.")
        return
    
    guids = {}  # Dictionary GUID -> List of files
    found_duplicates = False
    
    files = get_files(directory_path, include_subfolders)
    
    for file_name in files:
        guid = get_file_guid(file_name)
        if guid:
            guid_str = str(guid)
            if guid_str in guids:
                guids[guid_str].append(file_name)
                found_duplicates = True
            else:
                guids[guid_str] = [file_name]
    
    # Print files with duplicate GUIDs
    if found_duplicates:
        print("Duplicate files found:")
        report_content = "Duplicate files found:\n"
        for guid, files in guids.items():
            if len(files) > 1:
                print("Duplicate GUID: {}".format(guid))
                report_content += "Duplicate GUID: {}\n".format(guid)
                for f in files:
                    print("  - {}".format(f))
                    report_content += "  - {}\n".format(f)
        
        if save_report:
            desktop_path = os.path.join(os.path.join(os.environ['USERPROFILE']), 'Desktop')
            report_path = os.path.join(desktop_path, "duplicates.txt")
            with io.open(report_path, "w", encoding="utf-8") as f:
                f.write(report_content)
            print("Report saved to {}".format(report_path))
    else:
        print("No duplicate files found.")
        if save_report:
            desktop_path = os.path.join(os.path.join(os.environ['USERPROFILE']), 'Desktop')
            report_path = os.path.join(desktop_path, "duplicates.txt")
            with io.open(report_path, "w", encoding="utf-8") as f:
                f.write("No duplicate files found.")
            print("Report saved to {}".format(report_path))

# Add options for including subfolders and saving the report
Win = Windows()
Options = [
     ['Select folder to scan', WindowsInputTypes.Folder, expanduser('~')],
     ['Include subfolders', WindowsInputTypes.Boolean, False],
     ['Save report to desktop', WindowsInputTypes.Boolean, False],
]
Values  = Win.OptionsDialog("GUID Duplicate Detector", Options, 300)

folder_path = Values[0]
include_subfolders = Values[1]
save_report = Values[2]

find_duplicate_guids(folder_path, include_subfolders, save_report)
 
I haven't tested but I don't believe that the Alibre GUID is based on any hash. This would seam to compare file contents for exact copies. The actual Alibre GUID could still be the same for files that were once exact but one or both were subsequently edited. This script would not find those.

For the actual Alibre GUID you can see this script I wrote a while ago (and the issues we had when files were on a network drive)
 
I'll try your script, but from my tests I see that mine works. Thanks for showing me the link. If you have a way to test it I'll thank you!
 
Back
Top