PowerShell: Duplicate file finder ( by File Hash)


There are plenty of software on the market to find duplicate files. Almost all of them list the duplicates and have you review and delete them ONE by ONE. So I decided to write my own PowerShell script to
* Find the duplicates by file hash
* move the duplicate files to the given location

With a single click, moving the duplicates to a directory was most easy way to deal with duplicates. (I select the files and delete them at once).

Here is the script (example is in the comment section). ENJOY!

<#
  ___                       ___  
 (o o)                     (o o) 
(  V  ) Duplicate Remover (  V  )
--m-m-----------------------m-m--

This script finds the duplicate files by
file hash and MOVE the duplicate files to
a given location. 

You can late review and delete all 
duplicate files from the given location.

Script written by: Anand, the Awesome

Parameters: 
Path : Directory Path of the files where do you want to check the duplicates
DuplicateFileMoveLocation: Directory Path where to move the duplicate files. If this
directory doesn't exist, it will be created.

Example: 
.\DuplicateFinder.ps1 -Path C:\Temp\Myfiles -DuplicateFileLocation C:\Temp\Myfiles\Duplicates

#>
param(
[Parameter(
Mandatory=$True,
ValueFromPipeline=$True,
ValueFromPipelineByPropertyName=$True
)]
[string[]]$Path,
$DuplicateFileMoveLocation = "")


Write-Host "Calculating number of files in $path..." 
$TotalCount = (Get-ChildItem $Path).Count
Write-Host $Path "have " $TotalCount " files."
Write-Host "Started finding the Duplicates..."
# Get all duplicate files
$DuplicatePaths = 
    Get-ChildItem $Path -File | 
    Get-FileHash |
    Group-Object -Property Hash |
    Where-Object -Property Count -gt 1 |
    ForEach-Object {
        Write-Host "Duplicated File: $($_.Group.path)" -ForegroundColor Yellow
        $_.Group.Path | Select-Object -First ($_.Count -1)
    }

Write-Host ($DuplicatePaths.Count) " Duplicate Files found. `n"
Write-Warning ("The Script found the {0} duplicate files out of {1} total. The duplicates will be moved to {2}." -f $DuplicatePaths.Count, $TotalCount,$DuplicateFileMoveLocation)
$answer = Read-Host -Prompt "Do you want to Proceed (Y or N)?"

if ($answer -eq 'y') {
    if ($DuplicateFileMoveLocation -ne "") { 
        # Create the duplicate file move directory if it does not exist.
        if ((Test-Path -Path $DuplicateFileMoveLocation) -eq $false) {New-Item -Path (Split-Path $DuplicateFileMoveLocation -Parent) -Name (Split-Path $DuplicateFileMoveLocation -Leaf) -ItemType "directory" *> $null}
        # Move the duplicates 
        $DuplicatePaths | ForEach-Object { 
            Write-Host "Moving $($_) to $DuplicateFileMoveLocation..." -ForegroundColor Red
            Move-Item -Path $_ -Destination $DuplicateFileMoveLocation 
        } 
    }
}
Write-Host "* * * Completed * * *" 
<#
End of the Script
#>

4 thoughts on “PowerShell: Duplicate file finder ( by File Hash)

    1. On Line 47: Add “-Recurse” option to Get-ChildItem. Replace line 47 with this line:
      Get-ChildItem $Path -File -Recurse

      That will do the sub-folders too.

    1. That’s neat. I always had issues with what to do when the script or a program finds duplicates. Most scripts and programs ask the user which file to keep and which one to delete. My instantances were thousands of files, which I definetly don’t want to deal with 100’s duplicates manually. That’s why I chose to move the duplicates to a folder, so I can review and delete in one stretch.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s