Inventory Large Folders

This is a fairly simple setup using DCM. Say you want to find out where space is being used on machines, either to find user caches of data that isn’t backed up or to help target folders to clean up when drives get full.

Warning: This is a very CPU intensive process. I recommend testing in your environment as execution times will vary greatly on individual configurations.

You will also need to extend DCM execution times so it doesn’t always report as failed. This is optional however as it will still run. See http://blogs.msdn.com/b/fei_xias_blog/archive/2013/10/21/system-center-2012-configmgr-using-vbs-to-extend-the-dcm-script-execution-timeout-value.aspx for instructions. I set mine to 1200 seconds, which is 20 minutes. You could also deploy this as an SCCM package that just runs the powershell file.

First we setup a configuration item

Large Folders 1 Large Folders 2 Large Folders 3 Large Folders 4

And put this in for the script to run:

Inspirations and some of the codebase documented in the comments. This has all been heavily modified for this purpose.

# Get-DirStats.ps1
# Written by Bill Stewart (bstewart@iname.com)
# Outputs file system directory statistics.
# https://gallery.technet.microsoft.com/scriptcenter/Outputs-directory-size-964d07ff
# http://blogs.technet.com/b/heyscriptingguy/archive/2012/05/25/getting-directory-sizes-in-powershell.aspx

#requires -version 2

<#
.SYNOPSIS
Outputs file system directory statistics.

.DESCRIPTION
Outputs file system directory statistics (number of files and the sum of all file sizes) for one or more directories.

.PARAMETER Path
Specifies a path to one or more file system directories. Wildcards are not permitted. The default path is the current directory (.).

.PARAMETER LiteralPath
Specifies a path to one or more file system directories. Unlike Path, the value of LiteralPath is used exactly as it is typed.

.PARAMETER Only
Outputs statistics for a directory but not any of its subdirectories.

.PARAMETER WriteWMI
Checks for and writes to custom WMI namespace

.PARAMETER Every
Outputs statistics for every directory in the specified path instead of only the first level of directories.

.PARAMETER FormatNumbers
Formats numbers in the output object to include thousands separators.

.PARAMETER ShowErrors
Disable error suppression

.PARAMETER Benchmarking
Show benchmarking statistics

.PARAMETER Size
Filter any directories under this amount in MB from being outputted

.PARAMETER Total
Outputs a summary object after all other output that sums all statistics.
#>

[CmdletBinding(DefaultParameterSetName="Path")]
param(
  [parameter(Position=0,Mandatory=$false,ParameterSetName="Path",ValueFromPipeline=$true)]
    [ValidateScript({Test-Path $_ -PathType 'Container'})] 
    $Path=(get-location).Path,
  [parameter(Position=0,Mandatory=$true,ParameterSetName="LiteralPath")]
    [ValidateScript({Test-Path $_ -PathType 'Container'})] 
    [String[]] $LiteralPath,
    [Switch] $Only,
    [Switch] $WriteWMI,
    [Switch] $Every,
    [Switch] $FormatNumbers,
    [Switch] $Total,
  [Switch] $ShowErrors,
  [Switch] $Benchmarking,	
    [ValidateRange(1,90000000)] 
    [Int] $Size
)

begin {
  #We're going to return the seconds it took to run for DCM
  $script:startTime = Get-Date
  if($Benchmarking){
  write-host "Script Started at $script:startTime"
  }

  #Setting the variable here since we are using DCM
  $WriteWMI = $true
  #Silence error output during writeWMI
  If($WriteWMI){$ErrorActionPreference = "silentlycontinue"}
  if ($ShowErrors){$ErrorActionPreference ="Continue"}

  #Create FSO to use later
  $fso = New-Object -comobject Scripting.FileSystemObject
    
  $ParamSetName = $PSCmdlet.ParameterSetName
  if ( $ParamSetName -eq "Path" ) {
    $PipelineInput = ( -not $PSBoundParameters.ContainsKey("Path") ) -and ( -not $Path )
  }
  elseif ( $ParamSetName -eq "LiteralPath" ) {
    $PipelineInput = $false
  }

  if ( $Size -gt 0 ) { $SizeFilter = $true } 
  else { $SizeFilter = $false }

  #Check and setup WMI if we are going to use that
  if ( $WriteWMI ) {
    #Check if there is a size filter, if not set to 2000MB because we only want really big folders to work with
    if ($Size -le 0) { 
  $Size = 2000
  $SizeFilter = $true 
  } 
    
    #Get Local Drive to search
    $script:LocalDrives = get-wmiobject win32_volume | ? { $_.DriveType -eq 3 } | % { get-psdrive $_.DriveLetter[0] } | Select Root

    #Delete existing instances
    $script:Class = "LargeFolders"
    If (Get-WmiObject -List -Namespace "root\cimv2" | Where-Object {$_.Name -eq $script:Class}) {
   Get-WmiObject -Namespace "root\cimv2" -Class $script:Class | Remove-WMIObject
       } 
  
  #Now rebuild it to make sure we have the right fields
  $subClass = New-Object System.Management.ManagementClass ("root\cimv2", [String]::Empty, $null); 
  $subClass["__CLASS"] = $script:Class; 
  $subClass.Qualifiers.Add("Static", $true)
  $subClass.Properties.Add("Path", [System.Management.CimType]::String, $false)
  $subClass.Properties["Path"].Qualifiers.Add("Key", $true)
  $subClass.Properties.Add("FileCount", [System.Management.CimType]::UInt32, $false)
    $subClass.Properties.Add("SubFolderCount", [System.Management.CimType]::UInt32, $false)
  $subClass.Properties.Add("SizeMB", [System.Management.CimType]::UInt32, $false)
    $subClass.Properties.Add("GatherMethod", [System.Management.CimType]::UInt8, $false)
    $subClass.Properties.Add("DateCreated", [System.Management.CimType]::String, $false)
    $subClass.Properties.Add("DateLastModified", [System.Management.CimType]::String, $false)
    $subClass.Properties.Add("SecondsToEnumerate", [System.Management.CimType]::UInt32, $false)
  #This would be WMI CIM_DateTime Native if needed
    #$subClass.Properties.Add("DateCreated", [System.Management.CimType]::DateTime, $false)
    #$subClass.Properties.Add("DateLastModified", [System.Management.CimType]::DateTime, $false)
    $WMIOutput = $subClass.Put()
    }

  # Script-level variables used with -Total.
  [UInt64] $script:totalcount = 0
  [UInt64] $script:totalbytes = 0

  # Returns a [System.IO.DirectoryInfo] object if it exists.
  function Get-Directory {
    param( $item )

    if ( $ParamSetName -eq "Path" ) {
      if ((Test-Path -Path $item -PathType Container) -and ($item.Attributes -notmatch [System.IO.FileAttributes]::ReparsePoint)) {
        $item = Get-Item -Path $item -Force
      }
    }
    elseif ( $ParamSetName -eq "LiteralPath" ) {
      if ( (Test-Path -LiteralPath $item -PathType Container) -and ($item.Attributes -notmatch [System.IO.FileAttributes]::ReparsePoint) ) {
        $item = Get-Item -LiteralPath $item -Force
      }
    }
    if ( $item -and ($item -is [System.IO.DirectoryInfo]) -and ($item.Attributes -notmatch [System.IO.FileAttributes]::ReparsePoint)) {
      return $item
    }
  }

  #Do the benchmark calculations
  function GetElapsedTime([datetime]$starttime) 
  {
    $runtime = [Math]::Round(($(get-date) - $starttime).TotalSeconds)

    If($WriteWMI){
        $retStr = $runtime}
    Else{
        $retStr = [string]::format("{0} sec(s)", $runtime)}
    $retStr
  }
  
  # Filter that outputs the custom object with formatted numbers.
  function Format-Output {
    process {
      $_ | Select-Object Path,
        @{Name="Files"; Expression={"{0:N0}" -f $_.Files}},
        @{Name="Size"; Expression={"{0:N0}" -f $_.Size}}
    }
  }

  # Outputs directory statistics for the specified directory. With -recurse,
  # the function includes files in all subdirectories of the specified
  # directory. With -format, numbers in the output objects are formatted with
  # the Format-Output filter.
  function Get-DirectoryStats {
    param( $directory, $recurse, $format )
    
    #Break out of here if we don't care about this folder
    If ($WriteWMI) 
    {
        #Start counting how long it takes to get data
        $enumerationTime = Get-Date
        #Create Exclude array at runtime to speed comparison so we can exclude items
        #http://blogs.technet.com/b/heyscriptingguy/archive/2011/02/18/speed-up-array-comparisons-in-powershell-with-a-runtime-regex.aspx
        $ExcludedFolders = @("Windows"
                            ,"Users"
                            ,"inetpub"
                            ,"Program Files"
                            ,"Program Files (x86)"
                            ,"ProgramData"
                            ,"Windows\System32"
                            ,"System Volume Information"
                            ,"Windows\Installer"
                            ,"Windows\assembly"
                            ,"Windows\System32\DriverStore"
                            ,"Windows\System32\DriverStore\FileRepository"
                            ,"ProgramData\Package Cache
                            ","ProgramData\Microsoft")
        [regex] $excludeFolders_regex = '(?i)^(' + (($ExcludedFolders | foreach {[regex]::escape($directory.Root.ToString() + $_)}) –join "$|") + ')$'    

        If($directory.FullName -match $excludeFolders_regex) {
        #Write-Warning "Skipping: " $directory.FullName
        Return}
        else {
        #Write-Information $directory.FullName " not in " $excludeFolders_regex
        }
    }

    Write-Progress -Activity "Get-DirStats.ps1" -Status "Reading '$($directory.FullName)'"

    #Use FSO if possible
    #https://gallery.technet.microsoft.com/get-foldersize-224916dd
    $folder = $fso.GetFolder($directory.FullName)

    #test if FSO worked, otherwise go to file by file
    if($folder.size -ge $size*1MB -AND $folder.Type -eq "File Folder") {
      $output = "" | Select Path, Files, SizeMB, GatherMethod
      $output.Path = $folder.Path.ToString()
      $output.Files = $folder.Files.Count
      $output.SizeMB = [Math]::Round($folder.Size/1MB)
      $output.GatherMethod = 1
    }
    elseif($folder.size -eq $null) {
      $files = $directory | Get-ChildItem -Force -Recurse:$recurse | 
       ? {$_.Attributes -notmatch [System.IO.FileAttributes]::ReparsePoint} | 
       Where-Object { -not $_.PSIsContainer }
  
      if ( $files ) {
      Write-Progress -Activity "Get-DirStats.ps1" -Status "Calculating '$($directory.FullName)'"
    
    	If ( $SizeFilter ) 
    {
      $output = $files | Measure-Object -Sum -Property Length | 
      Where-Object {$_.Sum -ge $Size*1MB} |  Select-Object `
  	          @{Name="Path"; Expression={$directory.FullName}},
            @{Name="Files"; Expression={$_.Count; $script:totalcount += $_.Count}},
            @{Name="Size"; Expression={$_.Sum; $script:totalbytes += $_.Sum}},
          @{Name="SizeMB"; Expression={[Math]::Round($_.Sum/1MB)}},
              @{Name="GatherMethod"; Expression={2}}
    }
        elseif ( $SizeFilter = $False ) {
          $output = "" | Select-Object `
          @{Name="Path"; Expression={$directory.FullName}},
          @{Name="Files"; Expression={0}},
          @{Name="Size"; Expression={0}}
        }
   	  Else 
    {
      $output = $files | Measure-Object -Sum -Property Length | Select-Object `
  	          @{Name="Path"; Expression={$directory.FullName}},
            @{Name="Files"; Expression={$_.Count; $script:totalcount += $_.Count}},
            @{Name="Size"; Expression={$_.Sum; $script:totalbytes += $_.Sum}},
      @{Name="SizeMB"; Expression={[Math]::Round($_.Sum/1MB)}}
    }
      }
    } #end elseif $folder.size is null manual query

    #Write data to WMI
    if ($WriteWMI -And $output.SizeMB -ge $Size) { 
    $WMIURL = 'root\cimv2:'+$script:Class
    $PushDataToWMI = ([wmiclass]$WMIURL).CreateInstance()
    $PushDataToWMI.Path =  $output.Path
    $PushDataToWMI.FileCount = $output.Files
        $PushDataToWMI.SubFolderCount = $folder.SubFolders.Count
    $PushDataToWMI.SizeMB = $output.SizeMB
        $PushDataToWMI.GatherMethod = $output.GatherMethod
    $PushDataToWMI.DateCreated = $folder.DateCreated
    $PushDataToWMI.DateLastModified = $folder.DateLastModified
        $elapsed = GetElapsedTime $enumerationTime
        $PushDataToWMI.SecondsToEnumerate = $elapsed
    #WMI CIM_DateTime Namtive if needed
        #$PushDataToWMI.DateCreated = [System.Management.ManagementDateTimeConverter]::ToDmtfDateTime($folder.DateCreated)
    #$PushDataToWMI.DateLastModified = [System.Management.ManagementDateTimeConverter]::ToDmtfDateTime($folder.DateLastModified)
    $WmiOutput = $PushDataToWMI.Put()
        #$TotalTime = GetElapsedTime $script:startTime
        #Write-Host $TotalTime ":" $output.Path "with time of" $elapsed "using method" $output.GatherMethod
  }

   if ($WriteWMI=$false){if ( -not $format ) { $output } else { $output | Format-Output }}
  }
}

process {
  # Get the item to process, no matter whether the input comes from the
  # pipeline or not.
  if ( $PipelineInput ) {
    $item = $_
  }
  else {
    if ( $ParamSetName -eq "Path" ) {
      $item = $Path
    }
    elseif ( $ParamSetName -eq "LiteralPath" ) {
      $item = $LiteralPath
    }
  }


  if($WriteWMI){
  foreach($drive in $script:LocalDrives){
    if($Benchmarking -OR $WriteWMI){$iterTime = Get-Date}
    $item = $drive.Root

    # Write an error if the item is not a directory in the file system.
    $directory = Get-Directory -item $item
    if ( -not $directory ) {
      Write-Error -Message "Path '$item' is not a directory in the file system." -Category InvalidType
      return
    }

    # Get the statistics for the first-level directory.
    Get-DirectoryStats -directory $directory -recurse:$false

    # Get the subdirectories of the first-level directory and get the statistics
    # for each of them.
    $directory | Get-ChildItem -Force -Recurse:$true |
      Where-Object { $_.PSIsContainer } | ? {$_.Attributes -notmatch [System.IO.FileAttributes]::ReparsePoint} | 
      ForEach-Object {Get-DirectoryStats -directory $_ -recurse:$true}


    if($Benchmarking -OR $WriteWMI){    
      $elapsed = GetElapsedTime $iterTime ;
        write-warning "   Iteration Time: " $elapsed " for " $item
    }
  }
  }
   else {
    # Write an error if the item is not a directory in the file system.
    $directory = Get-Directory -item $item
    if ( -not $directory ) {
      Write-Error -Message "Path '$item' is not a directory in the file system." -Category InvalidType
      return
    }
  
    # Get the statistics for the first-level directory.
    Get-DirectoryStats -directory $directory -recurse:$false -format:$FormatNumbers
    # -Only means no further processing past the first-level directory.
    if ( $Only ) { return }

    # Get the subdirectories of the first-level directory and get the statistics
    # for each of them.
    $directory | Get-ChildItem -Force -Recurse:$Every |
      Where-Object { $_.PSIsContainer } | ? {$_.Attributes -notmatch [System.IO.FileAttributes]::ReparsePoint} | 
    ForEach-Object {Get-DirectoryStats -directory $_ -recurse:(-not $Every) -format:$FormatNumbers
      }
   }
}

end {
  # If -Total specified, output summary object.
  if ( $Total ) {
    $output = "" | Select-Object `
      @{Name="Path"; Expression={"<Total>"}},
      @{Name="Files"; Expression={$script:totalcount}},
      @{Name="Size"; Expression={$script:totalbytes}}
    if ( -not $FormatNumbers ) { $output } else { $output | Format-Output }
  }
  
  $elapsed = GetElapsedTime $script:startTime;
  if($Benchmarking){
  write-host "Script Ended at $(get-date)"
  write-host "Total Elapsed Time: " $elapsed
  }
  Write-Host $elapsed
}

You can modify any of the variables like size and folders to exclude to meet your needs. I didn’t want to focus on folders I probably could not clean out easily like system folders and the Windows folder will always show over 2GB on machines.

Then setup the configuration baseline to run either on a simple schedule or when it is good for your environment. This is a very processor intensive script that could easily use 50% or more of the CPU for 20+ minutes. I did every saturday here. You could also target a custom collection where machines have hard drive space less than X amount.

Large Folders 5 Large Folders 6 Large Folders 7 Large Folders 8

Then add the WMI entry to hardware inventory. This will need done from a system where this has run already.

Large Folders 10

Then wait for the data and profit

Large Folders 11