PowerShell Script To Find and Extract Files From SharePoint That Have A URL Longer Than 260 Characters

If you’re here, it can only mean one thing: Your users have created a folder and filename path in SharePoint that is so long that they’re now getting errors, and they can’t edit the document in Office applications. Like a burrowing parasite, the office document document has gone deeper and deeper into a convoluted folder structure until the URL exceeds the SharePoint limit of 260 characters.

It’s called Longurlitis, and while painful, it is curable.

Take this (not a real world) example:

http://sharepoint.company.com/Documents/Dave%20Documents/My%20Administrator%20Told%20Me%20To%20Use/
Metadata%20but%20what%20does%20he%20know/I’ll%20show%20him/
You’ll%20take%20these%20folders%20away%20from%20my%20cold%20dead%20hands/
Mwahahahahahahahahahahahahahahahahahaha/
Dave’s%20Word%20Document%20With%20Meeting%20Minutes%20From%201992%20
About%20That%20Upcoming%20Millenium%20Bug%20No%20I%20Can’t%20Get%20Rid%20Of%20These/
Meeting%20Minutes%20June%20Fourteenth%201992%20FINAL%20VERSION%20DRAFT%20v482.doc

Ouch, that hurt my everything.

Now when the user tries to do anything with this file, you get this:

The URL for this file is too long for the application. A temporary copy of this file will be opened on your computer. You must save this copy as a new file.
Every time this error appears, a SharePoint administrator sheds a tear.

This issue has been discussed at fairly great length a number of places, and by SharePoint heavyweights like Joel Oleson.

The problem is fairly straightforward when it’s only one or two files, but what do you do when this isn’t a one-off, but a full-on infestation of Longurlitis? I recently came across over 700 files with Longurlitis in an old site that’s being decommissioned. After running the SharePoint Site Extraction script, I found a number of files missing. The reason being is that the script’s BinaryWriter chokes on the insane URL length.

To that end I’ve created the script below that finds all files in a given site that with a URL that exceeds 260 characters. You can also download all of the files by un-commenting the designated line. Also if you’d like to spit out an inventory CSV file with the URL and number of characters, just use the Out-File Cmdlet like so:

FindLongPaths.ps1 | Out-File -filepath C:\wherever\LongFiles.csv

Note: I had to use a character other than a comma for the CSV output, because it’s possible and likely that the same users who are creating these files are also putting commas in the filename. Because the hash character (#) is forbidden in SharePoint URLs, it works well for delimiting the output.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
Add-PSSnapin Microsoft.SharePoint.PowerShell -erroraction SilentlyContinue
 
#Where to Download the files to. 
$destination = "C:\Wherever"
 
#The site to extract from. Make sure there is no trailing slash.
$site = "http://sharepoint.company.com/site"
 
# Function: DownloadLongFiles
# Description: Downloads all documents with a URL > 260 characters
# Variables
# $folderUrl: The Document Library to Download
function DownloadLongFiles($folderUrl)
{
    $folder = $web.GetFolder($folderUrl)
 
    foreach ($file in $folder.Files) 
	{
		$encodedURL = $file.url -replace " ", "%20"
		$FullURL = $site+'/'+$encodedURL
	    $URLWithLength = $FullURL+'#'+$FullURL.length
		$Filename = $file.Name
		$Downloadpath = $destination+'\'+$Filename
		if ($FullURL.length -ge 260)
		{
			#Uncomment the line below to download the files.
			#HTTPDownloadFile "$FullURL" "$Downloadpath"
			Write-Host $FullURL
			Write-Host $destination
 
			Write-Host $URLWithLength
			Write-Output $URLWithLength
		}
 
	}
}
 
# Function: DownloadSite
# Description: Calls DownloadLongFiles recursiveley to download all documents with long file names in a site.
# Variables
# $webUrl: The URL of the site to download all document libraries
function DownloadSite($webUrl)
{
	$web = Get-SPWeb -Identity $webUrl
 
	foreach($list in $web.Lists)
	{
		if($list.BaseType -eq "DocumentLibrary")
		{
			DownloadLongFiles $list.RootFolder.Url
			#Download files in folders
			foreach ($folder in $list.Folders) 
			{
    			DownloadLongFiles $folder.Url
			}
		}
	}
}
 
# Function: HTTPDownloadFile
# Description: Downloads a file using webclient
# Variables
# $ServerFileLocation: Where the source file is located on the web
# $DownloadPath: The destination to download to
 
function HTTPDownloadFile($ServerFileLocation, $DownloadPath)
{
	$webclient = New-Object System.Net.WebClient
	$webClient.UseDefaultCredentials = $true
	$webclient.DownloadFile($ServerFileLocation,$DownloadPath)
}
 
#Download Site Documents + Versions
DownloadSite "$site"

SharePoint PowerShell Script to Extract All Documents and Their Versions

Hey! Listen: This script doesn’t extract documents that suffer from Longurlitis (URL greater than the SharePoint maximum of 260 characters). So you may also want to also run the PowerShell Script To Find and Extract Files From SharePoint That Have A URL Longer Than 260 Characters.

Recently a client asked to extract all content from a SharePoint site for archival. A CMP file was out of the question, because this had to be a SharePoint independent solution.  Powershell to the rescue! The script below  will extract all documents and their versions, as well as all metadata and list data to CSV files.

The DownloadSite function will download all the documents and their versions into folders named after their respective document libraries. Versions will be named [filename]_v[version#].[extension].

The DownloadMetadata function will download all the document library’s metadata as well as list data from the site and export it as a CSV file. If you don’t need to download the metadata/ lists, just comment out the function below.

There’s also ample commenting in case someone wants to modify/ expand upon the script!

# This script will extract all of the documents and their versions from a site. It will also
# download all of the list data and document library metadata as a CSV file.
 
Add-PSSnapin Microsoft.SharePoint.PowerShell -erroraction SilentlyContinue
# 
# $destination: Where the files will be downloaded to
# $webUrl: The URL of the website containing the document library for download
# $listUrl: The URL of the document library to download
 
#Where to Download the files to. Sub-folders will be created for the documents and lists, respectively.
$destination = "C:\Export"
 
#The site to extract from. Make sure there is no trailing slash.
$site = "http://yoursitecollection/yoursite"
 
# Function: HTTPDownloadFile
# Description: Downloads a file using webclient
# Variables
# $ServerFileLocation: Where the source file is located on the web
# $DownloadPath: The destination to download to
 
function HTTPDownloadFile($ServerFileLocation, $DownloadPath)
{
	$webclient = New-Object System.Net.WebClient
	$webClient.UseDefaultCredentials = $true
	$webclient.DownloadFile($ServerFileLocation,$DownloadPath)
}
 
function DownloadMetadata($sourceweb, $metadatadestination)
{
	Write-Host "Creating Lists and Metadata"
	$sourceSPweb = Get-SPWeb -Identity $sourceweb
	$metadataFolder = $destination+"\"+$sourceSPweb.Title+" Lists and Metadata"
	$createMetaDataFolder = New-Item $metadataFolder -type directory 
	$metadatadestination = $metadataFolder
 
	foreach($list in $sourceSPweb.Lists)
	{
		Write-Host "Exporting List MetaData: " $list.Title
		$ListItems = $list.Items 
		$Listlocation = $metadatadestination+"\"+$list.Title+".csv"
		$ListItems | Select * | Export-Csv $Listlocation  -Force
	}
}
 
# Function: GetFileVersions
# Description: Downloads all versions of every file in a document library
# Variables
# $WebURL: The URL of the website that contains the document library
# $DocLibURL: The location of the document Library in the site
# $DownloadLocation: The path to download the files to
 
function GetFileVersions($file)
{
	foreach($version in $file.Versions)
	{
		#Add version label to file in format: [Filename]_v[version#].[extension]
		$filesplit = $file.Name.split(".") 
		$fullname = $filesplit[0] 
		$fileext = $filesplit[1] 
		$FullFileName = $fullname+"_v"+$version.VersionLabel+"."+$fileext			
 
		#Can't create an SPFile object from historical versions, but CAN download via HTTP
		#Create the full File URL using the Website URL and version's URL
		$fileURL = $webUrl+"/"+$version.Url
 
		#Full Download path including filename
		$DownloadPath = $destinationfolder+"\"+$FullFileName
 
		#Download the file from the version's URL, download to the $DownloadPath location
		HTTPDownloadFile "$fileURL" "$DownloadPath"
	}
}
 
# Function: DownloadDocLib
# Description: Downloads a document library's files; called GetGileVersions to download versions.
# Credit 
# Used Varun Malhotra's script to download a document library
# as a starting point: http://blogs.msdn.com/b/varun_malhotra/archive/2012/02/13/10265370.aspx
# Variables
# $folderUrl: The Document Library to Download
# $DownloadPath: The destination to download to
function DownloadDocLib($folderUrl)
{
    $folder = $web.GetFolder($folderUrl)
    foreach ($file in $folder.Files) 
	{
        #Ensure destination directory
		$destinationfolder = $destination + "\" + $folder.Url 
        if (!(Test-Path -path $destinationfolder))
        {
            $dest = New-Item $destinationfolder -type directory 
        }
 
        #Download file
        $binary = $file.OpenBinary()
        $stream = New-Object System.IO.FileStream($destinationfolder + "\" + $file.Name), Create
        $writer = New-Object System.IO.BinaryWriter($stream)
        $writer.write($binary)
        $writer.Close()
 
		#Download file versions. If you don't need versions, comment the line below.
		GetFileVersions $file
	}
}
 
# Function: DownloadSite
# Description: Calls DownloadDocLib recursiveley to download all document libraries in a site.
# Variables
# $webUrl: The URL of the site to download all document libraries
function DownloadSite($webUrl)
{
	$web = Get-SPWeb -Identity $webUrl
 
	#Create a folder using the site's name
	$siteFolder = $destination + "\" +$web.Title+" Documents"
	$createSiteFolder = New-Item $siteFolder -type directory 
	$destination = $siteFolder
 
	foreach($list in $web.Lists)
	{
		if($list.BaseType -eq "DocumentLibrary")
		{
			Write-Host "Downloading Document Library: " $list.Title
			$listUrl = $web.Url +"/"+ $list.RootFolder.Url
			#Download root files
			DownloadDocLib $list.RootFolder.Url
			#Download files in folders
			foreach ($folder in $list.Folders) 
			{
    			DownloadDocLib $folder.Url
			}
		}
	}
}
 
#Download Site Documents + Versions
DownloadSite "$site"
 
#Download Site Lists and Document Library Metadata
DownloadMetadata $site $destination