Sunday, November 26, 2006 2:15 AM bart

PowerShell - A cmdlet that reports progress - A simple file downloader cmdlet

Introduction

In my previous post on file downloads in PowerShell I mentioned how to download a file in Windows PowerShell using System.Net.WebClient. No big deal if you know the Base Class Libraries. However, there was a drawback: the lacking download status reporting. In order to fix this, I've created a simple sample cmdlet that uses reporting (WriteProgress) while performing a download. Enter download-file.

The code

The basic idea is pretty simple:

  • Inside a cmdlet we'll call some WebClient method to perform a file download from a specified URL (through some parameter) and to a local file location (which can be an optional parameter, since we can derive the file name from the source URL as well)
  • Instead of blocking on some DownloadFile call, we want to do things asynchronously (i.e. DownloadFileAsync) in order to be able to report progress during the download.
  • The events DownloadProgressChanged and DownloadFileCompleted of the System.Net.WebClient will help us to accomplish the progress reporting.

We'll call our cmdlet download-file, so the declaration will be like this:

[Cmdlet("download", "file")] public class DownloadFileCmdlet : PSCmdlet { }

Parameterization of a cmdlet is piece of cake as explained in a bunch of previous posts already, and will look like this:

/// <summary> /// Url of the file that has to be downloaded. /// </summary> [Parameter(Mandatory = true, Position = 0)] public Uri Url { get { return _url; } set { _url = value; } } private string _file; /// <summary> /// Target file name (optional). /// </summary> [Parameter(Position = 1)] public string To { get { return _file; } set { _file = value; } }

Next, we'll have to define the ProcessRecord overload of the cmdlet to do the processing. This is where things get tricky because of the following reasons:

  • Starting an asynchronous download using DownloadFileAsync doesn't report HTTP error codes directly. Since we want to catch these possible errors, we'll have to do some trick.
  • Events like DownloadProcessChanged and DownloadFileCompleted are raised on a background thread. From such a thread it's not valid to call the PSCmdlet's WriteProgress (or any other Write*) method.
  • We have to wait for the background download to complete before ProcessRecord is exited. In other words, DownloadFileCompleted has to run before we can exit ProcessRecord.

The solution to all of the problems above is some nice piece of thread synchronization. Basically the "main thread" (the one where ProcessRecord was called on) has to create the WebClient, hook in event handlers and start the asynchronous download job. Once that's done, it has to wait for any of two events to occur: either a ProgressRecord instance is available or DownloadFileCompleted has executed. In the first case, we can perform a WriteProgress to report progress on the right thread. In the second case, we can exit the ProcessRecord method because of download completion.

Here's the complete code:

1 using System; 2 using System.ComponentModel; 3 using System.IO; 4 using System.Management.Automation; 5 using System.Net; 6 using System.Threading; 7 8 [Cmdlet("download", "file")] 9 public class DownloadFileCmdlet : PSCmdlet 10 { 11 /// <summary> 12 /// Wait handle to report download completion. 13 /// </summary> 14 private ManualResetEvent exit = new ManualResetEvent(false); 15 16 /// <summary> 17 /// Wait handle to report availability of a ProgressRecord item in pr. 18 /// </summary> 19 private AutoResetEvent prog = new AutoResetEvent(false); 20 21 /// <summary> 22 /// Array of the wait handles above (set in ProcessRecord) to perform WaitAny. 23 /// </summary> 24 private WaitHandle[] evts; 25 26 /// <summary> 27 /// ProgressRecord indicating the current download status. 28 /// </summary> 29 private ProgressRecord pr; 30 31 /// <summary> 32 /// Synchronization object for pr. 33 /// </summary> 34 private object pr_sync = new object(); 35 36 private Uri _url; 37 38 /// <summary> 39 /// Url of the file that has to be downloaded. 40 /// </summary> 41 [Parameter(Mandatory = true, Position = 0)] 42 public Uri Url 43 { 44 get { return _url; } 45 set { _url = value; } 46 } 47 48 private string _file; 49 50 /// <summary> 51 /// Target file name (optional). 52 /// </summary> 53 [Parameter(Position = 1)] 54 public string To 55 { 56 get { return _file; } 57 set { _file = value; } 58 } 59 60 /// <summary> 61 /// Entry-point for the cmdlet processing. 62 /// </summary> 63 protected override void ProcessRecord() 64 { 65 // 66 // Construct wait handles array for WaitHandle.WaitAny calls. 67 // 68 evts = new WaitHandle[] { exit, prog }; 69 70 // 71 // If no target file name was specified, derive it from the url's file name portion. 72 // 73 if (_file == null) 74 { 75 string[] fs = _url.LocalPath.Split('/'); 76 if (fs.Length > 0) 77 _file = fs[fs.Length - 1]; 78 } 79 80 // 81 // Construct web client object and hook in event handlers to report progress and completion. 82 // 83 WebClient clnt = new WebClient(); 84 clnt.DownloadProgressChanged += new DownloadProgressChangedEventHandler(webClient_DownloadProgressChanged); 85 clnt.DownloadFileCompleted += new AsyncCompletedEventHandler(webClient_DownloadFileCompleted); 86 87 try 88 { 89 // 90 // Check validity for download. Will throw an exception in case of transport protocol errors. 91 // 92 using (clnt.OpenRead(_url)) { } 93 94 // 95 // Download the file asynchronously. Reporting will happen through events on background threads. 96 // 97 clnt.DownloadFileAsync(_url, _file); 98 99 // 100 // Wait for any of the events (exit, prog) to occur. 101 // In case of index 0 (= exit), stop processing. 102 // In case of index 1 (= prog), report progress. 103 // 104 while (WaitHandle.WaitAny(evts) != 0) //0 is exit event 105 { 106 lock (pr_sync) 107 { 108 WriteProgress(pr); 109 } 110 } 111 112 // 113 // Write file info object for the target file. Can be used for further processing on the pipeline. 114 // 115 WriteObject(new FileInfo(_file)); 116 } 117 catch (WebException ex) 118 { 119 // 120 // Report an error. Could be more specific for what the ErrorCategory is concerned, by mapping HTTP error codes. 121 // 122 WriteError(new ErrorRecord(ex, ex.Status.ToString(), ErrorCategory.NotSpecified, clnt)); 123 } 124 } 125 126 /// <summary> 127 /// Reports download progress. 128 /// </summary> 129 private void webClient_DownloadProgressChanged(object sender, DownloadProgressChangedEventArgs e) 130 { 131 // 132 // Construct a ProgressRecord with download state information but no completion time estimate (SecondsRemaining < 0). 133 // 134 ProgressRecord pr = new ProgressRecord(0, String.Format("Downloading {0}", _url.ToString(), _file), String.Format("{0} of {1} bytes transferred.", e.BytesReceived, e.TotalBytesToReceive)); 135 pr.CurrentOperation = String.Format("Destination file: {0}", _file); 136 pr.SecondsRemaining = -1; 137 pr.PercentComplete = e.ProgressPercentage; 138 139 // 140 // Report availability of a ProgressRecord item. Will cause the while-loop's body in ProgressRecord to execute. 141 // 142 lock (pr_sync) 143 { 144 this.pr = pr; 145 prog.Set(); 146 } 147 } 148 149 /// <summary> 150 /// Reports download completion. 151 /// </summary> 152 private void webClient_DownloadFileCompleted(object sender, System.ComponentModel.AsyncCompletedEventArgs e) 153 { 154 // 155 // Signal the exit state. Will cause the while-loop in ProcessRecord to terminate. 156 // 157 exit.Set(); 158 } 159 } 160 161 [RunInstaller(true)] 162 public class DownloadFileSnapIn : PSSnapIn 163 { 164 public override string Name { get { return "DownloadFile"; } } 165 public override string Vendor { get { return "Bart De Smet"; } } 166 public override string Description { get { return "Allows file download."; } } 167 }

Starting at the bottom of the cmdlet definition we can see the two event handlers, webClient_DownloadProgressChanged and webClient_DownloadFileCompleted. The first one creates a ProgressRecord (lines 134-137) which is the way for a cmdlet to communicate status to the PowerShell host application. The parameters and properties are self-explanatory. One could make this method more complex in order to provide a seconds remaining estimate based on the download speed observed. In this basic sample we're happy with some status messages and a percentage. In order to report progress, some thread synchronization stuff is needed. Remember the WriteProgress method can only be called on the ProcessRecord's thread. So, we copy (line 144) the constructed ProgressRecord to the pr member of the class (line 29) which is synchronized by pr_sync (line 34, used in line 142). Finally the availability of the record is signaled using the wait handle prog in line 145. Notice it's an AutoResetEvent (line 19), which means that it gets reset automatically (to false) once the consuming thread (ProcessRecord) has sucked it (in WaitAny in our case, line 104). The webClient_DownloadFileCompleted event handler is straightforward an just signals the exit "download completion" state on line 157 that will be caught on line 104's WaitAny.

Real work happens in ProcessRecord of course. First (line 68) an array of wait handles gets constructed to be used in the WaitAny call (line 104). WaitAny means to wait till either one of these handles has been set. For example: if a progress record is available, prog will be set (by webClient_DownloadProgressChanges on line 145) and WaitAny will return 1 because prog is on index 1 in the evts handles array. In a similar way, WaitAny will return 0 if the 0'th element of evts, i.e. exit, has been set (by webClient_DownloadFileCompleted on line 157). Next, the case of no specified target file is taken care of in lines 73 to 78, taking the source file name as the target name. Now (lines 83-85) the WebClient instance is created and the event handlers are hooked in. Finally, the download progress can be started. In order to cause an exception - for example in case of a 404 error code - before download begins, we call OpenRead on line 92. The asynchronous download job is started on line 97.

The summum of our code is in lines 104-110, where ProgressRecord instances are consumed every time the prog wait handle is set. These are reported to the PowerShell host by means of a WriteProgress call (line 108), taking care of the required locking. Finally, in line 115, a FileInfo object for the target file is written on the pipeline which might be useful for further processing. WebExceptions are caught on line 117 and reported via WriteError on line 122.

The snap-in for the cmdlet goes without further explanation, see lines 161 to 167.

 

Compilation

Compiling this goes as follows:

  • Make sure the PATH environment variable has csc.exe and installutil.exe on it (set PATH=%PATH%;%windir%\Microsoft.NET\Framework\v2.0.50727)
  • Save the file as downloadfilecmdlet.cs
  • Copy System.Management.Automation from the GAC (%windir%\assembly\GAC_MSIL\System.Management.Automation\<version_pktoken>\System.Management.Automation.dll) to the current folder with the downloadfilecmdlet.cs code file
  • Execute csc /t:library /r:System.Management.Automation.dll downloadfilecmdlet.cs
  • Install the snap-in using installutil -i downloadfilecmdlet.dll

 

Demo

See the pictures below for the cmdlet in action (H):

Download the code over here.

Have fun!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under:

Comments

# re: PowerShell - A cmdlet that reports progress - A simple file downloader cmdlet

Sunday, November 26, 2006 11:10 PM by dotnetjunkie

I think you should add transfer speed and estimated remaining time, that would make it even more useful and cooler ! ;)

# The November 06 Month Report

Friday, December 01, 2006 4:06 AM by B# .NET Blog

Yet another great (well, at least in my opinion) month of Daily Blogging . Once more, feedback from readers

# newbie.blog &raquo; Download Files Natively with PowerShell

Tuesday, July 31, 2007 1:24 PM by newbie.blog » Download Files Natively with PowerShell

Pingback from  newbie.blog &raquo; Download Files Natively with PowerShell