November 2006 - Posts

Introduction

Today (11/29/06) I received a mail from one of my blog readers:

Hi Bart,

in the following Block you mention an powershell commandlet for creating an SHA hash:

http://community.bartdesmet.net/blogs/bart/archive/2006/06/23/4106.aspx

Could you please publish or mail me some code snipes? My own code wont work proper.

THX for sharing so much knowledge about .NET.

Regards,

Thomas

Apparently I promised so time in the past to upload a cmdlet for file hashing but it never made it to my blog. So here it is today.

 

A file hasher cmdlet

Let's create a cmdlet for file hashing, called get-hash. It should take two parameters: one with the algorithm desired (SHA1, MD5, SHA256, SHA384, SHA512) and one with the file. The latter one can either be passed from the command line (e.g. dir *.cs | get-hash sha1 should work fine) or using some aliases specifying the name of the file as a string. Taking all these requirements together, we end up with the following:

1 using System; 2 using System.ComponentModel; 3 using System.IO; 4 using System.Management.Automation; 5 using System.Security.Cryptography; 6 using System.Text; 7 8 [Cmdlet("get", "hash")] 9 public class HashCmdlet : PSCmdlet 10 { 11 private string algorithm; 12 13 [Parameter(Position = 0, Mandatory = true)] 14 public string Algorithm 15 { 16 get { return algorithm; } 17 set { algorithm = value; } 18 } 19 20 private string file; 21 22 [Alias("File", "Name")] 23 [Parameter(Position = 1, Mandatory = true, ValueFromPipelineByPropertyName = true)] 24 public string FullName 25 { 26 get { return file; } 27 set { file = value; } 28 } 29 30 protected override void ProcessRecord() 31 { 32 HashAlgorithm algo = HashAlgorithm.Create(algorithm); 33 if (algo != null) 34 { 35 StringBuilder sb = new StringBuilder(); 36 using (FileStream fs = new FileStream(file, FileMode.Open)) 37 foreach(byte b in algo.ComputeHash(fs)) 38 sb.Append(b.ToString("x2")); 39 WriteObject(sb.ToString()); 40 } 41 else 42 { 43 string s = String.Format("Algorithm {0} not found.", algorithm); 44 ErrorRecord err = new ErrorRecord(new ArgumentException(s), s, ErrorCategory.InvalidArgument, null); 45 WriteError(err); 46 } 47 } 48 } 49 50 [RunInstaller(true)] 51 public class HashSnapin : PSSnapIn 52 { 53 public override string Name { get { return "FileHasher"; } } 54 public override string Vendor { get { return "Bart"; } } 55 public override string Description { get { return "Computes file hashes."; } } 56 }

A few remarks:

  • The second parameter, FullName, can be taken from the pipeline (ValueFromPipelineByPropertyName set to true). The reason for the chosen name "FullName" is the fact that a System.IO.FileInfo object has such a property name, and we want that property name to match our property name.
  • Using aliases, the second parameter can be supplied using File or Name too.
  • Line 39 write the string representation of the hash to the pipeline using WriteObject. Alternatively, one might also output the byte array retrieved from ComputeHash (line 37) but that would be less user-friendly in my opinion.
  • In case an unknown algorithm is specified, we construct an ErrorRecord object that's passed on to the shell using WriteError.
  • Stuff on lines 50 to 56 creates a simple snap-in that's used to make the cmdlet available.

 

Compilation and installation instructions

Download the code and execute the following steps on a VS2005 command line:

  • Make sure the PATH environment variable has csc.exe and installutil.exe on it (set PATH=%PATH%;%windir%\Microsoft.NET\Framework\v2.0.50727)
  • Save the file as hashcmdlet.cs
  • Copy System.Management.Automation from the GAC (%windir%\assembly\GAC_MSIL\System.Management.Automation\<version_pktoken>\System.Management.Automation.dll) to the current folder with the hashcmdlet.cs code file
  • Execute csc /t:library /r:System.Management.Automation.dll hashcmdlet.cs
  • Install the snap-in using installutil -i hashcmdlet.dll

 

Demo

Below you can see a sample of our get-hash cmdlet:

1 PS C:\temp> add-pssnapin filehasher 2 3 PS C:\temp> type extfi.ps1xml 4 <Types> 5 <Type> 6 <Name>System.IO.FileInfo</Name> 7 <Members> 8 <ScriptProperty> 9 <Name>MD5</Name> 10 <GetScriptBlock> 11 get-hash md5 $this 12 </GetScriptBlock> 13 </ScriptProperty> 14 <ScriptProperty> 15 <Name>SHA1</Name> 16 <GetScriptBlock> 17 get-hash sha1 $this 18 </GetScriptBlock> 19 </ScriptProperty> 20 </Members> 21 </Type> 22 </Types> 23 24 PS C:\temp> update-typedata extfi.ps1xml 25 26 PS C:\temp> dir *.cs | gm -type P* 27 28 TypeName: System.IO.FileInfo 29 30 Name MemberType Definition 31 ---- ---------- ---------- 32 PSChildName NoteProperty System.String PSChildName=bar.cs 33 PSDrive NoteProperty System.Management.Automation.PSDriveInfo PS... 34 PSIsContainer NoteProperty System.Boolean PSIsContainer=False 35 PSParentPath NoteProperty System.String PSParentPath=Microsoft.PowerS... 36 PSPath NoteProperty System.String PSPath=Microsoft.PowerShell.C... 37 PSProvider NoteProperty System.Management.Automation.ProviderInfo P... 38 Attributes Property System.IO.FileAttributes Attributes {get;set;} 39 CreationTime Property System.DateTime CreationTime {get;set;} 40 CreationTimeUtc Property System.DateTime CreationTimeUtc {get;set;} 41 Directory Property System.IO.DirectoryInfo Directory {get;} 42 DirectoryName Property System.String DirectoryName {get;} 43 Exists Property System.Boolean Exists {get;} 44 Extension Property System.String Extension {get;} 45 FullName Property System.String FullName {get;} 46 IsReadOnly Property System.Boolean IsReadOnly {get;set;} 47 LastAccessTime Property System.DateTime LastAccessTime {get;set;} 48 LastAccessTimeUtc Property System.DateTime LastAccessTimeUtc {get;set;} 49 LastWriteTime Property System.DateTime LastWriteTime {get;set;} 50 LastWriteTimeUtc Property System.DateTime LastWriteTimeUtc {get;set;} 51 Length Property System.Int64 Length {get;} 52 Name Property System.String Name {get;} 53 MD5 ScriptProperty System.Object MD5 {get=get-hash md5 $this;} 54 Mode ScriptProperty System.Object Mode {get=$catr = "";... 55 SHA1 ScriptProperty System.Object SHA1 {get=get-hash sha1 $this;} 56 57 PS C:\temp> dir *.cs | format-table Name,MD5,SHA1 58 59 Name MD5 SHA1 60 ---- --- ---- 61 bar.cs d541e9719077844ba1fa136... 8662e86f3302578a59da5e... 62 downloadfilecmdlet.cs 0c74a0c905f3b1cd6e22d52... ab3c4dcee4f9e3c48daded... 63 hashcmdlet.cs 41b01139d6168df3f3cec13... dd478c60f77b19b64fa0d7... 64 test.cs 477405d2be4a8f327d39a01... c632fe67a71baa0f333675... 65 66 PS C:\temp> dir *.cs | format-list Name,MD5,SHA1 67 68 Name : bar.cs 69 MD5 : d541e9719077844ba1fa13626f5122cb 70 SHA1 : 8662e86f3302578a59da5e9c936b69ab0d4ff9aa 71 72 Name : downloadfilecmdlet.cs 73 MD5 : 0c74a0c905f3b1cd6e22d52831b92b31 74 SHA1 : ab3c4dcee4f9e3c48daded97f01ee01e8c572a2a 75 76 Name : hashcmdlet.cs 77 MD5 : 41b01139d6168df3f3cec13b9663e633 78 SHA1 : dd478c60f77b19b64fa0d7c62944ec1b948419c9 79 80 Name : test.cs 81 MD5 : 477405d2be4a8f327d39a015db255fdf 82 SHA1 : c632fe67a71baa0f333675f5cdc16fc547772c33 83 84 PS C:\temp> get-hash MD5 bar.cs 85 d541e9719077844ba1fa13626f5122cb 86 87 PS C:\temp> get-hash SHA1 bar.cs 88 8662e86f3302578a59da5e9c936b69ab0d4ff9aa 89 90 PS C:\temp> get-hash SHA256 bar.cs 91 a1e6764cf77d02804e909427aff62ade6b9894924a69284f3d83fd0d2904548b 92 93 PS C:\temp> get-hash SHA384 bar.cs 94 875bde9e789f88e76aa9fe18f82adc8a8beb920cdf1d50692a0b9473ecc296a750e888844a184d7 95 6e610d434a3bec3a5 96 97 PS C:\temp> get-hash SHA512 bar.cs 98 f88b40bd4618dcb99e17af14c7f2368b00ea55a9b7d6d71e73d519e197ca3d6c3847fd46834fb1e 99 c4acb9c45729441ac76611de2f7f86b032b59e1a3b7384a3a 100 101 PS C:\temp> get-hash bla bar.cs 102 get-hash : Algorithm bla not found. 103 At line:1 char:9 104 + get-hash <<<< bla bar.cs

This sample is available in the download as well. Let's explain it in a bit more detail:

  • One line 1, the snap-in is loaded (which should be installed using installutil -i hashcmdlet.dll in the previous paragraph) using add-pssnapin.
  • Next, we leverage the power of the ETS (Extended Type System) to add two script properties MD5 and SHA1 to the FileInfo object. To do this, you should write an xml file which contains the line 4 to 22 and save it as a .ps1xml file. This file is loaded using update-typedata in line 24. This file is available in the download as well.
  • Now notice that FileInfo is extended with the MD5 and SHA1 ScriptProperty members using a get-member invocation on the output for get-childitem *.cs (assuming you have .cs files in your c:\temp folder). Lines 53 and 55 contain the newly added properties.
  • To use these script properties, observe the commands invoked in lines 57 and 66. These use format-table and format-list to visualize the additional properties.
  • Of course you can invoke get-hash directly too, as shown on lines 84, 87, 90, 93, 97 and 101. All supported algorithms are illustrated.

One drawback of our cmdlet is that it can't report progress when a hash operation takes a bit of time, especially for larger files. So, use it with care and rely on an explicit call to get-hash when you need to calculate a hash.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Two little-known features of ILASM (the IL assembler for .NET) are #define and .typedef which can reduce typing significantly, just as these do in a classic programming language. Often people do "round-tripping", i.e. they write an application in C#, ildasm it, make some slight modification, and ilasm it again. Now, in such a cycle you won't see all of the features of IL, such as #define and .typedef. So, if you're interested in IL, read on.

A few weeks ago I was creating a simple sample of "IL from scratch", meaning you take Notepad, write some IL and feed it to the ILASM tool. This was what I produced:

1 .assembly extern mscorlib {} 2 .assembly Demo {} 3 4 .namespace Bar.Foo 5 { 6 .class Demo 7 { 8 .field private static class Bar.Foo.Base printer 9 10 .method public static void Main() 11 { 12 .entrypoint 13 ldstr "Hello World" 14 newobj instance void Bar.Foo.Test::.ctor(string) 15 stsfld class Bar.Foo.Base Bar.Foo.Demo::printer 16 ldsfld class Bar.Foo.Base Bar.Foo.Demo::printer 17 call instance void Bar.Foo.Test::Print() 18 ret 19 } 20 } 21 22 .class abstract Base 23 { 24 .field family string name 25 26 .method public void .ctor(string name) 27 { 28 ldarg.0 //this 29 ldarg.1 //name 30 stfld string Bar.Foo.Base::name 31 ret 32 } 33 34 .method abstract virtual family void Print() 35 { 36 } 37 } 38 39 .class sealed Test extends Bar.Foo.Base 40 { 41 .method public void .ctor(string name) 42 { 43 ldarg.0 //this 44 ldarg.1 //name 45 call instance void Bar.Foo.Base::.ctor(string) 46 ret 47 } 48 49 .method public virtual void Print() 50 { 51 .override Bar.Foo.Base::Print 52 ldarg.0 //this 53 ldfld string Bar.Foo.Base::name 54 call [mscorlib]System.Console::WriteLine(string) 55 ret 56 } 57 } 58 }

As you can guess, this piece of code illustrates object-orientation in MSIL by creating some type with virtual methods, overriding, a family (~ protected) field, etc. However, there's quite a lot of typing to be done. For example, look at line 15 and 16 where the printer field is referenced including its type and its location. Basically, there is no such thing as namespaces in IL (line 4 is just syntactical sugar to say that all the classes below - line 6 and 39 - have to be prefixed with <namespace_name>. yielding Bar.Foo.Demo and Bar.Foo.Base. This same rule results in things like on line 54 where the assembly (mscorlib), the namespace (System.), the class (Console) and the method (::WriteLine) plus its arguments ((string)) have to written down to make a method call to the right overload. It would be much nicer to abbreviate these if you have to type it a lot.

#define will help us to reduce typing on lines 15 and 16. It's a precompiler directive that drives string replacement in what follows (just like the #ifdef...#else...#endif is a typical control directive). So, we can do something like this:

#define PRINTER "class Bar.Foo.Base Bar.Foo.Demo::printer"

.typedef is used for module-wide aliases. It allows to write stuff like:

.typedef [mscorlib]System.Console as Console .typedef method void Console::WriteLine(string) as WriteLine

So, we can just write call WriteLine instead of call [mscorlib]System.Console::WriteLine(string). The difference with a similar definition using a #define is the fact that aliases are more restrictive and can only be used for classes, methods, fields and attributes. Also, #define hasn't any meaning to the MSIL compiler itself since it's done in the pre-compilation phase. On the other hand, aliases are a part of the MSIL language and result in metadata, which means they can survive round-tripping. In other words, an alias defined in a piece of IL will remain in there even after a ILASM-ILDASM cycle of operations, whileas a #define will vanish during this process.

The end-result looks like this:

1 #define PRINTER "class Bar.Foo.Base Bar.Foo.Demo::printer" 2 3 .typedef [mscorlib]System.Console as Console 4 .typedef method void Console::WriteLine(string) as WriteLine 5 6 .assembly extern mscorlib {} 7 .assembly Demo {} 8 9 .namespace Bar.Foo 10 { 11 .class Demo 12 { 13 .field private static class Bar.Foo.Base printer 14 15 .method public static void Main() 16 { 17 .entrypoint 18 ldstr "Hello World" 19 newobj instance void Bar.Foo.Test::.ctor(string) 20 stsfld PRINTER 21 ldsfld PRINTER 22 call instance void Bar.Foo.Test::Print() 23 ret 24 } 25 } 26 27 .class abstract Base 28 { 29 .field family string name 30 31 .method public void .ctor(string name) 32 { 33 ldarg.0 //this 34 ldarg.1 //name 35 stfld string Bar.Foo.Base::name 36 ret 37 } 38 39 .method abstract virtual family void Print() 40 { 41 } 42 } 43 44 .class sealed Test extends Bar.Foo.Base 45 { 46 .method public void .ctor(string name) 47 { 48 ldarg.0 //this 49 ldarg.1 //name 50 call instance void Bar.Foo.Base::.ctor(string) 51 ret 52 } 53 54 .method public virtual void Print() 55 { 56 .override Bar.Foo.Base::Print 57 ldarg.0 //this 58 ldfld string Bar.Foo.Base::name 59 call WriteLine 60 ret 61 } 62 } 63 }

In name of all IL freaks, enjoy MSIL 2.0!

kick it on DotNetKicks.com

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Only a couple of days ago I posted about creating a file downloader cmdlet in Windows PowerShell which contained the following little sentence:

One could make this method more complex in order to provide a seconds remaining estimate based on the download speed observed.

One of my readers (dotnetjunkie) was so kind to leave a piece of feedback:

I think you should add transfer speed and estimated remaining time, that would make it even more useful and cooler ! ;)

Well, I couldn't agree more. So I revisited my piece of code I wrote some weeks ago (12th of November to be precise) and added download speed tracking and a seconds remaining indicator.

There are undoubtly different ways to implement such an estimate. My approach is to measure the number of bytes transferred during intervals of approximately 5 seconds (kind of "instant download speed") and to derive the estimated time remaining from this. I'll discuss the code changes in a few steps.

Step 1 - Add a few private members

We need 4 additional members to produce statistics:

  • The first one is a Stopwatch to measure elapsed time. We'll poll this counter every time the DownloadProgressChanged event is fired. In case it's above 5 seconds, we update the statistics and restart the counter.
  • Secondly, we'll cache an indicator that keeps the current transfer speed as a string of the format n [bytes|KB|MB|GB|...]/sec. This value will be updated every 5 seconds.
  • Next, the seconds remaining are kept. It's updated every 5 seconds and during these intervals it just counts down every one second.
  • Finally, a transferred bytes indicator is used for calculation of the bytes transferred the last 5 seconds.

Here's the piece of code:

/// <summary> /// Stopwatch used to measure download speed. /// </summary> private Stopwatch sw = new Stopwatch(); /// <summary> /// Bytes per second indicator (bytes/sec, KB/sec, MB/sec, ...). /// </summary> private string bps = null; /// <summary> /// Seconds remaining indicator. /// </summary> private int secondsRemaining = -1; /// <summary> /// Number of bytes already transferred. /// </summary> private long transferred = 0;

Step 2 - Let the count begin

In the ProcessRecord method we start our Stopwatch; just that:

// // Check validity for download. Will throw an exception in case of transport protocol errors. // using (clnt.OpenRead(_url)) { } // // Start download speed stopwatch. // sw.Start(); // // Download the file asynchronously. Reporting will happen through events on background threads. // clnt.DownloadFileAsync(_url, _file);

Step 3 - Calculate stats and report progress

Time for the real stuff. On to the DownloadProgressChanged event handler. When we observe that the Stopwatch has an elapsed time of 5 or more seconds, we'll stop it, update stats and restart it. The code is shown below:

1 /// <summary> 2 /// Reports download progress. 3 /// </summary> 4 private void webClient_DownloadProgressChanged(object sender, DownloadProgressChangedEventArgs e) 5 { 6 // 7 // Update statistics every 5 seconds (approx). 8 // 9 if (sw.Elapsed >= TimeSpan.FromSeconds(5)) 10 { 11 sw.Stop(); 12 13 // 14 // Calculcate transfer speed. 15 // 16 long bytes = e.BytesReceived - transferred; 17 double bps = bytes * 1000.0 / sw.Elapsed.TotalMilliseconds; 18 this.bps = BpsToString(bps); 19 20 // 21 // Estimated seconds remaining based on the current transfer speed. 22 // 23 secondsRemaining = (int)((e.TotalBytesToReceive - e.BytesReceived) / bps); 24 25 // 26 // Restart stopwatch for next 5 seconds. 27 // 28 transferred = e.BytesReceived; 29 sw.Reset(); 30 sw.Start(); 31 } 32 33 // 34 // Construct a ProgressRecord with download state information but no completion time estimate (SecondsRemaining < 0). 35 // 36 ProgressRecord pr = new ProgressRecord(0, String.Format("Downloading {0}", _url.ToString(), _file), String.Format("{0} of {1} bytes transferred{2}.", e.BytesReceived, e.TotalBytesToReceive, this.bps != null ? String.Format(" (@ {0})", this.bps) : "")); 37 pr.CurrentOperation = String.Format("Destination file: {0}", _file); 38 pr.SecondsRemaining = secondsRemaining - (int)sw.Elapsed.Seconds; 39 pr.PercentComplete = e.ProgressPercentage; 40 41 // 42 // Report availability of a ProgressRecord item. Will cause the while-loop's body in ProgressRecord to execute. 43 // 44 lock (pr_sync) 45 { 46 this.pr = pr; 47 prog.Set(); 48 } 49 }

So, what's going on here. Basically we want to provide a seconds remaining estimate on line 38 and a download speed estimate on line 36. This should be pretty self-explanatory. The real work happens in lines 11 to 30 where the number of bytes transferred in the last 5 seconds are obtained and divided by the expired milliseconds during the last 5 seconds (which should be around 5000 obviously). The rest is maths, except for the BpsToString call as shown below.

Step 4 - A download speed indicator

BpsToString is the method to convert the bytes per second rate to a friendly string representation:

/// <summary> /// Constructs a download speed indicator string. /// </summary> /// <param name="bps">Bytes per second transfer rate.</param> /// <returns>String represenation of the transfer rate in bytes/sec, KB/sec, MB/sec, etc.</returns> private string BpsToString(double bps) { string[] m = new string[] { "bytes", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB" }; //dreaming of YB/sec int i = 0; while (bps >= 0.9 * 1024) { bps /= 1024; i++; } return String.Format("{0:0.00} {1}/sec", bps, m[i]); }

I think the code fragment above is pretty optimistic for what transfer speeds is concerned, but with the expected life time of PowerShell in mind this should be no luxury :-).

Step 5 - The result & code download

This is the result (needless to say the figures are indicative only, it are estimates after all):

And here's the code download link.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks