Monday, May 7, 2012

Defendant argues he has expectation of privacy in SHA-1 values. Wait, what?

In State v. Daigle, 2012 La. App. LEXIS 573 (2012), the defendant appealed the denial of a motion to suppress evidence of child pornography obtained through the use of the Gnutella network. His argument began with an assertion of privacy already foreclosed by both United States v. Stults, 575 F.3d 834 (8th Cir. 2009) and United States v. Ganoe, 538 F.3d 1117 (9th Cir. 2008) - namely that an individual does not have a reasonable expectation of privacy in the files that they share off their computer because the whole of the Internet is free to view and download those files. However, in his next argument he states that even if the above is true, he had an expectation of privacy in the SHA values of the files? --
Defendant next claims that he had an expectation of privacy in the SHA values for his files as SHA means Secure Hash Algorithm. By its very name, it implies an expectation of privacy. Moreover, Defendant had an expectation of privacy because his files were encrypted and firewall-protected. Defendant equates Detective Gremillion's viewing the SHA values for his files to a law enforcement officer climbing a fence to look inside someone's window. 
A little tech speak for a second - SHA-1 is a mathematical algorithm that transforms the contents of a file into a hexidecimal string 40 characters long. So, for example, the SHA-1 hash of the string "justin" is 0ce7911e6479995d6c346d6f03eb723b5135309e. The hash is non-reversible and the likelihood of two sets of data having the same SHA-1 hash is essentially nil. This mathematical algorithm can be run against any data set to generate the hash, and often is run against files to ensure that they have not been modified while in transit, or from the original. So if I want to send a file to Jeffrey, I hash the file on my side, send it to him with the hash, and then when he gets it, he checks the hash against what I gave him to make sure he received exactly what I intended.

In this case, the officer used the "Wyoming Tool Kit," to interact with the Gnutella network just as a client would, but instead of sharing files for sharing sake, the tool kit runs SHA-1 hashes against files that are being shared to determine if they are child porn. If a hash matches a value in a database of SHA-1 hashes of known child pornography, typically law enforcement stops, finds who the user is, and then explores all of the files that the individual is sharing to determine if there is even more child pornography.

Now, with respect to mere file sharing, a few circuits have already ruled that you have no expectation of privacy in files that you share out into the peer-to-peer world - see Stults and Ganoe above. Since law and analogies go hand-in-hand, I'll analogize what I believe the courts have said in this regard. Essentially, by sharing your files out on the Gnutella network using programs such as Limewire, Bearshare, PHEX, etc., you are essentially stopping on the side of the road at a busy intersection, putting up a table, and handing out free cds, software, and pornography. Anyone is free to stop and ask you if you have a particular media file, and if you do, they can take a copy. You have a million copies on-site, so no big deal - you can always get more. And, if the person likes your tastes, and wants to see more of what you have, you allow them to rifle through your entire collection in the back of your van, and take whatever they would like from there, too.

In the analogy above, a police officer would violate no reasonable expectation of privacy because what you were offering at the table was in plain view, and what you had in the back of the van was legally searchable because you have offered blanket consent allowing anyone to look inside and rifle around. While it is often hard to make analogies to how cyberspace works, in this instance, it is actually pretty easy. And the legal basis is solid - you have a whole lot of consent, plain view, and one could argue third party doctrine as well.

After reading the above, it should be clear why the SHA-1 argument the defendant uses is befuddling and technically inaccurate. First, SHA-1 has nothing to do with encryption in this case, but merely with non-repudiation - i.e. that the value of the hash makes it impossible for you to argue the file isn't what they are alleging, because no two files have the same value. Additionally, using the Wyoming Tool Kit to obtain these hashes does nothing more than obtain their data and run this algorithm against them, which is functionally the same as mere file sharing.

While the "S" in SHA does stand for "secure," the defendant is arguing from a position of "huh." The SHA-1 values are post hoc values generated from data, not values that existed on the defendants computer and subject to a reasonable expectation of privacy. In essence, he cannot have privacy in a hash that was not calculated on his computer, never existed there, and even if it did, contained no more than 40 characters of hexidecimal garbage. SHA-1 is not encryption, its hashing - the hash is not the file, and the file is not the hash. The analogy at the end of the quoted argument above should not be hopping a fence and looking in a window, but instead: taking the free media you got from the guy with the van on the side of the road and checking to see if it is what it is purported to be - an exact replica - or if he was a fibber, and it is nothing of the sort.

0 comments:

Post a Comment