Sunday, October 6, 2013

Federal Ct. in web scraping case: accusations of "hacking" and "theft" could be defamatory, but privileged under facts


Can accusing someone of harvesting data from a publicly accessible webpage, by referring to that conduct as "hacking" and/or "theft," be a defamatory statement? Under the facts noted below, a federal court just said "yes," but ultimately found the statements privileged. There is an interesting discussion in the opinion about "protecting" website data with an exclusion in robots.txt (although, as an aside, robots.txt doesn't protect much of anything), and whether that choice to exclude makes any legal difference. The court also discusses the unsettled nature of CFAA law at the time the statement was made; to the court, the muddled precedent regarding whether scraping public web data was a CFAA violation was germane to determining if an accusation of "hacking" was accurate (i.e. a legal cause of action under the CFAA could be sustained).

As an initial matter, here is Mirriam-Webster Online's definition of "hack":
intransitive verb
...
4
a :  to write computer programs for enjoyment
b :  to gain access to a computer illegally
noun (1)
...
6
:  a usually creative solution to a computer hardware or programming problem or limitation 
hack 1  (hk)
v. hacked, hack·ing, hacks
v.tr.
...
3.
a. Informal To alter (a computer program): hacked her text editor to read HTML.
b. To gain access to (a computer file or network) illegally or without authorization: hacked the firm's personnel database.

v.intr.

a. To write or refine computer programs skillfully.
b. To use one's skill in computer programming to gain illegal or unauthorized access to a file or network: hacked into the company's intranet.
...
The American Heritage® Dictionary of the English Language, Fourth Edition copyright ©2000 by Houghton Mifflin Company. Updated in 2009. Published by Houghton Mifflin Company. All rights reserved. 
hack1
vb
...
7. (Electronics & Computer Science / Computer Science) to manipulate a computer program skilfully, esp, to gain unauthorized access to another computer system
...
Collins English Dictionary – Complete and Unabridged © HarperCollins Publishers 1991, 1994, 1998, 2000, 2003
And from the Oxford English Dictionary, Copyright © 2013 Oxford University Press
"hacking"
1.
...
 d. The use of a computer for the satisfaction it gives; the activity of a hacker (hacker n. 3). colloq. (orig. U.S.).
1976   J. Weizenbaum Computer Power & Human Reason iv. 118   The compulsive programmer spends all the time he can working on one of his big projects. ‘Working’ is not the word he uses; he calls what he does ‘hacking’.
1984   Times 7 Aug. 16/2   Hacking, as the practice of gaining illegal or unauthorized access to other people's computers is called.
1984   Sunday Times 9 Dec. 15/2   Hacking is totally intellectual—nothing goes boom and there are no sparks. It's your mind against the computer.
In Tamburo v. Dworkin, -- F.Supp.2d -- (N.D. Ill. Sept. 26, 2013), Judge Joan B. Gottschall granted Henry (another named defendant) motion for summary judgment; the causes of action against Dworkin were (1) tortious interference with a contractual relationship, (2) tortious interference with prospective economic damage, (3) defamation per se, and (4) defamation per quod.

The court stated the facts as follows:
The essential facts in this 2004 case are undisputed. Defendant Kristen Henry, a dog breeder and computer programmer, spent almost five years creating an extensive database of dog pedigrees, which she made freely available for use by fellow breeders through her web site. Plaintiffs John Tamburo and Versity Corporation (“Versity”) used an automated web browser to harvest the data from Henry’s website. They incorporated it into software which they attempted to sell to dog breeders for a profit. Henry was outraged. When the plaintiffs spurned her requests to cease using her data, she reached out to the dog breeding community, through emails and online messages, for assistance in responding to the plaintiffs’ misappropriation of her work. This lawsuit arose from her statements.
Henry (defendant) accused Tamburo of "hacking" in a Freerepublic.com article, as well as in an email; Henry also made various statements to a dog enthusiast message board using the words "theft" and "steal." One of the statements read: "[Tamburo] has written an agent robot to go to these individual sites and steal certain files...that were not offered to them except through a query user interface for page by page query of a single dog’s pedigree at a time."

Addressing the defamation allegations, the court analyzed whether the statements were non-actionable because they were either substantially true or protected by privilege. The court first discussed the defendant's use/lack thereof of robots.txt, which the court refers to as "the Robot Exclusion Standard." The court stated:
The parties dispute whether Tamburo and Versity evaded security measures to access Bonchien.com. Henry contends that a user could access the data on her site only through a query based search, by entering an individual dog name and the generations of ancestry to be displayed. Tamburo, however, states that the data could also be accessed through the site’s URL. He states in his affidavit that Henry admitted during her deposition that the URL used by the Data Mining Robot to access the web site was plainly visible, and that her allegations that the plaintiffs accessed data from non-public areas of the web site were false. 
Henry states in an affidavit that she did not give Tamburo or Versity express permission to access and gather the data on her website by any automated means, such as the Data Mining Robot. She contends that she placed a “robots.txt” header on the site to keep robots from indexing the site. The robots.txt protocol, or Robot Exclusion Standard, is a convention “to instruct cooperating web crawlers not to access all or part of a website that is publicly viewable. If a website owner uses the robots.txt file to give instructions about its site to web crawlers, and a crawler honors the instruction, then the crawler should not visit any pages on the website. The protocol can also be used to instruct web crawlers to avoid just a portion of the website that is segregated into a separate directory.”
As for allegedly defamatory statements regarding stealing of data, the court found them "substantially true." Here is the court's logic:
Tamburo argues that Henry’s statements that he stole from her are false because he did not commit theft. He did not delete or remove data from Henry’s site (thus depriving her of her property), Henry had made her data freely available, and no robots.txt file was visible on her site at the time the Data Mining Robot copied information on the site. According to Tamburo, because the data was not protected, either legally or by security protections on Henry’s web site, he could not have committed theft by appropriating it. 
Even so, the court concludes that no reasonable jury could find that Henry’s statements were not substantially true. ... It may be true that Tamburo could not be prosecuted or held liable for his actions because the data was publicly available and not protected by adequate security measures. But Tamburo’s argument relies on a narrow legal meaning of “theft.” Under Illinois law, the court must consider whether Henry’s use of the word “theft” is reasonably susceptible to a non- defamatory construction. (citation omitted) It is. To a lay person such as Henry, “theft” can also mean the wrongful act of taking the property of another person without permission. The data Henry had collected could be reasonably understood as her property—she had collected it, and it was her work in compiling it that gave it value. She did not give Tamburo permission to copy it and sell access to it. Although Henry might not be able to successfully sue Tamburo for using her data in this way, the gist of her statements was true: he took the data without her permission.
I can't say I agree with this -- the holding, in essence, means that anyone copying and pasting data from another individuals website is "stealing" that data if pre-approved permission isn't obtained. To me, the choice to post information on the internet, available to anyone in the world, means you assume the risk that your now "public" information will be used by others. You can't steal what is given away for free. And theft normally involves some deprivation of a property interest; what was the website owner deprived of, other than control of the information. Control which was given up when it was posted on the web.

Ultimately, the court held the statements were covered by privilege because "they related to her interests in protecting the substantial time and effort spent accumulating her data and in making it freely available to the community of Schepperke breeders, to promote the health of the breed. She also had an interest in ensuring that her data was presented in a certain way and in controlling the manner in which it could be accessed. Furthermore, the statements were published to people who likewise had an interest in the way in which the dog pedigree data was made available, and they involved a public interest in how access to information available on the internet is regulated."

Also, relating to privilege, the court discussed the current state of CFAA law at the time:
...Henry was a lay person, and the record shows plainly that as of May 4 and 5, 2004, when she made the statements that her data was “stolen,” Henry believed that Tamburo had stolen her data and was attempting to determine whether the law afforded her any protection against that theft. 
Moreover, even had Henry immediately consulted with an attorney, no such actual knowledge that Tamburo’s actions were lawful would have been revealed. Rather, in May 2004, the law governing the automated harvesting of data from web sites was unsettled. For example, a number of courts had held that website owners might have a remedy under the Computer Fraud and Abuse Act (“CFAA”) against defendants who had accessed information on their websites using automated harvesting. (citation omitted) In 2003, the First Circuit reversed a district court that had issued an injunction pursuant to the CFAA against a company using an automated “web scraper” to copy pricing information from a travel website. The district court had relied in part on “the fact that the website was configured to allow ordinary visitors to the site to view only one page at a time.” (citation omitted) The First Circuit disagreed and noted, “It is . . . of some use for future litigation . . . in this circuit to indicate that, with rare exceptions, public website providers ought to say just what non-password protected access they purport to forbid.” ...
The First Circuit’s opinion suggests that it is unlikely Henry could have pursued a CFAA claim, given the state of the law, and Tamburo is correct that a collection of data is normally not subject to copyright protections. See Feist Publ’ns v. Rural Tel. Serv. Co., 499 U.S. 340, 364 (1991) (noting that “copyright rewards originality, not effort”). Even so, further investigation on Henry’s part would not have revealed that Tamburo’s actions were undisputedly legal or illegal. Thus, even if Henry’s lawyer advised her that Tamburo had acted legally and that she did not have a remedy against him, such advice is not dispositive as to whether she abused the qualified privilege in making the statements in question. Henry was entitled to disagree with the lawyer about whether Tamburo had any right to access her database, another lawyer might have held a different opinion, and her statements were made as part of her efforts to seek help in protecting her interests. Thus, the fact that the law has evolved in a way that does not protect Henry’s years of work is not evidence that she made the statements about Tamburo’s theft with “a high degree of awareness of the[ir] probable falsity or entertaining serious doubts as to [their] truth.” (citation omitted)
Finally, addressing hacking, the court stated:
Tamburo argues that Henry’s statement that he committed “hacking” and that he took data from non-public areas of her website are defamatory because they imply illegal activity. He claims that the statements are false because he did not evade any security measures employed on Henry’s site, and no prohibition on robotic browsing was visible on the site.
The statements that Tamburo “wrote an agent robot to take specific files off of specific sites” and that the “files were not in a public venue” are substantially true and thus not actionable. Although Tamburo argues that the files were accessible to him through a URL, it is undisputed that Henry’s site was designed to allow the user to search manually for the pedigree of an individual dog. Nothing in the record indicates that Henry intended to make the entire database available to the public. The “gist” of the statements is therefore true. (citation omitted)
As to the word “hacking,” Henry argues that the term is susceptible to innocent construction because “the term has positive connotations,” implying the development of “a creative solution” to a computer problem. (citation omitted)) The innocent construction rule “requires a court to consider the statement in context and give the words of the statement, and any implications arising from them, their natural and obvious meaning.” (citation omitted) Courts “are to interpret the words of the statement as they appear to have been used and according to the idea they were intended to convey to a reader of reasonable intelligence,” and “should avoid straining” to give a term an innocent meaning. (citation omitted). Although Henry proposes that the word “hacking” can be used to convey an innocent meaning, it is clear from the context of her statement that she meant to imply that the way Tamburo accessed her database was unethical or illegal, not “creative.” Thus, the word, as used by Henry, was defamatory. 
Even so, the statement is protected by the same qualified privilege that renders Henry’s statements about theft non-actionable. Tamburo has presented no evidence showing that Henry abused the privilege. Although she admitted during her deposition that Tamburo had not evaded any security measures on her site, nothing in the record indicates that, at the time she made the statement about “hacking,” on May 5, 2004, she had serious doubts about the truth of the statement. Rather, the evidence shows that Henry designed her website to make data available to the public through a query search, which would provide information about one dog pedigree at a time. There is no dispute that this was the way Henry intended the site to be used, and that Tamburo instead accessed the site in a way that allowed him to copy Henry’s entire database.

1 comments:

  1. 50 years ago the same thing could have happened:

    1. Henry gathers data over five years, and prints it in a booklet.
    2. She gets an exhibitor table in a dog show, and offers those booklets for free.
    3. Tamburo takes a booklet from her exhibit, reformats the same data it into a new booklet, and sells them at his own exhibitor table, in the next dog show.

    The rest would be the same. Henry would be outraged, and would try to interfere with Tamburo's sales by saying (justifiably I think), "He stole my data." Tamburo would see his sales drop, and sue for defamation. The injection of new technology adds nothing to the principles of justice and, I would hope, the law's expression of those principles.


    Were there no precedents in print (and why not?)

    ReplyDelete