Plagiarize From Behind The Paywall

3 min read

From http://davideharrington.com/?p=594

To test Turnitin’s crawlers, I uploaded the document containing the New York Times articles to my website a few months ago. Google now matches many of the plagiarized phrases from Shoplifting to the New York Times articles on my website and some of the phrases to articles in the archives of the paper. Google also matches them to Shoplifting itself, which has been scanned into Google Books.

Turnitin fails to match the plagiarized phrases to any of these sources. I e-mailed Turnitin’s help desk, essentially asking, “What’s going on? Why can’t Turnitin find these things?”

A few hours later, a guy at Turnitin’s product support sent me a detailed answer that boils down to three basic points—the Internet is a big place and it takes our crawlers time to scan it; we can’t scan the New York Times because it requires a subscription; and, we can’t scan images of text like those used by Google Books. In other words, our crawlers are puny compared to Google’s.

Steal This Computer Book 4.0

If a student plagiarizes from a source that is not freely available over the open web, their chances of getting caught (at least via TurnItIn) are smaller. This includes content that is published on Google Books. However, books published at Project Gutenberg (which are text-based and not behind a paywall) would be found via TurnItIn.

There are other valid reasons not to use TurnItIn. TurnItIn demands the rights to use student-generated content for free in perpetuity; that's pretty offensive. But, the fact that TurnItIn isn't effective at detecting plagiarism from a large number of sources (aka, it doesn't work as well as it claims) should be another good reason not to subject students to this level of scrutiny and IP theft as a precondition for learning.

UPDATE: TurnItIn responded via Twitter that "Turnitin has tons of subscription content from pubs, journals, and library databases, but not everything."

I have requested some additional info about TurnItIn's criteria for indexing and not indexing content. Stay tuned.

UPDATE 2: TurniItIn responded: "We have a team that manages our content partnerships, but can't reveal all of'm. See https://turnitin.com/static/products/content.php"

Image Credit: "Steal This Computer Book 4.0" taken by Jordan and Lee, published under an Attribution-Non Commercial - Share Alike license.

.com, , ,