Jan
6
2006

Datamining 101: Amazon Communist hunt



Comments available as RSS 2.0

“Data mining” of all that information and communication is at the heart of the furor over the recent disclosure of government snooping

Using a simple 6-line shell script and the popular wget command line tool, I configured two computers on two different DSL connections to begin downloading all 260,000 wishlists in increments of 25,000. Each group of 25,000 wishlists took about four hours to download, for a total download time of less than one day.

Using a pair of 5-year-old computers, two home DSL connections, 42 hours of computer time, and 5 man hours, I now had documents describing the reading preferences of 260,000 U.S. citizens.

Impressive, but I don't see how the Amazon reading habits of every geek in America would be of much use to anyone. Unless they're buying copies of 'Communism for Dummies'…

Source [AppleFritter]

Like this post? Tell someone!
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Google
  • Technorati

Comments

Leave a Comment

Login using OpenID or enter your details below to leave a comment.

OpenID
Anonymous


Comment