Presentation Downloads Top 1 Million

I was wondering how many people had downloaded my Agile Databases With Migrations talk from my web site, so I decided to check the logs. Given that there are sometimes many repeated downloads from the same IP, I wanted to filter out any duplicate IPs from the HTTP access_log.

So, first we create a small ruby file “ip.rb” to process the relevant lines from STDIN.


#!/usr/bin/env ruby
text = STDIN.read
lines = text.split("\n")
result = [ ]
for line in lines do
  arr = line.split
  result << arr[0]
end
p result.uniq!
puts result.size

Next, we use grep to pick out the download lines:

grep "agilemigrations" access_log

Finally, we combine the two commands:

grep "agilemigrations" access_log | ./ip.rb

The simplicity and expressiveness of Ruby to accomplish a useful task shines again.

UPDATE: 7000+ downloads as of 2/1/07.