2008-03-29 11:50 UTC Latest referrers using Rack and Ruby
As most bloggers I like to keep an eye on where my traffic is coming from, and especially when there are surges in traffic. I'm using both Google Analytics and Feedburner for stats, and it works great for trends, but not see what's happening right now.
This morning I needed a distraction and figured I'd just throw together a quick and dirty Rack middleware class to keep track of the latest referrers.
What I ended up doing was keeping a rolling buffer in an array that holds the last N referrers, and generate a histogram from that as needed. I'm not interested in accuracy, since I have the logs + Google Analytics + Feedburner to get the daily totals, so I didn't bother persisting the buffer to disk or anything - if I restart my app the stats will reset. This is just to get a live image of what's going on right now.
The downside of that is that this approach does not scale beyond a single process. If you want it to, you really do want to persist the data to a database or something, though adds a lot of overhead. Maybe I'll do that next - it's easy, but until my blog has a lot more traffic I don't really have the motivation.
Here's the class (yes, I know referrer is misspelled, but it matches the HTTP header):
module LatestReferers class Gather def initialize app, opts = {} @app = app @referers = [] @limit = 100 @exclude = [] opts.each do |k,v| @limit = v.to_i if k == :limit @exclude = v if k == :exclude end end def call env ref = env["HTTP_REFERER"] || "-" req = env["REQUEST_URI"] if !@exclude.detect{|pat| req =~ pat || ref =~ pat } @referers << [ref,req] @referers.shift if @referers.size > @limit end env["hokstad.latestreferers"] = self @app.call(env) end def histogram h = {} @referers.each do |ref,req| h[ref] ||= {:total => 0} h[ref][req] ||= 0 h[ref][req] += 1 h[ref][:total] += 1 end h.sort_by{|ref,pages| -pages[:total]} end end endIn turn:
- #initialize takes the next app and a hash of options. Currently it recognized :limit, which controls how many referrers it will track, and :exclude which takes an array of regexp's to check against both the request uri and referrer for patterns to reject - I'm not interested in local referrals internally on my site, or referrals to the page I use to view the referral stats.
- #call just gets the fields, checks them against the patterns, and i they don't match, it adds the referrer and page to the end of the array and removes the first if it exceeds the limit, to create a FIFO queue.
- #histogram creates a sorted hash of hashes mapping a referrer to page names and the number of times each page has been accessed, plus a total.
use LatestReferers::Gather, {:exclude => [ /\/referers/, /http:\/\/www\.hokstad\.com/, /\.xml/, /\/feed/, /\.rdf/ ]}The above is the config I use for this site. If you just want a simple table of the results, you can use something like this. I just want the numbers, I don't care how the page looks:
module LatestReferers class View def initialize app, page @app = app @page = page end def show(ref) return Rack::Response.new("Missing 'latestreferers' object",500).finish if !ref r = Rack::Response.new r.write("<html><head/><body>") r.write("<table border='1'><tr><th>Referer</th><th>Pages</th></tr>\n") ref.histogram.each do |k,v| r.write("<tr><td>#{k}</td> <td><table>") total = 0 v.sort_by{|page,count| -count}.each do |page,count| r.write("<tr><td>#{count}</td><td>#{page.to_s}</td></tr>") total += count end r.write("</table></td></tr>\n") end r.write("</table></body></html>") r.finish end def call env if env["REQUEST_URI"] == @page show(env["hokstad.latestreferers"]) else @app.call(env) end end end endThat serves as a simple example of using Rack::Response too - it's completely optional, and you can stream out any template from your favorite templating system instead of hardcoding the HTML, but for this I just wanted something with no other external dependencies than Rack. There's probably a lot of things I could do to the view code, but it's a throwaway hack - I just want to be able to see at a glance if anything interesting is happening. If you want a pretty page, it's easy enough to use the above as a starting point. You can see the live result of using the above classes here with this config (expect it to be reset quite often, and I only track the last 100, so don't expect it to show a huge list):
use LatestReferers::Gather, {:exclude => [ /\/referers/, /http:\/\/www\.hokstad\.com/, /\.xml/, /\/feed/, /\.rdf/ ]} use LatestReferers::View, "/referers"
No comments yet - Be the first one!