RSS+uTorrent+BTC for Automated torrent downloading

Want to tell uTorrent what to do via Ruby? Read on.

First, you need to enable the uTorrent local interface in the application. Then go install btc (instructions on https://www.github.com/bittorrent/btc).

Make sure Ruby is installed. And get the rss gem:

gem install rss

Then create a file called rss.rb and throw this in there:

## START FILE rss.rb
require 'rss'
require 'open-uri'
require 'cgi'
 
PASSKEY = 'YOURPASSKEYHERE'
 
# URL to your Tracker -- Mine looked like
# https:///rss.php?feed=dl&bookmarks=on&passkey=#{PASSKEY}
# Note you need to define this or I'll raise an exception on you!
 
# url = "https:///rss.php?feed=dl&bookmarks=on&passkey=#{PASSKEY}"
 
raise "Did you define the url?" unless defined?(url)
 
torrent_list = `btc list`
 
open(url) do |rss|
  feed = RSS::Parser.parse(rss)
  puts "#{Time.now.to_s} -- Title: #{feed.channel.title}"
  feed.items.each do |item|
    puts "Item: #{item.title}"
    if torrent_list.include? item.title
      puts "Skipping torrent, already in uTorrent"
    else
      puts "Downloading torrent #{item.title}"
      `btc add #{item.link}`
    end
  end
end
 
## END FILE rss.rb

To test, simply run this at command line:

ruby rss.rb

To automate, add a job to your cron tab. From command line (Make sure to set the directory to the right location.):

crontab -e

Add a rule to run this script every minute:

* * * * * cd /Users/ephtoaster/torrenter/ && ruby rss.rb > runs.log

My whole crontab looks like:

MAILTO=""
SHELL=/bin/bash
PATH=/usr/local/bin:/usr/bin:/bin:/opt/bin
* * * * * cd /Users/ephtoaster/torrenter/ && ruby rss.rb > runs.log

Local vs Remote MySQL Inserts

I wrote this really long blog post and then said to myself TL;DR; Here are some take aways

  • Ruby does not have great support for reading large Excel (100MB+) files
    • Roo, SpreadSheet, ParseExcel all try to load whole report into memory -> 5GB+
  • Python has excellent Excel parsing support
    • OpenPyXL supports row-by-row iteration
  • Insert to a local MySQL database then dump & load into your remote MySQL database
    • ((24000000 rows/250 remote inserts a second)/60 seconds)/60 minutes = 26.6 hours
    • vs ((24000000 rows/3000 local inserts a second)/60 seconds)/60 minutes = 2.2 hours

Ultimately, dividing and conquering a problem is a great strategy and can significantly reduce your time to process large data sets. However, be careful that the co-routine you’re using in your divisions of labor are as fast as possible. It is way faster to fill up a local MySQL DB and then transfer it to a Remote host.