Ruby OpenURI open() returns StringIO & FileIO
Ahh, the little things in life. I was hacking out some code the other day and I was doing something like...
report_data = open(report_url) data_set = FasterCSV.read(report_data.path) data_set.each { |row| coolness(row) }
And I ran into an error coming out of FasterCSV:
TypeError: can't convert nil into String
After a quick headache or two I realized calling .path on an unknown Class type might be a problem. While in my test code and production code I was always seeing a FileIO object returned from the open() method, the particular use case I was now going through was returning a StringIO from open(). StringIO does not have a .path method, obviously. The realization of why this was happening came from digging into the implementation of open-uri.rb in Ruby 1.8.7:
class Buffer # :nodoc: def initialize @io = StringIO.new @size = 0 end attr_reader :size StringMax = 10240 def <<(str) @io << str @size += str.length if StringIO === @io && StringMax < @size require 'tempfile' io = Tempfile.new('open-uri') io.binmode Meta.init io, @io if @io.respond_to? :meta io << @io.string @io = io end end
The Buffer implementation for open-uri checks the size of the object before creating a Tempfile. Anything under 10k and you're looking at a StringIO object.
Fortunately, FasterCSV will operate on an IO object...
# from FasterCSV.read(report_data.path) # to FasterCSV.read(report_data)
It was still a little startling to see such a behavior (/optimization?) going on in open-uri.rb. Pretty cool, but this reminded me that I need a few more test cases to uncover behaviors on different data set sizes.
SimplyFor.us – Making idea sharing super simple
A few months ago I had a particular need: to share a private idea with people but for just a limited amount of time. The key ingredient was really all about time limitations... That is, I often come up with a an idea that I would not make public but seek advice/input from a friend, and want that feedback in a time sensitive manner. So, for example, if I am looking to share idea x then I sure hope that someone see's idea x and not some derivation of idea x, which would make the feedback less relevant and a time-waster for both parties. The other sub-need I had was a centralized storage for all of my ideas, a scratch pad if you will, that would be dead simple to use, keep my ideas private, and be super fast to interact with.
Enter: SimplyFor.US
That's where my new web application comes in handy. The name says it all: it's simply for us and very private. I can store all of my ideas in one place where only I have access to the ideas I create (i.e. I can not see any other ideas -- only my own). At my will I can edit them, delete them, and most importantly share them. Sharing is done by sending a secure token to a list of e-mail addresses you provide in the "Share" form. Each token lasts for 24 hours from the time created and allows a visitor to see the idea and comment. At the bottom of each idea the visitor (non-idea owner) will be able to provide commentary back to the idea owner inline.
So the flow works pretty nicely for the idea creator:
- Create an idea
- Share your idea
- Receive feedback (from each person within 24 hours of dispatching your shared idea)
And the flow for the visitor:
- Receive an e-mail
- Check out the idea
- Comment on the idea
There is another subtlety here that I want to highlight and it helps to understand why I chose 24 hours as the time limit: feedback should be quick or not at all. Delayed feedback is not great for the ideation process. By limiting the time for 24 hours there is a undertone to the sharing of ideas that screams "go go go, time is limited." The reality is that the people we trust and who appreciate our efforts will give feedback quickly... And those who are late to the game will either miss out, request a new token, or maybe you'll be kind enough to share with them the latest iteration of your idea.
Lastly, I used OpenID for this project to integrate with Y! Mail, Google Mail, and a few other providers for single-click authentication. It's been working pretty well and I like how easy it is to create an account (both for idea creators and visitors). Ruby on Rails with Passenger (pretty much my go-to now-a-days).
If you've got an idea log on to SimplyFor.us and then share away... Happy Ideation!
New Rails Deployment Stack
This is a post more geared towards getting feedback. What do you guys think of this proposed Ruby on Rails infrastructure stack for deployment of a Facebook Rails app?
- Ruby 1.92 via RVM
- RubyGems 1.5
- Nginx
- Unicorn
- CentOS
- Git or Subversion -- to be decided. Weight in if you have a preference!
- MySQL
The major change from my current set-up is removing HAProxy and putting in Unicorn and moving away from Env.Config's to Gem Bundler.
Update 2/23/2011: I went with Ruby compiled, not via RVM, and skipped Unicorn and went with Passenger (for now). My reasoning behind both decisions was purely based on my experience with Passenger and inexperience with RVM/Unicorn. I'll continue to develop locally w/ RVM/Unicorn and maybe the next version of the application using this stack will get to see Unicorn. Very happy so far with the deployment stack; been running about 4 days now with very minimal server load. I'll be interested to see how the new stack performs when the thousands of users via Facebook start hitting it
Detecting when a new DOM Element is added in HTML/JavaScript
This post is a half reminder... I want to be notified when elements are insert into a box. The box contains list of elements. The user experience is: User adds an element to the box. Element is prepended to the box with a "new_addition" class that styles the background of the element. 4 seconds later the background should be removed.
Using jQuery's (v 1.4) 'live' binding and the DOMNodeInserted event we can do this very easily:
('#set_list').live('DOMNodeInserted',function(event){ setTimeout("$('#"+event.target.id+"').removeClass('new_addition')",4000); });
Here's a JSFiddle.net example: http://jsfiddle.net/MZeLk/12/ -- Open your Firebug console to see the count of items you add in realtime.
As the piggy says, "Th-th-th-that's all folks!"
Stop ‘require’ing by hand in Ruby
While reading some source code from Sinatra I saw a neat line that made a lot of sense but I've never used before:
%w(rubygems rack).each { |req| require req }
Any cons to this approach of reducing the # of lines in our Ruby code? Let me know, otherwise I've found a new way to require files and avoid arthritis.
PHP array_diff vs foreach: a battle for speed
1000 runs w/ 1000 data elements in the two arrays (php array diff): 2.7389E-5 seconds
1000 runs w/ 1000 data elements in the two arrays (php foreach): 1.085E-7 seconds
php array diff slower by 2.728E-5 seconds
There were two arrays for this test: $big_set, which had 3147 string elements and $to_diff, which had 1581 string elements. We needed the difference of the two arrays and found that array_diff is slower on average. The nicety to using a foreach(){} is that we can now do other computation within the loop, as opposed to having to loop a second time after the differential set is found. Code below...
<?php print_r($argv); $runs = $argv[2]; echo "$iterations\n"; $data_size = $argv[4]; echo $data_size; $big_set = array(); for($i = 1; $i < $data_size; $i++){ $big_set[] = "Hello there ".rand(500,5000); } $to_diff = array(); for($i = 1; $i < $data_size; $i++){ $to_diff[] = "Hello there ".rand(500,5000); } for ($i=0; $i<$runs; $i++) { $time = 0; $start = microtime(true); $diff = array_diff ($big_set, $to_diff); $t = microtime(true) - $start; $time += $t; } echo $runs, ' runs (php array diff): ', sprintf('%2.9f',$time/$runs), ' secs', BR; for ($i=0; $i<$runs; $i++) { $time2 = 0; $start = microtime(true); foreach($big_set as $p) { unset($to_diff[0]); // cost of an associative array lookup & delete } $t = microtime(true) - $start; $time2 += $t; } echo $runs, ' runs (forloop): ', sprintf('%2.9f',$time2/$runs), ' secs <br />'; if($time2 > $time) { $seconds = $time2 - $time; echo "forloop slower by ". $seconds/$runs ." seconds"; } else { $seconds = $time - $time2; echo "php array diff slower by ". $seconds/$runs ." seconds"; }
Execute via command line :/>php file.php -runs
Must be in that order -- I did not make it handle fancy arguments.
Update: Adding PHP Bench for other cool benchmarks
Facebook Development: Hello Birthday V2 Open Graph
Well, it has been a few months since my original post on Developing Facebook applications (Hello Birthday's original post). During this time, Facebook has introduced a new API called Graph (Graph Docs). A big change is the authentication protocol moving to OAuth 2.0. Yay for OAuth, because it definitely simplifies things...
Facebook continues supports the legacy Rest API, which is good because the new Graph API is still changing slightly here and there... If you have Rest API keys stored, you can easily upgrade them using one of the new API's function calls.
The most exiting feature that Graph introduces is the ability to subscribe to user attributes, like friends, application permissions, and current location.
As you can imagine, this has huge performance gain implications and reduces the need for overly complex web apps with polling-like data integrity processes... Read: No longer do I have to ask Facebook if a user has given me permission, because I know now what I have when they sign up and if it changes at any given time.
Facebook is pushing iFrame Canvas apps. Beware of this. It is a difficult, and hack-prone, situation where you have to find a way to authenticate a user within an iFrame and use Javascript to do a top.window redirect. Facebook Developers say better support is coming. For now, JS is your friend.
Let's move on though... The new API is great and all, but I'm more interested in talking about some of the insights gained from my first real FB App. Here are some high level lessons learned from my non-iframed canvas app (v1):
- Javascript support is limited
- FBML is really a sad attempt at a markup language, I'd rather see HTML sensitive attributes that do the same thing FBML does, without he need for a whole new language
- Global styles were less-than-easy to work with due to FB rendering my page within theirs
- Rendering pages requires all of Facebook's shell to render. Boo
- Porting the app to be a stand-alone web app is not trivial, plan out your adventure ahead of time. If you think your app might become a standalone product plan for it (the best you can)
- Unable to fine-tune performance (you <strong>can use memcache though!</strong>)
- As the previous note hints at: use memcache it makes life so much easier if you're dealing with a lot of data and constantly using lists of friends and information about each friend
- Get a friend to work on the app with you. It makes testing and continued efforts so much easier. Thanks Joe
- Get a good set of friends to bounce ideas off of. Thanks Jon, Erica, and others whose names escape me right now
- Ask your USERS for feedback early & often
- It's nice to have my app feel as though it belongs in the Facebook layout. I think this is a big win to Canvas as a whole
- The API has worked very well for me and I've experienced little to no hiccups
- Love FQL
- Facebook developers are somewhat easy to get a hold of but problem solving may take time.
During development of V1.5 (the transition to the Graph API) some of the changes made were:
- Move to an iFramed app
- Clean up transitional code that helped me get from boot-strap to version one
- Remove FBML mark up
- Implement more personalization options to the user experience
- Create more of an "experience" for that matter... Adding descriptions next to links to give context. An ongoing effort, by the way
- Became more transparent in implementation details: I told my users more about how Hello Birthday worked, what features were being built, and what bugs I found.
- Listened to feedback and added: Personalized Messages on a global level (defining a set of default personalized messages that are randomly selected during the wishing process)
- Listened to feedback and changed: Birthday messages to be delivered during the day at user specific time ranges.
- Fixed issues related to local time zones
- Identified users who have privacy settings that don't allow Hello Birthday to retrieve their birthday *big one*
One of the points above is really important (bullet point 6) and it hovers around the idea of creating an open dialogue between you and your users so that they feel comfortable complaining, suggesting, and critiquing features. My users are the debuggers, like with most applications, and just by letting people know when I was making changes, what they were, and screenshots of what to expect I was able to generate immediate feedback. The best part is, that feedback was both good, bad, and off topic! I was receiving (as comments to the posts on the App Wall and via e-mail) bug reports, thumbs up, and thumbs down on different ideas/feature enhancements. By continuously increasing the number of posts (to a certain degree) I was essentially getting into the daily lives of Hello Birthday's users and it caused an awesome feedback loop to begin where there now exists a few users who love to let me know when things break! Awesome.
I'm very close to having the version two of Hello Birthday launch. As you can imagine, there is a bit of hesitation around the new authentication protocol and making sure my legacy API key's of my users transfer over. I'll write a post on that (if it ends up not being trivial). At any rate, I'm looking forward to seeing the feedback on version two. There has been a lot of subtle changes and feature enhancements that I'm sure (because of feedback received already) will be welcomed nicely.
You can expect a post after the launch of V2 where I will go over some of the insights Facebook provides to developers for their apps. There's a lot of interesting data around conversion rates and what permissions people grant your application that I want to share! Thanks.
GMail Conversation Threading can now be disabled!
This is a follow-up post for those who read my previous blog entry on preventing GMail threading. Google Mail now has the option to disable "Conversation View". Go to your Gmail, click Settings in the upper right corner, then you should land on the General tab and about half way down the page is the option to turn off "Conversation View". Hope this helps those of you out there who have had issues related to e-mailing account owners of your web apps... Or just plain get tired of accidental threading.
Just a quick time comparison of PHP’s preg_replace vs. str_replace
Here's a short and succinct comparison of running time to replace multiple characters in a given string. I ran the test 20 times each, so the numbers you will find are average running times:
$start = microtime(); $str = "23ilrj23oirj23iorj o23irj23klfj23lkjr4ocimior 4r ioj234roij234r io34jrio4jrio34r jio4jr o34jr oi4jr io34 r"; $new_string = preg_replace('/[\w2]/',',',$str); $end = microtime(); echo($end - $start); echo "\n"; $start = microtime(); $str = "23ilrj23oirj23iorj o23irj23klfj23lkjr4ocimior 4r ioj234roij234r io34jrio4jrio34r jio4jr o34jr oi4jr io34 r"; $new_string = str_replace(array('2',' ',"\t"),',',$str); $end = microtime(); echo($end - $start);
The results:
regex: 0.000608
tr: 0.00024099999999999
Rails, jQuery UI (Sortable), and Ordering of Slides
I am partly sharing this issue and solution to the world, but mostly just recording somewhere what I came up with. Now, onto the quick read.
A web design client recently asked me to build a web site for him that would allow him to create slide shows of his art work. One of the criteria was to create a way to set the order and, at a later date, re-arrange slides in the slide shows. Turning to jQuery UI, specifically the Sortable (doc), and a simple rails controller this task was pretty trivial. My initial concern was that a slide show could consist of hundreds of slides and doing any sort of ajax updating of the slides would be too slow. Turns out, this was a real concern and after reading all kinds of blogs I was unable to find a work-able ajaxively awesome solution. So, I turned back to the non web2.0 design of just having a Save Order button. The methodology of my solution is straight forward: allow the user to drag and drop to any order configuration they please and then save that order. Upon clicking save via the user interface, invoke the following javascript command that will build the query string and do a window relocation. I'd rather use GET than POST for no real reason outside of having to hack around the authenticity token. After all, what's the point of having an auth token if you're just going to override it in a members-only section. Here is the complete code for the view, broken into pieces for clarity.
CSS:
#sortable { list-style-type: none; margin: 0; padding: 0; width: 100%; } #sortable li { margin: 0 3px 3px 3px; padding: 0.4em; padding-left: 1.5em; font-size: 1.4em; height: 18px; } #sortable li span { position: absolute; margin-left: -1.3em; }
JS:
<script type="text/javascript"> // Initialize sortiable on #sortable div $(function() { $("#sortable").sortable(); $("#sortable").disableSelection(); }); // Prepare and go-to proper url for updating order of slides. // Called via html anchor tag by user function update_order() { window.location.href = "/slides/order?"+$('#sortable').sortable('serialize'); }
Sample #sortable:
<ul id="sortable"> <li id="slide_2" class="ui-state-default full-width"><span class="ui-icon ui-icon-arrowthick-2-n-s"></span>Lack of a better Title</li> <li id="slide_3" class="ui-state-default full-width"><span class="ui-icon ui-icon-arrowthick-2-n-s"></span>A fireplace for two</li> <li id="slide_1" class="ui-state-default full-width"><span class="ui-icon ui-icon-arrowthick-2-n-s"></span>My Super Sweet Villa</li> <li id="slide_4" class="ui-state-default full-width"><span class="ui-icon ui-icon-arrowthick-2-n-s"></span>A view from above</li> </ul>
Anchor link:
<a href="#" onClick='update_order(); return false()'>Save Order</a>
Controller:
def order if params[:slide] params[:slide].each_with_index { |slide, index| Slide.update(slide.to_i, :slide_position => (index +1)) } flash[:notice] = "Slide order has been updated." redirect_to(slides_path) else @slides = Slide.ordered end end
Now, for some disclaimers... You should add in error handling to all of these pieces! For brevity, I've included the main pieces of the task and not my error handling code. You can also use post and add in (instead of looking for params[:slide]) if request.post?... Just don't forget to put in the authenticity_token to your ajax or form post (forms do this automatically in rails if protect_from_forgery is enabled).
Efficiency: This is a O(n) algorithm, because it has to iterate over each element being sorted and do three things: fetch from the database, update the attribute, and save. The last two steps can be combined. I'm using the handy ActiveRecord::Base extension in Rails called update to do this. Here's that code so you don't have to look it up:
# File vendor/rails/activerecord/lib/active_record/base.rb, line 744 744: def update(id, attributes) 745: if id.is_a?(Array) 746: idx = -1 747: id.collect { |one_id| idx += 1; update(one_id, attributes[idx]) } 748: else 749: object = find(id) 750: object.update_attributes(attributes) 751: object 752: end 753: end
I was unable to locate any kind of conditional MySQL update that would allow different values to be updated for different rows all in one swoop. I'll admit my research on this topic was about 30 minutes of blog and StackOverflow reading... If you have something more efficient please reply.
Here are the link(s) to relevant information:
jQuery UI Demos (and download) -- I'm using the latest code base as of this post. The styles you see within the bullet list items are from the jQuery UI theme.



