We fixed rails redis store to rapidly expire 50,000 keys daily from a 3.5M key cache

In building out Blueprint, we have taken an approach of “cache everything” (or what we think is most important) to make the app super responsive. Our customers love this.

What we actually cache is, essentially, a data digest used in the reporting view (so not the view or the model per-se). These digests could be a daily, weekly, monthly or annual view of a perspective on a data cube. Each day we add new, updated data to our cubes and need to expire the cached items that pertain to that day (e.g. latest week, month, year). In addition, we’re often recalculating categorizations for clients which means expiring their entire cached history and rebuilding it.

And so, we will expire 50,000 cache keys each morning as we push in new data (and then we go rebuild them). For a given site we might delete 1000-2000 keys.

Up until this week we used the delete_matched method to remove all of the site’s keys in a single call (optimizing our redis time). However, delete_matched relies in the redis keys method which is slow, blocks the redis server and isn’t meant to be used in production. In fact, I could watch our redis server’s CPU go from 0.1% to 100.0% exactly during the call to expiring site caches. Because redis is mostly single threaded, pinning the CPU for processor-heavy operations holds back all the fast GET calls from our cache reads and slows the site to a crawl.

The solution was clearly we couldn’t rely on redis to tell us what keys were in the cache. Michael and I whiteboarded this and built (well, Michael mostly built) a replacement cache store that inherits from RedisStore and effectively does three changes: stores the key name into a set (we partition sets, which is hidden in the code below) when the cache entry is written, remove the key name from the set when the cache entry is deleted, use the set to find the matched keys for delete_matched and remove those from the cache and the set.

      def delete_matched(pattern)
        sub_set = find_matching_keys(@set, pattern)
        Redis.current.srem(@set, sub_set)
        Redis.current.del(sub_set)
      end

      private

      def write_entry(key, entry)
        Redis.current.sadd(@set, key)
        super
      end

      def delete_entry(key)
        Redis.current.srem(@set, key)
        super
      end

I was pretty surprised not to find other solutions for this already or that the redis-store implementation doesn’t do this automatically. However, when we implemented it, the tricks in partitioning the set (to keep that read time low) lead me to believe it’s not a generalized solution (although you could use hashing to generalize it). We also had to build a process to build up the set from the existing 3,541,239 keys in our cache, so we’ll see if that takes the weekend or just the night.

Adding a custom jQuery-UI control to Formtastic in Ruby on Rails

While working on OnCompare, over the last month, one of the controls we wanted to support was a slider, that let you choose between a range of values, as you can see in the example at right. jQuery-UI provides a slider control and Ruby on Rails already has jQuery built in (we’re using Rails 3). The challenge was that all of our controls are rendered from Formtastic.

Fortunately, Formtastic allows you to add new control types by extending Formtastic::SemanticFormBuilder. To do that I changed config/initializers/formtastic.rb to specify our own builder:

Formtastic::SemanticFormHelper.builder = Formtastic::OnCompareFormBuilder

Then I just needed to add a method that responds to :slider type.

class OnCompareFormBuilder < Formtastic::SemanticFormBuilder

  def slider_input(method, options = {})
    collection   = find_collection_for_column(method, options)
    html_options = strip_formtastic_options(options).merge(options.delete(:input_html) || {})

    input_id = generate_html_id(method,'')
    slider_id = "#{input_id}_slider"
    label_id = "#{input_id}_label"
    value_items = []
    label_items = []

    collection.each do |c|
      label_items << (c.is_a?(Array) ? c.first : c)
      value = c.is_a?(Array) ? c.last  : c
      value_items << value
    end

    label_options = options_for_label(options).merge(:input_name => input_id)
    label_options[:for] ||= html_options[:id]

    script_content = "$(function() {
        var option_values = [#{value_items.map {|v|"'#{v.to_s.gsub(/[']/, '\\\\\'')}'"}.join(',')}]
        var option_labels = [#{label_items.map {|l|"'#{l.to_s.gsub(/[']/, '\\\\\'')}'"}.join(',')}]
        $( '##{slider_id}' ).slider({
            value:100,
            min: 0,
            max: #{collection.count - 1},
            step: 1,
            slide: function( event, ui ) {
                $( '##{input_id}' ).val(
                        option_values[ui.value] );
                $( '##{label_id}' ).text(
                        option_labels[ui.value] );
            }
        });
        for(var i = 0; i < option_values.length; i++) {
          if(option_values[i] == $( '##{input_id}' ).val()) $( '##{slider_id}' ).slider( 'value', i )
        }
        $( '##{label_id}' ).text( option_labels[$( '##{slider_id}' ).slider( 'value' )] );
    });"

    label(method, label_options) <<
    template.content_tag(:script, Formtastic::Util.html_safe(script_content), :type => 'text/javascript') <<
    template.content_tag(:div,
        template.content_tag(:div, label_items[label_items.count - 1], :class => 'right-label') <<
        template.content_tag(:div, label_items[0], :class => 'left-label') <<
        template.content_tag(:div, nil, html_options.merge(:id => slider_id)), :class => 'slider-holder') <<
      hidden_input(method, options.delete(:as)
    ) <<
    template.content_tag(:span, nil, :id => label_id)
  end

end

Finally, I generate a ruby call in the ERB that adds a slider using something like this:

<%= f.input :value, :as => :slider, :label => question.text, :hint => question.description %>

Creating Test Data for Rails Apps

I wrote a guest post over at 3 Weeks to Live on our progress in building data sets for testing OnCompare under Ruby on Rails.

The biggest challenge has been generating a large number of records, but I think we can do that with loops and factory_girl:

The other place to use large data is to stress test the system. … I need to know that the algorithm will continue to be responsive when we have 300 products. At this point, I’m looking at running loops around factory_girl calls so that it will generate most of the non-essential information. [Settings] up a single product matrix may take a dozen records, but I can loop over that 300 times and factory_girl will automatically create records with new names (using the sequence method shown in the factory definitions above).
before do
  category = Factory.create(:category)
  300.times do
    product = Factory.create(:product, :category => category)
    12.times do
      Factory.create(:question, :category => category, :product => product)
    end
  end
end

As a follow-up, we also added Faker to our gem set, which has a ton of handy methods for mocking up data. So, for example, I can set up a factory definition with Faker::Lorem.words(3) for the name or description of a product and 300 products will all look different. It’s pretty sweet.