fhwang.net

testing

JsonDeepCompare

For a recent consulting project, I found myself comparing a lot of large JSON documents in tests, which can be frustrating since differences don't show up well when comparing the hashes normally. Hence JsonDeepCompare, a Ruby gem for comparing large JSON documents and showing the most specific points of difference if they are unequal.

Let's say you've got a test case:

class MyTest
  include JsonDeepCompare::Assertions

  def test_comparison
    left_value = {
      'total_rows' => 2,
      'rows' => [
        {
          'id' => 'foo',
          'doc' => {
            '_id' => 'foo', 'title' => 'Foo', 'sub_document' => { 'one' => 'two' }
          }
        }
      ]
    }
    right_value = {
      'total_rows' => 2,
      'rows' => [
        {
          'id' => 'foo',
          'doc' => {
            '_id' => 'foo', 'title' => 'Foo', 'sub_document' => { 'one' => '1' }
          }
        }
      ]
    }
    assert_json_equal(left_value, right_value)
  end
end

Running it will output this error:

RuntimeError: ":root > .rows :nth-child(1) > .doc > .sub_document > .one" expected to be "two" but was "1"

The selector syntax uses a limited subset of JSONSelect to describe where to find the differences.

The law of constant testing hassle

Over time, technological progress makes it easier to write automated tests for familiar forms of technology.

Meanwhile, economic progress forces you to spend more time working with unfamiliar forms of technology.

Thus, the amount of hassle that automated testing causes you is constant.

How to simulate a POST request body in a Rails 3 functional test

This was surprisingly underdocumented out in the world. If you've got a Rails controller that ever uses request.body.read, you can set that in a functional test like so:

class ArticlesControllerTest < ActionController::TestCase
  test "creates a new article" do
    attributes = {
      :subject => "10 ways to get traffic by writing blog posts about arcane Rails tips",
      :body => "Just kidding, it's totally impossible"
    }
    @request.env['RAW_POST_DATA'] = attributes.to_json
    post(:create)
    assert Article.last
  end
end

@request is a local variable defined inside of ActionController::TestCase, and used when you make a test request by calling post, get, etc.

New awesome coding thingy: Tddium

So, at Profitably we've been transitioning off our own custom CI build using Jenkins, to using a new service called Tddium. Tddium is a hosted, parallelized CI service that aims to run your test suite much faster than you likely could with your own CI setup. It took us a few weeks to get everything up and running, but now that we're using it as our main CI environment, allow me to offer my considered opinion as an engineer:

Holy crap this thing is fucking awesome.

Here, look:

Tddium screenshot

Our 34-minute test suite is now down to 13 minutes. This was the most recent run of master, and at times I've seen it get down to around eight minutes. The two guys behind Tddium have a ton of work to do, but I have no doubt that as they build in more intelligence about managing their nodes they'll be able to give us five-minute test runs on a regular basis.

It's sort of amazing what this does to your productivity. We're just now making the switch to Rails 3, which of course broke a ton of tests at first, and it's really gratifying to make one low-level change to a model, commit it, and find out shortly afterwards that the fix resolved errors in five other test files.

It's not an exaggeration to say I've been waiting for this service for literally years: At RubyConf 2008, I talked about this in my Testing Heresies talk. Of course, getting setup on Tddium reminded of why this is hard in the first place. Any non-trivial app is going to have one binary gem or external dependency that's a hassle to get set up in an environment where the instances you used last time won't be the same as the ones you're using next time, and where you don't have SSH access regardless. For their part, the Tddium guys are overcoming that with tons of automation, documentation, and for the time being really great personal service. When we started getting our code base setup to work on Tddium I was pleasantly surprised to have them emailing me patches to use.

They're in limited release right now, so I don't even know if they can take on any more customers right now. But you should sign up anyway.

When to mock

Dan Fox on mocking:

... I think it’s a good idea to stub out a local instance of a software service like solr, since maintaining a test environment version of processes outside the rails environment can be time consuming, even if they are on the same machine. In this case it’s worth it to mock out such a separate software piece, even though it means possibly incomplete test coverage.

My experience has been that any time I’ve tried to mock away some other piece of the stack—SQL database, Redis, etc—I end up eventually tearing away that abstraction. There’s a definite cost in terms of setup, but what you really don’t want is to mock out the behavior of some component and then realize you didn’t understand how it works in certain scenarios.

Another way to ask the question is this: Are there components of your stack that are 1) so powerful they’re worth the trouble of including them in your production infrastructure, and yet 2) so simplistic in their behavior that you could replicate how they behave in a few lines of Ruby? I can’t think of much that satisfies that condition.

The other reason to resort to mocks is if your test suite takes too long to run. This should only be used with the most egregious test runtime offenders, however. It’s always best to run full-stack tests (including the database) whenever possible; fewer bugs will slip through the cracks.

I’d state the case more strongly: There are many ways you can optimize your test suite that do not run the risk of introducing errors into your tests the that way mocks do. Nick Gauthier gave a great talk at last year’s GoRuCo on this very subject. Definitely worth a watch if your test suite gets slow. And your project is non-trivial, and you’re testing it right, it’s gonna get slow.

It’s interesting to think about what I said more than two years ago about mocking. I’ve actually become a little looser about when to mock, and now in the Profitably code base I occasionally mock in between model classes. This is primarily because there are times that test setup can be extremely cumbersome because of the underlying data setup, and maybe I just want to test an edge case.

But the thing to keep in mind when mocking is that every time you do it, you are strengthening the contract of that method name, the arguments it takes, the class it belongs to, etc., etc. Some people might say, “yeah, what’s the big deal, all code is sort of a contract”, but I think that misses the point: Not all contracts are created equal. Not all method names are created equal. Some have been around for a while, and they’ve settled nicely, so sure, wrap that up in a bunch of mock declarations if that makes your testing easy. But some are brand-new, and you don’t totally understand the domain, or what new features will be added in the next month. Wrapping mocks around that is only going to slow you down when—not if, but when—you have to change it.

As one extreme way to illustrate the difference between contracts: In Rails and elsewhere, we simulate controller hits by, at some level, simulating HTTP. But it’s not actually low-level HTTP—there’s no mucking around with TCP/IP or whatnot. We just run through Rack to simulate some request that gets routed through Rails so we can test how various controllers will act. This is fine, because HTTP is solid and boring (boring is a virtue in engineering) and unlikely to change any time soon. So maybe your entire test suite doesn’t have a single code path that simulates HTTP on the level of network traffic, and that’s great.

But if you’ve just written a big ball of data loading code, and you’re dealing with a lot of crazy edge cases in the data, and there might be a lot more edge cases coming tomorrow, you don’t know what your code is, and you might not have figured out the best way to even structure the method call. Probably best to avoid mocking that.

sample_models 1.0

I’ve just released version 1.0 of sample_models. This release has been a long time coming: I’ve been developing sample_models for more than two years as I’ve been trying to figure out the best interface for it. I use it now in a lot of projects, most notably Profitably, so I’m finally able to stabilize the interface and document how this thing works.

sample_models makes it extremely fast for Rails developers to set up and save ActiveRecord instances when writing test cases. It aims to:

  • meet all your validations automatically
  • only make you specify the attributes you care about
  • give you a rich set of features so you can specify associations as concisely as possible
  • do this with as little configuration as possible

Let’s say you’ve got a set of models that look like this:

  class BlogPost < ActiveRecord::Base
    has_many   :blog_post_tags
    has_many   :tags, :through => :blog_post_tags
    belongs_to :user

    validates_presence_of :title, :user_id
  end

  class BlogPostTag < ActiveRecord::Base
    belongs_to :blog_post
    belongs_to :tag
  end

  class Tag < ActiveRecord::Base
    validates_uniqueness_of :tag

    has_many :blog_post_tags
    has_many :blog_posts, :through => :blog_post_tags
  end 

  class User < ActiveRecord::Base
    has_many :blog_posts

    validates_inclusion_of    :gender, :in => %w(f m)
    validates_uniqueness_of   :email, :login
    # from http://github.com/alexdunae/validates_email_format_of
    validates_email_format_of :email
  end

You can get a valid instance of a BlogPost by calling BlogPost.sample in a test environment:

  blog_post1 = BlogPost.sample
  puts blog_post1.title             # => some non-empty string
  puts blog_post1.user.is_a?(User)  # => true

  user1 = User.sample
  puts user1.email                  # => will be a valid email 
  puts user1.gender                 # => will be either 'f' or 'm'

Since SampleModels figures out validations and associations from your ActiveRecord class definitions, it can usually fill in the required values without any configuration at all.

If you care about specific fields, you can specify them like so:

  blog_post2 = BlogPost.sample(:title => 'What I ate for lunch')
  puts blog_post2.title             # => 'What I ate for lunch'
  puts blog_post2.user.is_a?(User)  # => true

You can specify associated records in the sample call:

  bill = User.sample(:first_name => 'Bill')
  bills_post = BlogPost.sample(:user => bill)

  funny = Tag.sample(:tag => 'funny')
  sad = Tag.sample(:tag => 'sad')
  funny_yet_sad = BlogPost.sample(:tags => [funny, sad])

You can also specify associated records by passing in hashes or arrays:

  bills_post2 = BlogPost.sample(:user => {:first_name => 'Bill'})
  puts bills_post2.user.first_name  # => 'Bill'

  funny_yet_sad2 = BlogPost.sample(
    :tags => [{:tag => 'funny'}, {:tag => 'sad'}]
  )
  puts funny_yet_sad2.tags.size     # => 2

You can also specify associated records by passing them in at the beginning of the argument list, if there’s only one association that would work with the record’s class:

  jane = User.sample(:first_name => 'Jane')
  BlogPost.sample(jane, :title => 'What I ate for lunch')

For more documentation, see the README.

Testing Rails against a running Redis instance (and doing it with Hydra to boot)

Profitably has been running Redis since very early on. And as I’m planning to lean on Redis even more heavily going forward, particularly for Resque, I decided to beef up how Redis is tested in the Rails app. Since not a lot of people are talking about what this takes, here are my notes about my initial setup:

Testing approaches

If your Rails app uses Redis, there are basically three approaches you can take for when the code under test hits Redis.

Continue reading “Testing Rails against a running Redis instance (and doing it with Hydra to boot)” »