Techblog

Tech Blog

Our latest geek adventures!

Archive for the ‘Ruby on Rails’ Category

18 November Request-log-analyzer 1.5.0

Bart and I just released request-log-analyzer version 1.5.0. New features include:

  • MySQL slow query log format support to analyze what queries are slowing down your database.
  • Format autodetection: with all those supported file formats, remembering the right --format parameter gets tricky. With format autodetection, this usually is not needed anymore!

As always, run the following command to install or upgrade to the latest version:

$ gem install request-log-analyzer

No Comments - Tags: , , , ,

17 November Case-insensitive validates_uniqueness_of slowness

Watch out when using validates_uniqueness_of :field, :case_sensitive => false. Rails transforms this in a query that cannot be supported by an index, which will really slow validation down if the underlying table grows larger.

For example, we use validates_uniqueness_of to check for duplicate e-mail addresses. Because email addresses are case-insensitive, adding :case_sensitive => false seems like a natural choice. However, this results in the following queries:

# For a new User instance:
SELECT id FROM users 
 WHERE LOWER(users.email) = BINARY 'user@example.com'
 
# For an existing User instance:
SELECT id FROM users 
 WHERE LOWER(users.email) = BINARY 'user@example.com' 
   AND users.id <> 42

This query cannot be optimized by a (unique) index on the email field and thus has to scan the full table. As our users table grew larger, these queries started to show up in our slow query log.

However, MySQL uses case-insensitive comparison by default. (To be exact, case-sensitiveness depends on the current collation, which can vary. Rails generates the weird query to make sure the check works, regardless of the current collation.) The conversion to lowercase therefore is not necessary for a uniqueness check (as long as the field has a case-insensitive collation like utf8_general_ci). I decided to write my own validation method that issues a query that can be optimized by a query.

  # Alternative for validates_uniqueness_of :email, :case_sensitive => false
  validate do |user|
    conditions = "users.email = :email"
    conditions << " AND users.id != :id" unless user.new_record?
    conditions = [conditions, { :email => user.email, :id => user.id }]
    if User.find(:first, :select => :id, :conditions => conditions)
      user.errors.add(:email, 'Already in use')
    end
  end

There is a ticket for this issue in Rails’s Lighthouse, but as of yet this issue is unresolved. For now, this solution works to keep our slow query log nice and quiet!

5 Comments - Tags: , , , , , ,

27 September Adyen payment services for Rails

Michel and I have been playing around with integrating Adyen payment services in Rails applications. We have assembled some of the pieces of code we have written, combined them, written specs for them and released the result as a gem. The package is also included on the Adyen support site.

Currently, the gem provides the following:

  • Simple configuration and setup.
  • Uses Adyen’s test or production environment based on your Rails environment.
  • Generating hidden form fields for redirecting to Adyen for a payment.
  • Calculating the signature to sign these redirects.
  • Checking Adyen’s signature when the user gets redirected back.
  • Matchers to easily test your payment forms using RSpec.
  • Receiving and storing notifications from Adyen.
  • Calling the Adyen SOAP services (requires the Handsoap gem).

Currently, not all SOAP services are implemented (because we didn’t need them all). It should be quite easy to implement them as well based on the other services that are implemented already. Don’t hesitate to submit patches!

No Comments - Tags: , , , , , , ,

27 September Performance tweaking of Ruby algorithms

I have been working on request-log-analyzer quite a lot recently. One of the things I focused on was improving the parsing performance: because it parses log files that often are very big, processing times tend to be long. So all savings are very welcome.

Improving the performance of a command line application that does a lot of processing is very different from optimizing the performance of a web application. Request-log-analyzer basically reads a file line by line and processes it, so small performance improvements in the line processing algorithm can really add up when the file has a lot of lines. I used ruby-prof to get information about the performance of our algorithms split out by method to focus my tweaking efforts. I have written down some of my findings below; hopefully they can be helpful.

Parallelization

My first thought for improving performance was parallelization: parse multiple lines at the same time. Unfortunately, this did not yield the results I was hoping for: it instead become slower. Probably, this is because Ruby implements its own in-process threading and thus only uses one core of my processor.

Block overhead

The overhead of using a block should not be underestimated. Consider the following simple change, which improved the performance of request-log-analyzer by 1.5-2% on large log files:

# with block
io.each_line { |line| process_line(line) }
 
# without block
process_line(line) while line = io.gets

Regular expressions

If you’re using complex regular expressions, and you do not expect that every string will match it successfully, it can be beneficial to test the string with a simpler regexp first. For example, request-log-analyzer uses a complex regexp to see if a line in a log file is a “Completed in…” line with the request duration in it:

# Check every line to see if it is a "completed" line and capture the values
if line =~ /Completed in (\d+)ms \((?:View: (\d+), )?DB: (\d+)\) \| (\d\d\d).+\[(http.+)\]/
  # do something with the captured values
end

I improved this by first checking for a superficial regexp that tells me with 99% certainty that the complex regexp will match the line successfully as well:

if line =~ /Completed in/
  if line =~ /Completed in (\d+)ms \((?:View: (\d+), )?DB: (\d+)\) \| (\d\d\d).+\[(http.+)\]/
    # do something with the captured values
  else
    $stderr.puts "#{line.inspect} expected to match 'Completed' regexp!"
  end
end

Depending on the log file, this can increase performance by ~3%. Another benefit of this approach is that it will give feedback on lines that matched the simple regexp, but not the complex one. This information can be used to correct the regular expressions.

Calculate things that do not change only once

Calculate things that do not change only once is easier said than done. For instance, request-log analyzer can aggregate durations in a category. A category can be based on a request field that is parsed from the log, or a Proc that calculates it:

# during parsing:
if categorizer.respond_to?(:call)
  category = categorizer.call(request)
else
  category = request[categorizer]
end
categories[category].aggregate_request(request)

With this implementation, categorizer will be checked to be a Proc for every request, but as the value of categorizer will not change, the result of the check will be constant as well. I solved this my making sure that it is always a Proc beforehand, so the check is no longer necessary during parsing:

# before parsing:
if categorizer.respond_to?(:call)
  categorizer_proc = categorizer
else
  categorizer_proc = lambda { |request| request[categorizer] }
end
 
# during parsing:
category = categorizer_proc.call(request)
categories[category].aggregate_request(request)

Performance gain: ~1%! And because on several instances a similar technique could be applied, the performance got improved by about 4% in total.

Check the most common case first

Consider the following example, which converts a number in any traffic unit (kilobytes, MB, etc) to bytes:

def convert_traffic(value, unit)
  case unit
  when :GB, :G, :gigabyte      then (value.to_f * 1000_000_000).round
  when :MB, :M, :megabyte      then (value.to_f * 1000_000).round
  when :KB, :K, :kilobyte, :kB then (value.to_f * 1000).round
  # ... even more units here
  else value.to_i
  end
end

In most cases, the value will simply given in bytes, which will be returned by the final else after all possibilities have been checked. This can be improved by checking for this possibility first:

# Converts traffic in any unit to bytes.
def convert_traffic(value, unit)
  case unit
  when nil, :B, :byte          then value.to_i
  when :GB, :G, :gigabyte      then (value.to_f * 1000_000_000).round
  when :MB, :M, :megabyte      then (value.to_f * 1000_000).round
  when :KB, :K, :kilobyte, :kB then (value.to_f * 1000).round
  # ... even more units here
  else raise "Unknown unit: #{unit.inspect}!"
  end
end

Again this change adds up to a ~1% performance increase if this method is called very often.

These kinds of changes really improved the performance of request-log-analyzer by quite a bit, so upgrade to the latest version to get some of your valuable time back! :-)

6 Comments - Tags: , , , , , , , , ,

31 August New version of Scoped search

After an almost complete rewrite, I am proud to present version 2.0 of scoped_search, the ActiveRecord plugin that makes it easy to find records using a simple query language. This new version support a new query language that supports more complex constructs, and can therefore be used to conduct more fine-grained queries on your models.

New query language

  • Logical operators: AND (&, &&), OR (|, ||) and NOT (!, -) operators, and parentheses to structure the boolean logic: police AND (car || uniform), -"village people". By default, the AND operator is used to combine different segments of your query.
  • Comparison operators: the most common comparison operators are supported, and to what you expect on integer and date field.
  • Explicit field support: only search in the specified field instead of all fields: age >= 21, created < 2009-01-01, username != "root".
  • Check for NULL fields: null? parent, set? error_message
  • Commas are supported to separate the different parts of the query.

More information about the query language can be found in the project wiki on GitHub.

New definition syntax

The new version supports a new syntax to define what fields of your model can be searched and in what cases. An example:

class User < ActiveRecord::Base   
  belongs_to :account_type
 
  scoped_search :on => [:first_name, :last_name]
  scoped_search :on => :created_at, :alias => :created, :only_explicit => true
  scoped_search :in => :account_type, :on => [:name, :description]
end

After the fields have been defined, the search_for method can be used to search your models using a named scope, just like it was before. The project wiki has more information about this new syntax. The search syntax itself hasn’t changed:

@users = User.search_for(params[:q]).paginate(page => params[:page])

Installation or upgrade

Include the gem in your environment.rb configuration and run rake gems:install to install it:

config.gem 'scoped_search', :source => 'http://gemcutter.org'

Backwards compatibility

The new version has a new syntax to define the fields that can be searched with a query. This new syntax gives you more fine-grained control over the queries that will be generated, so I urge you to adopt this new syntax. However, the old searchable_on syntax is still available for backwards compatibility.

Please contact me if you have any issues with the new version.

(Updated with new gemcutter installation instructions.)

2 Comments - Tags: , , , , , ,

25 February Creating a REST API for a Flash application

We have been working hard to implement an XML-based REST-style API for Floorplanner and some of our partners are using it already to access their users and plans. We now have started to use this API ourselves.

The Floorplanner Flash application communicates with our servers to load and save projects and designs. The backend for this functionality used to be written in PHP. Eliminating this PHP application simplifies our server setup, eases development and reduces our maintenance burden. To rewrite this functionality in Rails, we decided to eat our own dogfood and use the REST API to load and save designs. Why reinvent the wheel?

However, while implementing the changes it the Flash application, we found that it did not work out of the box, because of some limitations in ActionScript. Note that we still use ActionScript 2, so some of these issues may not be relevant for ActionScript 3. In this post, we will detail what these issues are and how we overcame them.

Using a separate format

Our REST API uses the XML format supplied by Rails. Because we had to make some changes to make it work from ActionScript, we decided to add a distinct format that we could implement differently without altering the behavior of the default XML API. Adding a new format, called “flash” can be done in Rails by adding a MIME alias to your environment:

Mime::Type.register_alias "application/xml", :flash

Now we can send different responses based on this format:

respond_to do |format|
  format.xml   { ... } # default API behavior
  format.flash { ... } # do something different
end

HTTP status codes

REST APIs use HTTP status codes to return whether a call succeeded, and if not, why. Flash however uses the browser to perform HTTP requests. The browser only returns something to Flash if the request was successful. So, if an error code is used if a request failed together with some error messages, these error messages will not be delivered to Flash and cannot be displayed to the user. We resolved this by always sending the HTTP OK status.

respond_to do |format|
  format.xml   { render :xml => @project.errors, :status => 422 }
  format.flash { render :xml => XML.failure(@project.errors, 422) }
end

Our XML.failure method will return something like:

<failure status="422">
  <error on="name">A project should have a name!</error>
</failure>

Note that other HTTP success statuses than 200 work in Safari and Firefox, but not in Internet Explorer. So, never return a 201 (:created), because Internet Explorer will not send the result to Flash!

PUT and DELETE requests

REST-style APIs use HTTP PUT requests to alter objects and DELETE requests to destroy objects. Most browsers do not support these request type. It is not supported by ActionScript either, because Flash uses the browser to send the request.

To overcome this problem, these types of requests can be simulated in Rails by sending a _method parameter along with a POST request. Unfortunately, this does not work when calling the REST API. The POST request body cannot be used to send additional variables, because it is used for the XML payload.

We solved this issue by creating additional routes for POST requests to the update and destroy actions of our resource controllers. Our routes.rb file now looks like this:

  map.resources :projects, :member => { :update => :post, :destroy => :post }

These routes route to exactly the same methods as the default REST actions (ProjectsController#update and ProjectsController#destroy), so no additional code is needed. The following calls are now equivalent:

PUT    /projects/123.xml
POST   /projects/123/update.xml
POST   /projects/123/update.flash

DELETE /projects/123.xml
POST   /projects/123/destroy.xml
POST   /projects/123/destroy.flash

The result

It requires some stretching of the pure REST principles, but doing so is worth it: we can now reuse the code we use for our API to handle Flash application calls and we can eliminate the PHP backend.

3 Comments - Tags: , , , , , , , ,

12 January Request-log-analyzer 1.0

After a complete rewrite, Bart and I are proud to present request-log-analyzer version 1.0! Request-log-analyzer is an open-source command-line tool to analyze production log files from your Rails application to produce a performance report.

What’s new?

  • More robust log parser. It parses more lines and it now combines all lines that belong to the same request, which greatly improves the amount of information available. 
  • It produces more detailed and more beautiful reports
  • A database builder is included, which will create an SQLite 3 database with all parsed request information, so you can roll your own queries.
  • Request filtering options, so you can exclude irrelevant data. An example on how this can be applied in practice can be found in the wiki. 
  • Better, more modularized design under the hood. The parser is now fully log file format-agnostic. Developing extensions and modifications, or adding support for other log file formats should be much easier now. See the development-page for some pointers.
  • Documentation in the project’s wiki. Hopefully, this helps people get up to speed with the new version and answers most questions about using the tool. If you still have questions, please contact us so we can keep improving it!

Installation

Install or upgrade to the new version with the following command:

$ sudo gem install wvanbergen-request-log-analyzer 
                --source http://gems.github.com

To get the best results out of request-log-analyzer, it is important to configure logging correctly for your application. Some pointers on how to set things up correctly can be found in the wiki.

1 Comment - Tags: , , , , , , ,

24 December Rails and Merb merge!

Good luck to the merged Rails team and hopefully Rails 3 will kick ass! Let’s hope git will really deliver on this gig! Try to refrain from using git blame too much when resolving merge conflicts. ;-)

git checkout rails && git merge merb

4 Comments - Tags: , , , , , , ,

11 December Rails 2.2 support for request-log-analyzer

I just released version 0.2.0 of request-log-analyzer, our tool to analyze request log files that are generated by Rails and Merb for performance tweaking. This new version supports the new log format of Rails 2.2, which has changed slightly.

An updated gem should be available any minute now. Run sudo gem update to upgrade the newest version.

No Comments - Tags: , , , ,

21 September HTTP status exception handling plugin

Some time ago, I wrote about putting HTTP status code to use for your Rails application. For my reinvigorated project, I wanted to apply the same technique. Instead of re-implementing it once again, I created a Rails plugin called http_status_exceptions to easily add this functionality and I have put it on Github. For more information on how to install and use the plugin, see the project’s wiki.

No Comments - Tags: , , , , ,