Techblog

Tech Blog

Our latest geek adventures!

18 November Request-log-analyzer 1.5.0

Bart and I just released request-log-analyzer version 1.5.0. New features include:

  • MySQL slow query log format support to analyze what queries are slowing down your database.
  • Format autodetection: with all those supported file formats, remembering the right --format parameter gets tricky. With format autodetection, this usually is not needed anymore!

As always, run the following command to install or upgrade to the latest version:

$ gem install request-log-analyzer

Tags: , , , , | No Comments »

17 November Case-insensitive validates_uniqueness_of slowness

Watch out when using validates_uniqueness_of :field, :case_sensitive => false. Rails transforms this in a query that cannot be supported by an index, which will really slow validation down if the underlying table grows larger.

For example, we use validates_uniqueness_of to check for duplicate e-mail addresses. Because email addresses are case-insensitive, adding :case_sensitive => false seems like a natural choice. However, this results in the following queries:

# For a new User instance:
SELECT id FROM users 
 WHERE LOWER(users.email) = BINARY 'user@example.com'
 
# For an existing User instance:
SELECT id FROM users 
 WHERE LOWER(users.email) = BINARY 'user@example.com' 
   AND users.id <> 42

This query cannot be optimized by a (unique) index on the email field and thus has to scan the full table. As our users table grew larger, these queries started to show up in our slow query log.

However, MySQL uses case-insensitive comparison by default. (To be exact, case-sensitiveness depends on the current collation, which can vary. Rails generates the weird query to make sure the check works, regardless of the current collation.) The conversion to lowercase therefore is not necessary for a uniqueness check (as long as the field has a case-insensitive collation like utf8_general_ci). I decided to write my own validation method that issues a query that can be optimized by a query.

  # Alternative for validates_uniqueness_of :email, :case_sensitive => false
  validate do |user|
    conditions = "users.email = :email"
    conditions << " AND users.id != :id" unless user.new_record?
    conditions = [conditions, { :email => user.email, :id => user.id }]
    if User.find(:first, :select => :id, :conditions => conditions)
      user.errors.add(:email, 'Already in use')
    end
  end

There is a ticket for this issue in Rails’s Lighthouse, but as of yet this issue is unresolved. For now, this solution works to keep our slow query log nice and quiet!

Tags: , , , , , , | 5 Comments »

9 November Database replication in 5 easy steps

Running an online service like Floorplanner is not without risks. There are a lot of things that can go wrong. One of them is downtime of your service due to a crashing system. It doesn’t have to happen of course, but when it does, it can have some nasty consequences. A good way to reduce the risk of a crashing system is to set up a redundant system.

In short, a redundant system is an exact copy of your main system. When your main system goes down (for whatever reason), the redundant system can take over. One of the most important parts of any online service is the database. A way to maintain an exact copy of your database is setting up database replication. This post gives you the basics on how to set up database replication using a MySQL database.

Database replication isn’t probably something you’d setup right from the start. This means that you have to work with a running system when you decide to do so. From past experience we know that it’s not a good idea to replicate your database to the same machine, let alone to the same disk. To make this work, you need a separate machine with MySQL installed. This is your slave machine and database.

Let’s get down to business. These are the steps for setting up database replication:

  1. Enable binary logging on master database
  2. Create a backup of master database
  3. Transfer backup to slave machine
  4. Import backup to slave database
  5. Start slave database

1. Enable binary logging on master database

The binary log contains all statements that update data [..]. Statements are stored in the form of “events” that describe the modifications. The binary log also contains information about how long each statement took that updated data.

Every write action that’s performed on your database, like an insert or an update, is stored in these binary logs (bin logs). The master dumps all actions in these logs and the slave database can read them and perform the same actions. This way your master and slave databases will always have the same data.

To enable binary logging you have to create a config file, for example master.cnf. An important thing here is the server-id, which is needed for replication to work. Then there is log-bin which specifies base name of binary logs, and finally binlog-do-db, which specifies which databases should be bin logged.

[mysqld]
datadir=/home/yourname/mysql/myisam/data
basedir=/usr/local/mysql
port=3306
server-id=1
log-bin=mysql-bin
binlog-do-db=my_fat_db1
binlog-do-db=ma_other_fat_db

Then start (or restart) the master database using the config file:

~ $ mysqld --defaults-file=master.cnf&amp;

The slave database needs a userid/password in order to access the master machine. It’s best practice create a dedicated user with just the required privileges. This can be done with the following SQL statement.

GRANT replication slave ON *.* TO 'repl'@'slave-ip-here'

2. Create a backup of master database

The next step it to create a backup of your master database. The type of your database is very important for this action. If you use MyISAM, you need to schedule some downtime to dump your database. Otherwise the dumped data isn’t consistent and therefor useless. With InnoDB it’s possible to dump a consistent backup of your database while keeping it up and running. We stumbled on this feature by accident, but we haven’t been able to in the official documentation.

MyISAM

~ $ mysqldump --lock-all-tables

InnoDB

~ $ mysqldump --single-transaction

3. Transfer backup to slave machine

Now that the backup is created on the master machine, it needs to be transferred to the slave machine.

4. Import backup to slave database

Once the backup is present on the slave machine, it can be imported into the slave database:

~ $ mysql &lt; filename

5. Start slave database

The last step is to tell the database that it’s a slave database. First we have to give it an server id. We use a config file for this, slave.cnf:

[mysqld]
datadir=/home/yourname/mysql/myisam/data
basedir=/usr/local/mysql
port=3306
server-id=2
replicate-do-db=my_fat_db1
replicate-do-db=my_other_fat_db

Then we start the database with this config file:

~ $ mysqld --defaults-file=slave.cnf&amp;

and we issue the following SQL statement:

CHANGE MASTER TO
-&gt; MASTER_HOST='master-ip-here',
-&gt; MASTER_PORT=3306,
-&gt; MASTER_USER='repl',
-&gt; MASTER_PASSWORD='repl',
-&gt; MASTER_LOG_FILE='bin-log-filename-here',
-&gt; MASTER_LOG_POS=4;

It’s hard to predict the MASTER_LOG_POS. You can find this number by issuing the following SQL statement on the master database:

SHOW MASTER STATUS\G;

Once the slave is configured correctly it will get the bin logs from the master machine, parse them and execute each action. It takes some time to have an exact copy of the master database, depending on the size and number of bin logs. To get an idea about the delay, issue the following query:

SHOW SLAVE STATUS\G;

With this post we hope to contribute to a less riskier world and more peaceful state of mind for the sys admins out there :-D

Tags: | 2 Comments »

3 October Evaluating static ParseTree subtrees

ParseTree is a very useful to gem that can translate Ruby code into a syntax tree. I recently needed to evaluate a static part of this tree to return the original hash it represented. I wrote a simple method called ParseTree.eval_static_tree for this purpose.

The method can only evaluate trees that have a static value composed of hashes, arrays, strings, symbols and numerics. You can however pass a block to the function that will be called for every dynamic part the method encounters (function calls, etc.)

A quick sample on how to use it:

code = '{ :static_array => ["str", 123, 4.5], :dynamic => method_call }'
tree = ParseTree.translate(code)
# => [:hash, [:lit, :static_array], [:array, [:str, "str"], 
#       [:lit, 123], [:lit, 4.5]], [:lit, :dynamic], [:vcall, :method_call]]
 
ParseTree.eval_static_tree(tree)
# => RuntimeError: tree is not static: :vcall ...
 
# Pass a block to simply return nil for every dynamic item in the tree.
ParseTree.eval_static_tree(tree) { |dynamic_subtree| nil }
# => {:dynamic=>nil, :static_array=>["str", 123, 4.5]}

Tags: , , , | No Comments »

30 September Request-log-analyzer 1.4.0

Bart and I have been working a lot on request-log-analyzer lately, our tool to produce performance reports for web applications based on their log files. Today, we released version 1.4.0, which boasts many new features since I last blogged about a release. The changelog contains all changes we have implemented recently with some additional information, but these are the highlights:

  • New and improved log formats: r-l-a can now handle Apache access logs, Rack CommonLogger logs and Amazon S3 access logs. Moreover, the Rails format has been restructured to offer more flexibility.
  • Improved database support: the database supports other databases than SQLite3 as well, and r-l-a can append information to an existing database instead of overwriting it. Moreover, a console tool similar to Rails’s script/console is bundled to inspect the database and run queries on it easily.
  • Added standard deviation to reports: the standard deviation measure has been added to duration and traffic reports to get some feel of the variation in values besides the mean.
  • E-mailing reports: r-l-a can email the performance report to a given e-mail address. This can be useful when running r-l-a in a cron job.
  • Compressed log support: r-l-a will decompress compressed logs automatically.
  • Speed improvements: we have profiled request-log-analyzer itself and significantly improved its performance.
  • API: we created a basic API so it is possible to use the r-l-a engine as a library as well.
  • Monitoring integration: integrate performance information into your Munin dashboard or your Scout account.

As always, use sudo gem install request-log-analyzer to install or upgrade.

Ruby en Rails 2009 conference

Bart and I will be presenting request-log-analyzer and performance tuning of Rails applications in general at the Ruby en Rails conference in Amsterdam, October 30-31 2009. We hope to see you there!

Tags: , , , , , , , | 2 Comments »

27 September Adyen payment services for Rails

Michel and I have been playing around with integrating Adyen payment services in Rails applications. We have assembled some of the pieces of code we have written, combined them, written specs for them and released the result as a gem. The package is also included on the Adyen support site.

Currently, the gem provides the following:

  • Simple configuration and setup.
  • Uses Adyen’s test or production environment based on your Rails environment.
  • Generating hidden form fields for redirecting to Adyen for a payment.
  • Calculating the signature to sign these redirects.
  • Checking Adyen’s signature when the user gets redirected back.
  • Matchers to easily test your payment forms using RSpec.
  • Receiving and storing notifications from Adyen.
  • Calling the Adyen SOAP services (requires the Handsoap gem).

Currently, not all SOAP services are implemented (because we didn’t need them all). It should be quite easy to implement them as well based on the other services that are implemented already. Don’t hesitate to submit patches!

Tags: , , , , , , , | No Comments »

27 September Performance tweaking of Ruby algorithms

I have been working on request-log-analyzer quite a lot recently. One of the things I focused on was improving the parsing performance: because it parses log files that often are very big, processing times tend to be long. So all savings are very welcome.

Improving the performance of a command line application that does a lot of processing is very different from optimizing the performance of a web application. Request-log-analyzer basically reads a file line by line and processes it, so small performance improvements in the line processing algorithm can really add up when the file has a lot of lines. I used ruby-prof to get information about the performance of our algorithms split out by method to focus my tweaking efforts. I have written down some of my findings below; hopefully they can be helpful.

Parallelization

My first thought for improving performance was parallelization: parse multiple lines at the same time. Unfortunately, this did not yield the results I was hoping for: it instead become slower. Probably, this is because Ruby implements its own in-process threading and thus only uses one core of my processor.

Block overhead

The overhead of using a block should not be underestimated. Consider the following simple change, which improved the performance of request-log-analyzer by 1.5-2% on large log files:

# with block
io.each_line { |line| process_line(line) }
 
# without block
process_line(line) while line = io.gets

Regular expressions

If you’re using complex regular expressions, and you do not expect that every string will match it successfully, it can be beneficial to test the string with a simpler regexp first. For example, request-log-analyzer uses a complex regexp to see if a line in a log file is a “Completed in…” line with the request duration in it:

# Check every line to see if it is a "completed" line and capture the values
if line =~ /Completed in (\d+)ms \((?:View: (\d+), )?DB: (\d+)\) \| (\d\d\d).+\[(http.+)\]/
  # do something with the captured values
end

I improved this by first checking for a superficial regexp that tells me with 99% certainty that the complex regexp will match the line successfully as well:

if line =~ /Completed in/
  if line =~ /Completed in (\d+)ms \((?:View: (\d+), )?DB: (\d+)\) \| (\d\d\d).+\[(http.+)\]/
    # do something with the captured values
  else
    $stderr.puts "#{line.inspect} expected to match 'Completed' regexp!"
  end
end

Depending on the log file, this can increase performance by ~3%. Another benefit of this approach is that it will give feedback on lines that matched the simple regexp, but not the complex one. This information can be used to correct the regular expressions.

Calculate things that do not change only once

Calculate things that do not change only once is easier said than done. For instance, request-log analyzer can aggregate durations in a category. A category can be based on a request field that is parsed from the log, or a Proc that calculates it:

# during parsing:
if categorizer.respond_to?(:call)
  category = categorizer.call(request)
else
  category = request[categorizer]
end
categories[category].aggregate_request(request)

With this implementation, categorizer will be checked to be a Proc for every request, but as the value of categorizer will not change, the result of the check will be constant as well. I solved this my making sure that it is always a Proc beforehand, so the check is no longer necessary during parsing:

# before parsing:
if categorizer.respond_to?(:call)
  categorizer_proc = categorizer
else
  categorizer_proc = lambda { |request| request[categorizer] }
end
 
# during parsing:
category = categorizer_proc.call(request)
categories[category].aggregate_request(request)

Performance gain: ~1%! And because on several instances a similar technique could be applied, the performance got improved by about 4% in total.

Check the most common case first

Consider the following example, which converts a number in any traffic unit (kilobytes, MB, etc) to bytes:

def convert_traffic(value, unit)
  case unit
  when :GB, :G, :gigabyte      then (value.to_f * 1000_000_000).round
  when :MB, :M, :megabyte      then (value.to_f * 1000_000).round
  when :KB, :K, :kilobyte, :kB then (value.to_f * 1000).round
  # ... even more units here
  else value.to_i
  end
end

In most cases, the value will simply given in bytes, which will be returned by the final else after all possibilities have been checked. This can be improved by checking for this possibility first:

# Converts traffic in any unit to bytes.
def convert_traffic(value, unit)
  case unit
  when nil, :B, :byte          then value.to_i
  when :GB, :G, :gigabyte      then (value.to_f * 1000_000_000).round
  when :MB, :M, :megabyte      then (value.to_f * 1000_000).round
  when :KB, :K, :kilobyte, :kB then (value.to_f * 1000).round
  # ... even more units here
  else raise "Unknown unit: #{unit.inspect}!"
  end
end

Again this change adds up to a ~1% performance increase if this method is called very often.

These kinds of changes really improved the performance of request-log-analyzer by quite a bit, so upgrade to the latest version to get some of your valuable time back! :-)

Tags: , , , , , , , , , | 6 Comments »

5 September Build ActionScript3 projects with TextMate in 5 easy steps

The text editor of choice of our Rails team is TextMate. Our Flash team is a bit divided between Eclipse+FDT and FlexBuilder. I wanted to see if TextMate is a good alternative for building Flash/ActionScript projects.

There are a couple of good sources on using TexMate for ActionScript projects, but it’s a little fragmented and sometimes outdated. I found pixelate’s blog post and Simon’s blog very useful.

1. TextMate

First of all, you need have MacroMates’ TexMate of course. If you don’t already own a copy, you can download a 30-day-trial.

2. ActionScript3 bundle

Simon Gregory did some fine work by creating a TextMate bundle for ActionScript3. The most important things that the bundle handles are auto-completion and auto-import. You can download it directly from his blog, or fetch it with Git or SVN.

To install via Subversion:

export LC_CTYPE=en_US.UTF-8
cd ~/"Library/Application Support/TextMate/Bundles/"
svn co http://svn.textmate.org/trunk/Review/Bundles/ActionScript\ 3.tmbundle
osascript -e 'tell app "TextMate" to reload bundles'

To install via Git:

cd ~/"Library/Application Support/TextMate/Bundles/"
git clone git://github.com/simongregory/actionscript3-tmbundle.git "ActionScript 3.tmbundle"
osascript -e 'tell app "TextMate" to reload bundles'

3. Flex SDK

The actual building of your project is done by the free Flex 3 SDK. Download it and move the extracted folder into your /Developer/SDKs/ folder. The SDK has to be accessible throughout your whole system. You can do this by adding it to the PATH variable in the /etc/profile file. Open terminal and type:

sudo mate /etc/profile

Then add the folder “/Developers/SDKs/flex_sdk_3/bin” to the file and save it. It should look something like this:

PATH="/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/mysql/bin:/Developer/SDKs/flex_sdk_3/bin"

4. TextMate settings

Before we can start the ActionScript fun we have to setup a few things in TextMate. Open TextMate and select File→New Project. Click on the info button located in the bottom of the Project Drawer. Add two shell variables so that the ActionScript Bundle knows where to look for your files:

TM_FLEX_FILE_SPECS    src/Main.as
TM_FLEX_OUTPUT        bin/Main.swf

We also need to let TextMate know where the Flex SDK is located. Go to TextMate→Preferences→Advanced→Shell Variables and add a new global variable:

TM_FLEX_PATH    Developer/SDKs/flex_sdk_3

5. Hello World

With all that out of the way, we can finally start working on a ActionScript project: Hello World. Create a folder on your system that will hold this project. Drag this folder to TextMate’s Project Drawer. Create two new folders named bin and src in your project folder. Then create a new file in the src folder and name it Main.as. It should look something like this:

package {        
    import flash.display.Sprite;
    import flash.text.TextField;
 
    [SWF( backgroundColor='0xFFFFFF', frameRate='30', width='200', height='200')]
 
    public class Main extends Sprite {
        private var textField: TextField;
 
        public function Main() {
            textField = new TextField();
            textField.text = "Hello World.";
 
            addChild(textField);
        }        
    }
}

Choose Bundles→ActionScript 3→Build to build your project. A terminal window pops up which runs the Flex SDK to build your project. You can find the resulting Main.swf file in your bin folder. Yay!

Notes

If you want to build Flash Player 10 specific stuff (like me) you have to let the Flex SDK know. Open /Developer/SDKs/flex_sdk_3/frameworks/flex-config.xml and set the target-player tag to 10.0.0

<target-player>10.0.0</target-player>

The easiest way to work with libraries, the so called SWC files, in this setup is to drop them directly into /Developer/SDKs/flex_sdk_3/frameworks/libs folder. That way the Flex SDK has no problem finding them.

Tags: | 2 Comments »

31 August New version of Scoped search

After an almost complete rewrite, I am proud to present version 2.0 of scoped_search, the ActiveRecord plugin that makes it easy to find records using a simple query language. This new version support a new query language that supports more complex constructs, and can therefore be used to conduct more fine-grained queries on your models.

New query language

  • Logical operators: AND (&, &&), OR (|, ||) and NOT (!, -) operators, and parentheses to structure the boolean logic: police AND (car || uniform), -"village people". By default, the AND operator is used to combine different segments of your query.
  • Comparison operators: the most common comparison operators are supported, and to what you expect on integer and date field.
  • Explicit field support: only search in the specified field instead of all fields: age >= 21, created < 2009-01-01, username != "root".
  • Check for NULL fields: null? parent, set? error_message
  • Commas are supported to separate the different parts of the query.

More information about the query language can be found in the project wiki on GitHub.

New definition syntax

The new version supports a new syntax to define what fields of your model can be searched and in what cases. An example:

class User < ActiveRecord::Base   
  belongs_to :account_type
 
  scoped_search :on => [:first_name, :last_name]
  scoped_search :on => :created_at, :alias => :created, :only_explicit => true
  scoped_search :in => :account_type, :on => [:name, :description]
end

After the fields have been defined, the search_for method can be used to search your models using a named scope, just like it was before. The project wiki has more information about this new syntax. The search syntax itself hasn’t changed:

@users = User.search_for(params[:q]).paginate(page => params[:page])

Installation or upgrade

Include the gem in your environment.rb configuration and run rake gems:install to install it:

config.gem 'scoped_search', :source => 'http://gemcutter.org'

Backwards compatibility

The new version has a new syntax to define the fields that can be searched with a query. This new syntax gives you more fine-grained control over the queries that will be generated, so I urge you to adopt this new syntax. However, the old searchable_on syntax is still available for backwards compatibility.

Please contact me if you have any issues with the new version.

(Updated with new gemcutter installation instructions.)

Tags: , , , , , , | 2 Comments »

26 May Papervision3D 2.1 – alpha

Just committed rev. 911 to the Papervision3D SVN trunk. Rev. 911 and upwards will become Papervision3D 2.1 because the changes made are quite big.
Major changes where made to the DAE, MD2 and animation classes.

NOTE:
This revision is considerably different then previous revisions. Use with care!
At this point its not advised to use rev. 911 for production.

This revision fixes several issues regarding the DAE class:

  1. vertex-animation
  2. nested animations
  3. Cinema4D support
  4. morph-weight animation
  5. splines
  6. cloning
  7. play(), play(”clipName”), stop(), pause(), resume()
  8. more…

The whole org.papervision3d.core.animation.* package has been revamped completely to allow for the changes in the DAE class.

DAE Example:

var autoPlay : Boolean = false; // don't play animations automatically
 
var dae : DAE = new DAE( autoPlay, "myCollada" );
 
dae.addEventListener(FileLoadEvent.LOAD_COMPLETE, onDaeComplete);
dae.addEventListener(FileLoadEvent.LOAD_PROGRESS, onDaeLoadProgress);
 
// optionally pass materials to DAE
// NOTE: here's a change with previous revs :
// 1. lookup the <material> elements in the COLLADA file (inside <library_materials>).
// 2. write down / remember the @id attribute of the <material> element.
// 3. materials.addMaterial( myMaterial, materialElementID ).
// ==> this will probably change in future revs
var materials : MaterialsList = new MaterialsList();
 
// If textures fail to load optionally add some search-paths 
// (relative to the swf):
dae.addFileSearchPath( "images" );
dae.addFileSearchPath( "textures" );
 
// set to true if you get a script-timeout error
var asyncParsing : Boolean = false;
 
// load it!
dae.load( "/path/to/dae", materials, asyncParsing );
 
/**
 * The DAE has loaded completely
 */
private function onDaeComplete(event : FileLoadEvent) : void
{
     var dae : DAE = event.target as DAE;
 
     // add to scene
     scene.addChild( dae );
 
     // start playing animation (if any available)
     // other animation controls include :
     // 1. play( "clipName ")
     // 2. stop()
     // 3. pause()
     // 4. resume()
     // 5. playing (getter: bool indicating if playing)
     dae.play();
 
     // lets create a clone
     // NOTE: DAE#clone() is somewhat bugged still, 
     // but seems to work in most cases
     var clone : DAE = dae.clone() as DAE;
 
     // add clone to scene
     scene.addChild( clone );
 
     // move it a bit
     clone.x = 200;
}
 
private function onDaeLoadProgress(event : FileLoadEvent) : void
{
 
}

MD2 Example:

var md2 : MD2 = new MD2();
 
var material : MaterialObject3D = new WireframeMaterial();
 
md2.addEventListener(FileLoadEvent.LOAD_COMPLETE, onMD2Complete);
 
md2.load("/path/to/md2", material);
 
scene.addChild(md2);
 
private function onMD2Complete(event : FileLoadEvent) : void
{
       var md2 : MD2 = event.target as MD2;
 
       md2.play();
       // or play some clip :
       // md2.play( "run" )
}

Animation:

Click the image below to show an example of the new animation controls.
Animation Test
Download the source of this example here.

The DAE and MD2 class implement IAnimatable, IAnimationProvider and IControllerProvider, which can be found in the org.papervision3d.core.animation.* package.

As so much has changed I’m sure some bugs are introduced. Please let me know!

PS: I’ll be on vacation until june 8th, so its unlikely I’ll be able to fix any bugs before that time.
PS2: Many people have helped by submitting code-snips, reporting bugs etc. I’ll credit you all when I’m back :-)

38 Comments »