Techblog

Tech Blog

Our latest geek adventures!

Posts Tagged ‘request-log-analyzer’

18 November Request-log-analyzer 1.5.0

Bart and I just released request-log-analyzer version 1.5.0. New features include:

  • MySQL slow query log format support to analyze what queries are slowing down your database.
  • Format autodetection: with all those supported file formats, remembering the right --format parameter gets tricky. With format autodetection, this usually is not needed anymore!

As always, run the following command to install or upgrade to the latest version:

$ gem install request-log-analyzer

No Comments - Tags: , , , ,

30 September Request-log-analyzer 1.4.0

Bart and I have been working a lot on request-log-analyzer lately, our tool to produce performance reports for web applications based on their log files. Today, we released version 1.4.0, which boasts many new features since I last blogged about a release. The changelog contains all changes we have implemented recently with some additional information, but these are the highlights:

  • New and improved log formats: r-l-a can now handle Apache access logs, Rack CommonLogger logs and Amazon S3 access logs. Moreover, the Rails format has been restructured to offer more flexibility.
  • Improved database support: the database supports other databases than SQLite3 as well, and r-l-a can append information to an existing database instead of overwriting it. Moreover, a console tool similar to Rails’s script/console is bundled to inspect the database and run queries on it easily.
  • Added standard deviation to reports: the standard deviation measure has been added to duration and traffic reports to get some feel of the variation in values besides the mean.
  • E-mailing reports: r-l-a can email the performance report to a given e-mail address. This can be useful when running r-l-a in a cron job.
  • Compressed log support: r-l-a will decompress compressed logs automatically.
  • Speed improvements: we have profiled request-log-analyzer itself and significantly improved its performance.
  • API: we created a basic API so it is possible to use the r-l-a engine as a library as well.
  • Monitoring integration: integrate performance information into your Munin dashboard or your Scout account.

As always, use sudo gem install request-log-analyzer to install or upgrade.

Ruby en Rails 2009 conference

Bart and I will be presenting request-log-analyzer and performance tuning of Rails applications in general at the Ruby en Rails conference in Amsterdam, October 30-31 2009. We hope to see you there!

2 Comments - Tags: , , , , , , ,

27 September Performance tweaking of Ruby algorithms

I have been working on request-log-analyzer quite a lot recently. One of the things I focused on was improving the parsing performance: because it parses log files that often are very big, processing times tend to be long. So all savings are very welcome.

Improving the performance of a command line application that does a lot of processing is very different from optimizing the performance of a web application. Request-log-analyzer basically reads a file line by line and processes it, so small performance improvements in the line processing algorithm can really add up when the file has a lot of lines. I used ruby-prof to get information about the performance of our algorithms split out by method to focus my tweaking efforts. I have written down some of my findings below; hopefully they can be helpful.

Parallelization

My first thought for improving performance was parallelization: parse multiple lines at the same time. Unfortunately, this did not yield the results I was hoping for: it instead become slower. Probably, this is because Ruby implements its own in-process threading and thus only uses one core of my processor.

Block overhead

The overhead of using a block should not be underestimated. Consider the following simple change, which improved the performance of request-log-analyzer by 1.5-2% on large log files:

# with block
io.each_line { |line| process_line(line) }
 
# without block
process_line(line) while line = io.gets

Regular expressions

If you’re using complex regular expressions, and you do not expect that every string will match it successfully, it can be beneficial to test the string with a simpler regexp first. For example, request-log-analyzer uses a complex regexp to see if a line in a log file is a “Completed in…” line with the request duration in it:

# Check every line to see if it is a "completed" line and capture the values
if line =~ /Completed in (\d+)ms \((?:View: (\d+), )?DB: (\d+)\) \| (\d\d\d).+\[(http.+)\]/
  # do something with the captured values
end

I improved this by first checking for a superficial regexp that tells me with 99% certainty that the complex regexp will match the line successfully as well:

if line =~ /Completed in/
  if line =~ /Completed in (\d+)ms \((?:View: (\d+), )?DB: (\d+)\) \| (\d\d\d).+\[(http.+)\]/
    # do something with the captured values
  else
    $stderr.puts "#{line.inspect} expected to match 'Completed' regexp!"
  end
end

Depending on the log file, this can increase performance by ~3%. Another benefit of this approach is that it will give feedback on lines that matched the simple regexp, but not the complex one. This information can be used to correct the regular expressions.

Calculate things that do not change only once

Calculate things that do not change only once is easier said than done. For instance, request-log analyzer can aggregate durations in a category. A category can be based on a request field that is parsed from the log, or a Proc that calculates it:

# during parsing:
if categorizer.respond_to?(:call)
  category = categorizer.call(request)
else
  category = request[categorizer]
end
categories[category].aggregate_request(request)

With this implementation, categorizer will be checked to be a Proc for every request, but as the value of categorizer will not change, the result of the check will be constant as well. I solved this my making sure that it is always a Proc beforehand, so the check is no longer necessary during parsing:

# before parsing:
if categorizer.respond_to?(:call)
  categorizer_proc = categorizer
else
  categorizer_proc = lambda { |request| request[categorizer] }
end
 
# during parsing:
category = categorizer_proc.call(request)
categories[category].aggregate_request(request)

Performance gain: ~1%! And because on several instances a similar technique could be applied, the performance got improved by about 4% in total.

Check the most common case first

Consider the following example, which converts a number in any traffic unit (kilobytes, MB, etc) to bytes:

def convert_traffic(value, unit)
  case unit
  when :GB, :G, :gigabyte      then (value.to_f * 1000_000_000).round
  when :MB, :M, :megabyte      then (value.to_f * 1000_000).round
  when :KB, :K, :kilobyte, :kB then (value.to_f * 1000).round
  # ... even more units here
  else value.to_i
  end
end

In most cases, the value will simply given in bytes, which will be returned by the final else after all possibilities have been checked. This can be improved by checking for this possibility first:

# Converts traffic in any unit to bytes.
def convert_traffic(value, unit)
  case unit
  when nil, :B, :byte          then value.to_i
  when :GB, :G, :gigabyte      then (value.to_f * 1000_000_000).round
  when :MB, :M, :megabyte      then (value.to_f * 1000_000).round
  when :KB, :K, :kilobyte, :kB then (value.to_f * 1000).round
  # ... even more units here
  else raise "Unknown unit: #{unit.inspect}!"
  end
end

Again this change adds up to a ~1% performance increase if this method is called very often.

These kinds of changes really improved the performance of request-log-analyzer by quite a bit, so upgrade to the latest version to get some of your valuable time back! :-)

6 Comments - Tags: , , , , , , , , ,

28 March Counter feedback

Request-log-analyzer watchers on Github

This week, request-log-analyzer obtained its 100th watcher on GitHub!

Bart and I have worked hard to make r-l-a a useful product for many people in various situations. The fact that more than 100 people are following the project’s progress and that at this moment, the gem has been download almost 200 times, shows that we are somewhat successful in this regard. Numbers like these, in combination with the e-mail messages we have received, motivate us to keep spending time on the project and keep improving it, even if these improvements are not directly useful for our own projects. It has grown beyond scratching our own itch

On a related note, my Flickr photostream recently welcomed its 10,000th visitor.

10,000 visitors on my Flickr photostream!

What started as a convenient utility to backup and share my holiday photos with my family and friends, now has become somewhat of a showcase of what I am about and what I am up to. I would not consider myself a “professional” photographer and I am not an active member of the Flickr community, but still I get feedback on my photographs by visitors, because their visits increase the view counters of my photos.

These counters have really motivated me to make more of an effort when I put photos on Flickr. I started by adding titles, descriptions and tags, so that my photos are easier to find. I also became much more critical of the pictures I upload to Flickr: new uploads have to add something significant to my collection. Analyzing why some pictures got more attention than others made me a better photographer, although there still is a lot of room for improvement. :-)  

Ignite the lazyweb, kick-start a quality improvement loop

What interests me in these examples is that simple counters like watchers on GitHub or views on Flickr can be valuable feedback and can motivate people to put in effort. The end result is quality improvement: write better software, make better pictures, etc.. Even an inherently lazy person like me can get motivated to keep putting in effort and to keep improving myself, because of such a simple feedback loop! :-)

Additionally, it creates a dependence on the website in question. I look at my Flickr stats page every day, and I am subscribed to my activity feed on GitHub to get notified when new people start watching my projects. I now simply have to publish tools I write on GitHub to boost its quality, immediately and in the long run. And I have to upload pictures to Flickr as it is vital for my photography learning curve.

I guess I finally figured out what Web 2.0 is all about! :-) Can we use a similar technique on Floorplanner to boost the quality of the designs our visitors make?

No Comments - Tags: , , , , , ,

12 January Request-log-analyzer 1.0

After a complete rewrite, Bart and I are proud to present request-log-analyzer version 1.0! Request-log-analyzer is an open-source command-line tool to analyze production log files from your Rails application to produce a performance report.

What’s new?

  • More robust log parser. It parses more lines and it now combines all lines that belong to the same request, which greatly improves the amount of information available. 
  • It produces more detailed and more beautiful reports
  • A database builder is included, which will create an SQLite 3 database with all parsed request information, so you can roll your own queries.
  • Request filtering options, so you can exclude irrelevant data. An example on how this can be applied in practice can be found in the wiki. 
  • Better, more modularized design under the hood. The parser is now fully log file format-agnostic. Developing extensions and modifications, or adding support for other log file formats should be much easier now. See the development-page for some pointers.
  • Documentation in the project’s wiki. Hopefully, this helps people get up to speed with the new version and answers most questions about using the tool. If you still have questions, please contact us so we can keep improving it!

Installation

Install or upgrade to the new version with the following command:

$ sudo gem install wvanbergen-request-log-analyzer 
                --source http://gems.github.com

To get the best results out of request-log-analyzer, it is important to configure logging correctly for your application. Some pointers on how to set things up correctly can be found in the wiki.

1 Comment - Tags: , , , , , , ,

11 December Rails 2.2 support for request-log-analyzer

I just released version 0.2.0 of request-log-analyzer, our tool to analyze request log files that are generated by Rails and Merb for performance tweaking. This new version supports the new log format of Rails 2.2, which has changed slightly.

An updated gem should be available any minute now. Run sudo gem update to upgrade the newest version.

No Comments - Tags: , , , ,

29 August Rails-log-analyzer matures

Since I announced rails-log-analyzer some weeks ago, quite a lot has happened! Apparently there is some interest in such a tool: on this blog we get a lot of traffic looking for more info, the github project already has 22 watchers and it even has been forked!

In the mean time, Bart and I worked hard to add new functionality and refactored the internal design. As a result, I have released request-log-analyzer 0.1.0 today!

Changes: 

  • The project is renamed to request-log-analyzer, because we plan to support log files from other frameworks as well; Merb is planned to be supported in the near future.
  • The tool is distributed as a gem, making it much easier to install and update.
  • More reports, colorized output, parsing progress bars, command line arguments, etc…
  • Added a tool to create a SQLite database with all the parsed info from the log file, so you can do your own analysis.

Installation:

gem sources -a http://gems.github.com
sudo gem install wvanbergen-request-log-analyzer

Usage:

request-log-analyzer  [LOG FILES*]
request-log-analyzer -c 20 -z log/production.log

Please let me know what you think! If you have any problems using the tool, do not hesitate to contact me!

13 Comments - Tags: , , , ,

15 August Rails log analyzer

My friend Bart from movesonrails.com just blogged about Rails log analyzer, a command line tool to get performance statistics for your Rails application by parsing its log file.

What started as an exercise for me to write a command line ruby program, has been extended and improved by Bart to be actually useful! We decided to release it under an MIT license. You can found the source on github. The project’s wiki contains usage information and an example of the output it will produce.

5 Comments - Tags: , , ,