Techblog

Tech Blog

Contributions by Willem van Bergen

About Willem van Bergen

My website, online since 1999: www.vanbergen.org. I can also be found on Flickr, GitHub, LinkedIn and - of course - on Google.

26 November Generating thumbnails

I would like to thank you all for helping us build our thumbnail database!
I presume this statement might be in need of some clarification, so bear with me when I go into the technical details on this one.

For every design that is saved on Floorplanner, we create a thumbnail in JPEG format. We use these thumbnails for the gallery, and now we have included them on everyone’s dashboard. However, for various reasons, we do not have a thumbnail available for every design. However, with your help, we are improving the thumbnail database while you our browsing the site!

The thumbnail images are stored on Amazon AWS S3. We know the URL that a thumbnail of a design should have, but we do not know if it actually is available. In the latter case, the result is a nasty image not found placeholder on the dashboard. This of course is not acceptable! We cannot know if a thumbnail exists other than doing a request to the URL and see whether we get an image back from Amazon, or a HTTP 404 status. This is a very time-consuming procedure to run server side so we chose to find a client-side solution.

We found that javascript can be used to check if an image exist. An AJAX call cannot be used, because cross-site calls are not supported. However, the javascript Image object can be used for this purpose.

var img = new Image();
img.onload = function(event) {
  // image was found and loaded successfully
  document.getElementById('img-tag').src = img.src;
};
img.onerror = function(event) {
  // An error occured while loading the image
  document.getElementById('img-tag').src = '/images/thumb-unavailable.jpg';
}
 
// Setting the src property will trigger the events.
img.src = 'http://link.to.amazon.s3/design/thumbnail.jpg';

A nice thumbnail not available image will be shown if the thumbnail file cannot be found on S3. This is much nicer, and the check is completely done client-side! However, we found a way this could even be better by changing the onerror event handler. Instead of displaying a thumbnail not available image, we can simply load a small instance of the Floorplanner application to display a small version of the design. Moreover, we can instruct it to generate a thumbnail JPEG and save it to S3!

So, every time you see a small Floorplanner loading on your dashboard, you are creating a missing thumbnail. Next time, the thumbnail will be available on S3 and there will be no need to start the Floorplanner application. A nice example of distributed computing, mixed with a hint of SETI@home. I like it!

No Comments - Tags: , , , ,

21 September HTTP status exception handling plugin

Some time ago, I wrote about putting HTTP status code to use for your Rails application. For my reinvigorated project, I wanted to apply the same technique. Instead of re-implementing it once again, I created a Rails plugin called http_status_exceptions to easily add this functionality and I have put it on Github. For more information on how to install and use the plugin, see the project’s wiki.

No Comments - Tags: , , , , ,

20 September Batch file renaming

I just started working on an old Rails project after having neglected it for 15 months. Most of the view files still had the good old .rhtml extension. I was too lazy to rename these files by hand, both on my file system and in the git repository. I used the following Bash commands to do the job:

First, I renamed all the partials to the .erb extension. Note: I am not using .html.erb, as some of these partials are used in js-formatted responses as well:

for i in `find app/views/**/_*.rhtml`; do \
  git mv $i `echo $i | sed s/\\.rhtml$/.erb/`; \
done

The remaining files could now be renamed to .html.erb with a similar command:

for i in `find app/views/**/*.rhtml`; do \
  git mv $i `echo $i | sed s/\\.rhtml$/.html.erb/`; \
done

Note that this technique works with Subversion as well: just substitute git with svn in the command above. A regular rename is possible as well by leaving out git altogether!

Now my file names are Rails-compliant again, I can start refactoring all the code that is not up to current Rails standards anymore. Ah, the virtues of developing with a rapidly evolving framework…

1 Comment - Tags: , , , , ,

13 September Remote branches in git

I have been using git for a while now, and I believe I have the the basic workflow under control. Committing, reverting, using local branches for major refactoring work: been there, done that! ;) However, I recently got some collaborators on my github-projects, I had to start working with other remote repositories and branches.

I found this blog post, which was really helpful. I am sharing some others things that helped me in the last couple of weeks. Hopefully, this saves other people some time Googling. If you know a better ways to accomplish these tasks, please let me know!

Things to remember about remote branches

Because I had some troubles discovering how to properly work with remote repositories, I am sharing what I found. The most important things to remember:

  • Never forget to switch to the correct local branch (using git checkout local-branch). This is easier if you setup your command prompt to include your current branch.
  • The names of the local and remote do not need to match, but it is highly recommended if you do not want to go crazy ;) .
  • Before you can push changes from a local branch to a remote branch, all the commits of the remote branch have to be included in your local branch. This can be done using by executing git pull remote-name remote-branch in your local branch.

Checking out a newly created branch in a remote repository

Bart is implementing Merb log parsing for request-log-analyzer. He has put his progress in a separate branch of the github project. My local repository does not yet include this branch, but I want to check it out. Note that I am using a different name than the branch name on the github project.

$ git branch merb origin/generic_base
$ git checkout merb

Update: The same can be accomplished with a single command, which sets up remote tracking as well:

$ git checkout -b merb origin/generic_base

When the new functionality is finished, the following commands will merge the changes in the merb branch to the master branch.

  # goto my local master branch and merge the merb-branch
$ git checkout master
$ git merge merb
 
  # push the changes to the master branch on github
$ git push origin master

Merging back a fork

Wes Hays is helping me out on the scoped_search plugin. He implemented OR in the query language in his own fork. I wanted to merge his changes back to master branch:

  # add a reference to the remote repository 
  # and fetch the latest data from that repository
$ git remote add gbdev git://github.com/gbdev/scoped_search.git
$ git fetch gbdev
 
  # create a local branch for the fork to follow a remote branch
$ git branch gbdev-fork gbdev/master
$ git checkout gbdev-fork

Now, my local gbdev-fork branch contains Wes’s code. Because Wes’s repository was forked from my repository, git will know that most of the history of my master branch and gbdev-fork branch is the same.

After some testing, I was ready to include his changes by merging the gbdev-fork branch into my local master branch:

  # go back to my master branch, and merge the changes
$ git checkout master
$ git merge gbdev-fork
  # push the changes to the master branch at hithub
$ git push origin master

Update: I found out that you do not have to create a local branch when merging a remote branch. You can do read-only work directly on the remote branch:

  # Add a new remote to your repository and fetch updates
$ git remote add gbdev git://github.com/gbdev/scoped_search.git
$ git fetch gbdev
 
  # Checkout the remote branch for testing (read-only)
$ git checkout gbdev/master
 
  # After successful testing, merge the branch into the master branch:
$ git checkout master
$ git merge gbdev/master
$ git push origin master

Pushing tags to a remote repository

You can create tags locally, but you probably want to send them to the remot repository as well:

  # create a local tag "tagname" with the given message.
$ git tag -a "tagname" -m "message" 
  # send your tags to the remote repository "origin"
$ git push origin --tags

Deleting tags

Sometimes, you want to remove a faulty tag. If you already pushed your tags to a remote repository, you probably want to delete it from that repository, too,

  # remove local tag
$ git tag -d tagname
  # remove tag from remote using colon syntax
$ git push :tagname

5 Comments - Tags: , , , ,

29 August Rails-log-analyzer matures

Since I announced rails-log-analyzer some weeks ago, quite a lot has happened! Apparently there is some interest in such a tool: on this blog we get a lot of traffic looking for more info, the github project already has 22 watchers and it even has been forked!

In the mean time, Bart and I worked hard to add new functionality and refactored the internal design. As a result, I have released request-log-analyzer 0.1.0 today!

Changes: 

  • The project is renamed to request-log-analyzer, because we plan to support log files from other frameworks as well; Merb is planned to be supported in the near future.
  • The tool is distributed as a gem, making it much easier to install and update.
  • More reports, colorized output, parsing progress bars, command line arguments, etc…
  • Added a tool to create a SQLite database with all the parsed info from the log file, so you can do your own analysis.

Installation:

gem sources -a http://gems.github.com
sudo gem install wvanbergen-request-log-analyzer

Usage:

request-log-analyzer  [LOG FILES*]
request-log-analyzer -c 20 -z log/production.log

Please let me know what you think! If you have any problems using the tool, do not hesitate to contact me!

13 Comments - Tags: , , , ,

15 August Rails log analyzer

My friend Bart from movesonrails.com just blogged about Rails log analyzer, a command line tool to get performance statistics for your Rails application by parsing its log file.

What started as an exercise for me to write a command line ruby program, has been extended and improved by Bart to be actually useful! We decided to release it under an MIT license. You can found the source on github. The project’s wiki contains usage information and an example of the output it will produce.

5 Comments - Tags: , , ,

29 July Active OLAP released

Remember my post about easy OLAP queries in Rails? I rewrote it almost completely and published is as a Rails plugin for anyone to use on github! It is now called: Active OLAP.

Although it is a complete rewrite, the API I demoed in my previous post should still work with some small changes. The most important: you have to enable it for every class you want to use it on with the enable_active_olap method. You can provide a block to this method with dimension definitions, but is not mandatory:

class User < ActiveRecord::Base
 
  enable_active_olap do |olap|
 
    # create a simple dimension on the account_type field
    olap.dimension :account_type
 
    # create a dimension with custom categories
    # the order of the categories will be kept in the results 
    # if you use an array to define the categories.
    olap.dimension :nationality, :categories => [
      [:usa, { :country => 'US' }],
      [:china, { :country => 'CN' }]
      # other is automatically added
    ]
 
    # Easily create a trend dimension
    olap.dimension :created_daily, :trend => {
      :timestamp_field => :created_at,
      :period_length => 1.day, 
      :period_count => 20
    }
  end
end

Now, we can use these dimensions for our OLAP queries. Multiple dimensions are supported too!

# simple query
@result = User.olap_query(:nationality)
# @result[:usa] == 123, @result[:china] == 456, @result[:other] = 789
 
# do drilldown using will_paginate to paginate the results
# olap_drilldown is implemented as a named_scope
@users = User.olap_drilldown(:nationality => :china).paginate(:page => 1)
 
# multiple dimensions!
@result = User.olap_query(:nationality, :created_daily)
@users = User.olap_drilldown(:nationality => :china, 
                        :created_daily => :period_19)

I am working on a generic controller that can easily be added to your Rails project. Just define dimensions for your models and the controller will let you execute OLAP queries and display the results as a table or a graph.

Keep an eye on this weblog or the github project if you want to stay up-to-date! Or, contact me if you have questions, suggestions or want to help out.

5 Comments - Tags: , , ,

26 July Easy search with ActiveRecord

A couple of minutes ago I released scoped_search, a Rails/ActiveRecord plugin that makes it easy to search your models. It is very easy to use:

  1. Install the plugin in your vendor/plugins directory from http://github.com/wvanbergen/scoped_search
    Add the gem to your rails environment.rb:

    config.gem 'wvanbergen-scoped_search', :lib => 'scoped_search', 
        :source => 'http://gems.github.com'

    Call rake gems:install afterwards to ensure the gem is installed.

  2. Define in what fields your model should be searched by calling
    searchable_on :some, :field, :names
  3. Find your records by calling search_for("query keywords")

That’s all! A short example:

class Project < ActiveRecord::Base
  searchable_on :name, :description
end
 
Project.search_for("search keywords").each do |project| 
  puts project.name
end
 
# SELECT * FROM projects WHERE 
#      (name LIKE '%search%' OR description LIKE '%search%') 
#  AND (name LIKE '%keywords%' OR description LIKE '%keywords%')

This functionality is completely build upon named_scope. The search_for method is actually a named scope that was created by the call to searchable_on. Because these scopes can be chained, this offers some great possibilities.

For example, in Floorplanner, we only want you to search on the projects you have access to. We have implemented this access logic in another named scope. The calls can simply be chained:

class Project < ActiveRecord::Base
  searchable_on :name, :description
 
  named_scope :accessible_by, lambda { |user| ... }
  named_scope :published, :conditions => 'published_at IS NOT NULL'
end
 
@projects = Project.accessible_by(current_user).published.search_for('query')
@projects.each { |project| ... }

This plugin is released under the MIT license, so please use it for any purpose you see fit. There are some TODOs: you currently can not search on fields in other tables, and splitting the search string into keywords is very basic. Please contact me if you have implemented any of these features and you are willing to share them! Do not hesitate to contact me in case or problems either.

Update: I added support for quotes and the minus sign to the query language:
Project.search_for('willem -"van bergen"').count

Update #2: Wes Hays implemented the OR keyword:
Project.search_for('wes OR hays').count.
A big thanks to Wes for helping out on this project!!

10 Comments - Tags: , , , ,

19 July Snack 2.0

Earlier this week we discussed a fast food snack that is available in Rotterdam called “Kapsalon”, literally “Hairdresser’s”. According to the urban legend, the name came into existence after employees from a hairdresser’s composed their favorite meal at the shoarma place next door. The “calorie bomb” contains french fries, shoarma, cheese and lettuce, all thrown together. Unboxing pictures of it can be seen here.

Within a couple of months, it became rather popular in Rotterdam and most shoarma places include the dish on their menu, next to döner kebab and Turkish pizza. At Floorplanner we hope it will spread and become a national phenomenon. Not because the dish is so tasty or healthy, because it is not. It is, however, a very buzzword compliant meal:

  • It could be described as a mashup, as it just consists of some existing dishes trown together in a unique manner.
  • It is user-generated content, as the customers of the shoarma place invented the dish instead of the shoarma shop itself.
  • Its popularity is because of a grassroots campaign, instead of a major marketing undertaking by one of the big Dutch snack producers like Mora or Beckers.

It’s almost a shame that the birth of this phenomenon is taking place in the city of Feyenoord instead of the city of AJAX. ;-)

1 Comment - Tags: , , , ,

14 July Easy OLAP queries in ActiveRecord

Because I love statistics so much, I decided to add some neat statistics functionality to the Floorplanner administration interface, so we can get better insight in what is going on. Instead of writing complete OLAP SQL queries myself and adding a custom interface for each one of them so our management can use them (yes Jeroen, that means you!), I built an ActiveRecord extension to ease the work. Right now, I only have to define some categories, and it automagically generates the right SQL query to generate charts and tables with the number of records that fall in each category. Moreover, by clicking on these numbers, I can drill down to the individual records.

I can define the categories like this:

olap_definition = { :categories => {
  :project_is_private   => { :public => false, :publishd_at => nil },
  :project_is_public    => { :public => true,  :publishd_at => nil },
  :project_is_published => 'projects.published_at IS NOT NULL'
}}

Not too hard, was it? Now, I can easily feed this to Project.olap_query:

@query_result = Project.olap_query(olap_definition) 
# @query_result == {
#   :project_is_private   => 123,
#   :project_is_public    => 456,
#   :project_is_published => 3,
#   :other                => 2
# }

Note that the category other is added automatically, but can be omitted if you wish. (I found that the other-category is nice to spot data integrity problems in your dataset you didn’t think of beforehand). The result can be used to create a table with the results, plot a pie chart with the Google Charts API. Because this setup is completely generic, this functionality only has to be written once. DRY!

The SQL for other-category is “calculated” by OR-ing all the categories and checking whether the result is false, or NULL. The check for NULL is necessary if you have NULL-values in your table: this is a weird characteristic of SQL that defines that TRUE AND NULL equals NULL (see Wikipedia).

The actual SQL query for this example would be:

SELECT 
  SUM(projects.public = 0 AND projects.published_at IS NULL) AS project_is_private,
  SUM(projects.public = 1 AND projects.published_at IS NULL) AS project_is_public,
  SUM(projects.published_at IS NOT NULL) AS project_is_published,
  SUM( NOT (
    (projects.public = 0 AND projects.published_at IS NULL) OR
    (projects.public = 1 AND projects.published_at IS NULL) OR
    (projects.published_at IS NOT NULL)
  ) OR (
    (projects.public = 0 AND projects.published_at IS NULL) OR
    (projects.public = 1 AND projects.published_at IS NULL) OR
    (projects.published_at IS NOT NULL) IS NULL)) AS other
FROM projects

Some notes about this query:

  • It is complety built using the fragments from the categories. The fragment for the other-cagegory is a little verbose, but what do I care? It works and is generated automatically! :-)
  • Note that a record can be in multiple categories, depending on the category definitions. The other category only contains records that conform to none of the provided categories.
  • SUM is used in stead of COUNT. This way, I can query all the categories at once and it solves the problems with NULL-values, while keeping my WHERE and GROUP BY clause nice and clean :-)
  • The query is built completely using ActiveRecords find method by using anonymous scopes. Therefore, Rails 2.1 is required, but this makes some neat tricks possible as well.

I also have a Project.olap_drilldown method that I can use to find the individual projects in a category:

@projects = Project.olap_drilldown(olap_definition, :project_is_public)
# SELECT projects.* FROM projects 
# WHERE (projects.public = 1 AND projects.published_at IS NULL)
 
@projects.each do |project|
  puts project.name
end

Because this functionality is built on anonymous scopes, it offers some interesting additional functionality. You can use your own scopes to limit the input dataset

class Project < ActiveRecord::Base
  named_scope :recent, lambda { { :conditions => 
              ['created_at > ?', Time.now - 7.days]} }
  ...
end
# This will add a WHERE-clause to the OLAP query
results = Project.recent.olap_query(olap_definition)
 
# Or, use :conditions for the same effect
results = Project.olap_query(olap_definition.merge(
            :conditions => ['created_at > ?', Time.now - 7.days]))

As I noted before, the GROUP BY-clause is not used. I already built an extension to use the GROUP BY clause to group the results in periods of a given timestamp field of the model (e.g. created_by). When I pass the result of such a query to the Google Chart API, I can generate trend graphs to see how my dataset is evolving.

If I have time and there is any interest, I may release this extension as a gem or Rails plugin.

UPDATE: I rewrote it and released this project on github.

1 Comment - Tags: , , , ,