<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Floorplanner Tech Blog &#187; sql</title>
	<atom:link href="http://techblog.floorplanner.com/tag/sql/feed/" rel="self" type="application/rss+xml" />
	<link>http://techblog.floorplanner.com</link>
	<description>Our latest geek adventures!</description>
	<lastBuildDate>Tue, 16 Mar 2010 18:45:44 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Case-insensitive validates_uniqueness_of slowness</title>
		<link>http://techblog.floorplanner.com/2009/11/17/case-insensitive-validates_uniqueness_of-slowness/</link>
		<comments>http://techblog.floorplanner.com/2009/11/17/case-insensitive-validates_uniqueness_of-slowness/#comments</comments>
		<pubDate>Tue, 17 Nov 2009 14:01:02 +0000</pubDate>
		<dc:creator>Willem van Bergen</dc:creator>
				<category><![CDATA[Databases]]></category>
		<category><![CDATA[Ruby on Rails]]></category>
		<category><![CDATA[ActiveRecord]]></category>
		<category><![CDATA[case insensitive]]></category>
		<category><![CDATA[index]]></category>
		<category><![CDATA[rails]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[sql]]></category>
		<category><![CDATA[validates_uniqueness_of]]></category>

		<guid isPermaLink="false">http://techblog.floorplanner.com/?p=787</guid>
		<description><![CDATA[Watch out when using validates_uniqueness_of :field, :case_sensitive => false. Rails transforms this in a query that cannot be supported by an index, which will really slow validation down if the underlying table grows larger.
For example, we use validates_uniqueness_of to check for duplicate e-mail addresses. Because email addresses are case-insensitive, adding :case_sensitive => false seems like [...]]]></description>
			<content:encoded><![CDATA[<p>Watch out when using <code>validates_uniqueness_of :field, :case_sensitive => false</code>. Rails transforms this in a query that cannot be supported by an index, which will really slow validation down if the underlying table grows larger.</p>
<p>For example, we use <code>validates_uniqueness_of</code> to check for duplicate e-mail addresses. Because email addresses are case-insensitive, adding <code>:case_sensitive => false</code> seems like a natural choice. However, this results in the following queries:</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family: Monaco,monospace;"><span style="color: #808080; font-style: italic;"># For a new User instance:</span>
<span style="color: #993333; font-weight: bold;">SELECT</span> id <span style="color: #993333; font-weight: bold;">FROM</span> users 
 <span style="color: #993333; font-weight: bold;">WHERE</span> LOWER<span style="color: #66cc66;">&#40;</span>users<span style="color: #66cc66;">.</span>email<span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">=</span> <span style="color: #993333; font-weight: bold;">BINARY</span> <span style="color: #ff0000;">'user@example.com'</span>
&nbsp;
<span style="color: #808080; font-style: italic;"># For an existing User instance:</span>
<span style="color: #993333; font-weight: bold;">SELECT</span> id <span style="color: #993333; font-weight: bold;">FROM</span> users 
 <span style="color: #993333; font-weight: bold;">WHERE</span> LOWER<span style="color: #66cc66;">&#40;</span>users<span style="color: #66cc66;">.</span>email<span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">=</span> <span style="color: #993333; font-weight: bold;">BINARY</span> <span style="color: #ff0000;">'user@example.com'</span> 
   <span style="color: #993333; font-weight: bold;">AND</span> users<span style="color: #66cc66;">.</span>id <span style="color: #66cc66;">&lt;&gt;</span> <span style="color: #cc66cc;">42</span></pre></div></div>

<p>This query cannot be optimized by a (unique) index on the email field and thus has to scan the full table. As our users table grew larger, these queries started to show up in our slow query log. </p>
<p>However, MySQL uses case-insensitive comparison by default. (To be exact, case-sensitiveness depends on the current collation, which can vary. Rails generates the weird query to make sure the check works, regardless of the current collation.) The conversion to lowercase therefore is not necessary for a uniqueness check (as long as the field has a case-insensitive collation like <code>utf8_general_ci</code>). I decided to write my own validation method that issues a query that can be optimized by a query.</p>

<div class="wp_syntax"><div class="code"><pre class="ruby" style="font-family: Monaco,monospace;">  <span style="color:#008000; font-style:italic;"># Alternative for validates_uniqueness_of :email, :case_sensitive =&gt; false</span>
  validate <span style="color:#9966CC; font-weight:bold;">do</span> <span style="color:#006600; font-weight:bold;">|</span>user<span style="color:#006600; font-weight:bold;">|</span>
    conditions = <span style="color:#996600;">&quot;users.email = :email&quot;</span>
    conditions <span style="color:#006600; font-weight:bold;">&lt;&lt;</span> <span style="color:#996600;">&quot; AND users.id != :id&quot;</span> <span style="color:#9966CC; font-weight:bold;">unless</span> user.<span style="color:#9900CC;">new_record</span>?
    conditions = <span style="color:#006600; font-weight:bold;">&#91;</span>conditions, <span style="color:#006600; font-weight:bold;">&#123;</span> <span style="color:#ff3333; font-weight:bold;">:email</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> user.<span style="color:#9900CC;">email</span>, <span style="color:#ff3333; font-weight:bold;">:id</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> user.<span style="color:#9900CC;">id</span> <span style="color:#006600; font-weight:bold;">&#125;</span><span style="color:#006600; font-weight:bold;">&#93;</span>
    <span style="color:#9966CC; font-weight:bold;">if</span> User.<span style="color:#9900CC;">find</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#ff3333; font-weight:bold;">:first</span>, <span style="color:#ff3333; font-weight:bold;">:select</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> <span style="color:#ff3333; font-weight:bold;">:id</span>, <span style="color:#ff3333; font-weight:bold;">:conditions</span> <span style="color:#006600; font-weight:bold;">=&gt;</span> conditions<span style="color:#006600; font-weight:bold;">&#41;</span>
      user.<span style="color:#9900CC;">errors</span>.<span style="color:#9900CC;">add</span><span style="color:#006600; font-weight:bold;">&#40;</span><span style="color:#ff3333; font-weight:bold;">:email</span>, <span style="color:#996600;">'Already in use'</span><span style="color:#006600; font-weight:bold;">&#41;</span>
    <span style="color:#9966CC; font-weight:bold;">end</span>
  <span style="color:#9966CC; font-weight:bold;">end</span></pre></div></div>

<p>There is <a href="https://rails.lighthouseapp.com/projects/8994/tickets/2503-validates_uniqueness_of-is-horribly-inefficient-in-mysql">a ticket for this issue in Rails&#8217;s Lighthouse</a>, but as of yet this issue is unresolved. For now, this solution works to keep our slow query log nice and quiet!</p>
]]></content:encoded>
			<wfw:commentRss>http://techblog.floorplanner.com/2009/11/17/case-insensitive-validates_uniqueness_of-slowness/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>How to remove hidden tab characters</title>
		<link>http://techblog.floorplanner.com/2008/12/17/how-to-remove-hidden-tab-characters/</link>
		<comments>http://techblog.floorplanner.com/2008/12/17/how-to-remove-hidden-tab-characters/#comments</comments>
		<pubDate>Wed, 17 Dec 2008 13:31:37 +0000</pubDate>
		<dc:creator>Gert-Jan van der Wel</dc:creator>
				<category><![CDATA[Databases]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[sql]]></category>
		<category><![CDATA[tab character]]></category>

		<guid isPermaLink="false">http://techblog.floorplanner.com/?p=367</guid>
		<description><![CDATA[At this moment, all the language translations of the Floorplanner 2D app are stored in a database table. Today we discovered that a couple of these translations didn&#8217;t align properly in the interface. After some investigation we discovered that they all contained a hidden tab character at the end of  each string. This was probably [...]]]></description>
			<content:encoded><![CDATA[<p>At this moment, all the language translations of the Floorplanner 2D app are stored in a database table. Today we discovered that a couple of these translations didn&#8217;t align properly in the interface. After some investigation we discovered that they all contained a hidden tab character at the end of  each string. This was probably caused by importing a malformed CSV file.</p>
<p>I thought a simple <strong>REPLACE</strong> query would fix this problem, but (as usual) it was a little more complicated than that. First I had to find the fields with the tab character&#8230; <a href="http://twitter.com/wvanbergen/status/1060635881" target="_blank">Willem</a> pointed me to the right direction with his favorite weapon of choice <strong>REGEXP</strong>. According to the <a href="http://dev.mysql.com/doc/refman/5.1/en/regexp.html" target="_blank">MySQL docs</a> I could find tab characters with something like this:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family: Monaco,monospace;">SELECT <span style="color: #339933;">*</span> FROM table WHERE field REGEXP <span style="color: #0000ff;">'[[.LF.]]'</span></pre></div></div>

<p>The next step was to remove the tab characters. My first thought was to do this by replacing them with an empty string. It turns out you can&#8217;t combine a <strong>REPLACE</strong> with a <strong>REGEXP</strong> in a query. So I used good ol&#8217; PHP for the job. A nice advantage was that I didn&#8217;t have to do any replacing, I could just use the <em>trim()</em> function.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family: Monaco,monospace;"><span style="color: #000088;">$res</span> <span style="color: #339933;">=</span> <span style="color: #990000;">mysql_query</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;SELECT id, field FROM table WHERE field REGEXP '[[.LF.]]'&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #b1b100;">if</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$res</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
	<span style="color: #b1b100;">while</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$row</span> <span style="color: #339933;">=</span> <span style="color: #990000;">mysql_fetch_assoc</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$res</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000088;">$id</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$row</span><span style="color: #009900;">&#91;</span><span style="color: #0000ff;">'id'</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
		<span style="color: #000088;">$field</span> <span style="color: #339933;">=</span> <span style="color: #990000;">trim</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$row</span><span style="color: #009900;">&#91;</span><span style="color: #0000ff;">'field'</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #990000;">mysql_query</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;UPDATE table SET field = '<span style="color: #006699; font-weight: bold;">$field</span>' WHERE id = <span style="color: #006699; font-weight: bold;">$id</span>&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Rather simple, when you know what to do&#8230; Another bug bites the dust!</p>
]]></content:encoded>
			<wfw:commentRss>http://techblog.floorplanner.com/2008/12/17/how-to-remove-hidden-tab-characters/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
