WordPress 2.5 + Apache disk_cache = A complete mess
June 24, 2008
Avoid using WordPress and Apache disk_cache extension together … unless you understand what is going on and you now how to fix the problem.
Using both together led one of the sites I have worked on (http://www.gdium.com) to display RSS feeds instead of posts and pages. After a forced refresh of the page (CTRL+R in Firefox) the correct page was displayed. This have been driving me crazy for several days until I actually remembered I had enabled Apache disk caching for another website on the same server.
After disabling disk_cache I did a search and I found someone that was having the same problem which confirmed what I thought.
Handling Drupal forms submission is dead simple. You basically write a three functions: a form generator, a form validator and a form submitter.
The form generator is called everytime the form needs to be built and it can be used to populate fields default values or the values previously entered by the user.
In the form generator you can specify the functions that should be called to validate the form and to submit it (to actually do something with the form values, like adding a new row to a table for instance):
$form[‘#validate’][] = ‘car_edit_validate’;
$form[‘#submit’][] = ‘car_edit_submit’;
Where car_edit_validate is the name of the function that will be called to do the validation and car_edit_submit is the function that will be called after the form validates.
Every time you submit the form the validate function is called first and if the validation does not encounter any errors the submit function is called.
The problem I had is that I wanted to return an error and keep the form values if an error occurred in my submit function. By default, after the submit function is called an empty form will be rendered and user entered values will be lost even if you add an error to the form with form_set_error. The error is displayed but this does not prevent the form values from being lost. If you want to keep the form values you must also add this line:
$form_state[‘redirect’] = FALSE;
Example:
function car_edit_submit($form, &$form_state) { global $user; $car = (object)$form_state['values']; $car->uid = $user->uid; if(car_save($car)) { drupal_set_message(t('car successfully saved'))); drupal_goto('car/'.$car->cid); } else { form_set_error('form', t('The car could not be saved, please try again')); $form_state['redirect'] = FALSE; //To prevent Drupal from cleaning the form } }
YAML vs Marshal performance
January 29, 2008
A colleague of mine has built a quite sophisticated mechanism that allows for components to automatically reload if necessary in case a user interaction requires it. In order to do that though, it needs to store a significant amount of information for each reloadable component in a page. This information contains the different parameters needed to reload each component. This context information is associated to a particular browser window and is stored in a hash. This hash is persisted using the serialize method provided Rails. This method uses YAML for serialization.
We are currently working to improve the performance, and thanks to the excellent ruby-prof profiler I detected that an important amount of time was spent serializing the hash before persisting it. I decided to look for alternatives and the first one I came across was Marshal.dump.
I wrote a simple test case:
#!/usr/bin/ruby
require ‘yaml’
hash = {:key1 => ‘value1’, :key2 => ‘value2’, :key3 => ‘value3’, :key4 => {:key41 => ‘value41’, :key41 => ‘value42’}}
iterations = 10000
serialized_hash = nil
start = Time.now
1.upto(iterations) { serialized_hash = Marshal.dump(hash) }
puts “Marshal hash: #{Time.now – start} seconds”
start = Time.now
1.upto(iterations) { reloaded_hash = Marshal.load(serialized_hash) }
puts “Reload marshalled hash: #{Time.now – start} seconds”
start = Time.now
1.upto(iterations) { serialized_hash = hash.to_yaml }
puts “YAMLize hash: #{Time.now – start} seconds”
start = Time.now
1.upto(iterations) { reloaded_hash = YAML::load(serialized_hash) }
puts “Reload YAMLlized hash: #{Time.now – start} seconds”
The results show that YAML is awfully slow. I will not put here the complete report, but here are the timings:
Marshal hash: 0.13829 seconds
Reload marshalled hash: 0.184913 seconds
YAMLize hash: 4.792248 seconds
Reload YAMLlized hash: 1.046568 seconds
In my tests, YAML is 34.65 times slower in serialization and 5.66 times slower in unserialization.
So be careful when serializing big objects with YAML as the performance impact can be significant .
Ever wondered how to easily add image attachment support to your Rails application? Then you should definitely give attachment_fu a go, a very easy to use Rails plugin by Rick Olson.
(Note: This article would not have been possible without Mike Clark’s excellent attachment_fu tutorial.)
Step 1: Installation (on Ubuntu 6.10)
Installing the plugin is as easy as it gets:
script/plugin install http://svn.techno-weenie.net/projects/plugins/attachment_fu/
In order to do some image processing you need to install one of the following packages as well:
- ImageScience
- RMagick
- minimagick
ImageScience is the simplest of all of them only allowing to resize images. It depends on FreeImage and RubyInline.
This is the one I have ended up using as it is enough for me.
It is not available on Ubuntu repositories, so I had to install it manually following the instructions in their website:
sudo gem install -y image_science
which also installs RubyInline, hoe and rubyforge gems.
Installing FreeImage required me to install cvs (to check out the sources) and g++ first:
sudo apt-get install cvs g++
cvs -z3 -d:pserver:anonymous@freeimage.cvs.sourceforge.net:/cvsroot/freeimage login (just type enter when asked for a password)
cvs -z3 -d:pserver:anonymous@freeimage.cvs.sourceforge.net:/cvsroot/freeimage co -P FreeImage
cd FreeImage
make
sudo make install
Step 2: Preparing your Rails application
In my application I have a Work model to which I want to associate images. Images are submitted by users and are associated to one single Work, a has_many / belongs_to association between a Work and the associated images. My application has also users and I want to know who added a particular image (to prevent abuse).
In order to make use of the functionality provided by attachment_fu you need to create an ActiveRecord model with at least the following attributes:
- content_type: what sort of content you are storing. This is used by web browsers to know how to present this information to users (open an external application, show embedded using a plugin, etc).
- filename: a pointer to the image location
- size: the size in bytes of the attachment
When you store images, attachment_fu makes use of some other useful fields:
- parent_id: if you store thumbnails to associate them to the parent image (this could actually be used for other type of content as well)
- thumbnail: as you can have more than one thumbnail, this fields contains the identifier assign to each type of thumbnail.
- width: the width of the image.
- heigth: the height of the image.
In my case as I have added the following attributes:
- work_id: the work that the image is associated to.
- user_id: the user that added the image
- default: whether this is the default image to be used when displaying the work
- created_at: when the image was added
Let’s create the model:
script/generate model WorksImages
My migrations file looks like this one:
class CreateWorkImages < ActiveRecord::Migration
def self.up
create_table :work_images, :options => 'ENGINE=InnoDB DEFAULT CHARSET=utf8' do |t|
t.column :work_id, :integer, :null => false
t.column :user_id, :integer, :null => false
t.column :default, :boolean, :null => false, :default => false
t.column :created_at, :datetime, :null => false
t.column :parent_id, :integer, :null => true
t.column :content_type, :string, :null => false
t.column :filename, :string, :null => false
t.column :thumbnail, :string, :null => true
t.column :size, :integer, :null => false
t.column :width, :integer, :null => true
t.column :height, :integer, :null => true
end
execute "alter table work_images add constraint fk_wi_works foreign key (work_id) references works(id)"
execute "alter table work_images add constraint fk_wi_user foreign key (user_id) references users(id)"
end
def self.down
drop_table :work_images
end
end
Let’s edit the WorksImages model to make use of the attachment_flu plugin:
class WorkImage < ActiveRecord::Base
has_attachment :content_type => :image,
:storage => :file_system,
:max_size => 100.kilobytes,
:resize_to => '200x200>',
:thumbnails => { :thumb => '50x50>' },
:processor => 'ImageScience'
validates_as_attachment
belongs_to :work
belongs_to :user
#The block will be executed just before the thumbnail is saved.
#We need to set extra values in the thumbnail class as
#we want it to have the same extra attribute values as the original image
#except for the default flag that is always set to false
before_thumbnail_saved do |record, thumbnail|
thumbnail.user_id = record.user_id
thumbnail.work_id = record.work_id
thumbnail.default = false
end
end
I wanted to be able to attach images by providing its url, rather than asking the user to download the image and upload it to the system, This can also be used when querying ecommerce apis (like the amazon one) to retrieve and store the images they return. So I enriched my WorkImage model with an extra method (which I guess would be a good feature to be added to the attachment_fu plugin)
def source_url=(url)
return nil if not url
http_getter = Net::HTTP
uri = URI.parse(url)
response = http_getter.start(uri.host, uri.port) {|http|
http.get(uri.path)
}
case response
when Net::HTTPSuccess
file_data = response.body
return nil if file_data.nil? || file_data.size == 0
self.content_type = response.content_type
self.temp_data = file_data
self.filename = uri.path.split('/')[-1]
else
return nil
end
end
I also enrich my Work model to easily retrieve associated images. You can easily add new relationships for easy access to thumbnails.
class Work < ActiveRecord::Base
...
has_many :images, :class_name => 'WorkImage', :conditions => ["work_images.parent_id is null"] #The condition avoids retrieving thumbnails
#Easily retrieve the default image
has_one :default_image, :class_name => 'WorkImage', :conditions => ["work_images.default"]
...
end
Step 3: Make use of the new model in the controller and view
In my controller, when I want to add an image to a model I do something like the following:
def add_image
...
#Store the image if any
if params[:image_source_url]
image = WorkImage.new(:source_url => params[:image_source_url])
image.work_id = @work.id
image.user_id = self.current_user.id
image.default = true if params[:is_default_image]
image.save!
end
...
end
Images will be saved in public/work_images using something that Jaimis buck from 37signals called id partitioning.
That way you can theoretically store 9999 * 10000 attachments (thumbnails are not counted as attachments), which for standard purposes is enough. Anyway, this can easily be changed to support more files if you need it. Look for a method named partitioned_path in vendor/plugins/attachment_fu/lib/technoweenie/attachment_fu/backends/file_system_backend.rb.
In order to display the default image in a view I just need to do the following:
<%= image_tag(@work.default_image.public_filename()) %>
If what you want to display is the thumbnail, just pass the thumbnail identifier (in our case :thumb) to the file:
<%= image_tag(@work.default_image.public_filename(:thumb)) %>
And that should be it really. If you have questions, leave a comment.
Note:
I found a small bug in the plugin. It was not storing resized image sizes properly. I had to add edit the vendor/plugins/attachment_fu/lib/technoweenie/attachment_fu/processors/image_science_processor.rb file and set the correct size just after the image is saved in the resize_image method:
...
img.save self.temp_path
self.size = File.size(self.temp_path)
...
I also noticed that for images that do not need to be resized, something is done as the size of the images changes, although the dimensions remain the same. I have a file of 5KB that has a size of 12 KB after the resizing process!!! The size of the image is the same and it should have not been modified. Not sure what is going on here but I guess this is an ImageScience issue.
This is my first post sent using Deepest Sender as it will make updating this log much easier.
When a request asking for an action in a controller that does not exist in your application a not found error page is displayed. You can actually use routes to redirect this requests to the a default page.
Just add the following line the last rule in your config/routes.rb file:
map.connect ‘*path’, :controller => ‘main’, :action => ‘redirect_to_default’
Whenever a request asking for an action in a controller that you have not defined hits your application, rails will call the action ‘redirect_to_default’ in the ‘main’ controller (you can obviously change the controller and the action to fit your needs).
The code for the redirect_to_default action is a simple rails redirect:
def redirect_to_default
redirect_to :action => 'index'
end
If you want to pass specific options like the table type and the charset to use when creating tabled though Rails migrations, pass an options parameter to the create_table method:
create_table :my_table, :options => 'ENGINE=InnoDB DEFAULT CHARSET=utf8', :force => true do |t|
t.column :column1, :string
t.column :column2, :string
end
Setting a maintenance page in apache 2
March 15, 2007
In order to set a maintenance page in apache 2 you need:
- Enable mod_rewrite. In my debian server I just need to do the following:
ln -s /etc/apache2/mods-available/rewrite.load /etc/apache2/mods-enabled/rewrite.load - Create the maintenance page somewhere in your disk server. I created it under /srv/www/maintenance
- Set up apache2 to redirect all requests to your site to the maintenance page (you will need to comment out the current apache2 directives for your website). In my case I have a /etc/apache2/sites-available/mysite file that is linked from /etc/apache2/sites-enabled/mysite.
#Maintenance page
<VirtualHost *:80>
ServerName mysite.com
ServerAdmin postmaster@mysite.com
RewriteEngine on
RewriteCond %{REQUEST_URI} !/index\.html$
RewriteCond %{REQUEST_URI} !/logo\.gif$
RewriteRule ^(.*)$ /index.html [L]
DocumentRoot “/srv/www/maintenance”
</VirtualHost> - Reload apache2 configuration:
/etc/init.d/apache2 reload
In my case I only have an index.html and a logo.gif file in the /srv/www/maintenance folder. If you have more files that are needed to render the maintenance page you will need to add some extra “RewriteCond %{REQUEST_URI} !/yourfile\.extension$” rules.
Note if you do not use the rewrite engine, the maintenance page will show up when you users access http://mysite.com or http://mysite.com/index.html, but if they access http://mysite.com/something_else they will get a nice “Page not found error”.
Restoring a software RAID 1 on Linux
March 12, 2007
When one of the hard disks in a RAID 1 gets out of the RAID because it is no longer in sync with the other disk, you can easily resynchronize it with the following command:
raidhotadd /dev/mdX /dev/sdY
X and Y should be set to the appropriate values.
‘cat /proc/mdstat’ would tell you if your RAID system is healthy.
The configuration of your RAID is set in the file /etc/raidtab which will tell you the disks in the RAID and you can compare the results with the cat command above to see which disk is missing.
I use this script to verify if all disks in my RAID 1 are working fine:
#!/bin/bash
#Check if both drive are up
if [ `grep [UU] /proc/mdstat | wc -l` != 2 ] || [ `grep "2/2" /proc/mdstat | wc -l` != 2 ]; then
cat /proc/mdstat
exit 1
fi
exit 0
More info here.