Know what ORMs and Query Builders do!
Filed in: ExpressionEngine, CodeIgniter, Django
September 2, 2011
ORMs and Query Builders such as the Django ORM and CodeIgniter Active Record classes are great time savers. But you can get yourself into performance trouble if you're not paying attention to what these are doing behind the scenes. I think the biggest mistake people make is they use tools like this as crutches for not understanding how to write simple SQL statements. Don't fall into this group! If you're in it, work hard to get out!
I'll show a couple of examples from both to illustrate how you can give your applications a performance boost and not beat up on your Database Server.
Let's look at CodeIgniter in the context of ExpressionEngine first.
Wide tables can cripple you
The exp_channel_data table can get extremely 'wide' (lots of columns) depending on the number of custom fields you have. So let's pretend you have a hundred custom fields, and you're needing to get IDs to do something special in your custom plugin or module.
<?php public function get_ids() { $q = $this->EE->get_where('channel_data', array('site_id' => (int) $this->EE->config->item('site_id')) ); $ids = array(); foreach ($q->result() as $row) { $ids[] = $row->entry_id; } return $ids; }
Under the covers, CodeIgniter is converting this to the following query:
SELECT * FROM exp_channel_data WHERE site_id = 1;
You are only using the ID in the result set. So chances are, as your site, or your customers site grows, you're going to start getting fatal errors with memory being exhausted because you are SELECTing EVERYTHING from the largest table in the ExpressionEngine database. Let's refactor to something sane.
<?php public function get_ids() { $q = $this->EE->db->select('entry_id') ->get_where('channel_data', array( 'site_id' => (int) $this->EE->config->item('site_id') ) ); $ids = array(); foreach ($q->result() as $row) { $ids[] = $row->entry_id; } $q->free_result(); // oh, even better! return $ids; }
So the query is changed to only select the entry_id column, and we free the query result after we use it. Your page loads faster, and your sysadmin no longer curses your name. It's a win win for everyone.
Django ORM
This is easy to translate over to the Django ORM. For purposes of this example, we'll pretend we have a Post model for our blog and just want to grab the IDs of the objects in the model.
If you're being lazy, you'd do:
return render_to_response( 'template.html', { post_ids=Post.objects.all() }, context_instance=RequestContext(context) )
So same situation here. We're getting everything from that table just to work with IDs. Not good, Sam. Time to refactor.
return render_to_response( 'template.html', { post_ids=Post.objects.only('id') }, context_instance=RequestContext(context) )
This converts it to a SELECT id FROM // query under the covers. So we're doing less bad things to the database server, and depending on the site of that table, will help speed things up!
So moral of the story, don't use ORM's & query builders as a crutch to not knowing SQL. They are supposed to help make life easier and save time, not be a replacement for knowledge.