Forcing an SQL JOIN in CakePHP

Update (10/7/2009)… this functionality has been rolled into Habtamable behavior

You’ve probably noticed that by default CakePHP will only create a JOIN query if you have a hasOne or belongsTo associations between your models. There are cases, however, when a JOIN is necessary to get just the right data.

So how do you tell cake to create a JOIN?

Well, you have to rely on some crafty methods, but all in all it’s not very hard, once you get the hang of it…

I will summarize here a few tricks I’ve encountered on the google group, so big thanks to the original authors of these ideas.

Let’s take our favorite sample models:

User hasMany Post
and
Post hasAndBelongsToMany Tag

If we were to do: $this->User->find(‘all’);
CakePHP will run a few selects and return you all the User data. The problem, however, is that if a User doesn’t have any posts CakePHP will still return you the User and an empty array for Post (I’m sure you’ve noticed that before)…

So let’s try to force cake to do a JOIN…

First things first, we have to make cake forget about our previous bindings between the models:

$this->User->unbindModel(array('hasMany'=>array('Post')));
$this->User->Post->unbindModel(array('hasAndBelongsToMany'=>array('Tag')));

Now, we need to trick CakePHP into thinking that we’ve got a hasOne relationship between our models, so that it will build a JOIN query…

We can use bindModel() to achieve that:

$this->User->bindModel(array('hasOne'=>array('Post'=>array(),
                    'PostsTag'=>array(
                        'foreignKey'=>false,
                        'conditions'=>array('PostsTag.post_id = Post.id')),
                    'Tag'=>array(
                        'foreignKey'=>false,
                        'conditions'=> array('PostsTag.tag_id = Tag.id')
))));

So what’s going on here?

We are telling cake to bind the User model using a hasOne relationship to the Post model, which forces cake to build the JOIN query like:

LEFT JOIN `posts` AS `Post` ON (`Post`.`user_id` = `User`.`id`)

Next, we need to ensure that we get all related information from our joinTable (i.e. posts_tags), so that we can eventually get the relevant Tags. Just as before we tell cake to bind PostsTag using a hasOne relationship. Note, we need to tell CakePHP how to do the JOIN condition by specifying the ‘conditions’ key. Therefore, cake will do:

LEFT JOIN `posts_tags` AS `PostsTag` ON (`PostsTag`.`post_id` = `Post`.`id`)

Last, but not least, we have to grab all the Tags. Again, we tell cake to join our Tag model using hasOne (of course here we also have to specify the conditions).

The complete query looks like:

SELECT `User`.`id`, `User`.`username`, `User`.`password`, `User`.`name`, `User`.`created`, `Post`.`id`, `Post`.`title`, `Post`.`post`, `Post`.`created`, `Post`.`modified`, `Post`.`user_id`, `PostsTag`.`id`, `PostsTag`.`post_id`, `PostsTag`.`tag_id`, `PostsTag`.`status`, `Tag`.`id`, `Tag`.`tag`, `Tag`.`status` FROM `users` AS `User` LEFT JOIN `posts` AS `Post` ON (`Post`.`user_id` = `User`.`id`) LEFT JOIN `posts_tags` AS `PostsTag` ON (`PostsTag`.`post_id` = `Post`.`id`) LEFT JOIN `tags` AS `Tag` ON (`PostsTag`.`tag_id` = `Tag`.`id`) WHERE 1 = 1

OK, this is much better and pretty much what we need. However, we are still getting records with empty values, yet we need only the records where the User has a Post and a Post has a Tag.
Well, this is actually due to the fact that we are using a LEFT JOIN, what we really need is an INNER JOIN. You can look up the difference, if you are not sure, but basically INNER JOIN ensures that matching records must exist in all tables.

Remember that for a hasOne (or belongsTo) relationship you can specify a ‘type’ key, so go ahead and modify the above bindModel() call to include ‘type’ => ‘INNER’ for both PostsTag and Tag association.
How? Here’s a hint: ‘foreignKey’=>false, ‘type’=>’INNER’, ‘conditions’=>array(‘PostsTag.post_id=Post.id’)).

Now, we’ve got just the records we were looking for.

A quick note on ‘foreignKey’=>false…
It’s necessary to specify that, so that CakePHP does not attempt to automagically establish a relationship between the models, instead it forces cake to use our conditions for the JOIN
(i.e. ‘conditions’=>array(‘PostsTag.post_id = Post.id’)).

P.S. as josoroma pointed out, if you use this method with paginate() be sure to supply a ‘false’ param, to your bindModel() so that the binding persists for all subsequent methods, i.e. paginate() and paginateCount()

%d bloggers like this: