HABTM and JOIN trickery with CakePHP

Update (10/7/2009)… this functionality has been rolled into Habtamable behavior

An interesting question came up on IRC today…

If I have Post HABTM Tag, how do I select only Posts that have Tags “new” AND “cakephp”?

A seemingly simple problem, actually required a pretty tricky solution. I do have to say that it’s not because of some cake limitation, but really (at least for me) it’s not at all straight forward to do that type of query in good ol’ SQL (yeah, I don’t really know SQL all that well… good thing we’ve got google).

Before going further, I’ll assume that you have a solid understanding of the way HABTM works, concepts of join tables, auto “with” models and cake conventions. If not, you should probably refer to my previous posts about HABTM and/or read up in the manual.

Alright, let’s analyze the problem for a second… basically we need to grab all Posts where in the join table a single post_id would match two tag_id’s (i.e. the tag_id’s of “new” and “cakephp”). Note, that we cannot match just one or the other, we must have both tag_id’s matching the same post_id. Not only that, we don’t know tag_id’s as we only know the search terms (“new” and “cakephp”).

Looks like we’ll need some creative SQL and JOINs to get this working right…

Surely, we could rely on Model::query(), but let’s see if we can get this working with find() instead.

First of all we have to force cake to build a join query.

So let’s do something like this:

$searchTerms = array('cakephp', 'new');


                                                         'conditions'=>array('PostsTag.post_id = Post.id')
                                                              'Tag.id = PostsTag.tag_id',

I will briefly explain, what’s going on here (if you need more details, see the post linked above on how to force CakePHP to do a JOIN)…

We are telling cake to JOIN our Post model with the PostsTag model (join table: posts_tags) and then JOIN our PostTag model with our Tag model (tag table: tags).
The JOIN conditions are pretty simple, we ensure that Post.id matches the PostsTag.post_id and Tag.id matches PostsTag.tag_id. Of course we need to also ensure that we only grab the tags where Tag.tag is IN our search terms (see the $searchTerms array).

Once all of that is accomplished, we build our find() method:

$this->Post->find('all', array(
                              'group' => array('Post.id','Post.title HAVING COUNT(*) = '.$numCount)))


Let’s break it down…

If we were to do a simple find(‘all’) we’d get all Posts that happen to have Tag.id’s matching either one of our search terms. This is not what we need.

By adding the GROUP BY and HAVING COUNT(*) = $numCount, we ensure that we match both of our Tag ids and not just one or the other. In other words, COUNT (*) must equal to the number of search terms.

OK, so what is $numCount?
In our example we know that we searched for two terms ($searchTerms array), therefore we could have done:


However, if our $searchTerms array had an unknown number of items, we’d do something like this prior to our find() call, to determine how many Tag.id’s must be matched in order for our query to be correct:

$numCount = count($searchTerms);

I can imagine that this is probably a bit over the top, but it certainly taught me a few things and reinforced some others, hopefully you’ll learn a thing or two from this as well ;)

Related Posts