December 9, 2016

Motivation

Using hook_views_query_alter to alter a query
When you first hear of Drupal's hook_views_query_alter you think you get it. You understand Drupal's hook system (presumeably, otherwise start here), you've used other hooks before, maybe you even experience a twinge of guilty excitement that it's one of the hooks that hasn't been removed from Drupal 8. You think to yourself, okay, this function will be passed a query argument, and I'll alter it -- and you're right! But it turns out it's a little more complicated than it sounds. First of all, your typical support system is conspicuously absent: most of the comments on api.drupal.org are about just getting the hook to fire; there's very little on stackoverflow about what to actually do once you get that $query object; and what little information there is out there is about adding new conditions/sorting/etc to your query, not changing what's already there. Then, once you give up on finding a meaningful example and dig into that $query object, you're confronted with an intimidating mess of arrays nested inside of objects nested inside of arrays. In this blog post we will be looking at how to use the $query object passed in to hook_views_query_alter to alter existing where conditions in a Views-generated sql query.

Use Case

In a side project I'm working on, I have a View that uses multiple filters, one of which is the Search Keywords filter. This filter works almost exactly as I'd like it to, except that it doesn't allow for partial word searches, e.g. if someone were to search for "burger" I would want results containing the word "cheeseburger" to show up, but they don't. Inspecting the query (clicking the "show the sql query" checkbox in admin/structure/views/settings should let you view the query on the view edit page) we see the following WHERE condition:

WHERE (((node_field_data.status = '1')
AND (node_field_data.type IN ('dish'))
AND ((node_search_index.type = 'node_search')
AND (node_search_dataset.data LIKE '% burger %' ESCAPE '\\')
AND ((node_search_index.word = 'burger')))))

If you're familiar with SQL syntax you might have spotted our issues: 1) There are spaces around the word "burger" in our LIKE statement. A SQL LIKE statement interprets the % symbol as a wildcard representing an arbitrary string of arbitrary length. So this clause will match any text containing the literal string ' burger ' (with spaces) somewhere in the middle. It will match 'good burger ' and ' burger good', but not 'goodburger' or 'cheeseburger'. So we'll want to alter this statement to remove the spaces around our search term.

2) The node_search_index.word clause is using = instead of LIKE. This clause will only accept words that exactly match the search string, and since the search index only contains complete words, this means no partial word search. Solution: swap out the = for a LIKE, and wrap the search string in % wildcard.

Okay so we've got a plan! Let's dig into some code.

DISCLAIMER: A SQL LIKE operation is far more computationally expensive than an = operation; altering this query as described here could result in prohibitive performance regressions, especially when executed on large tables. Since my side project currently has very little data to search through, and premature optimization is the root of all evil, I chose to make this compromise in the name of getting stuff done. If you are working on a medium to large site and/or a View with lots of content, I might recommend against this approach, or at least break out a profiler (see here or here). That said, the approach we're using here can easily be adapted to other alterations, so I'm gonna keep talkin!

Solution

First things first, let's take a look at that $query object in a debugger:

query object

Wow! What a load of garbage. Okay so, we've got our WHERE clause with a group of conditions, and each condition can either be a bottom level condition, or itself another condition group. Now, our first instinct might be to just find the condition that we want to edit, and just get to drupalin' with something like this:

$value = &$query->where[1]['conditions'][2]['field']->conditions[1]['value'];

But let's not do that. First of all, it's fragile. What if I add a filter to the View later on? Will my condition still be in the exact same position in the query tree? Probably not. Secondly, if we went this route, we'd have to specifically target each condition we want to edit, but (spoiler alert) new conditions are added for each extra word in the search phrase.

So, we're faced with a tree structure of arbitrary size, where each item could potentially be a condition or a new condition group, and we want to check each condition to see if it matches the portion of the query we want to alter. How do we do this? Enter recursion! If you're not familiar with recursion it might sound scary, but really all we're doing is creating a helper function that calls itself in certain circumstances.

Here's how it will work in our example: we'll start by looping over all the top level conditions, then if a condition is actually a condition group (represented by a Drupal\Core\Database\Query\Condition object), we'll call our helper function on it. The helper function will then loop over all conditions in the condition group object, and if one of those conditions happens to be a condition group object, we'll call the helper function again. This will repeat until we reach a condition group that doesn't contain other condition groups.

That was a lot of words, let's take a look at some pseudo code:

foreach top level condition
    if condition is a condition group
        call helper function on it
    else
        check if we want to alter the condition and if so alter it
        
helper function: (takes in conditions group argument)
    foreach condition in condition group
        if condition is itself a condition group
            call helper function on it
        else
            check if we want to alter the condition and if so alter it

Hopefully it's relatively clear that this code would eventually touch every condition in the query tree. If not, refer back to this overview of recursion, or ask me some questions in the comments.

Okay we're almost there. I'm going to inflict one more block of pseudocode on you, and then we'll look at the actual Drupal implementation. Why more pseudocode? Well if you look at what we have written above, you probably noticed that the helper function and the original foreach loop have a lot in common, and when you don't keep your code "DRY" @bollskis kills a kitten. Okay that's not true, but if you don't stay DRY you will definitely get a code review from some smartass on the internet. So here we go:

foreach top level condition group
    call helper function to recursively alter conditions
    
helper function: (takes in conditions group argument)
    foreach condition in condition group
        if condition is itself a condition group
            call the helper function on it
        else
            check if we want to alter the condition and if so alter it

Okay, finally, some real code!

function MYMODULE_views_query_alter(ViewExecutable $view, QueryPluginBase $query) {
  // make sure we're altering the correct view
  if ($view->storage->get('id') === 'MY_VIEW' && $view->current_display === 'MY_VIEW_DISPLAY') {
    // foreach top level condition group
    foreach ($query->where as &$condition_group) {
      // call helper function to recursively alter conditions
      _recursively_alter_query_conditions($condition_group['conditions']);
    }
  }
}

// helper function: (takes in conditions group argument)
function _recursively_alter_query_conditions(&$conditions) {
  // foreach condition in condition group
  foreach ($conditions as &$condition) {
    // if condition is itself a condition group
    if (isset($condition['field']) && is_a($condition['field'], 'Drupal\Core\Database\Query\Condition')) {
      // call the helper function on it
      _recursively_alter_query_conditions($condition['field']->conditions());
    }
    else {
      // check if we want to alter the condition and if so alter it
      _alter_query_condition($condition);
    }
  }
}

// separate helper function to determine if the condition is one we want to alter
function _alter_query_condition(&$condition) {
  if (isset($condition['field']) && ($condition['field'] === 'node_search_index.word')) {
    $condition['value'] = "%{$condition['value']}%";
    $condition['operator'] = 'LIKE';
  }
  if (isset($condition['field']) && ($condition['field'] === 'node_search_dataset.data')) {
    // here we're using trim to eliminate both the unwanted whitespace and the '%' symbols,
    // which then get added back via string concatenation
    $condition['value'] = "%" . trim($condition['value'], " \t\n\r\0\x0B%") . "%";
  }
}

Phew, we did it! Here's a recursion themed gif to celebrate:

whoa

As you can see, once we got our pseudocode sorted out, porting it over to Drupal was pretty straightforward -- just a matter of figuring out the implementation details, like how to determine whether a condition is an instance of the Drupal\Core\Database\Query\Condition class, and how to get the conditions list from said object.

The final piece of the puzzle was another helper function to alter bottom level query conditions, which simply checked to see if the condition field was one we were interested in and then altered the condition as described earlier. Take note that we were only able to act directly on the condition because we made sure to pass everything by reference -- you can see the & operator in both function argument lists and foreach loops in the code above.

I hope this was helpful, please let me know about any questions, typos, code reviews, or general snark in the comments.