Learn how repetitive words within products affect search results and how you can change this behavior.
Update: Starting from version 3.06 plugin gives less score for repeated terms. This partially fixes the problem described in this article, but it is still relevant if you need to increase/decrease an impact of repeated words on relevance score.
In this article we will learn how plugin counts relevance scores for repeated words and how we can change that default behavior.
Problem: Search queries containing multiple terms. Repeating words have the same score as all other words. A product can have multiple repeated words and will show up higher in search results than a product that has all the words of a search query and fewer repeated words.
Example: we have a search query Hoodie logo
. So with default OR search logic plugin will show in the search results all products that have at least one of these words.
Then the plugin orders all these products according to each one's relevance score. Relevance score calculated based on different criterias. Lets simplify things - in our example we search only inside product title and each of the search words Hoodie
and logo
has the same relevance value - 100
.
For example, here is how the total score is calculated for different products:
Hoodie with logo - total score = 200
( has one word Hoodie
and one logo
)
Hoodie with pocket - total score = 100
( has one word Hoodie
)
Hoodie with hoodie and some hoodie - total score = 300
( has three words Hoodie
)
It is a default algorithm for ordering products. As we see products with the title Hoodie with hoodie and some hoodie
will be in the first place in the search results.
Perhaps this behavior will suffice in most cases. But it is also possible to change it - reduce the impact of repeated words or even remove it entirely.
We want to ensure that products that include all search words rank higher in search results than products that have multiple repetitions of the same word.
How to do that we will cover in the next section.
Now lets try to reduce the impact of repeated words on a product's relevance score.
Default formula looks like this:
{term} * number
This means that the plugin simply multiplies the score of each term by the number of those terms.
We want to make the repeated word less meaningful. In our example we will use the following formula instead:
1 + (count-1)/5
For 2 repeated words, the relevance multiplier will be 1.2
. For 3 - 1.4
. For 4 - 1.6
and so on.
To apply such changes just use the following code snippet:
add_filter( 'aws_search_query_array', 'my_aws_search_query_array' ); function my_aws_search_query_array( $query ) { $query['relevance'] = str_replace( ' * count', ' * ( 1 + (count-1)/5 )', $query['relevance'] ); return $query; }
Update: if you are using plugin version 3.06 or above then you don't need to apply these changes - it is a default formula starting from that version.
Another possible change - completely eliminate the impact of repetitive words. In this case no matter how many times a word is repeated inside one product - plugin will count only the number of different words from the search query.
To make this change please use the following code snippet:
add_filter( 'aws_search_query_array', 'my_aws_search_query_array' ); function my_aws_search_query_array( $query ) { $query['relevance'] = str_replace( ' * count', '', $query['relevance'] ); return $query; }
If you are using the plugin version 3.06 or above - use the following code instead:
add_filter( 'aws_relevance_count_multiplier', 'my_aws_relevance_count_multiplier' ); function my_aws_relevance_count_multiplier( $formula ) { $formula = '1'; return $formula; }