Imagine being tasked with extracting valuable information from a set of strings/characters – mainly from names of companies, phone numbers, URLs, ZIP codes, financial reports, etc. The phrase “finding a needle in a haystack” comes to mind, right? But that won’t be the case if you know about regular expressions or regex!
Regex is a pattern that defines character combinations using operators to extract data in a text document based on a set of rules. In other words, regex casts an incantation of operators (like . ? + * {} | () [] - ^
) and extracts desired results from a massive volume of data.
For instance, you want to filter out the website visits from your corporate network and solely analyze the external visitors or prospects. To achieve this feat, you’ll have to manually compile the IP addresses from a big muddle of data. With regex, it becomes a breezy affair. Let’s say your internal IP addresses range from 195.71.100.1
to 195.71.100.25
. Rather than entering all of them, you could simply create a regex like 195\.51\.100\.\d*
that matches the entire range of addresses.
Regex is used everywhere – from data pre-processing to NLP, data extraction, pattern matching, web scraping, string parsing, syntax highlighting, and now in cognitive search. Wait, cognitive search? How, you may ask. This blog entails how regex patterns help admins and search users under the hood.
Looking at Regex Through Cognitive Search Lens
Organizations turn to intelligent cognitive search to support their data-hungry culture. But sometimes, it may not surface relevant documents due to a lack of search tuning and query boosting. Enter regex.
Regex is the de facto standard for parsing text from large documents. When used in cognitive search, it supports query boosting, improves keyword tuning, and optimizes search relevance. Let me explain with an example.
A document boosted to rank one for “KPI” might not show up at all when its plural sibling “KPIs” is the search query. To boost the doc for both, the admin will have to enter two separate queries manually, unless he knows regex. Instead of entering multiple keywords, he can simply boost the document by entering a regex pattern, i.e., /kpis?/i
which will produce the same results for both “KPI” and “KPIs”.
Similarly, document X boosted for query “launch” and the pattern “/launch??/i
” will boost the document for “launch,” “Launch,” “LAUNCH,” “launches,” “LAUNCHES,” “Launching,” and several other matching words.
How Cognitive Search Leverages Regex Patterns to Augment Relevancy
Regex can automate a lot of mundane and manual tasks. It is also dynamic so customization to fit your particular use case is rather simple. And, when coupled with cognitive search, it can work wonders to amplify keyword boosting, query boosting, and elevate relevance manifold.
1. Boost Contextual Results
Regex is a rule-based pattern. If you fail to lay the proper rules, they will match not only what you specify but also any adjacent characters with it. For example, in a partial match for “site”, the results may contain “mysite,” “website,” “yoursite,” “theirsite”, “parasite,” and so on. A potent
cognitive platform can help circumvent this issue by understanding the context of the query and accordingly populating relevant results on top. This is made possible by a foolproof algorithm of emerging technologies like machine learning and NLU.
2. Keyword Tuning & Handling Query Inflexion
Usually, regex clings to specific data patterns. However, when combined with cognitive search, it acts as a tuning tool that can boost or bury search results. For instance, SearchUnify’s recent release enables enterprises to choose the top ten results for one search term or a whole set of search terms. This allows them to boost (subset) documents for a set of query patterns. Additionally, you can enable synonym boost that will automatically apply boost on defined synonyms such as “SearchUnify” and “SU.” Lastly, you can custom tune based on search clients too.
3. Test Your Tuning
It is better to test the waters before jumping in, right? That’s precisely what Test Your Tuning is all about. You should look for cognitive platforms that allow you to measure the impact of your regex tuning before applying the changes to the live environment. SearchUnify enables admins to beta test new tuning in a simulation before implementing it on the search client(s). This ensures it delivers the desired results with no whoopsies for your end-users. On top of it, you can share the results of your tuning over an email.
To Conclude
At first, regex can sound a tad complicated, but the result it produces is tremendous. It empowers you to present the most relevant results at the top according to the user query, thus increasing your conversion and case deflection rate significantly.
Our Client Improved Deflection Rate from 39% to 50% in Just 3 Months! Want to Know How?
Join Lynette Ledoux, Customer Specialist, SearchUnify, and Cheryl Zupke, Technical Content Developer, Cornerstone InDemand, in a fireside chat to hear the complete story. In the live event, Cheryl will share how the customer success team at SearchUnify worked with her to improve the cumulative case deflection from 39% to 50% in just three months! Register now.