This following post was originally published on emerj.com and is based on a presentation by Daniel Faggella for Sinequa‘s INFORM 2019 client event.
Traditional Search – Then
Older search applications would usually search through structured documents, such as loan application forms. They emphasized predictable formats and matching keywords directly to their appearances in enterprise documents. Also, at the time, only natively digital text was searchable, as opposed to scanned print and handwriting. It would take some years before scanned documents and other unstructured data types became searchable.
Before machine learning, “intelligent” search applications could not handle as much metadata as current systems. This made searching for complex topics difficult. In addition, metadata was applied to documents manually. This was a time-consuming process that was required for documents that a company wished to be able to search in the future. In many cases, this continues to be the case.
Intelligent Search – Now
Current search applications can now handle all kinds of structured and unstructured content in various file types with an emphasis on classification for further accessibility. These applications could also enrich documents with metadata, allowing for concept searching and automatic document organization.
Past Difficulties Persist Today
Artificial intelligence and machine learning are not the solution to every search-related business problem. Despite how much search applications have developed over the years, companies still face some of the same difficulties as in the past. The difficulties with adopting an intelligent search application include integration, defining metadata, and determining what data is needed to search the documents a bank or financial institution wants to search.
AI startups and other vendors that are new to the intelligent search space often underestimate the difficulties their clients are likely to face with adoption. Overcoming these challenges can be hard work, and we find that many companies that are just starting out with intelligent search do not consider the commitment required to do so.
These companies often market their AI applications as easy to deploy within the enterprise. However, it is likely that they do this because they have not finished the thorough process of bringing an AI application into the enterprise. They may not have run into the common problems with data infrastructure (an ML problem that almost every enterprise data science leader struggles with) or defining their use cases (easier said than done, requires lots of business context from subject-matter experts).
What AI and ML Bring to Enterprise Search
The potential influence of artificial intelligence and machine learning on enterprise search can be understood as two important capabilities:
Making more information accessible – Making data digitally accessible using techniques such as optical character recognition, machine vision, scanning documents, and analyzing more data types. An AI application can also accomplish this by automatically adding metadata to backlogs of enterprise data.
Enabling companies to ask deeper questions – Enabling the capability of searching for broader concepts as opposed to strict keywords. This is helpful for finding insights on a general topic instead of simply every document including a few terms. Employees could search for documents and information beyond what directly pertains to a single keyword.
When observing the differences between search applications of the past and those of the present, one can see that artificial intelligence could help broaden a bank’s access to data. At the same time, the technology could transform the way in which employees search for that data, thus capitalizing on that access even more.
Enrichment and Classification
One use case of intelligent search for banks and financial institutions is in data enrichment and classification. Documents need to be tagged with metadata, or data that describes the data within those documents. Metadata is what allows employees to search for documents using search queries with keywords and filters.
Traditionally, these documents need to be manually tagged with metadata, often upon uploading or creating them in the ideal situation. But that doesn’t always happen, and as a result, a bank’s digital ecosystem can end up very disorganized. Employees forget to tag documents or tag them incorrectly, making them difficult to find when needed.
Artificial intelligence could improve this process, but leaders at the bank will still need to decide what kind of metadata they want documents tagged with. For example, leaders at the customer service department may want to tag call center logs with metadata about the kind of problem the customer is facing and the emotional state of the caller.
Once they determine categories of metadata, subject matter experts at the department can start tagging documents with this metadata, and once this is complete, they can feed these tagged documents into the machine learning algorithm that will power the intelligent search engine. The bank will then be left with a search application that could automate and improve two parts of the search and discovery process:
Enrichment – When employees upload or create a document, the intelligent search application could automatically tag the documents with metadata, immediately preparing them for search. The application could also run through older documents and automatically add metadata to them as well.
Classification – The machine learning algorithm could also cluster the metadata into broader categories. As a result, documents that are uploaded and created could be automatically organized into folders and allow for easier search with keywords.
Example: Data Confidentiality
Banks and financial institutions could use an intelligent search application to restrict access to enterprise data based on different levels of confidentiality.
They could use these groups as thresholds for documents so that the higher one’s threshold, the more access they have. The top-level would be the most confidential, where nearly no one has access unless it is specifically defined.
The middle level might allow certain categories of people to access certain documents based on what they need to do their job. For example, an account executive for financial services may not have access to the bank’s profit and loss information. The bottom level would allow most or all employees to access openly accessible data, such as customer service agents.
Once thresholds are decided, the company’s subject matter experts and data scientists can begin to label various documents in the database according to their level of confidentiality. The company can then use that labeled data to train an algorithm to go through the rest of the database and find commonalities between all of the documents labeled under a certain threshold. The algorithm could then determine which other documents fit those patterns or involve similar topics.
Unified View of the Customer
Another use-case for intelligent search is gaining what vendors market as a unified view of customers. Customer data is often scattered across various data silos and in structured and unstructured formats, such as a history of transactions or a mortgage application respectively.
This makes it difficult for company employees, especially those that deal with customers every day, to know whether or not they have all of the information a company has on a customer when dealing with them. A wealth manager, for example, may have trouble finding all of the information about a client they need to make the best decision for their portfolio.
When we studied the vendor landscape of intelligent search applications in the banking industry, we found that 75% of the products in the space included capabilities for customer information retrieval. The unified view seems to be a point of resonance for banks and financial institutions in customer service and wealth management use-cases.
Example: Call Centers
A unified view of a customer may allow a call center agent to not only pull up a customer’s contact record in a CRM, but also their past emails with the company, call logs on their past phone calls with the company, and, in some cases, sentiment analysis information on these conversations.
As a result, the call center agent would have a better idea of how to deal with the customer; they may learn that an angry customer has been calling in frequently about overdraft fees and decide it’s better to refund the customer for those fees than to allow them to keep calling in to the support line and take up agent time.
In the future, this use-case may evolve into automated coaching for call center or live chat employees. Employees would get recommendations for how to best handle the customer and even what to sell them on. Instead of deciding for themselves whether or not to refund the irate customer, the AI software might recommend this to the employee.
Concept and Advanced Entity Search
A third use-case for intelligent search is the capability to search for broader concepts and phrases as opposed to individual words or entities. Employees could search for documents with more contextual natural language phrases, as opposed to just searching for specific keywords.
For example, an employee could search “angry customers with an account login issue between June and August” into the search application, and the software could present a list of call logs for customers fitting the criteria. Such a capability is useful for finding more information relating to concepts that could appear in various documents scattered throughout a database, especially when those concepts are discussed in tangential ways.
Example: Searching For Documents Related to LIBOR
In banking, the 2021 sunset of LIBOR may have compliance departments scrambling to search for contracts that reference it so that they might update or manage them for a post-LIBOR state of affairs. In many cases, it may still be very simple to find all LIBOR-related documents and update them via strict keyword searches.
However, there may be many documents within a database that contain LIBOR-related discussions that don’t specifically mention any keywords one might normally associate with LIBOR. Employees using traditional keyword-based search software might miss these documents,
Intelligent enterprise search software could help employees find these documents. Subject matter experts could first find documents that appear to only suggest LIBOR-related discussion and label these documents.
Data scientists could then run this labeled data through the machine learning algorithm behind the search software, and this would train the software to pick up on the patterns that tend to constitute LIBOR-related discussion within a document. As a result, employees could type “LIBOR” into the search application, and the software would return LIBOR-related documents that compliance officers would want to stay on top of.
This way, employees do not have to guess which of the results actually reference LIBOR without mentioning it directly, manually reading through documents to find LIBOR-related discussion. Instead, they would search for LIBOR as a concept, and the algorithm would search the enterprise database for entities/phrases related to that concept.