This article was originally published on Information Management.
This article was originally published in Database Trends & Applications.
The deadline looms on the horizon. On May 25, 2018, the European Union will enact some of the most stringent data privacy regulations the world has ever seen. These regulations will impact thousands of companies around the world, not only EU-based organizations but any company that collects or processes personal data on EU residents. The General Data Protection Regulation (GDPR) recognizes the “fundamental right” of people to control what data is stored about them and how it is used.
Organizations must be ready for this date since the fines for non-compliance could be as high as 4% of annual revenue or $21 million, whichever is higher. To put this in perspective, small companies could go out of business with a $21 million fine, and for a company with revenue of $10 billion, the fine could be a staggering $400 million.
No organization with large datasets can sift through them manually to find personal data and judge its GDPR compliance. Companies need sophisticated technology to deal with their data effectively, enabling them to search, discover, and review. Most organizations find it challenging to quickly and accurately identify and find personal data.
Under GDPR guidelines, people can request to be informed about the data that organizations store about them and can demand rectification, erasure, or the restriction of how their data is used. They can also ask to receive their personal data in a common format that allows them to transfer it to another organization.
The impending deadline and the fear of painful fines put organizations under a great deal of pressure, such that they may forget about pursuing the potential business benefits of conformity measures. For example, the prospect of thousands or even millions of people demanding to know what data is stored about them may seem daunting. Since an organization is obliged to answer within 30 days, this might result in thousands of cases per day being handled by customer service.
On the other hand, many large enterprises with millions of individual customers—banks, wireless providers, etc.—need to provide a 360-degree view of a customer to their sales and service personnel—in seconds, not in a month. This is a business requirement independent of GDPR compliance. When customers contact the company, they expect the sales or service reps to know them and give them knowledgeable recommendations and advice.
One way of providing such a 360-degree customer view is using cognitive technologies that can ingest structured data from enterprise applications such as CRM and billing and unstructured data such as emails and other correspondence. Companies often have hundreds of such data sources. Cognitive capabilities, such as natural language processing and machine learning, are necessary to extract relevant information from structured and unstructured data: what kinds of contracts the organization has with customers; service and payment history; whether the latest exchanges were friendly or aggressive; suggestions from past experience with other customers to help solve the current customer’s problems; etc.
In a call center, operators need to get a complete picture of the person on the line within less than 2 seconds, according to industry standards. If a company has 20 million customers, more than 200 enterprise applications with customer data, and 10,000 call center agents, that is a daunting challenge—but a challenge that has been successfully overcome by companies.
ROI: BUSINESS BENEFITS—NOT JUST COMPLIANCE
Gartner estimates that European companies will each spend an average of 1.3 million euros to comply with GDPR personal data protection requirements while U.S. businesses are setting aside at least $1 million for GDPR readiness, with some assigning up to $10 million. What do they get for it, apart from avoiding fines?
Let us look at a concrete example of a wireless telecom company that implemented a 360-degree view strategy using cognitive technologies. The first objective of the project was reduction of average call handling time, increased customer satisfaction and loyalty, and increased up- and cross-selling. All these goals have been achieved, but there is another aspect to the project that offered massive savings: Call center employees now have a unique and intuitive user interface to access customer data.
They no longer need to understand some 30 enterprise applications they had to navigate before to access this data. This reduces the need for training from 30 days to 1 day. With 10,000 employees and a turnover rate that often approaches 50%, that means 5,000 x 29 workdays saved per year, i.e., 145,000 workdays or 29,000 person-weeks. ?The company can certainly offer a lot of customer service during that time! The overall ROI of the project would be approximately 60 million euros over ?3 years.
NEW PARADIGM: CUSTOMER SELF-SERVICE FOR INFORMATION RETRIEVAL
One of the 10 biggest banks in the world has implemented a similar project to provide a 360-degree view of customers to its customer-facing employees. Its objective from the outset was also to provide their customers a 360-degree view of their own dealings with the bank: accounts, share deposits, insurance contracts, etc. It is easy to extend this interface to answer the question, “What data does the company have on me?” In this way, the company improves its service to customers and fulfills its GDPR obligations without a single employee being involved.
GDPR is coming, but instead of seeing it only as a costly burden, organizations should view the regulation as an opportunity. By implementing advanced cognitive technologies to derive deep customer insights, organizations can ensure compliance while reaping the business benefits of greatly improved customer service that can have a tremendous impact on the bottom line.
The quest for actionable insights and answers from within vast troves of data is neverending within the modern enterprise. There’s good reason for that – it is the end goal of all information work – but the process is anything but optimized. Global analytics firm Forrester revealed as much in a 2017 report, which found that more than 54% of global information workers are interrupted from their work a few times or more per month by time wasted trying to gain access to information, insights, and answers.
It’s a problem that goes far beyond the limitations of conventional enterprise search technology – it’s a Sisyphean challenge, thanks to the sheer volume of data being created every single second.
“As organizations in data-intensive industries strive to create value, enhance customer experiences, and differentiate themselves from their competition, they are placing demands on their knowledge workers in unprecedented ways,” explains Laurent Fanichet, VP of Marketing for Sinequa. “Frequently, the data and knowledge they are looking for is isolated, segmented, and fractured. It’s difficult to surface the right information at the right time to see the patterns in the data.”
Fanichet has a clear grasp on the key problem Sinequa, an independent software vendor specialising in cognitive search & analytics, is trying to address. In its recent report, Forrester Wave: Cognitive Search and Knowledge Discovery Solutions, (Q2 2017), Forrester defines cognitive search as ‘the new generation of enterprise search that employs AI technologies such as natural language processing and machine learning to ingest, understand, organize, and query digital content’ – and, in the same report, go on to highlight Sinequa for the applications of their NLP technology in enterprise search.
The kind of cognitive search and analytics platform Sinequa offers, Fanichet explains, refers to an information system that is capable of automatically extracting relevant insights from diverse enterprise datasets for users within a specific work context. “Cognitive search brings the power of AI to enterprise search,” he says. “It helps organizations in data-intensive industries to become information driven.”
A recent IBM Watson report highlights the applications of cognitive search in the aerospace sector. One company uses these enhanced search capabilities “to improve supply chain visibility and reduce cycle time, saving millions of dollars on critical parts deliveries.” Furthermore, the system enables aircraft technicians to search through “reams” of maintenance records and technical documentation. “Now, if a worker needs to know what’s causing high hydraulic oil temperatures, the [cognitive solution] identifies historical cases with similar circumstances, finding patterns that point to the root cause of the overheating.” The report goes on to note that the solution in question saves the airline manufacturer up to $36 million per year.
Cognitive search and analytics likewise has its applications in the health and pharma sector. AI Business recently spoke to Karenann Terrell, GlaxoSmithKline’s first ever Chief Data and Analytics Officer, and former CIO of Walmart. She explained that a big component of what it takes to develop medicine can benefit from next-generation computing and machine learning. “Approximately 1/3 of the total cost of developing a medicine (>$2.5bn) is spent during the time it takes from identifying your target (the process in the body that you want to affect) to testing your molecule in humans for the first time,” she explained. “This process can take around five years. [GSK’s] goal with artificial intelligence is to reduce this time to just one year in future.”
“These are just a few of the many business areas where surfacing the information from within their data can drive better decisions,” Fanichet argues. He explains that cognitive search and analytics also have a range of powerful potential applications within customer service, enabling organizations to:
- Provide personalized and highly relevant communication to their customers
- Nurture customer relationships and prevent customer churn
- Improve productivity, reduce operating expenses, and gain operational efficiencies
- Minimize customer service representative turnover and knowledge loss
The Challenges Ahead for Cognitive Search
The potential use cases speak for themselves, but that doesn’t mean there aren’t challenges ahead for enterprises looking to incorporate cognitive search technology into their work. While working with clients, Fanichet explains, Sinequa helps them to understand that there are a set of common machine learning challenges along the path ahead. Expertise is often the first hurdle, but he maintains that there are many different types of AI implementation challenge. “Assuming that enterprises are able to resolve a dearth of expertise, there are still other challenges – most of which are specific to the type of AI being pursued.”
Take supervised machine learning, where the system learns to recognize patterns by observing ‘correct’ patterns provided by humans. “The greatest challenge is around providing sufficiently labelled training datasets from which the system can learn,” Fanichet explains. This is something Matt Buskell highlights in his ‘10 keys to AI implementation‘, recommending that following the initial loading of data and knowledge base, the system needs to go through a phase of refinement once the software has launched. “During this phase, things like gain and variance for Machine Learning, or intent training for NLP and maybe model refinement to cognitive reasoning need to be improved. During this phase, it is essential to carefully release the software and measure how well it’s performing over a 6-12 week period, at the least.”
Fanichet likewise highlights the obstacles unique to unsupervised machine learning, in which the system identifies existing patterns and a human determines their usefulness. “The greatest challenge is balancing the system’s need for sufficient data with the proper human guidance and interpretation needed to train the system,” Fanichet argues. This is as much an issue of skills and process culture as it is technical expertise, and is reflected in a recent Genpact survey of over 300 senior executives, which argues “AI cannot be implemented piecemeal. It must be part of the organisation’s overall business plan, along with aligned resources, structures, and processes.” Collaboration is therefore key.
Finally, there’s a need to formulate clear goals and outcomes, Fanichet says. “When pursuing reinforcement learning, where the system makes many attempts and learns from the outcome to take better actions, the greatest challenge is providing the system with a defined goal and sufficient practice in a dynamic environment so that the system can effectively learn from trial and error.”
Sinequa Brings the Power of AI to Enterprise Search
Fanichet believes Sinequa offer a range of unique intelligent capabilities within the analytics space:
- Robust Indexing Engine: “If cognitive search was all about matching a keyword, a single index would suffice. The best results are obtained when multiple indexes are combined, each providing a different perspective or emphasis, providing a comprehensive overview of the information available. This provides the best possible understanding of the meaning it carries.”
- Enterprise Grade: “Sinequa was designed from the start to support the complexities and multiple security layers of today’s enterprises. It was also designed to be immersed in diverse enterprise environments and can operate within the context of a specific industry and the language of the specific organization.”
- Topically Aware: “Connecting information along topical lines across all repositories surfaces the collective expertise of the organization and makes it transparent. This is especially valuable in large organizations that are geographically distributed. By connecting people with expertise, the overall responsiveness of the organization increases.”
- Natural Language Processing: “Sinequa’s world-class NLP offers automated language detection; lexical and syntactical analysis; and automatic extraction of dozens of entity types, including concepts and named entities like people, places, companies, etc. It also supports text mining agenda that is integrated into the indexing engine. This enables the extraction of virtually any function, relationship, or complex concept from the content.”
- Machine Learning: “Sinequa leverages ML to enhance and improve search results and relevancy. This is done during ingestion but also constantly in the background as humans interact with Sinequa. It has become an essential part of the platform since it can handle complexity beyond what’s possible with rules.”
- Well Designed User Experience: “Sinequa’s front-end serves as an intelligent agent that employees can consult for institutional knowledge that can be readily applied to the task or situation at hand. The experience is well designed in the sense that it is aesthetically pleasing, it is understandable in that it makes use of the user’s intuition, it is unobtrusive, and perhaps most importantly, it is contextual to the user’s goals.”
- Ubiquitous Connectivity: “Sinequa’s product comes with over 160 ready-to-use connectors, all of which were developed in-house, thus ensuring consistency, quality control, and high performance.”
A new IDC report is recognizing Sinequa for our Cognitive Search & Analytics platform around critical technologies, including machine learning and advanced natural language processing. This Vendor Spotlight looks at how Sinequa leverages artificial intelligence and cognitive computing-based analytics to meet the needs of companies that are looking to address complex problems with easy-to-use, powerful solutions featuring simplified interfaces.
According to the report’s author David Schubmehl, Research Director for IDC’s Cognitive/Artificial Intelligent Systems and Content Analytics research, “The capabilities being offered by cognitive knowledge discovery systems, such as Sinequa, provide many opportunities for enterprises to innovate and advance their organization using approaches that were either not possible or not easily implemented several years ago. Within many enterprises, these opportunities are limited only by the imagination and creativity of those seeking to improve their business and information handling processes.”
The report states that Sinequa’s software provides organizations with real-time, relevant results from unstructured and structured internal data, and that the we are developing our Cognitive Search & Analytics platform on an extensive foundation of unstructured information access technologies that include advanced natural language processing capabilities in 21 different languages.
Schubmehl adds: “While Sinequa has offered a flexible information collection, access, and analysis architecture for many years, it has now built capabilities around cognitive technologies, such as machine learning, advanced natural language processing, improved relevance, and better decision support while offering strong user and data interaction capabilities.”
The advancement of natural language processing and increased maturity of machine learning are creating substantial demand for cognitive search and analytics solutions. At the same time, the growth of unstructured data and pressure to improve worker productivity makes it even more critical to find the right information at the right time. This report highlights the fact that Sinequa’s platform meets this demand and by combining our solution with human ingenuity, we can produce the best possible search and analytics results.
The full report is available here: https://www.sinequa.com/idc-vendor-spotlight-2017/
As the data-driven age gives way to an information-driven economy where context is critical to surfacing useful insights from data, taking in relevance feedback from users, especially expert users, will play a major role in driving the benefits. This article explains the concept of a relevance feedback model and why you should care.
What is a Relevance Feedback Model?
Assume you ask a person or a system to provide you with information on a certain topic. There may be many facets to this topic, and you may get information from a whole range of different aspects. If you are working with that person or that system on a permanent basis, you may want to tell them that only certain aspects and hence certain kinds of information are relevant to you – in the hope of getting only the more relevant answers from them the next time you ask. You give the person or system “relevance feedback”.
Now let’s concentrate on a system, a cognitive information retrieval system or a “system of insight”. In that context, a relevance feedback model (RFM) is the capability of the system to take your relevance feedback and “internalize” it in order to tune the results of your future queries to what is most relevant.
The system performs and automates this task by adjusting weights attributed to certain terms and their equivalents (i.e. terms with the same or similar meanings) within the data it processes.
Imagine you asked “what do we have on MRO”, and you got information back on maintenance, repair and operations, but you told the system that you are only interested in anything pertaining to “Mars Reconnaissance Orbiter”. The next time you ask, you will get information only pertaining to the latter and possibly on related topics like Mars landing craft, automated robots for planetary exploration, etc.
For one person and one query, that seem rather simple. But now imagine, that you have tens of thousands of colleagues and thousands of topics to cover. That is when the RFM benefits from machine learning algorithms, not only to detect the preferences of each person but also of groups of people with similar interests, similarities in documents, etc. to spread the user relevance feedback to other documents, queries and people on an ongoing basis in an automated way.
Why use a Relevance Feedback Model?
A key benefit of a relevance feedback model is to enable users, in particular expert users, to affect relevance appropriate to their environment without the IT department having to implement rules for relevance according to specific user groups. It allows administrators to decide by configuration which specific users within the organization will contribute as well as the exact factor of relevancy improvement.
The relevance feedback model can also go a long way towards improving the human-machine interaction. As the relevance of certain content increases significantly due to relevance feedback, the user experience starts to feel much more “conversational” – i.e offering one to three suggestions as “answers” to a query – than a traditional search interface offering a list of documents in response to a query.
The RFM provides a way to discover from everyone’s experience the information that best answers the question. Take the real-world case of a customer service representative (CSR) seeking an answer to a customer’s product question using the product name or code. In this case, the CSR will obtain a diverse set of documents including parts catalogs, how-to information, product specifications, packaging information, marketing material, etc. All of this information is relevant but only some of it may help the CSR answer the customer’s question.
Thanks to the RFM, the CSR would immediately see information she has already viewed when she searched similar things in the past because the RFM takes into account the user’s “click actions” and applies a tiny relevance boost accordingly. Perhaps even more powerfully, the RFM will also modify the order of the results by observing (over time) what information other CSRs spend time to discover, even when they dive deeply into the results list for relevant information. Organizations striving to take full advantage of the RFM will configure it so that the experts’ interactions with the system provide bigger boosts for important content and even ban inaccurate information from appearing in results lists.
As you can see from the example above, the RFM provides a collaborative way to modify search result order. It is neither a tagging nor a classification approach, both of which can be done at indexing time (extracting metadata from source, entity extraction with Natural Language Processing) or afterwards (classification through ML algorithm like clustering, similarity computation, and so forth). The RFM arguably represents a smarter approach by directly incorporating human decisions when presenting information that will best address a user’s query.
As information-driven organizations strive for ever higher degrees of accuracy for end users seeking knowledge, the ability to leverage relevance feedback from users, especially expert users, automatically at scale becomes increasingly mission-critical for optimal business performance.