Mind the Information Gap

The following was originally published on the Benelux Intelligence Community website.

Over the last several years, data analytics has become a driving force for organizations wanting to make informed decisions about their businesses and their customers.  With further advancements in open source analytic tools, faster storage and database performance and the advent of sensors and IoT, IDC predicts the big data analytics market is on track to become a $200 billion industry by the end of this decade.

MIND_the_GAPMany organizations now understand the value of extracting relevant information from their enterprise data and using it for better decision-making, superior customer service and more efficient management. But to realize their highest potential in this space, organizations will have to evolve from being “data-driven” to being “information-driven.” While these two categories might sound similar, they’re actually quite different.

In order to make a data-driven decision, a user must somehow find the data relevant to a query and then interpret it to resolve that query. The problem with this approach is there is no way to know the completeness and accuracy of the data found in any reliable way.

Being information-driven means having all of the relevant content and data from across the enterprise intelligently and securely processed into information that is contextual to the task at hand and aligned with the user’s goals.

An information-driven approach is ideal for organizations in knowledge-intensive industries such as life sciences and finance where the number and volume of data sets are increasing and arriving from diverse sources. The approach has repeatedly proven to help research and development organizations within large pharmaceutical companies connect experts with others experts and knowledge across the organization to accelerate research, lab tests and clinical trials to be first to market with new drugs.

Or think of maintenance engineers working at an airline manufacturer trying to address questions over an unexpected test procedure result. For this, they need to know immediately the particular equipment configuration, the relevant maintenance procedures for that aircraft and whether other cases with the same anomaly are known and how they were treated. They don’t have time to “go hunting” for information. The information-driven approach draws data from multiple locations, formats and languages for a complete picture of the issue at hand.

In the recent report, “Insights-Driven Businesses Set the Pace for Global Growth,” Forrester Research notes organizations that use better data to gain business insights will create a competitive advantage for future success. They are expected to grow at an average of more than 30 percent each year, and by 2020 are predicted to take $1.8 trillion annually from their less-informed peers.

To achieve this level of insight, here are several ways to evolve into an information-driven organization.

Understand the meaning of multi-sourced data

To be information-driven, organizations must have a comprehensive view of information and understand its meaning. If it were only about fielding queries and matching on keywords, a simple indexing approach would suffice.

The best results are obtained when multiple indexes are combined, each contributing a different perspective or emphasis. Indexes are designed to work in concert to provide the best results such as a full-text index for key terms and descriptions, a structured index for metadata and a semantic index that focuses on the meaning of the information.

Maintain strong security controls and develop contextual abilities

Being information-driven also requires a tool that is enterprise-grade with strong security controls to support the complexities and multiple security layers, and contextual enrichment to learn an organization’s vernacular and language.

Capture and leverage relevant feedback from searches

As queries are performed, information is captured about the system that interacts with the end user and leveraged in all subsequent searches. This approach ensures the quality of information improves as the system learns what documents are most used and valued the most.

Connect information along topical lines

Connecting information along topical lines across all repositories allows information-driven organizations to expose and leverage their collective expertise. This is especially valuable in large organizations that are geographically distributed.

As more people are connected, the overall organization becomes more responsive in including research and development, service and support and marketing and sales as needed. Everyone has the potential to be proficient in less time as new and existing employees learn new skills and have access to the expertise to take their work to the next level.

By connecting related information across dispersed applications and repositories, employees can leverage 360-degree views and have more confidence they are getting holistic information about the topic they are interested in, whether it be a specific customer, a service that is provided, a sales opportunity or any other business entity critical to driving the business.

Leverage natural language processing

A key to connecting information is natural language processing (NLP), which performs essential functions, including automated language detection and lexical analysis for speech tagging and compound word detection.

NLP also provides the ability to automatically extract dozens of entity types, including concepts and named entities such as people, places and companies. It also enables text-mining agents integrated into the indexing engine that detects regular expressions and complex “shapes” that describe the likely meaning of specific terms and phrases and then normalizes them for use across the enterprise.

Put Machine Learning to work

Machine learning (ML) is becoming increasingly critical to enhancing and improving search results and relevancy. This is done during ingestion but also constantly in the background as humans interact with the system. The reason ML has become essential in recent years is that it can handle complexity beyond what’s possible with rules.

ML helps organizations become information-driven by analyzing and structuring content to both enrich and extract concepts such as entities and relationships. It can modify results through usage, incorporating human behavior into the calculation of relevance. And it can provide recommendations based what is in the content (content-based) and by examining users’ interactions (collaborative filtering).

Taking these steps will help organizations become information-driven by connecting people with the relevant information, knowledge, expertise and insights necessary to ensure positive business outcomes.

 

+1Share on LinkedInShare on Twitter

Data Doesn’t Drive Finance

Your people need information, not data. On average, they waste a day a week searching across silos, systems, and clouds for information. It’s pre-digital-age work. Learn how AI-Powered Search gives your employees the information and intelligence they need.

View the white paper

em360-11-2019-800

+1Share on LinkedInShare on Twitter

Keeping Secrets Secret: How to Industrialize Information Privacy

Banks run on trust. At the core of trust is protecting the privacy of client information. Clients expect it. Regulators require it. Though a challenge for any financial institution, this challenge amplifies at complex global banks. Traditional approaches rely on human skill and craft, rather than on software. This means the average information privacy process isn’t industrialized or providing systematic assurance that it’s working.

Click here to download the solution white paper to learn how one of the world’s top 20 banks addressed this challenge.

keeping secrets - stamp draft

+1Share on LinkedInShare on Twitter

Sinequa 2018 Roundup… 2019 Here We Come!

2018Sometimes it helps to look at an entire year to gauge just how far you’ve come in a relatively short period of time. Sinequa experienced some very positive developments in 2018 that are worth highlighting. Our software platform evolved on several fronts to help us accelerate our mission to power the information-driven economy. In parallel, our customers demonstrated what the platform can do, even when stretched in creative and unexpected ways.

On the Technology Front

The Sinequa platform evolved with some very useful and powerful new capabilities in 2018.

Content-related Capabilities

Many of the new capabilities improved on the platform’s ability to integrate with even more enterprise applications and content formats, including:

  • New connectors to support the goal of ubiquitous connectivity across the enterprise. Among these were connectors to Atlassian products to incorporate information from software development projects, including source code files. Also addressed were new versions of popular repositories like SiteCore (a leading web content management platform according to Gartner), along with the likes of Azure storage, AODocs, Beezy, and Teamcenter.
  • New converters to index more formats like OCR on PDF and Images, AutoCAD and Windchill files, Visio, Improvements on PowerPoint, and a dedicated converter for source code files
  • Tighter integration with SalesForce.com
  • In a year full of major data privacy breaches being reported, the Sinequa platform continued to strengthen support of additional levels of encryption like in-flight encryption between all components in a distributed deployment and encryption at indexing time to secure the document cache, which contains elements like HTML preview and thumbnails to better serve customers operating in highly secure environments

Further Automation for the Interpretation of Meaning

The platform’s ability to interpret the meaning of content also evolved in 2018.

  • Query Intent: It is now possible to configure rules to be applied on queries to change the behavior of the underlying search process. This new query intent capability analyzes the query to detect certain words and entities and triggers actions based on the specified rules and classifications. New default entities were also introduced in the platform in 2018 that can be leveraged by the query intent capability and for enrichment during indexing.
  • Enhanced Linguistics: There were some language-specific improvements added to the platform to help automate the interpretation of meaning. These included things like enhanced linguistic processing for compound words in French, improved lexical disambiguation in English, enhanced detection of ordinal numbers for Danish & Swedish.

Improvements in Machine Learning

The year 2018 brought several significant improvements in the Sinequa platform’s ability to leverage machine learning, including:

  • The platform evolved to embed Online Machine Learning, applying machine learning models based on Spark or TensorFlow directly in the indexing pipeline. This represents the first of many new components that can serve machine learning models in real time. Deep learning is also used during indexing to detect new entities or concepts. These are immediately fed into machine learning algorithms, for example in the classification of incoming documents.
  • Packaged with the platform is a new unsupervised Deep Learning application for text analysis that detects the key words, key phrases, and key sentences of a document.
  • The platform now supports the Spark 2.3 implementation.
  • Packaged integration with 3rd party spark distribution providers – e.g. AWS EMR, HortonWorks.
  • Battle testing of supervised classification algorithms – i.e. Sinequa reached a threshold training set size over 10M documents
  • First machine learning customers are now in production
  • Packaging of hierarchical classification
  • Ongoing transition to Software 2.0 paradigm where software is effectively “trained” rather than manually programmed with the packaging of the lifecycle of the model and the model feedback from the search based applications into the Sinequa platform.

Presentation Enhancements for End Users and Admins

Sinequa invested significantly during 2018 to evolve the way the platform presents insights to end users as well as status information and optional settings to administrators. Here are a couple of the most significant developments:

  • A very exciting development from 2018 involved a complete overhaul of the user interface framework to a responsive design based on Angular 7. This will not only ensure optimal flexibility and performance for end users on all kinds of devices, but will open up Sinequa application development to a much wider audience.
  • On the Admin front, components have been reshaped to offer administrators of the platform more functionality and a better user experience for their work behind the scenes.

On the Customer Front

There were a few compelling themes driven by our customer base in 2018, each of which was rewarding in its own way.

Customer satisfaction and retention is a predominant theme for Sinequa. We are extremely pleased by the sheer number of existing satisfied customers that came back to us in 2018 with additional use cases to accelerate their information-driven journey. For instance, business drivers related to governance, risk and compliance with the advent of GDPR and related regulatory demands spurred a lot of activity this past year.

We also had a significant number of customers who experienced that “light bulb moment”, which often occurs when they realize their existing return on Sinequa investment could be significantly amplified by extending the use of the platform with information-driven applications in other parts of the business – e.g. areas like customer service, R&D, supply chain, and other knowledge-intensive arenas.

We even had a few long-time customers take a pause to re-evaluate their vendor choice and, without exception, decided to double-down on their commitment to Sinequa for years to come.

Of course, the disappearance of the Google Search Appliance brought some new customers into the fold, most of them fiercely determined to go beyond their previous use of a dying application and truly become information-driven.

Possibly the single most exciting development for Sinequa in 2018 was the surge in machine learning projects, which contributed significant business value back to the respective organizations, especially in the Financial Services industry. As the underlying technology matures, we see a steady trend for machine learning projects going from research to production stages. Some of the projects from 2018 focused on applying machine learning models to automate the curation of enterprise content and improve relevance. For example, one customer demonstrated how trained machine learning models could be used to make the enrichment of their enterprise corpus more efficient. It turns out that by proactively identifying what content qualifies as “scientific”, both time and money can be saved by preventing non-scientific content from even being considered for scientific enrichment during ingestion. Another customer took a completely different tack, using machine learning to automatically reproduce confidentiality policies to classify large volumes of banking documents with measurably higher quality at a fraction of the cost they would have spent to do it manually or even with a more traditional rules-based approach.

Now it’s on to 2019!

As we turn the corner into 2019, we are grateful for both the accomplishments of our R&D team and for all of our partners and customers, especially those who provide the challenges, creativity and critical feedback necessary for Sinequa to continue providing the leading platform for information-driven applications and solutions.

We wish you all the best and look forward to serving all of you in 2019 and beyond.

+1Share on LinkedInShare on Twitter

How organizations can evolve from data-driven to information-driven

This article was originally published on Information Management.

Over the last several years, data analytics has become a driving force for organizations wanting to make informed decisions about their businesses and their customers.
With further advancements in open source analytic tools, faster storage and database performance and the advent of sensors and IoT, IDC predicts the big data analytics market is on track to become a $200 billion industry by the end of this decade.
+1Share on LinkedInShare on Twitter