Inclusion of Unstructured Data into Business Intelligence Analytics

data5Today’s large enterprises and administrations accumulate ever bigger volumes of ever more heterogeneous and volatile data. More than 90 percent of this data is unstructured, consisting of texts in many different languages. Information workers spend too much time searching for the information they need for their work and companies lose business opportunities when employees are drowning in these mounds of data.

Enter Sinequa.

Sinequa’s real-time Big Data Search & Analytics platform helps its customers meet this challenge. But what makes Sinequa different than other big data analytic platforms? See what Sinequa’s Vice President of Marketing Hans-Josef Jeanrond says about the value of Sinequa’s platform and how companies can best leverage the software.

Q: What differentiates Sinequa from other big data analysis platforms?

A: There are two basic reasons why Sinequa is different from other platforms. One being the data sources we are dealing with: We are not tackling the World Wide Web nor the complete internet of things. We focus on the enterprise world with different analytical tools.

The second reason that we stand apart from other big data platforms is that we do not choose between structured and unstructured data. We offer combined statistical and semantic analysis of big data and use the structured information to refine linguistic analysis. With Sinequa the semantic analysis of unstructured data is used to create structured data; one analytic method is used to refine the other. What sets Sinequa apart here is our capability to query an index of up-to 200 million documents in real-time with sub-second answers and our ability to deal with semantically related subjects in 19 different languages.

Q: Can you please further explain the importance of semantic analysis?

A: If you limit yourself to statistical data analysis, enterprise data often poses problems in sample size, sample error and sample bias. This is why tools from the Web don’t work well with enterprise data.

Keyword search is not a sufficient option either: The keywords you use in an information request may not occur in many of the documents that are actually relevant for your work. These keywords may not have been added as “meta data” by the people who classified and stored the documents in your document management system – they may have been looked at from a different perspective.  You need to find documents dealing with concepts that are semantically related to your request, thus needing semantic analysis to find them.

 

As you can see, the Sinequa platform offers functionalities that truly differentiates it from competitors. Sinequa combines deep content analytics, including Natural Language Processing with an extremely scalable IT architecture, offering users simple and secure access to the most relevant information. Stay tuned for our next post that will explain how a leading bio/pharma company implemented Sinequa to index millions of R&D documents and break down barriers between information silos, worldwide.

+1Share on LinkedInShare on Twitter