The usefulness of data storage is inherently derived from the ability retrieve, at command, a relevant subset of the data. So, as data sets become exponentially more abundant, how does this impact the outlook of search technologies?
Learn more about how effective search positively impacts enterprises.
The term Big Data came to prominence in very much the same way as the information it describes. The digital world was quickly saturated with a flood of interest in how to manage Big Data complexity. The label Big Data is relative, and holds to loosely describe data that are too voluminous, too varied in type, too rapidly changing and inconsistent in quality to deal with using conventional applications.
Since the 1940’s there has been reference to an “information explosion” where captured data sets have far exceeded the resource budgets of processing and analysis tools at hand. In a 2001 report by the Meta Group the 3Vs which are used to define Big Data were first coined; volume, velocity and variety. In the last 5 years, the buzz around Big Data has become a dominating factor in talk across large data generators, such as research, public sector and enterprise; and for good reason.
There are ever more sensing tools in development, working towards the goal of enabling every last drop of information to be taken away from any given situation. Processing technologies have made huge leaps forward, however, with the persistent acceleration of information sensing technologies and information resolution, the explosion, very much, remains in motion.
The lowering price of storage and the abundance of varied and detailed data input, have contributed to large amount of much lesser refined data being piled into disorganised storage, with the knowledge that somewhere therein lies the potential for value. To extract that potential, there is a reliance on current or future technology to retrieve the relevant aspects of the data in a meaningful way. This is the changing face of search technology.
In enterprises, efficient retrieval of information is beginning to be recognised amongst executives as paramount. High-profile takeovers of enterprise search companies by IT giants, IBM, Oracle and HP, reflect this sentiment. In commendation of Google, it has succeeded in both ensuring that nearly all users are now able to intuit search engine use, and setting a high expectation for search functionality. Enterprise search is renowned for being more troublesome in comparison, and web-search accustomed users are quick to tire of irrelevant results.
Application of web search methods to enterprise search generally leads to a poorly functioning enterprise search. With a large variety of data and numerous types of file storage systems with variable security, a web crawler will only be able to index a small fraction of narrow data. A company wishes to extract, on demand, as much meaningful information from the extent of their available information. For an enterprise search to provide relevant results to a diverse set of queries in a short time, and to structure those queries to be relevant across a diverse set of data types, it is necessary to look elsewhere.
The volume aspect of Big Data makes a time-efficient enterprise search tool mandatory. Here, distributed parallel processing for minimising time are necessary. The use of batch processing in clusters leaves some gap that needs to be filled by real time systems. As regards Big Data velocity, data that is not only changing rapidly but also can carry with it, wide reaching knock-on effects, is pushing real-time indexing requirements of enterprise search.
The need for enterprise search to deal with variation is driving development in Unified Information Access, a closer integration between structured and unstructured information processing in a search. Similarly, federated search allows a single enterprise search query to be passed to different types of data storage. Developments in query analysis, such as research in natural programming language, are geared towards refining the generated queries to return the relevant hits across systems. A key issue with integration of varying data types is to find a suitable way to structure the enterprise search output, so that the user can make sense of it effectively. This alone is no trivial task.
In the current state of affairs, enterprise search is a requirement for any company with large stores of data, or in any scenario relying heavily on knowledge workers’ abilities to retrieve information. For small companies, an effective enterprise search is a useful aid to productivity, which will become essential with growth and key to keeping in step with modern business. Big Data poses some difficult challenges ahead for enterprise search, but more excitingly, new possibilities for sophisticated analysis and enabling those stores of data to be unlocked for more powerful information usage.