Posts Tagged ‘Content indexing’

I suggest that if you are interested in intelligent Enterprise search that you listen to  Steve Akers being interviewed by BeyeNetworks – it is a long session but well worth the time.  

Steve discusses a number of things including the challenges faced by Enterprises, how Digital Reef solves these problems and some customer use cases.  There were also two slides that I thought was a great overview of the Digital Reef solution:


  • Automatically identify and index all unstructured data
  • Provide tools to find and understand the data: 
  • Boolean searches (freeform, fuzzy, metadata, phrase, proximity)
  • Similarity searches using example files
  • Email thread reconstruction
  • Exact and near duplicate identification 
  • Pattern expression recognition
  • Organize the data using automatic classification 


  • Transform files into common file types
  • Collect and move data
  • Manage data retention policies


Designed for Scale and Security

  • Grid-based, distributed architecture provides performance and resiliency
  • Multi-tenant, role-based security model
  • Easily deployed and maintained
  • Indexes and prepares the full content and metadata of up to 10TBs of data in 24 hours with a standard configuration


Read Full Post »

The recent announcement of Digital Reef and Microsoft FAST is an important one.  At first glance it might be a bit confusing since both companies provide search capabilities.  Digital Reef does provide a search engine and additionally they offer capabilities that make their search smarter than other solutions and they also have features that live above the search stack. 

As a result Digital Reef can be a totally turn-key solution and also work with other third party search and indexing engines in order to protect the investment that customers have already made.  If an organization has already indexed tons and tons of content – then why go through the process again?  In addition to their own indexing capability, Digital Reef also will leverage existing popular indexes – in this case the FAST index – and bring the Digital Reef federation, performance, scalability, archiving and similarity engine to Microsoft FAST and SharePoint customers.

Digital Reef is a search and indexing solution but it is much more – if it wasn’t then why bother building a new company?  Digital Reef is Enterprise-class search and an information application with the goal of providing relevant data to its users – rapidly and efficiently.

Read Full Post »

I was an industry analysts for many years. I focused heavily on storage systems and was convinced that search and storage would eventually be like peanut butter and jelly. Although we have not seen the realization of this yet – I am still convinced that it needs to and will ultimately occur. However, like all things in the data center, it just takes time.

There are practical reasons why storage and search aren’t more bounded together. For one, search solutions haven’t been scalable or intelligent enough to provide the value that IT professionals are looking for. Second, most search solutions have been associated with specific storage systems and not the entire storage complex. That is very limiting. We need Enterprise search solutions that can access all storage within an organization. The third big issue is that storage adminstrators haven’t figured out why they need it. There are some applications and use cases that are a priority – such as eDiscovery. But storage admins have not found the killer app that gives them that “aha” moment where they just need to have it and are willing to invest time and money.

What is the killer app for search and storage? I believe one killer app is using a universal search application as a tool to give Enterprise end users greater access to the company’s data. We create so much content, using any number of applications, and instead of looking for data via the various application interfaces, having a single pane of glass, to get to any and all content in the Enterprise, would provide huge increases in productivity and efficiency.

This concept should not be a leap for most people , but since no one is complaining about it or demanding it, it isn’t a priorty. However, if you think about the power of being able to easily access content – data – information – we all know that mountains can be moved when this ability is provided. This is where storage admins have to transcend their nuts-and-bolts view of the world and think about the business and how they can apply technology to elevate the companies they work for. It is “right brain” thinking (creative) versus the typical “left brain” logical and rational thinking that is typically needed in the data center. Only by combining the creative and the logical can real leaps forward be made.

Read Full Post »

Digital Reef recently came out of stealth mode and is now talking to press openly about their solution. I spoke to a few trade press editors about Digital Reef and they wanted to know what made Digital Reef uniquely valuable in a market that seems to have a wide range of solutions for customers to choose from. It is not enough that a vendor is valuable or unique. If competitors offer the same value then the solution may have no real market traction. If a solution is unique but that singular capability offers no real value then customers will not pay for it.

In the world of high-tech there is often confusion because we often use the same terms to mean different things and different terms to mean the same thing. Therefore XYZ vendor may say they provide Enterprise-class search and indexing and are able to scale and provide rapid access to content for users. Therefore when Digital Reef states that they are “Enterprise-class” – it is important to distinguish and articulate what makes them uniquely valuable.

The ability to provide Enterprise-class search and indexing requires two very different core competencies. The first requirement is to build a platform – IT infrastructure – to address the needs of the Enterprise. These include massive amounts of content that is stored on heterogeneous storage that is most likely geographically dispersed. How do you index all of the existing content – which consists of hundreds of terabytes and perhaps even petabytes – while new data is created continuously? How long will it take the solution to catch up? Days? Weeks? Months? Years? Ever?

Digital Reef has built a scalable system that works like a grid or cluster – enabling you to add more compute resources to tackle this huge challenge. In other words, they have developed and provide sophisticated infrastructure – applying grow-able grid technology leveraging massive amounts of compute power in a unified fashion to index mountains of content.

The other core competency is to quickly access relevant content. Digital Reef provides this through keyword search and their unique similarity engine – I discussed this in greater depth in my last blog – The Power of Similarity. Their search capability enables you to get results based on context. Consider the sentence – “I’m feeling blue” – which has nothing to do with the actual color but a pure keyword search would be swimming with content that included a myriad of references to the color blue including paints, fabrics, the sky, the ocean, etc.

Digital Reef excels when looking for abstract concepts, metaphors, idioms, specifics, vertical terminology, and word associations. And the magic of all of this is mathematics – complex, reasoned, considered and sophisticated algorithms.

It is the combination of their scalable clustered architecture and similarity engine that makes Digital Reef uniquely valuable.

Read Full Post »