Dormant and untapped within every law firm resides a broad range of technology tools that are begging to be exploited. These assets have been licensed at substantial cost to the firm and, too often, their capabilities are used to only a small extent. High on this list of under-utilised assets is the enterprise search tool.

Enterprise search has graduated beyond simply matching queries to sections of document text – these tools are about finding answers to business questions.

Enterprise search tools help the business to combine disparate systems (exceptionally useful during a merger), extract concepts and entities from the text, cluster related documents and recommend complimentary resources. In addition to this, search engines can correct some of the users' human frailties – for example correcting spelling mistakes, suggesting alternative queries and focussing attention on relevant results. Enterprise search tools can provide a consolidated access point into structured information (such as a firm's practice management system), semi-structured information (such as document management or client relationship management [CRM] systems) as well as traditional unstructured information (file systems and websites), making it a 'portal' into content stored across the firm.

All this and you can still match queries to sections of document text.

Adding the term 'enterprise' as a precursor to 'search' not only implies an increase in scale of the solution, it can also define different requirements for the firm

If the law firm has internally focused business needs – such as finding documents across its document and knowledge management systems – then the requirements for the search engine are to index and manage the retrieval of the different document formats and structures to be found within the firm. Such an internally focused search implementation has a high degree of control over both content and search results since the system created all of the search indices being queried.

If the firm has a wider view of enterprise content – incorporating all of the content that a lawyer within the firm may need to research, whether from within or outside of the firm – then the search engine needs to create its own index of 'internal' content, as well as work with the indices of external/third party search tools, managing the retrieval and organisation of search results from across all of these systems. Since the enterprise search tool has to work with search indices it did not create, which have a structure that will not match its self-built indices, this is a more complex requirement that enterprise search tools aim to address.

The difficulty of the 'enterprise lens' – sending or 'federating' queries across different search engines is not distributing the search, but intelligently managing and displaying the results of that search – providing consistency and a holistic view to the searcher.

The Google factor

In the early days of web searching, success was measured by the number of 'hits' a search engine returned. When a public search engine increased its number of hits following an example query from 500,000 to 750,000 the web search tools proclaimed success – while knowledge managers groaned and saw a problem getting worse. The success of Google – the most popular web search engine – has been its simplicity, its speed and its ability to rank results in a way that fits the needs of those searching public information, bringing the most relevant content to the top of the results list. The Google factor has set the expectations of lawyers at a high level, albeit against a set of requirements that do not match those of lawyers.

The public web content search tools, such as Google, solve a different problem. These search tools do not concern themselves with the risk of omission of entries in their result list – no-one is going to sue them for a legal opinion based on the results of a Google search. Public search engines do not concern themselves with the need to secure sections of content from search results. In addition, the vast infrastructure behind a public search site far exceeds the budget allowance of any law firm. Expecting a full 'Google experience' from your in-house enterprise search tool is not realistic – although lessons should be learned regarding the ease and intuitive nature of the users' interaction with the search tool.

When researching, lawyers typically do not want a simple list of search results, rather they want to see their search results in 'context' – organised in hierarchical clusters around practice group structures or legal processes i.e. the firm's 'taxonomy' structure. Law firms need to be explicit about how content is organised into these clusters (after all, they may be sued on the advice given based on their legal research) and they need to ensure that only users with the correct rights can see secure content.

Organising content into taxonomy structures requires the creation of 'business rules' that govern how and why a particular item of content appears in a node of the taxonomy These business rules act on information gathered and stored in the search index – enabling a common set of rules to be applied against all content indexed.

A problem for 'enterprise' search tools that aim to federate queries to multiple search systems and then meaningfully organise the results is that the indices they are interrogating will have different structures, making it difficult to apply consistently a set of business rules. Effectively, when trying to organise the results of this federated search, the enterprise search tool finds that each federator result set returned speaks a different language.

There are three potential methods of working around this problem:

Re-index everything. This is a 'brute force' solution that, while producing ideal results, is largely impractical, after all – who would want to recreate the Google or Lexus Nexus search index?

Build translators. Rather than owning the indexing process, the enterprise can accept that the system it is federating its query to is in a better position to maintain its search index. By mapping metadata and taxonomy terms, the third party system's search results can be mapped onto the firm's taxonomy structure. This will need to be done on a system by system and 'best fit' basis.

Categorise 'on the fly'. Many search tools return some content relevant metadata as part of the search result. This data could be passed through the enterprise search engine's categorisation module prior to being displayed to the user where the firm's business rules can be applied to the content. The risk with this approach is that the level of meta-data passed to the enterprise search tool is insufficient to properly categorise it, leading to inaccurate results.

Whichever method is used, lawyers using the enterprise search tool to do research will want to browse, search and intersect these categories to find the information that they seek. The lawyer is effectively layering one search filter over another, creating what is called a 'parametric' query, the result of which is to be found in the intersection of all of the filters, as shown in the Venn diagrams ( see page 40).

The prioritisation rules for search results within a law firm will differ from those of public tools like Google. For example, a law firm may wish to highlight content from specific sources – such as the internal know-How database – by 'cooking' the query i.e. adding additional criteria to a search to promote content meeting those criteria. For example, know-how content may be given a 100% rating, whilst document management system content is given only a 60% rating; when query results are ranked, know-how content is boosted relative to document management content in the rankings.

Relevance through metadata

While the Googles of the public content search world provide good results for the web browser, the legal researcher requires 'prove-able' accuracy and completeness from their search results. When researching, lawyers have been trained to follow a series of thought processes – involving legal subjects, practice group structures, jurisdictions, etc. This contextual information is stored in content meta-data. Obtaining quality meta-data has traditionally been a resource-intensive process of using professional support lawyers (PSLs) to tag content – not the most fulfilling or cost-effective use of these professionals' time.

Enterprise search tool marketing materials propose an automated replacement to this human resource requirement, although in reality what can effectively be achieved is not a replacement of PSLs, but a substantial improvement in the productivity of these PSLs by focusing their attention on content where their professional opinion is required in order to make a decision on how to categorise content. There are two capabilities offered within many enterprise search products that help to provide quality metadata: these are profiling and entity extraction tools.

1. Profiling content inverts the typical search process so that, instead of queries being passed over a static (indexed) set of documents, a new document is passed over a static set of stored queries. The stored queries are the 'business rules' which identify the 'fit' of 'probability' of that document's fit with the stored queries node in the firm's taxonomy structure. The firm may agree that any probability greater than (say) 98% results in that content being automatically assigned to the taxonomy node, any probability from 85% to 98% will route the document to a PSL for their opinion, and any probability below 85% will reject the content from that taxonomy node.

2. Entity extraction offers a technology form of document 'comprehension'. The entity extraction tools are programmed with the semantic and syntactical rules for an 'entity', for example, a person's name, setting out the style and context of how that entity would be used within the document text. Once identified, the identified text can be tagged and highlighted to the enterprise search index and its profiling tool – adding (or removing) weight to the relevance of that text. Examples of entity extraction rules for a person's name would be (i) the words follow 'Mr', 'Miss', 'Ms' or 'Mrs'; and, (ii) capital letter followed by capitalised word (e.g. S. Levene).

Wider exploitation

This article has highlighted a small number of major factors directing the use of enterprise search in aggregating and organising content used by the enterprise. Given that the opportunities offered by enterprise search should not be restricted to the traditional matching of text terms in a set of documents, what other uses can be made of this tool? A few examples would be:

Expertise identification – using entity extraction tools to identify references to people, along with other meta data (such as author name); linking into the information stored in practice management systems, CRM systems and matter management systems; the firm can use enterprise search to answer questions such as:

. Who is [colleague]?

. Who has skills in [share purchase acquisitions]?

. Who knows [colleague/client]?

This use of enterprise search can implement the 'small world theory' (of six degrees of separation between any two people) into a powerful relation-ship management tool – imagine if you could plot the shortest set of relationship links to gain access to a target client.

Matter centricity – using the enterprise search index to organise information from diverse information systems around matter specific meta-data allows a matter centric view of the information held across the firm.

E-discovery – a law firm's enterprise search tool license can be used as a fee generator when applied against discovered client data. Using the automatic taxonomy generation tools incorporated into most enterprise search engines, relationships between discovered documents/e-mails can be identified and visualisation tools can be used to display these relationships graphically for presentation purposes.

Online legal services – as part of their legal service offering a firm may decide to capitalise on its know-how collections, making them available to clients via direct access or through an email alert function. The enterprise search tool provides capabilities to sift through, categorise and produce alerts for these services. Expert systems style services use search in an innovative method, matching client questions to 'canned' questions and their pre-constructed answers.

As with many technologies, it is exciting to see new software products. However, before buying new tools it is worth looking at your firm's existing software stable and seeing how you can exploit greater value from these tools and enterprise search is an excellent starting point.

Simon Levene is a senior consultant with Baker Robbins & Company, based in their London office.