Categorized and Tabbed Indexes: Searching for a Specific Information Type
The discussions about problems and issues regarding search engines often revolve around precision and recall of search results, and more importantly the ability of the search engines to understand what type of information the user is looking for. Traditionally search engines have chosen the categorization (a la Yahoo) tabbed route (a la Google) to serve up searches. Categories and tabs help users to create context behind their searches. The SearchEngineWatch article has in a way opened up few data points to understand and hypothesize about future of categorized or tabbed indexes.
When a user searches on the Internet, he/she is looking for a variety of stuff: news, articles, audio, video, games, images, "web sites", "home pages", etc. Search engines have already started catering to various categories of searches, so it would make sense in the long run to have separate indexes for each of these sources. It will be useful for the users to choose directly from "websites", news, discussion forums and weblogs rather than wade through one common index for everything.
Issues with Categorized and Tabbed Indexes
Given that tabbed indexes or categorized indexes are a very natural development of contextualized search and an avoidable eventuality, we need to find ways to circumvent the resultant user-interface and navigability clutter that these indexes can create. SearchEngineWatch article has put this in right context, giving the examples of "tab blindness". As long as these tabs are limited in number, users wont crib about them but as tabs and indexes subsume extended taxonomies, users are bound to become more wary, and irritated by
having to switch tabs and indexes.
| Sales Marketing Intelligence: Is your company looking to buy a Sales or Marketing Intelligence solution? Then its time you analyze the solution from a Text Analysis point of view. A report by K-Praxis on Sales and Marketing Intelligence provides a roadmap for integrating Text Analysis with traditional data mining. The complete report (Sales and Marketing Intelligence: The Need for Integrating Textual Analytics with Traditional Solutions) is available for purchase through InfoSphere AB . |
Categorized Indexes: An Alternative
One way of this problem is for search engines to adopt one of the strategies Google is adopting. The recent launch of its "define" feature by Google could be cited as an example. So now you could use word "define:" with your searches to get a definition of the term being searched. The define parameter turns your search into a contextualized zone and gets you the desired results, Google only searches from definitions and glossary pages that are available on the web. To understand what Google is doing, pay close attention to the URL of the pages where definitions are culled from, and you will realize that Google is using some source base filtering technique to limit the search to the glossary/definition pages. The “Define” innovation exhibits an entry into contextualized searches (Google now does that with "Phonebook:" as well). Opportunities for such constrained search are unlimited.
It is also important to note that using words like "define" or "news" or "blog" is a much neater way to contextualize searches than tabbed interfaces and indexes. Besides, as search becomes a familiar interface, users will not shy away from a relatively extended set of "define" queries to get what they want.
Categorized Indexes and Page Ranking Algorithms
Another way to arrive at the easy-to-use-interface and hide the tabbed clutter is to look for a solution embedded in the variety of page ranking algorithms being used by search engines. Usually we only get to hear of Google PageRank, which - apart from its efficiency and user-friendliness and its ability to hide the entrails of search internals - is the most well known and by far the most sophisticated ranking algorithm that we know of. In information retrieval terms, PageRank system (to a great extent) assumes that the more linked a webpage is, the greater is its value. Whatever algorithm Google uses to normalize this effect - to bring in other aspects such as keywords, relatedness of the content and so forth - because the basic system is PageRank, the results that are produced by Google tilt towards a theory where the more "networked" you are the more popular and trustworthy you are. In a way at the beginning of the web information retrieval systems, PageRank served a very important purpose. It gave us one criterion to judge web-pages and retrieve them on that basis.
But now, these page ranking algorithms themselves could be extended to hide the tabbed interfaces of search and making searches more intuitive. Eventually as searchers too evolve along with search engines, searchers will be interested in not just the authority of the page judged by Google pagerank or any other system, but in the relative authority of the page in a more constrained environment, in which users' personal choices or a particular field of knowledge, domain or subject category will be considered. So in effect the user is interested in judging the page by variety of ways. He/She does not all the time want Google to calculate the "relative authority" of this page across the whole web - in a manner that is redundant or of no importance to the user. What he/she is interested in is to rank this page in a constrained search, "if I am interested in Baseball then only rank my results in baseball terms", so on and so forth.
Personalized and contextualized searches could help search engines (to a great extent) solve the problem of tab and navigation clutter on search engines. This could also help in making the search experience more efficient without having to switch tabs. The new breed of page ranking algorithms will help search engines (at any given time) to compute three very important aspects of a web page:
a) its total rank across the web,
b) its content -through content analysis
c) and, its relative rank within a specific context.
This could make searches more meaningful without any apparent clutter or navigability issues for searchers.
Future of Search: Hidden Taxonomies, Personalization and Contextualization
The future of search is indeed interesting. It will involve hiding the entrails of algorithms even further. No apparent clusters, no apparent category drill downs (if you have used Internet Yellow Pages, you know what it means to drill down 3-5 levels to find a particular piece of information)
Besides the obvious benefit of hiding the clutter, it could be much more interesting (from a user perspective) to see how search engines cross-link between different categories/indexes of information sources - for instance, it will be much more efficient to be able to go to blogs through a news article and vice versa (Technorati-style )
Applications and Implications of Advances in Search Engines
These advances in search technology will not be limited just to Internet search, these could help shopping sites, Internet Yellow Pages, Online Community sites, and this list could go on.