Making sense of online textual information and information management technologies
   
 
Search TechnologiesHome

   
Some Recent Trends in Information Retrieval
December 20, 2002

Over the last month since I posted the last dispatch here, there has been huge developments in information management, collaboration and retrieval world. I will just list some of these developments.......

Over the last month since the previous News Analysis dispatch, there has been huge developments in information management, collaboration and retrieval world. I will just list some of these developments first and then offer my observations:

  1. Google launched two new features on its already interesting Google Labs site: Google WebQuotes and Google Viewer.
  2. This gives a huge ammunition to those of us "Google Freaks" about Google being able to do everything in Information Retrieval terms - at least in the Internet information retrieval scene, which can be duplicated in an enterprise. Along with the other previous features of Google Labs such as Google Glossary and Google Sets one can really see the potential of what Google can do with its assets - a huge and ever-bulging web document farm.
  3. After turning a new chapter in online and automated news aggregation and delivery with their Google News, Google now has added a shopping search engine called Froogle - playing on the idea of frugality just before the holiday season.
  4. The importance of blogs seems to be spreading like wild fire, I could more newsily trite and say Australian Bush fires :) And the latest entrance into what is called as Enterprise Blogging is Traction Software with a tool called TeamPage. Read this article in Line56: Blogging for e-Business
  5. FAST, the biggest competitors to Google search, added a new relevancy algorithms to its search which they claim enhances the FAST results by 12%. See this item on Internet.com: FAST Fine Tunes Features.

Comments

  • Yes Google does have that great advantage a) 3-billion document farm b) atmosphere where innovation is the air one breathes, but that does not mean Google can cater for all and to everything in the world of information management and retrieval. Although it is true that even to write this small post, I have relied heavily on Google searches, but I while I searched, my past experience about the richness and quality of the sources, helped me find relevant and quality information on the Internet. There is a great deal of peering that goes on while you search - consciously and subconsciously - most of us do rely on people pointing you to sources.
  • And when it comes to searching on Google with just keywords, we all know how difficult it it is. I think search which is fortified by an accurate auto-classification and categorization algorithm, a detailed and granular taxonomy, along with some kind of peer review blogging system is the need of day. Any body who provides all these capabilities will surely pose a huge challenge to Google.
  • Of course none of us know that whether Google has the technology to read entire pages besides their famous PageRank relevancy technology - as some other technologies that use machine learning and statistical natural language processing can do. But as an avid searcher and having some understanding of these technologies, I don't think they use such a technology. But of course, their recruitment pages do talk of hiring machine learning experts!
  • On the importance of granular Taxonomies, it was interesting to see what Google did with very rudimentary taxonomy on Google News. It was as if Google was saying: taxonomies don't matter so much, what matters is relating pages to pages, similarity between documents. Of course document similarity matters, but it is easy to create this effect in the world of online news. When you click on "Related" displayed near a Google News headline, and you would surely wonder about the coolness of automation that shows news emanating from Salt Lake Tribune in Utah to Taipei Times in Taiwan to Hindustan Times, India. But if you really looked carefully you would notice that news might be coming from a few well-known sources: AP or Reuters. Besides online news or news articles in general have nice format - a well structured Title, Introductory Summary, and few paragraphs of core news item and a nice close-ended byline. And hence Google News does not have to worry about topics, or a taxonomy of topics. On the other hand take Froogle, here semi-structuredness of the data involved, makes it imperative that Google allows you to browse both by keywords as wells by categories. So I guess we have to wait until we write epitaph for taxonomies.
  • I think blogs are developing an entirely new information eco-system, which let us tap the recommendation of others. If collaboration technologies inside enterprises boast of expertise management systems and document referral systems, I have found Blogs doing that work on the Internet for me.

One note about this site, Google does have some affinity to blog and bloggy sites, I can say that with my own experience. Just type in K-Praxis in the Google search box and the first result you get is THIS SITE. And it happened within 3 days of this website being online.

Going by the true blog-style peer-recommendations, some body else corroborates this too. Read this article in Microcontent: Google Time Bomb

And if you are looking for more scoop on Google. I thought this article in the Wired is an interesting read : Google vs. Evil. The article also points to a self-proclaimed Google Watch Dog site - which has some rather thought provoking stuff on Google - and Google has not removed this site from their index.

 
Home | Contact K-Praxis | About K-Praxis | Copyright© 2003-2004 K-Praxis. All rights reserved.