Making sense of online textual information and information management technologies
   
 
Intelligent Information Extraction and Discovery through RSS
June 15, 2003

RSS has established itself as a strong alternative to spidering the Internet. And now Blogstreet, an Indian company - cashing in on the popularity of blogs - has introduced a beta version of its RSS Generator that allows you to generate an RSS feed for a blog. K-Praxis thinks that although Blogstreet's efforts are limited to blogs, "superimposing" an RSS feed onto an information source could lead to new ways of information extraction and discovery from various structured and semi-structured information sources...read on..

RSS: Beyond the metaphor of search

Think information retrieval and we are still trapped into the metaphor of search. Search assumes that there is an information pool available out there and the best way to get to that pool is to basically "scoop" out the information by sending in a search query. RSS (Really Simple Syndication OR Rich Site Summary OR RDF Site Summary, take your pick!) on the other hand, has the ability to virtually send site updates to user instantaneous. Various news syndication sites like Moreover and blogs are using RSS to push feeds to the users. This helps users as they are automatically updated of site updates and allows syndication sites to integrate seamlessly with other information access systems.



Sales Marketing Intelligence: Is your company looking to buy a Sales or Marketing Intelligence solution? Then its time you analyze the solution from a Text Analysis point of view. A report by K-Praxis on Sales and Marketing Intelligence provides a roadmap for integrating Text Analysis with traditional data mining. The complete report (Sales and Marketing Intelligence: The Need for Integrating Textual Analytics with Traditional Solutions) is available for purchase through InfoSphere AB .

RSS, in a way, adds a new dimension to information access where the user is automatically sent the information he/she wants. Besides, RSS also provides an alternative to crawler based spidering of the web, there are a number of blog search engines (Feedster, RSSSearch, DayPop, etc.) which are using RSS feeds to search through data provided by these feeds.

It is important to note that although creating an RSS based search engine is definitely an efficient way to access information than crawling, but you are still well within the paradigm of search metaphor. This makes the RSS Generator by Blogstreet an interesting new begining. Up to now RSS generation was inbuilt into some of the main stream blogging softwares (Userland, Blogger, Movable Type) but the ability to "superimpose" an RSS feed is to take this idea much further - almost close to a paradigm shift. Blogstreet's RSS Generator only works with blogs and assumes that there are permanent links on the webpage it is trying to process.

RSS: Proactive Information Extraction and Discovery

K-Praxis thinks that RSS could be used to extract dynamic information from variety of information sources, facilitating effortless real time information feeds and intelligent information analytics. This idea comes very close to the idea of Semantic Web, where the idea is that in future web-pages will be based on semantically rich XML and will be able to interact with the users based on the understanding of the users' needs. But instead of waiting for the webmasters to get their first, an intelligent RSS feed generator should be able to "superimpose" RSS feed onto any information source - whether internal or external. Potential for this type of information extraction is enormous and will take away the huge problems one faces with crawler based extraction and discovery.