Making sense of online textual information and information management technologies
   
 
Search TechnologiesHome

   
Pre-IPO Google Dossier: An Experiment in Opinion Mining (Updated)
November 25, 2003

The headlines, analysis and speculative rumors storm kicked off by the recent news about Google IPO and the New York Times article about Microsoft’s proposed offer to buy out Google has now subsided, the ripple effects of this head-line news too are now settling back to normalcy. We at K-Praxis treated this as an opportunity to try out text analysis based opinion mining using a combination of text analysis software, methodologies and a text analysis processes-metrics to create what we call a "Pre-IPO Google Dossier". This is an attempt to understand a company, its technology and the buzz it has been able generate through and across the dot-com-boom and bust era. More than looking at just the business aspect of the company, this is an attempt to look at the whole phenomena of Google in a holistic and multi-dimensional manner.

Note that, we are republishing this dossier as some users had trouble accessing it earlier.

Google Dossier: A Challenge

Usually it is a big challenge to analyze a privately owned company - even if the company is not as popular and as dynamic as Google has become. There are no regulatory SEC filings, there are no quarterly analysts calls, and no revenue forecasting. The task of analysis in the case of Google is even harder as the company has been looked upon as an iconic figure that has been able to almost single-handedly catapult web information retrieval to a religious dimension. And because of this iconic status, Google has become a speculative engine. Many commentators will site an example of the proverbial "google" the verb getting into English language as an example of its popularity. Of course getting into public lexicon is an indication of popularity, one would require much "harder" facts to predict where this privately own company is going.

In absence of harder facts filed at the regulatory authorities, and a very few concrete data-points available for any analysis, we at K-Praxis set out to carry out an experiment of opinion mining and text analysis by using our proprietary methodologies, software and a processes-metrics designed for these types of analysis. We decided to use the hundreds of news stories, analysis, and opinion pieces spun off by Google IPO announcements and the NYT article from publications across the Internet to achieve this goal.

The complete Pre-IPO Google Dossier: An Experiment in Opinion Mining is available on request. The complete package includes:
  • Opinion Clusters
  • The hypotheses discovered through this text analysis and opinion mining experiment
  • Opinion clusters and hypotheses visualization
  • An Executive summary in a power point presentation format
Contact us to get your copy of the report. K-Praxis can provide a new dimension to content leverage through text analysis and text mining. Contact us, for text analysis and actionable intelligence driven custom reports, technologies and consultancy.

3. The Methodology: Opinion Mining Through Publicly Available Information

In order to analyze the opinions expressed in online information related to these two events, treating them as a pivot, we at K-Praxis analyzed text chunks from some 500+ news, opinion and analysis pieces, blog postings, discussion forums (stories from NYT to Register, Economist, SearchEngineWatch, to discussion forums like Slashdot and a number of blog postings). These opinions and chunks were clustered around the following dimensions:


  1. Google the Brand

  2. Google Search and Search Technology

  3. Problems with Google

  4. Google the Media Agency

  5. Goggle, Innovation and Web Information Retrieval

  6. Google Competition/Rivals

    • Google and other Search Players

    • Google and eBay/Amazon

    • Google and Yahoo-Overture

    • Google and Microsoft

  7. Google IPO and Google Business

These opinion clusters were cleansed, classified and analyzed to form a few constrained hypotheses. We believe that these hypotheses could be useful for anybody who is interested in tracking Google; besides, this study opens a door for more studies of this nature in the future. It is important to note that many of these opinions were of very speculative nature. The methodology of this analysis was designed in such a manner that a constrained analysis could counter-balance some of the speculative opinion expressed in the documents used for this purpose.

Sales Marketing Intelligence: Is your company looking to buy a Sales or Marketing Intelligence solution? Then its time you analyze the solution from a Text Analysis point of view. A report by K-Praxis on Sales and Marketing Intelligence provides a roadmap for integrating Text Analysis with traditional data mining. The complete report (Sales and Marketing Intelligence: The Need for Integrating Textual Analytics with Traditional Solutions) is available for purchase through InfoSphere AB .

Google the Speculative Engine

Before we embark on the study, it is important to point out how important it was for us design the methodology of constrained analysis in order to deal with the very speculative nature information about Google.

Getting an insider point of view about the company has been so important, especially to the Search Engine Optimization community - a completely new category of cottage industry business "almost" born out of Google, as Google wields a huge power as far as search referrals are concerned. The SEO community tracks and speculates about everything that happens in Google in various online bulletin boards, blogs and discussion forums - watching religiously the whole lifecycle of google indexing as getting sites listed and highly ranked on Google is there life-blood. This has led to some innovative attempts at opinion tracking, one such attempt is particularly interesting: in WebMasterWorld (one of the most respected search engine discussion boards) a Google executive answers and gives his "insider" point of view with a pseudonym "GoogleGuy". So intense is the desire of getting to know what the company says about various indexing events, that there is another website (GoogleGuy Says) feeding off the "utterances" of this GoogleGuy.

Besides, watching Google has been a favorite pastime for many bloggers - Google lovers as well as Google-bashers. A huge of number of blogs have sprung up, constantly lapping up every thing that has been said about Google: Google Watch, Google Weblog, Watching Google Like Hawk, MicrodocNews, etc, etc.

So much speculatory information makes the task mining opinion much more difficult and that is where the role of automation through software, rigorously tested methodologies and processes-metrices becomes very important for such an experiement.

The complete dossier is available on request. Contact us to get your copy.

These opinion clusters and hypotheses - derived through an experiment of opinion mining and text analysis by using our proprietary methodologies, software and a processes-metrics designed specifically for these types of analyses - go on to explain how Google will develop in near future, proposing a constrained analysis directions open for Google.

 
Home | Contact K-Praxis | About K-Praxis | Copyright© 2003-2004 K-Praxis. All rights reserved.