Content Analysis: From Classical Text Coding to the Advanced Text Analysis
One of the issues in trying to understand the varied usage-scenarios is that the face of content analysis has changed a great deal over the last few decades. From the academic text coding that tried to determine the author of a document by using a simple methods like noun/word count - to more sophisticated text analytics and text mining to understand various themes and context based information extraction and text modeling - technology has made a huge leap forward in terms of grappling with unstructured free-form nature of text vis-a-vis more structured numerical data.
Here is an early definition of content analysis:
"Any technique for making inferences by objectively and systematically identifying specified characteristics of messages"(Holsti, O.R. (1969). Content Analysis for the Social Sciences and Humanities. Reading, MA: Addison-Wesley)
But even the early adopters of content analysis for academic purposes were convinced that Content Analysis is a much more of an overarching term than text analysis. They understood that content analysis goes much beyond text analysis albeit text analysis constitutes a major part of content analysis.
Before we go into the recent usages of automated content analysis, let us summarize a few usage scenarios for the traditional content analysis:
- Author Identification
- Identifying Author Intention
- Identifying document differences
- Identifying author style and style comparison
- Identifying trends, patterns from open ended survey answers
| Sales Marketing Intelligence: Is your company looking to buy a Sales or Marketing Intelligence solution? Then its time you analyze the solution from a Text Analysis point of view. A report by K-Praxis on Sales and Marketing Intelligence provides a roadmap for integrating Text Analysis with traditional data mining. The complete report (Sales and Marketing Intelligence: The Need for Integrating Textual Analytics with Traditional Solutions) is available for purchase through InfoSphere AB . |
Content Analysis and Legal/Regulatory Compliance: Analyzing Recent News and Business Wires
Now let us examine a few recent news items and product announcements (news items are organized in a chronological order):
- AmikaNow! and Authentium Announce Partnership and Speakeasy Anti-Spam/Anti-Virus Server Launch :The AmikaGuardian Compliance Server ensures that email complies with legislated privacy requirements and corporate policies, including anti-spam handling, for government, healthcare and financial customers. The AmikaGuardian screens both incoming and outgoing email, identifying concepts based on the company’s US-patented content analysis technology.
- Aungate Launches Enterprise Spam Filtering Product : Aungate was started by Autonomy Corp to specifically focus on regulatory compliance target segement. It uses Autonomy's automated text analysis technology to provide solutions including spam detection and total communications compliance management and unauthorized information detection.
- iLumin Announces the General Availability of Assentor Enterprise 3.2 : Assentor platform is one the most advanced in terms of newer content analysis solutions are concerned. Assentor Enterprise combines the power of content analysis along with archive and records management to provide an integrated solution that can help companies in "archiving, compliance, supervision, discovery, litigation support and mailbox management", document management and records management. Particularly interesting from a content analysis perspective was list normalization solution (Assentor List Examiner) that helps in managing lists of entities - a very good example of iLumin's innovative approach to content analysis.
- Choicestream Launches The MyBestBets Personalization Platform : An interesting example of an advanced audio/video content analysis. ChoiceStrem uses content classification along with Bayesian choice modeling to provide a unique online television personalization solutions.
- Proofpoint Protects Corporate Email From Recent Fraudulent Spam "Phishing" Epidemic :The Proofpoint Protection Server is a solution that protects corporate messaging infrastructure from external email threats, such as spam and viruses, and ensures compliance with corporate email policies and external regulations.
- Watchfire Introduces 'Online Business Management' Software and Services : WatchFire's newest solutions integrating brand, risk and cost management provides firms with insight into privacy, compliance, and external protection issues. Helping them manage exposure to risk and privacy breaches, manage internal compliance with corporate risk governance standards, protect online trademarks and brand names, ensure compliance with regulatory and industry guidelines, avoid negative exposure from objectionable or risky content and data leaks, automate critical compliance management processes, improve customer acquisition.
- Thomson ISI ResearchSoft Releases New Data Analysis Tool - RefViz : Thomson ISI's RefViz covered extensively on K-Praxis earlier (see: Intelligent & Automated Research Tools & Text Analysis). Through content analysis and visualization RefViz can create a digest of topics discussed in the reference literature; recognize and reveal relationships between topics; analyze trends and associations amongst topic areas; and can allow the researcher to store, manage, retrieve research in a flexible manner, integrating seamlessly with bibliographies in EndNote, Reference Manager and ProCite (other Thomson ISI products).
Content Analysis: Some Recent Usages
Apart of Thomson's RefViz and ChoiceStream's MyBestBets all other vendors are trying to zoom in on the anti-spam, regulatory and compliance segment, in a way heralding a new opening for content analysis solutions. Here are the new usages of content analysis as apparent the above new analysis:
- Email Management Anti-Spam
- Content Compliance
- Legal and Industry Compliance
- Offensive, Harassing and Discriminatory Content Detection and Content inspection
- Defamatory, Libelous or Fraudulent Content Detection
- Intellectual Property Protection and Analysis
- Brand Management and Customer Retention
- Content Personalization
- Security and Virus Detection
Conclusion
Content Analysis could become an important segment for software vendors as companies are looking for solutions - not just to fight spam and viruses but protect their online brands, manage business risks and achieve legal, regulatory and industry compliance. Automated content analysis could help a great deal in achieving these objectives.