Understanding the thoughts and feelings of Internet users

TextImi is a system that helps people understand the thoughts of others from large volumes of text through a collaboration between computer analysis and human interpretation. The system can be used in for marketing surveys coordinated with network research systems, customer voice analysis, public comment analysis and more.

Development of TextImi, a Text Semantic Space Analysis System:
- Thoughts and feelings of those who read from the Internet -

FUKAYA, Masahiro. Professor
Faculty of Policy Management, Keio University

Introduction

The development of the Internet and information technology has now enabled us to collect a huge volume of text data on various matters discussed by people themselves. By processing this enormous quantity of text data, we can learn how people think and feel about different issues. TextImi is a system that helps people understand the thoughts of others from large volumes of text through a collaboration between computer analysis and human interpretation. The system can be used in a wide range of applications such as marketing surveys coordinated with network research systems, customer voice analysis, and public comment analysis.
Human semantics research: Sociosemantics and development of TextImi

Development research for this project originated in the Fukaya Laboratory as part of research on "The Human Semantic World" and is now proceeding as a project of "Sociosemantics: Development of Research Procedures for Web Society " in the Policy COE. A social phenomenon is a composition and integration of human actions. Human actions reflect individual issues that give rise to those actions. We customarily call the internal world where meaning is developed a "semantic world". Human actions are, so to speak, products of the human semantic world. Consequently, the semantic viewpoint is an essential element for the study of society.

However, full-fledged development of study of the human semantic world has been impeded until very recently, mainly by the constraint of data availability. But development of the Internet and IT has made it possible to collect a large volume of text data on various issues discussed by people in their own words. A theory and method for extracting human semantic worlds out of this enormous volume of data expressing meaning will allow earnest research into the human semantic world.

The theory needed is a language and communication theory that captures the meaning of words contained in text data as the meaning for people. Fukaya and Tanaka built a new theoretical paradigm to embody that doctrine (Theory of Word Significance, awarded a Japan Linguistics and Language Education Institution Prize, 1996). This theory opened a clear prospect that an experimental study on the basis of the paradigm would be promising, and that the study of the human semantic world would be greatly advanced by computer processing of large text data volume (approx. 2000). Collaboration also began with the Research Headquarters of Fuji Xerox Co., Ltd on development of a text analysis system (approx. 2002). This project was adopted as a Policy COE project, and development research continues (Keio Gijuku Prize, 2003).

Work on the text semantic space analysis system is now proceeding as part of efforts to create a new academic discipline, sociosemantics, that investigates the human semantic world empirically. What do people think or feel about a certain matter? Or what concepts are contained in the conventional common sense that underlies everyday human thoughts and behaviors? With the aid of computers, large volumes of text data will clarify these type of questions. These questions are at the heart of the new discipline of sociosemantics that we seek. This system is a technology at the core of the effort and has, at the same time, broad applicability.

Contents and features of technology

The text semantic space analysis system that I am introducing today has been developed to its current functional state with these considerations in mind and is termed TextImi (a nickname), Version 3. Connected to a network research system, for example, applications of the system include marketing surveys and analysis of policy-related public comments. Our development research has now reached the stage of demonstration in practical applications. Examples of the technical features TextImi embodies may be summarized as follows:

1. Based on a paradigm of language and communication theory that handles word meanings as understood by people.
2. System design concept of collaboration between computer analysis and human analyst's interpretation.
3. Extraction of statements (chunk of fundamental meaning) not so small as to cause loss of meaning.
4. Efficient support for tagging of small statements (aftercoding)
5. Builds semantic space through text mining and human analyst's interpretation

####################################

Masahiro FUKAYA
Affiliation : Faculty of Policy Management
Position : Professor
Research areas / Keywords : Sociosemantics / text semantic space analysis / significance theory / cognitive linguistics

For inquiries about this technology, contact the Intellectual Property Center, Keio University.
Direct line: +81-(0)35-427-1678
E-mail address: :< [email protected] >