Abstract:Thecontroversial news issues draw much interest from the public. But itis not simple for an ordinary user to search and contrast theopposing arguments and have complete understanding of issues.Disputant relation based method classifies the opposing views of thenews/issues which can help readers to easily understand the issue.For classifying news articles on contentious issues disputantrelation-based method is used. It is known that the disputants of acontention are an important feature for understanding theconversation so, the disputant relation based method performsunsupervised classification on news articles based on disputantrelations, and helps readers naturally view the articles through theopponent- based frame and attain balanced understanding, free from aspecific biased viewpoint.Thismethod is performed in three stages: disputant mining, disputantseparation and article classification. Also a modified version ofHITS algorithm is used in disputant partitioning and an SVMclassifier is used for article analysis.
Keywords:Classification,document analysis, text mining.IntroductionIntoday’s world, news issues become an important part of every one’slife. It is essential to know the information of the surrounding asremain updated. Thiscovering of news issues work is done by journalism. The covering ofcontroversial issues of the surrounding is an important function of journalism as controversial issues grow in various areas such asbusiness, politics, sports, etc and these issues include variousparticipants and their various views.It is notice that sometimes news articles get biased and unable topresent conflicting views of the issue. That’s why it is verydifficult for ordinary readers to analyze the conflicting views andunderstand the controversy.
Readers generally make their viewsthrough single articles advanced news presentation models are need toincreases the awareness on differing views. In this paper, we surveyon a disputant relation based method for classifying news articles oncontroversial issues. it is observed that the disputants i.e.
theparticipants who take his appropriate position and participate in thecontroversy such as businessmen, politician, sportsmen, experts, newswriter, and so on are an important aspects for understanding theconversation. News producers primarily shape articles on a contentionby selecting and covering specific disputants 2. Readers alsonaturally understand the controversy by identifying who the opposingDisputants are. The method helps readers naturally view the newsarticles through the ‘opponent-based frame’ 1. It performsclassification in an unsupervised mode: it dynamically identifies theopposing disputant groups and classifies the articles according totheir positions. As such, it successfully helps readers comparearticles of a controversy and achieve balanced understanding, freefrom a specific biased viewpoint.
The surveyed method differs fromthose used in related tasks as it aims to achieve classificationunder the opponent based frame. Most research on sentimentclassification and debate position appreciation takes atopic-oriented view, and attempts to perform classificationunderthe ‘positive versus negative’ frame for the given topic, forexample, positive versus negative about television. However, newsarticles of a controversy are hardly classified under such frames. Argument Frame ComparisonLaunchinga suitable argument frame is important. It provides a framework thatallows readers to naturally understand the controversy.
It alsodetermines how classification methods should classify articles of theissue.Related Work In Document ClassificationTurneyet al., and Pang et al., have been made researched on sentimentclassification in document-level. It aims to automatically recognizeand classify the sentiment of documents into positive or negative.Opinion summarization aims a similar goal, to identify specialopinions on a topic and generate summaries of them 2, 3.Paulet al.
, developed an unsupervised method for generating summaries ofcontrastive opinions on a common topic. Many of these works make anumber of assumptions that are difficult to apply to the conversationof controversial news issues. They usually apply a single staticclassification frame, ‘positive versus negative,’ to the topic4.Somasundaranet al., had proposed that a number of works deal with debate attituderecognition, which is a closely related task. They try to identify aposition of a dispute, such as ideological.
This debate frame isoften not appropriate for controversial issues for similar reasons asthe positive/negative frame. In contrast, surveyed method does notassume a fixed debate frame, and rather develops one based on theopponents of the controversy at hand 5.Thomaset al., and Agrawal et al., had proposed that the several works haveused the relation between speakers or authors for classifying theirdebate stance. However, these works also assume the same debate frameand use the debate mass, e.
g., floor debates in the House ofRepresentatives, online dispute forums. Their approaches are alsosupervised, and require training data for relation analysis, e.g.
,voting records of congress people 6, 7.Schonet al., had proposed the conversation of controversial issues in newsarticles shows different characteristics from that studied in thesentiment classification tasks. First, the opponents of acontroversial issue often discuss different topics.
Second, the frameof dispute is not fixed as, positive vs. negative 9.Disputant Relation-Based Classification MethodThedisputant relation-based method implements the opponent based framefor classification. It attempts to recognize the two opposing groupsof the matter at hand, and determines whether an article replicatesthe position of a definite side more.
The method is based on theobservation that there are usually two opposing groups of disputants,and the groups compete for news coverage. They strive to influencereaders’ understanding, estimate of the issue, and grow support fromthem 2. In this challenging process, news articles may give morechance of speaking to a detailed side, explain or detailed them, orsupply facts helpful of that side. The surveyed method is performedin three stages:Thefirst step, disputant extraction, mines the disputants appearing inan article set.Thesecond step, disputant partition, divisions the mined disputants intotwo opposing groups.Lastlythe news classification step classifies the articles into threecategories, i.e.
, two for the articles influenced to each group, andone for the others.Thismethod assumes polarization for controversial issues. This statementwas valid for most of the tested issues. For a few issues, there weresome participants who do not belong to either side; however, theyusually did not take a particular position nor make strong arguments.For example, in the second issue Entrance of retailers to supermarketbusiness the government was in the middle between the big retailcompanies and the small store owners. The government commented thatthis is a difficult problem to solve and did not support a specificside. Based on this observation, the method is designed to identifyopposing two groups of disputants and recognize articles biased to aspecific side.Disputant ExtractionInthis step, the participants who participate in the controversy haveto be extracted/ mined.
We expand that many disputants appear as thesubject of quotes in the news article set. The articles activelyquote or cover their action to deliver the controversy actively. Weused straight-forward methods for extraction of the subjects. Themethods were efficient in practice as quotes of articles often had aregular pattern. The subjects of direct and indirect quotes aremined. The sentences including a statement inside double quotes areconsidered as direct quotes. The sentences that express a statementwithout double quotes, and those describing the action of a disputantare considered as indirect quotes (see the example 1 below).
Theindirect quotes are identified based on the morphology of the endingword. The ending word of the indirect quotes often has a verb as itsroot or includes a verbalization suffix. Other sentences, typically,those describing the reporter’s explanation or comments are notconsidered as quotes (see example sentence 2). The government clarified that there would not be any talks unless Pakistan apologizes for the attack. The government’s belief is that a demanding response is the only solution for the current emergency.Disputant Partitioning”Keyopponent-based partitioning” method is developed for disputantpartitioning step. The method initially identifies two key opponents,each representing one side, and uses them as a pivot for partitioningother disputants 1. The other disputants are divided according totheir relation with the key opponents, i.
e., which key opponent theystand for or against. Disputantpartitioning it is a meaningful task to explore for more optimizedmethods. The presented method is our first round solution to thetask, based on the observation of the criticizing structure that isfrequent in news article sets of controversial issues, i.e.,existence of the key opponents who actively criticize and arecriticized by others, and existence of other minor disputants whocommonly criticize the key opponents but are not criticized often.The presented key opponent-based partitioning is an initial algorithmthat operates this criticizing structure.
The perception behind themethod is that there frequently exists key opponents who representthe controversy, and many participants disagree about the keyopponents, whereas they rarely identify and talk about minordisputants.Selectingkey opponents: To identify the key opponents of the issue, we searchfor the disputants who frequently criticize, and are also criticizedby other disputants. As the key opponents get more news coverage,they have more chance to clear their argument, and also have morechance to face counter-arguments of other disputants. This is done intwo steps. First, for each disputant, it is to be analyzing whom heor she criticizes and by whom he or she is criticized.
The methodgoes through each sentence of the article set and investigates forboth disputant’s criticisms and the criticisms about the disputant.Based on the criticisms, it analyzes the associations amongdisputants.On the other hand, if the disputant is not the subject butdemonstrates in the quote, the sentence is considered to convey acriticism about the disputant from another disputant. Second,a modified version of the HITS graph algorithm to uncover majordisputants. For this, the criticizing relationships obtained in thefirst step are represented in a graph. Each disputant is modeled as anode, and a connection is made from a criticizing disputant to aCriticized disputant.
Figure3.2: Modified HITS algorithm.Originally,the HITS algorithm is designed to rate WebPages regarding the linkstructure. The feature of the algorithm is that it separately modelsthe value of outlinks and inlinks. Each node, i.
e., a webpage, hastwo scores: The authority score, which reflects the value of inlinkstoward itself, and the hub score, which reflects the value of it’soutlinks to others. The hub score of a node increases if it links tonodes with high authority score, and the authority score increases ifit is pointed by many nodes with high hub score. The HITS algorithmis adopted due to above feature. It enables us to separately measurethe significance of a disputant’s criticism (using the hub score) andthe criticism about the disputant (using the authority score).
Theaim is to find the nodes that have both high hub score and highauthority score; the key opponents will have many links to others andalso be pointed by many nodes.Theadapted HITS algorithm is shown in Fig. 3.2. Some adaptation is tomake the algorithm reproduce the disputants’ uniqueness.
The initialhub score of a node is set to the number of quotes in which thematching disputant is the subject. The initial authority score is setto the number of quotes in which the disputant appears but not as thesubject. In addition, the weight of each link (from a criticizingdisputant to a criticized disputant) is set to the number ofsentences that convey such criticism.Partitioningminor disputants:Given the two key opponents, we have to partition the rest ofdisputants based on their relations with the key opponents. For this,we identify whether each disputant has a positive or negativerelation with the key opponents. The disputant is classified to theside of the key opponent with whom the disputant shows a morepositive relation. If the disputant shows a negative relation, thedisputant is classified to the opposite side.Hereare the four features to capture the positive and negativerelationships between the disputants: Positive Quote Rate (PQRab): Given two disputants (a key opponent a, and a minor disputant b), the feature measures the ratio of positive quotes between them.
Negative Quote Rate (NQRab): This feature is an opposite version of PQR. It measures the ratio of negative quotes between the two disputants. Frequency of Standing Together (FSTab).: This feature attempts to capture whether the two disputants share a position.
Frequency of Division (FDab): This feature is an opposite version of the FST. It counts how many times they are not collocated in the sentences.Article ClassificationEachnews article of the set is classified by evaluating which side isprominently enclosed. The method classifies the articles into threecategories, either to one of the two sides or the category “other”.Itis observed that the major components that shape an article on acontroversy are quotes from disputants and journalists’ commentary.Thus, this method believes two points for classification: First, fromwhich side the article’s quotes came; second, for the rest of thearticle’s text, the correspondence of the text to the arguments ofeach side.Asfor the quotes of an article, the method computes the amount of thequotes from each side based on the disputant partitioning step’sresult.
As for the rest of the sentences, a similarity analysis isconducted with an SVM classifier 8. The SVM classifier receives asentence as input, establishes its class to one of the three types,i.e.
, one of the two categories, or other. It is qualified with thequotes from each side. The related number of quotes from each side isused for training. It is automatically obtained based on thepartitioning result of the earlier stage. Accordingto survey, an article is classified to a precise side if more of itsquotes are from that side and more sentences are similar to thatside: Given an article a, and the two sides b and c, Classifya to b if, Classifya to c if,,Classifya to other, otherwise, whereSu:Number of all sentences of the articleQi:Number of quotes from the side i.
Qij:Number of quotes from either side i or j.Si:Number of sentences classified to i by classifier.Sij:Number of sentences classified to either i or j.
Conclusions and Future WorkInthis paper the problem of classifying news articles on controversialissues is studied. It involves new challenges as the conversation ofcontroversial issues is complex, and news articles show differentcharacteristics from commonly studied amount, such as productreviews. The conversation involves many topics and the argumentsoften do not fit the ‘positive versus negative’ frame.Itsuggests that opponent-based frame is a clear and effective frame forunderstanding controversial issues. The frame does not requirearticles to cover a common topic nor the arguments to explicitlyexpress positive or negative sentiments. In this study, it can befind that the participants easily identified the opposing disputants,and classified articles written for or against the disputants.Instead of taking a topic-oriented view, the disputant relation-basedmethod focuses on the disputants of the controversy.
Forbetter performance of Disputant relation based method, ‘Naive Bayesalgorithm’ can be used in article classification step as; SVMclassifier cannot classify the issue that involves three or moredisputants.References Souneil Park, Jungil Kim, Kyung Soon Lee, and Junehwa Song, Disputant Relation Based Classification for Contrasting Opposing Views of Contentious News Issues, IEEE Transactions On Knowledge And Data Engineering, Vol. 25, No. 12, December 2013. P. Turney, Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews, Proc.
40th Ann. Meeting Assoc. Computational Linguistics (ACL ’02), pp. 417-424, 2002.
B. Pang, L. Lee, and S. Vaithyanathan, Thumbs up? Sentiment Classification Using Machine Learning Techniques”, Proc. Conf. Empirical Methods Natural Language Processing (EMNLP ’02), pp. 79-86, 2002. M.
J. Paul, C. Zhai, and R. Girju, Summarizing Contrastive Viewpoints in Opinionated Text, Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP), pp.
66-76, 2010. S. Somasundaran and J. Wiebe, Recognizing Stances in Ideological Online Debates, Proc. NAACL HLT Workshop Computational Approaches Analysis and Generation Emotion in Text (CAAGET ’10), pp. 116-124, 2010.
M. Thomas, B. Pang, and L. Lee, Get Out the Vote: Determining Support or Op- position from Congressional Floor-Debate Transcripts, Proc.
Conf. Empirical Methods Natural Language Processing (EMNLP ’06), pp. 327-335, 2006. R.
Agrawal, S. Rajagopalan, R. Srikant, and Y. Xu, Mining Newsgroups Using Networks Arising from Social Behavior, Proc.
12th Int’l Conf. World Wide Web (WWW ’03), pp. 529-535, 2003. T. Joachims, Making Large-Scale SVM Learning Practical, Advances in Kernel Methods-Support Vector Learning, B.
Scholkopf, C. Burges, and A. Smola, eds., MIT Press, 1999.Books:9 D.A.
Schon and M. Rien, FrameReection: Toward the Resolution of Intractable Policy Controversies,(IBasicBooks, 1994).