THE IMPACT ANALYSIS OF THE IMPLEMENTATION OF TEXT SUMMARIZATION METHOD ON DOCUMENT CLASSIFICATION ACURACY USING TF

Wahyu Kurnia Dewanto, Hermawan Arief P, Taufiq Rizaldi

Abstrak


Classification of textual documents (text classification) is one is one of the disciplines of Natural  Language Processing (NLP)  which is  usually  use a machine learning technique to performs.  The  goal  of  classifying  text  is  to  assign  one  or  more  classes  or  categories  to  a document, making it easier to manage and sort. In this study, there are two main issues to be studied, the first is how to reduce the attributes in the text document using text summarization and the second is to analyze the effectiveness of using this methods.

In order to see the impact of using document summarization methods on the accuracy of the classification  process,  two  approaches  were  used  in  this  research.  The  first  approach  is  the classification process  without the automatic document summarization  method and the second approach is the classification process using the automatic document summarization method. The results of the two approaches are compared to find out whether the use of the automatic document summarization method is useful as a feature reduction method so that it can improve the accuracy of the classification process.

From the experiment, it can be conclude that there is a significant increase in accuracy, from 72.22%  to  83.33%,  which  is  equal  to  11%.  This  means  that  using  automatic  document summarization  methods  as  an  attributes  reduction  can  improve  classification  accuracy  using Support Vector Machine.


Teks Lengkap:

PDF

Referensi


S. Rasckha, Phyton Machine Learning, Birmingham, UK: Packt Publishing Ltd., 2015.

M. A. Hearst, "Support Vector Machines," IEEE Intelligent Systems, vol. 13, no. 4, pp. 18-28 ,1998.

V. Chandani, R. S. Wahono and Purwanto, "Komparasi Algoritma Klasifikasi Machine Learning Dan Feature Selection pada Analisis Sentimen Review Film," Journal of Intelligent Systems, vol. 1, no. 1, pp. 56-60, 2015.

P. Sethi, S. Sonawane, S. Khanwalker and R. Keskar, "Automatic Text Summarization of News Article," in 2017 International Conference on Big Data, IoT and Data Science (

N. Saputra, T. B. Adji and A. E. Permanasari, "Analisis Sentimen Data Presiden Jokowi dengan Preprocessing Normalisasi dan Stemming Menggunakan Metode Naive Bayes dan SVM," Jurnal Dinamika Informatika, vol. 5, no. 1, 2015.

H. A. Putranto, O. Setyawati and Wijono, "Pengaruh Phrase Detection dengan POS-Tagger terhadap Akurasi Klasifikasi Sentimen menggunakan SVM," JNTETI, vol. 5, no. 4, pp. 252-259, 2016.

A. A. Armana, A. B. Putra, A. Purwarianti and Kuspriyanto, "Syntatic Phrase Chunking for Indonesian Language," Science Direct, pp. 635-640, 2013.

R. Ferreira, L. d. S. Cabral, R. D. Lins, G. P. e. Silva, F. Freitas, G. D. C. Cavalcanti, R. Lima, S. J. Simske and L. Favaro, "Assessing Sentence Scoring Techniques for Extractive Text Summarization," Expert Systems with Applications, vol. 40, no. 14, p. 5755–5764, 2013.


Refbacks

  • Saat ini tidak ada refbacks.