Document Plagiarism Detection Application Using Web-Based TF-IDF and Cosine Similarity Methods

English

Authors

    Jimmy Halim( 1 ) Desiyanna Lasut( 2 )

    (1) Buddhi Dharma University
    (2) Buddhi Dharma University

DOI:


https://doi.org/10.32877/bt.v7i2.1697

Keywords:


Cosine Similarity, Detection , Plagiarism , TF-IDF , Website

Abstract

In the era of technology and information which is developing very rapidly recently, this has resulted in easy access to information which makes the learning process easier in the world of education, but this ease also triggers acts of plagiarism which is a serious threat to science. Plagiarism is an act of stealing or taking someone else's work without giving proper attribution or you could say without citing that person. Therefore, an application was developed that can overcome this problem, namely a plagiarism detection application that uses the TF-IDF (Term Frequency-Inverse Document Frequency) and cosine similarity algorithm methods. TF-IDF and Cosine Similarity will be implemented into the application to carry out the calculation process which will ultimately provide results in the form of a percentage of the calculations that have been carried out. This plagiarism application is designed to detect similarities between documents in the database and user documents. The processes that occur in the application include preprocessing processes, tf-idf calculations, and cosine similarity calculations. The results of the tests carried out can be said to be consistent because the results of manual and application tests show percentage results of 4% and 4.34%. The application will also be website-based, and will be designed in such a way that it can be used to detect plagiarism.

Downloads

Download data is not yet available.

References

L. Hermawan and M. B. Ismiati, “Aplikasi Pengecekan Dokumen Digital Tugas Mahasiswa Berbasis Website,” J. Buana Inform., vol. 11, no. 2, pp. 94–103, 2020, doi: https://doi.org/10.24002/jbi.v11i2.3706.

M. A. Shadiqi, “Memahami dan Mencegah Perilaku Plagiarisme dalam Menulis Karya Ilmiah,” Bul. Psikol., vol. 27, no. 1, p. 30, Jun. 2019, doi: 10.22146/buletinpsikologi.43058.

M. Krokoscz, “Plagiarism in articles published in journals indexed in the Scientific Periodicals Electronic Library (SPELL): a comparative analysis between 2013 and 2018,” Int. J. Educ. Integr., vol. 17, no. 1, p. 1, Dec. 2021, doi: https://doi.org/10.1007/s40979-020-00063-5.

M. A. Pratiwi and N. Aisya, “Fenomena plagiarisme akademik di era digital,” Publ. Lett., vol. 1, no. 2, pp. 16–33, Jul. 2021, doi: 10.48078/publetters.v1i2.23.

A. Riyani, M. Z. Naf’an, and A. Burhanuddin, “Penerapan Cosine Similarity dan Pembobotan TF-IDF untuk Mendeteksi Kemiripan Dokumen,” J. Linguist. Komputasional, vol. 2, no. 1, pp. 23–27, 2019, Accessed: May 07, 2024. [Online]. Available: https://scholar.archive.org/work/7t7hzdt6gnbg5e7npnbdsuoobi/access/wayback/http://inacl.id:80/journal/index.php/jlk/article/download/17/19/

Herianto, Yulisman, W. H. Manullang, and Y. Irawan, “APLIKASI DETEKSI PLAGIARISME JUDUL TUGAS AKHIR BERBASIS WEB DENGAN MENGGUNAKAN ALGORITMA RABIN-KARP ROLLING HASH (STUDI KASUS: AMIK MAHAPUTRA RIAU),” J. ILMU Komput., vol. 10, no. 2, pp. 107–112, 2021, doi: https://doi.org/10.33060/JIK/2021/Vol10.Iss2.223.

K. Abdullah et al., Metodologi Penelitian Kuantitatif, vol. 3, no. 2. Pidie Provinsi Aceh: Yayasan Penerbit Muhammad Zaini, 2022. [Online]. Available: https://www.infodesign.org.br/infodesign/article/view/355%0Ahttp://www.abergo.org.br/revista/index.php/ae/article/view/731%0Ahttp://www.abergo.org.br/revista/index.php/ae/article/view/269%0Ahttp://www.abergo.org.br/revista/index.php/ae/article/view/106

I. Tojimamatov, T. D. Abdusalomovna, H. O. A. Qizi, K. O. A. Qizi, and M. D. B. Qizi, “TEXT MINING,” Eur. J. Interdiscip. Res. Dev., vol. 13, pp. 184–189, 2023.

D. Remawati and H. Wijayanto, Buku Ajar Web Jsp Dengan Database Mysql. SEMARANG: LEMBAGA PENELITIAN DAN PENGABDIAN KEPADA MASYARAKAT UNIVERSITAS DIAN NUSWANTORO SEMARANG, 2021. [Online]. Available: https://eprints.sinus.ac.id/784/1/Buku_Ajar_Web_JSP_dengan_database_MySQL.pdf

R. A. Putri, Buku Ajar BASIS DATA, 2nd ed. Medan: PENERBIT MEDIA SAINS INDONESIA (CV. MEDIA SAINS INDONESIA), 2022.

M. Madani and Haryono, “Perancangan Website Media Promosi Produk Gerabah Menggunakan Metode Waterfall Designing a Promotional Media Website for Pottery Products Using the Waterfall Method,” J. Bumigora Inf. Technol., vol. 5, no. 2, pp. 195–204, 2023, doi: https://doi.org/10.30812/bite/v5i2.3370.

C. Ningki and Noviyanti, “Implementasi Aplikasi Penjualan Produk Tradisional Berbasis Website Menggunakan Metode Waterfall,” J. Inform., vol. 19, no. 2, pp. 107–114, 2023, doi: https://doi.org/10.52958/iftk.v19i2.6149.

C. Zong, R. Xia, and J. Zhang, Text Data Mining. China: the registered company Springer Nature Singapore, 2021. doi: https://doi.org/10.1007/978-981-16-0100-2.

A. Hermawan, I. Jowensen, Junaedi, and Edy, “Implementasi Text-Mining untuk Analisis Sentimen pada Twitter dengan Algoritma Support Vector Machine,” JST (Jurnal Sains dan Teknol., vol. 12, no. 1, pp. 129–137, Apr. 2023, doi: https://doi.org/10.23887/jstundiksha.v12i1.52358.

F. N. Hasanah and R. S. Untari, REKAYASA PERANGKAT LUNAK. Sidoarjo: UMSIDA Press, 2020.

F. Wahyuni, “PERANCANGAN SISTEM INFORMASI KAS BERBASIS WEB DENGAN MENGGUNAKAN METODE WATERFALL,” METHOMIKA J. Manaj. Inform. dan Komputerisasi Akunt., vol. 7, no. 1, pp. 138–143, Apr. 2023, doi: https://doi.org/10.46880/jmika.Vol7No1.pp138-143.

A. Rozaq, KONSEP PERANCANGAN SISTEM INFORMASI BISNIS DIGITAL. Banjarmasin,: Poliban Press, 2020. [Online]. Available: https://repository.stkipjb.ac.id/index.php/lecturer/article/view/3694/3111

M. N. Huda, M. Burhan, A. Satibi, H. A. Pradita, A. Saifudin, and I. Kusyadi, “Implementasi Black Box Testing pada Aplikasi Sistem Kasir dengan Menggunakan Teknik Equivalence Partitions,” J. Teknol. Sist. Inf. dan Apl., vol. 5, no. 2, pp. 120–127, 2022, doi: https://doi.org/10.32493/jtsi.v5i2.17645.

M. Y. Suyudi, A. P. Pratiwi, R. F. Mawahdah, Y. A. Purwara, and I. Kusyadi, “Teknik Pengujian Equivalents Partitioning pada Aplikasi Sistem Pendaftaran PAUD berbasis WEB dengan Menggunakan Black Box,” J. Inform. Univ. Pamulang, vol. 5, no. 2, pp. 198–202, 2020, doi: https://doi.org/10.32493/informatika.v5i2.5351.

V. Puri, S. Mondal, S. Das, and V. G. Vrana, “Blockchain Propels Tourism Industry—An Attempt to Explore Topics and Information in Smart Tourism Management through Text Mining and Machine Learning,” Informatics, vol. 10, no. 1, p. 9, Jan. 2023, doi: 10.3390/informatics10010009.

Downloads

Published

2024-12-27

How to Cite

[1]
J. Halim and D. Lasut, “Document Plagiarism Detection Application Using Web-Based TF-IDF and Cosine Similarity Methods: English”, bit-Tech, vol. 7, no. 2, pp. 202–213, Dec. 2024.

Issue

Section

Articles
DOI : https://doi.org/10.32877/bt.v7i2.1697
Abstract views: 17 / PDF downloads: 6