Document Plagiarism Detection Application Using Web-Based TF-IDF and Cosine Similarity Methods
English
DOI:
https://doi.org/10.32877/bt.v7i2.1697
Keywords:
Cosine Similarity, Detection , Plagiarism , TF-IDF , Website
Abstract
In the era of technology and information which is developing very rapidly recently, this has resulted in easy access to information which makes the learning process easier in the world of education, but this ease also triggers acts of plagiarism which is a serious threat to science. Plagiarism is an act of stealing or taking someone else's work without giving proper attribution or you could say without citing that person. Therefore, an application was developed that can overcome this problem, namely a plagiarism detection application that uses the TF-IDF (Term Frequency-Inverse Document Frequency) and cosine similarity algorithm methods. TF-IDF and Cosine Similarity will be implemented into the application to carry out the calculation process which will ultimately provide results in the form of a percentage of the calculations that have been carried out. This plagiarism application is designed to detect similarities between documents in the database and user documents. The processes that occur in the application include preprocessing processes, tf-idf calculations, and cosine similarity calculations. The results of the tests carried out can be said to be consistent because the results of manual and application tests show percentage results of 4% and 4.34%. The application will also be website-based, and will be designed in such a way that it can be used to detect plagiarism.
Downloads
References
L. Hermawan and M. B. Ismiati, “Aplikasi Pengecekan Dokumen Digital Tugas Mahasiswa Berbasis Website,” J. Buana Inform., vol. 11, no. 2, pp. 94–103, 2020, doi: https://doi.org/10.24002/jbi.v11i2.3706.
M. A. Shadiqi, “Memahami dan Mencegah Perilaku Plagiarisme dalam Menulis Karya Ilmiah,” Bul. Psikol., vol. 27, no. 1, p. 30, Jun. 2019, doi: 10.22146/buletinpsikologi.43058.
M. Krokoscz, “Plagiarism in articles published in journals indexed in the Scientific Periodicals Electronic Library (SPELL): a comparative analysis between 2013 and 2018,” Int. J. Educ. Integr., vol. 17, no. 1, p. 1, Dec. 2021, doi: https://doi.org/10.1007/s40979-020-00063-5.
M. A. Pratiwi and N. Aisya, “Fenomena plagiarisme akademik di era digital,” Publ. Lett., vol. 1, no. 2, pp. 16–33, Jul. 2021, doi: 10.48078/publetters.v1i2.23.
A. Riyani, M. Z. Naf’an, and A. Burhanuddin, “Penerapan Cosine Similarity dan Pembobotan TF-IDF untuk Mendeteksi Kemiripan Dokumen,” J. Linguist. Komputasional, vol. 2, no. 1, pp. 23–27, 2019, Accessed: May 07, 2024. [Online]. Available: https://scholar.archive.org/work/7t7hzdt6gnbg5e7npnbdsuoobi/access/wayback/http://inacl.id:80/journal/index.php/jlk/article/download/17/19/
Herianto, Yulisman, W. H. Manullang, and Y. Irawan, “APLIKASI DETEKSI PLAGIARISME JUDUL TUGAS AKHIR BERBASIS WEB DENGAN MENGGUNAKAN ALGORITMA RABIN-KARP ROLLING HASH (STUDI KASUS: AMIK MAHAPUTRA RIAU),” J. ILMU Komput., vol. 10, no. 2, pp. 107–112, 2021, doi: https://doi.org/10.33060/JIK/2021/Vol10.Iss2.223.
K. Abdullah et al., Metodologi Penelitian Kuantitatif, vol. 3, no. 2. Pidie Provinsi Aceh: Yayasan Penerbit Muhammad Zaini, 2022. [Online]. Available: https://www.infodesign.org.br/infodesign/article/view/355%0Ahttp://www.abergo.org.br/revista/index.php/ae/article/view/731%0Ahttp://www.abergo.org.br/revista/index.php/ae/article/view/269%0Ahttp://www.abergo.org.br/revista/index.php/ae/article/view/106
I. Tojimamatov, T. D. Abdusalomovna, H. O. A. Qizi, K. O. A. Qizi, and M. D. B. Qizi, “TEXT MINING,” Eur. J. Interdiscip. Res. Dev., vol. 13, pp. 184–189, 2023.
D. Remawati and H. Wijayanto, Buku Ajar Web Jsp Dengan Database Mysql. SEMARANG: LEMBAGA PENELITIAN DAN PENGABDIAN KEPADA MASYARAKAT UNIVERSITAS DIAN NUSWANTORO SEMARANG, 2021. [Online]. Available: https://eprints.sinus.ac.id/784/1/Buku_Ajar_Web_JSP_dengan_database_MySQL.pdf
R. A. Putri, Buku Ajar BASIS DATA, 2nd ed. Medan: PENERBIT MEDIA SAINS INDONESIA (CV. MEDIA SAINS INDONESIA), 2022.
M. Madani and Haryono, “Perancangan Website Media Promosi Produk Gerabah Menggunakan Metode Waterfall Designing a Promotional Media Website for Pottery Products Using the Waterfall Method,” J. Bumigora Inf. Technol., vol. 5, no. 2, pp. 195–204, 2023, doi: https://doi.org/10.30812/bite/v5i2.3370.
C. Ningki and Noviyanti, “Implementasi Aplikasi Penjualan Produk Tradisional Berbasis Website Menggunakan Metode Waterfall,” J. Inform., vol. 19, no. 2, pp. 107–114, 2023, doi: https://doi.org/10.52958/iftk.v19i2.6149.
C. Zong, R. Xia, and J. Zhang, Text Data Mining. China: the registered company Springer Nature Singapore, 2021. doi: https://doi.org/10.1007/978-981-16-0100-2.
A. Hermawan, I. Jowensen, Junaedi, and Edy, “Implementasi Text-Mining untuk Analisis Sentimen pada Twitter dengan Algoritma Support Vector Machine,” JST (Jurnal Sains dan Teknol., vol. 12, no. 1, pp. 129–137, Apr. 2023, doi: https://doi.org/10.23887/jstundiksha.v12i1.52358.
F. N. Hasanah and R. S. Untari, REKAYASA PERANGKAT LUNAK. Sidoarjo: UMSIDA Press, 2020.
F. Wahyuni, “PERANCANGAN SISTEM INFORMASI KAS BERBASIS WEB DENGAN MENGGUNAKAN METODE WATERFALL,” METHOMIKA J. Manaj. Inform. dan Komputerisasi Akunt., vol. 7, no. 1, pp. 138–143, Apr. 2023, doi: https://doi.org/10.46880/jmika.Vol7No1.pp138-143.
A. Rozaq, KONSEP PERANCANGAN SISTEM INFORMASI BISNIS DIGITAL. Banjarmasin,: Poliban Press, 2020. [Online]. Available: https://repository.stkipjb.ac.id/index.php/lecturer/article/view/3694/3111
M. N. Huda, M. Burhan, A. Satibi, H. A. Pradita, A. Saifudin, and I. Kusyadi, “Implementasi Black Box Testing pada Aplikasi Sistem Kasir dengan Menggunakan Teknik Equivalence Partitions,” J. Teknol. Sist. Inf. dan Apl., vol. 5, no. 2, pp. 120–127, 2022, doi: https://doi.org/10.32493/jtsi.v5i2.17645.
M. Y. Suyudi, A. P. Pratiwi, R. F. Mawahdah, Y. A. Purwara, and I. Kusyadi, “Teknik Pengujian Equivalents Partitioning pada Aplikasi Sistem Pendaftaran PAUD berbasis WEB dengan Menggunakan Black Box,” J. Inform. Univ. Pamulang, vol. 5, no. 2, pp. 198–202, 2020, doi: https://doi.org/10.32493/informatika.v5i2.5351.
V. Puri, S. Mondal, S. Das, and V. G. Vrana, “Blockchain Propels Tourism Industry—An Attempt to Explore Topics and Information in Smart Tourism Management through Text Mining and Machine Learning,” Informatics, vol. 10, no. 1, p. 9, Jan. 2023, doi: 10.3390/informatics10010009.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 bit-Tech
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
I hereby assign and transfer to bit-Tech all exclusive copyright ownership rights to the above work. This includes, but is not limited to, the right to publish, republish, downgrade, distribute, transmit, sell, or use the work and other related materials worldwide, in whole, or in part, in all languages, in electronic, printed, or any other form of media, now known or hereafter developed and reserves the right to permit or license a third party to do any of the above. I understand that this exclusive right will belong to bit-Tech from the date the article is accepted for publication. I also understand that bit-Tech, as the copyright owner, has sole authority to license and permit reproduction of the article. I understand that, except for copyright, any other proprietary rights associated with the work (e.g. patents or other rights to any process or procedure) must be retained by the author. In addition, I understand that bit-Tech permits authors to use their papers in any way permitted by the applied Creative Commons license.