A Comparative Study On Hadoop Ecosystem: Hive And HBase – A Literature Review
Isi Artikel Utama
Moh Rifqi Zamzami
Moh Riswandha Imawan
Imam Ghozali
Dalam era di mana volume data yang besar semakin menjadi tantangan utama bagi perusahaan, penelitian ini bertujuan untuk melakukan tinjauan literatur yang komprehensif terkait dengan perbandingan antara Hive dan HBase dalam ekosistem Hadoop. Dengan menggunakan pendekatan systematic literature review, penelitian ini mengumpulkan artikel-artikel terpilih dari berbagai sumber, termasuk jurnal nasional terakreditasi SINTA, jurnal internasional, dan jurnal akreditasi Scopus. Berhasil disaring artikel-artikel yang relevan dengan fokus pembahasan seperti arsitektur, proses pengolahan data, dan penerapan HBase dan Hive. Hasil dari penelitian ini memberikan pemahaman yang lebih mendalam tentang peran dan kontribusi masing-masing komponen dalam memproses big data pada ekosistem Hadoop. Dengan memahami bagaimana HBase dan Hive memproses data serta peran masing-masing dalam mengamankan informasi, perusahaan dapat memilih solusi yang paling cocok untuk kebutuhan pengolahan dan keamanan data mereka. Penelitian ini juga membandingkan kinerja dari kedua alat tersebut. Hasilnya menunjukkan bahwa HBase memiliki kinerja yang lebih baik dalam operasi read/write yang cepat dan acak, sedangkan Hive lebih efisien dalam melakukan query data. Namun, kinerja dari kedua alat ini juga dipengaruhi oleh faktor-faktor seperti ukuran data, jumlah node, dan konfigurasi sistem. Penelitian ini memberikan pemahaman mendalam tentang peran dan kontribusi HBase dan Hive dalam mengatasi tantangan pemrosesan big data, serta memberikan landasan bagi pengembangan lebih lanjut dalam hal ini. Dengan demikian, penelitian ini diharapkan dapat memberikan wawasan yang berharga bagi para praktisi dan peneliti di bidang teknologi informasi, khususnya dalam konteks pengolahan big data menggunakan ekosistem Hadoop.
D. Reinsel, J. Gantz, and J. Rydning, “The Digitization of the World From Edge to Core,” Framingham, MA 01701, Nov. 2018. Accessed: Dec. 26, 2023. [Online]. Available: https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper
J. Leonard, “19 Data and Analytics Predictions Through 2025,” Business2Community. Accessed: Dec. 23, 2023. [Online]. Available: https://www.business2community.com/big-data/19-data-and-analytics-predictions-through-2025-02178668
J. Zhang, G. Wu, X. Hu, and X. Wu, “A distributed cache for Hadoop Distributed File System in real-time cloud services,” in Proceedings - IEEE/ACM International Workshop on Grid Computing, 2012, pp. 12–21. doi: 10.1109/Grid.2012.17.
M. U. Hassan, I. Yaqoob, S. Zulfiqar, and I. A. Hameed, “A comprehensive study of HBase storage architecture-a systematic literature review,” Symmetry (Basel), vol. 13, no. 1, pp. 1–21, Jan. 2021, doi: 10.3390/sym13010109.
Mint Fox, “How much data does TikTok use?,” Mint Mobile. Accessed: Dec. 23, 2023. [Online]. Available: https://www.mintmobile.com/blog/how-much-data-does-tiktok-use/
L. Zhang and C. Malife, “Processing billions of events in real time at Twitter,” Twitter Blog. Accessed: Dec. 23, 2023. [Online]. Available: https://blog.twitter.com/engineering/en_us/topics/infrastructure/2021/processing-billions-of-events-in-real-time-at-twitter-
J. Tidy, “TikTok: What is the app and how much data does it collect?,” BBC. Accessed: Dec. 23, 2023. [Online]. Available: https://www.bbc.com/news/technology-53476117
L. George, HBase: The Definitive Guide : Random Access to Your Planet-Size Data, First Edition. Sebastopol, CA, USA: O’Reilly Media, Inc, 2011.
N. Azizah and H. Saptono, “UJI PERFORMA DAN PERBANDINGAN RDBMS MYSQL DAN HIVE-HADOOP,” Jurnal Informatika Terpadu, vol. 6, no. 1, pp. 20–28, Mar. 2020, [Online]. Available: https://journal.nurulfikri.ac.id/index.php/JIT
W. B. Alfajri, A. Puji Widodo, and K. Adi, “Penerapan Tata Kelola Teknologi Informasi pada Instansi: Systematic Literature Review,” Jurnal Nasional Teknologi dan Sistem Informasi, vol. 7, no. 3, pp. 191–198, Jan. 2022, doi: 10.25077/teknosi.v7i3.2021.191-198.
H. Matallah, G. Belalem, and K. Bouamrane, “Evaluation of NoSQL Databases,” International Journal of Software Science and Computational Intelligence, vol. 12, no. 4, pp. 71–91, Sep. 2020, doi: 10.4018/ijssci.2020100105.
F. Ye, J. Sun, Z. Du, N. Nedjah, W. Liu, and L. Lan, “Efficient data replay mechanism of sensor stream data based on concurrent buffer pool,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 10, pp. 10293–10303, Nov. 2022, doi: 10.1016/j.jksuci.2022.10.021.
J. Kalajdjieski, M. Raikwar, N. Arsov, G. Velinov, and D. Gligoroski, “Databases fit for blockchain technology: A complete overview,” Blockchain: Research and Applications, vol. 4, no. 1. Zhejiang University, Mar. 01, 2023. doi: 10.1016/j.bcra.2022.100116.
M. J. Suárez-Cabal, P. Suárez-Otero, C. de la Riva, and J. Tuya, “MDICA: Maintenance of data integrity in column-oriented database applications,” Comput Stand Interfaces, vol. 83, Jan. 2023
Z. M. Zhu, F. Q. Xu, and X. Gao, “Research on school intelligent classroom management system based on internet of things,” in Procedia Computer Science, Elsevier B.V., 2020, pp. 144–149. doi: 10.1016/j.procs.2020.02.037.
C. Feng and B. Li, “Research of Temporal Information Index Strategy Based on HBase,” in Procedia Computer Science, Elsevier B.V., 2017, pp. 367–372. doi: 10.1016/j.procs.2017.03.119.
M. Sharma and M. Bundele, “Analysis of NoSQL schema design approaches using HBase for GIS data,” in Procedia Computer Science, Elsevier B.V., 2019, pp. 59–65. doi: 10.1016/j.procs.2019.05.027.
L. Ding and L. Hsin Cheng, “Introduction and Performance: An Overview of Hive,” New York, USA, 2017.
D. Chrimes and H. Zamani, “Using Distributed Data over HBase in Big Data Analytics Platform for Clinical Services,” Comput Math Methods Med, vol. 2017, 2017, doi: 10.1155/2017/6120820.
Z. Bousalem, I. El Guabassi, and I. Cherti, “Relational databases versus HBase: An experimental evaluation,” Advances in Science, Technology and Engineering Systems, vol. 4, no. 2, pp. 395–401, 2019, doi: 10.25046/aj040249.
R. Sethy, S. K. Dash, and M. Panda, “Performance comparison between apache hive and oracle SQL for big data analytics,” in Advances in Intelligent Systems and Computing, Springer Verlag, 2018, pp. 130–141. doi: 10.1007/978-3-319-60618-7_14.
A. Preetih and J. Elavarasi, “BIG DATA ANALYTICS USING HADOOP TOOLS – APACHE HIVE VS APACHE PIG,” International Journal of Emerging Technology in Computer Science & Electronics (IJETCSE), vol. 24, no. 3, Feb. 2017.
N. Ahmed, S. Ahamed, J. I. Rafiq, and S. Rahim, “Data processing in Hive vs. SQL server: A comparative analysis in the query performance,” in 2017 IEEE 3rd International Conference on Engineering Technologies and Social Sciences, ICETSS 2017, Institute of Electrical and Electronics Engineers Inc., Jul. 2017, pp. 1–5. doi: 10.1109/ICETSS.2017.8324202.
S. Arora, A. Verma, R. Vasuja, and R. Vasuja, “An Overview of Apache Pig and Apache Hive,” International Journal of Scientific Research in Computer Science, Engineering and Information Technology, pp. 432–436, Mar. 2019, doi: 10.32628/cseit195250.
N. Y. Wicaksono, E. Sakti Pramukantoro, and W. Yahya, “Perbandingan Kinerja HBase dan MongoDB Sebagai Backend IoT Data Storage,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 2, no. 12, pp. 6842–6848, 2018, [Online]. Available: http://j-ptiik.ub.ac.id
D. Malik Ibrahim, R. Primananda, and M. Data, “Perbandingan Performa Database Apache HBase dan Apache Cassandra Sebagai Media Penyimpanan Data Sensor Internet of Things,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 2, no. 8, pp. 2943–2949, Aug. 2018, [Online]. Available: http://j-ptiik.ub.ac.id
B. KHONDE Noel, B. MANGATA Bopatriciat, M. MUKENDI Eugène, and B. CHRISTIAN Parfum, “STUDY AND IMPROVEMENT OF PERFORMANCE OF NoSQL DATABASES: MongoDB, HBase and OrientDB,” IJISCS (International Journal of Information System and Computer Science), pp. 164–172, 2022, [Online]. Available: http://www.oracle.com/technetwork/java/ja
K. D. Mahajan and V. D. Chaudhari, “Hive: A Literature Review,” International Journal of Innovations in Engineering and Science, vol. 4, no. 10, p. 425003, 2019, [Online]. Available: www.ijies.net
A. A. Khaleel, A. N. Kareem, and L. H. Mahdi, “Predictive analytics on COVID-19 data using Hive based on Hadoop cluster,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 31, no. 2, pp. 945–956, Aug. 2023, doi: 10.11591/ijeecs.v31.i2.pp945-956.
Z. Cao, H. Dong, Y. Wei, S. Liu, and D. H. C. Du, “IS-HBase: An In-Storage Computing Optimized HBase with I/O Offloading and Self-Adaptive Caching in Compute-Storage Disaggregated Infrastructure,” ACM Transactions on Storage, vol. 18, no. 2, May 2022, doi: 10.1145/3488368.
ZamZami, M. R., Wibowo, N. C., Ana Wati, S. F., Ghozali, I., & Imawan, M. R. (2024). Rancang Bangun Sistem Informasi Berbasis Web Menggunakan Metode Waterfall. CYCLOTRON, 7(01), 61–66. https://doi.org/10.30651/cl.v7i01.21084
Ghozali, I., Riswandha Imawan, M., Rifqi Zamzami, M., Zuhri, S., Pembagunan Nasional Veteran Jawa Timur, U., & Muhammadiyah Surabaya, U. (2023). WEBMAP UNTUK PENGEMBANGAN JALUR IRIGASI BARU DI KABUPATEN LAMONGAN. 1(5). https://doi.org/10.47353/satukata.v1i5.1401
Riswandha Imawan, M., Rifqi Zamzami, M., Ghozali, I., Muhammadiyah Surabaya, U., & Pembangunan Nasional Veteran Jawa Timur, U. (2023). PANDANGAN ORANG TUA DALAM PENGGUNAAN APLIKASI MEDIA SOSIAL DI ANAK REMAJA (STUDI KASUS: KOTA SURABAYA). 1(4). https://doi.org/10.47353/satukata.v1i4.1015
Riswandha Imawan, M. (2023). MODEL PEMBELAJARAN KOOPERATIF TIPE TPS-TGT PADA PEMBELAJARAN MATEMATIKADI KELAS VIII SMP N 1 SEMARANG. 1(1), 1–9. https://doi.org/10.3342/jursih.v1i1.14
Riswanda, M., & Ghozali, I. (2020). Tips & Trick Android Root:Cara Cepat dan Mudah Belajar Tips & Trick Android. Jakad Media Publishing. www.nandroid19.com



