A Comprehensive Review of Data Collection Methods and Challenges in Machine Learning
ID:155
View protection:Participant Only
Updated time:2025-12-23 13:21:19 Views:92
Online
Abstract
The collection of data significantly influences generalizability, reliability, and model accuracy in key aspects of machine learning (ML). This work presents a thorough review of many data collection methods along with their challenges in effective use. Current methods compromise the quality of ML outputs by means of bias, inconsistency, scalability limits, and lack of standardization. Data gathering strategies are selected depending on Taxonomy Analysis with ML (TA-ML), thereby addressing these issues. Based on intended use, data type, source dependability, and collection size, the framework arranges methods. This rigorous approach helps practitioners to choose appropriate strategies suited for the conditions of their assignment. By means of the recommended strategy, users will be able to reduce noise, enhance data relevance, and reduce bias, thus increasing model performance. Moreover highlighted in the study is how numerous ML disciplines' structure helps sensible decision-making. Results reveal that the proposed taxonomy-based strategy properly addresses normal data collection issues and helps more accurate and efficient ML development. Reaching the decision-making mark with 98.32% accuracy by 97.6%, efficiency by 96.3%, the recommended strategy is evident.
Keywords
Data Collection, Machine Learning, Taxonomy Analysis, Data Quality, Framework Design, Model Accuracy.
Post comments