
Scientia et Technica Año XXVIII, Vol. 30, No. 01, enero-marzo de 2025. Universidad Tecnológica de Pereira
Based on this, a 87% precision was achieved, indicating that it
is a high-performance model. Subsequently, performance metrics
by class were analyzed, finding that both precision and recall
were 71% and 90% for both classes, and an F1-score of 64%
and 92% was observed for both class 0 and class 1. This
concluded that the supervised KNN algorithm is the most
accurate among those studied in this research.
The results obtained in this research demonstrate the high
effectiveness of the KNN model, achieving an accuracy of 87%
and an F1-Score of 92% for efficient suppliers. This validates the
model’s ability to accurately classify categorical data in a
business context.
These findings are consistent with studies such as that of Zhang
et al. [17], who applied hybrid models for supplier selection in
the manufacturing industry, reporting high levels of precision.
Similarly, Kim and Lee [18] showed that clustering algorithms
applied prior to supervised learning significantly improved
segmentation performance in complex industrial environments.
Unlike those studies, which were developed in advanced
organizational contexts in Asia, this research adapts the models
to a Colombian setting using data extracted from a local ERP
system and tailored to the conditions of a real manufacturing
company in Pereira. This contextual difference explains why the
KNN model proved more effective than SVM or ANN in this
case, particularly due to its lower computational cost and its
compatibility with visual tools such as Dash and Plotly [16].
In conclusion, this study offers a methodological contribution by
demonstrating that classification models like KNN, when
supported by robust feature reduction and interactive
visualization processes, can yield results comparable to
international research and remain applicable to emerging
business environments.
V. CONCLUSIONS AND FUTURE WORK
This work presented an approach for characterizing and grouping
categorical data using ordinal variables. The variable "Línea,"
considered key by the company expert, was used for
dimensionality reduction and supplier classification.
Preprocessing techniques such as Z-score scaling were applied,
improving experimental results and providing structured data for
supervised and unsupervised algorithms.
The clustering approach included K-Means to generate labels,
showing great similarity with the target variable. The labels were
compared with the Gaussian Mixture Models (GMM) algorithm,
achieving 97% accuracy with 20 features, 96% with 13 features
(PCA), and 100% with 9 features. However, the expert preferred
using the 20 features from K-Means due to their business
relevance. Finally, an optimized KNN model with various kernels
and features was constructed, successfully identifying and
classifying suppliers with high accuracy. This model is useful for
analysts and investors in financial decision-making for the
vehicle manufacturing company in Pereira, demonstrating the
effectiveness of the K-Means and GMM algorithms in improving
data separability and reducing computational times.
Future work will focus on advanced predictive analysis and
Machine Learning techniques to optimize the identification of
critical financial variables and improve operational efficiency.
Deep neural networks, real-time data processing, advanced
interactive dashboards, and the expansion of the model to other
business areas will be explored. A feedback loop system for
continuous improvement, explainable artificial intelligence
techniques to ensure transparency, and interdisciplinary
collaborations are also proposed to enhance financial
management and establish a solid foundation for the application
of AI and data analysis in the organization.
The performance of both supervised and unsupervised algorithms
in this study is inherently conditioned by the characteristics of the
dataset, which originates from a single manufacturing company
in Pereira. As such, the predictive models may not generalize
directly to other organizations without conducting similar
analyses to determine the most suitable algorithms for each
specific context. While the current approach effectively classifies
supplier types within the studied company, its applicability to
other environments requires contextual adaptation and possible
retraining of the models.
IV. VI. ACKNOWLEDGMENTS
V.
Thanks to the Master’s program in Systems and Computing
Engineering at the Universidad Tecnológica de Pereira, to the
Director of the Institutional Corporation for Administration and
Finance (CIAF), Luis Ariosto Serna, and to Ph.D. Julián David
Echeverry.
VI. VII. REFERENCES
[1] Gonzalez Disla, R. R. (2013). Big data: El cambio en los paradigmas
de la información.
ResearchGate. https://www.researchgate.net/profile/Renato-
GonzalezDisla/publication/311950584_BIG_DATA_El_Cambio_en_el
_Paradigma_de_la_Informacion/li
nks/58645e6208ae329d6203a9d5/BIG-DATA-El-Cambio-en-el-
Paradigma-de-laInformacion.pdf
[2] Sablón, B., et al. (2019). Gestión de la información y toma de
decisiones en organizaciones educativas. Revista de Ciencias Sociales,
XXV(2), 120–130. https://doi.org/10.31876/rcs.v25i2.27341
[3] Amo Cubillo, A. (n.d.). Research data management. Universidad de
Valladolid. Recuperado el 25 de agosto de 2024, de
http://uvadoc.uva.es/handle/10324/31269
[4] Chien, C.-F., Chang, Y.-J., & Wang, W.-C. (2018). AI and big data
analytics for wafer fab energy saving and chiller optimization to
empower intelligent manufacturing. IEEE Xplore.
https://ieeexplore.ieee.org/document/8374411