Model for Predicting Suppliers in the Financial Sector of the Vehicle Manufacturing Company in the City of Pereira


Authors

DOI:

https://doi.org/10.22517/23447214.25692

Abstract

This research work presents the development of a model using data mining techniques to identify financial variables in a manufacturing company of automotive vehicle bodies in Pereira. The study is structured into four key phases. The first phase focuses on data preprocessing, including characterization, normalization, and dimensionality reduction using PCA, Relief, and Correlation. The second phase applies unsupervised learning with K-means and Gaussian Mixture Models (GMM) to cluster and validate data based on a defined target variable. In the third phase, supervised classifiers such as Bayesian Classifier, Artificial Neural Networks, Support Vector Machines, and KNN are employed to predict supplier efficiency, optimizing investment and costing processes. Finally, the fourth phase integrates preprocessing and prediction into a practical form, using libraries such as Plotly and Dash for detailed visualizations, and tools like GitHub and Heroku for application development. This study highlights the importance of artificial intelligence in business decision-making, demonstrating how data science techniques and visualization tools can facilitate the interpretation and utilization of data analysis results.

Downloads

Download data is not yet available.

References

Gonzalez Disla, R. R. (2013). Big data: El cambio en los paradigmas de la información.

ResearchGate. https://www.researchgate.net/profile/Renato-GonzalezDisla/publication/311950584_BIG_DATA_El_Cambio_en_el_Paradigma_de_la_Informacion/li

nks/58645e6208ae329d6203a9d5/BIG-DATA-El-Cambio-en-el-Paradigma-de-laInformacion.pdf

Sablón, B., et al. (2019). Gestión de la información y toma de decisiones en organizaciones educativas. Revista de Ciencias Sociales, XXV(2), 120–130.

Amo Cubillo, A. (n.d.). Research data management. Universidad de Valladolid. Recuperado el 25 de agosto de 2024, de http://uvadoc.uva.es/handle/10324/31269

Chien, C.-F., Chang, Y.-J., & Wang, W.-C. (2018). AI and big data analytics for wafer fab energy saving and chiller optimization to empower intelligent manufacturing. IEEE Xplore. https://ieeexplore.ieee.org/document/8374411

Ralambondrainy, H. (1995). A conceptual version of the K-means algorithm. Pattern Recognition Letters, 16(11), 1147–1157. https://doi.org/10.1016/0167-8655(95)00075-R

Liang, Y., Quan, D., Wang, F., Jia, X., Li, M., & Li, T. (2020). Financial big data analysis and early warning platform: A case study. IEEE Access, 8, 36515–36526. https://doi.org/10.1109/ACCESS.2020.2969039

Romero, A. C., Sanabria, J. S. G., & Cuervo, M. C. (n.d.). Utilidad y funcionamiento de las bases de datos NoSQL. Facultad de Ingeniería, 21(33), 21–32. Recuperado el 25 de agosto de 2024, de https://www.redalyc.org/articulo.oa?id=413940772003

Iturria Aguinaga, A. (n.d.). Reduction of false positives in online outlier detection over time series using ensemble learning. Recuperado el 25 de agosto de 2024, de https://hdl.handle.net/10481/82540

Hasan, B. M. S., & Abdulazeez, A. M. (2021). A review of principal component analysis algorithm for dimensionality reduction. Journal of Soft Computing and Data Mining, 2(1), 1–10. https://doi.org/10.30880/jscdm.2021.02.01.003

Macqueen, J. (n.d.). Some methods for classification and analysis of multivariate observations. Recuperado el 25 de agosto de 2024, de https://doi.org/

Karatzoglou, A., Meyer, D., Wien, W., & Hornik, K. (n.d.). Journal of Statistical Software: Support vector machines in R. Recuperado el 25 de agosto de 2024, de http://www.jstatsoft.org/

Rasmussen, C. E. (2004). Gaussian processes in machine learning. Lecture Notes in Computer Science, 3176, 63–71. https://doi.org/10.1007/978-3-540-28650-9_4

Castrillón, O. D., Sarache, W., & Ruiz-Herrera, S. (2020). Prediction of academic performance using artificial intelligence techniques. Formación Universitaria, 13(1), 93–102. https://doi.org/10.4067/S0718-50062020000100093 [14] Kira, K., & Rendell, L. A. (1992). A practical approach to feature selection. In Proceedings of the ninth international workshop on Machine learning (pp. 249-256). Morgan Kaufmann.

Ahmad, A., & Dey, L. (2007). A method to compute distance between two categorical values of same attribute in unsupervised learning for categorical data set. Pattern Recognition Letters, 28(1), 110–118. https://doi.org/10.1016/J.PATREC.2006.06.006

Moya, R. (n.d.). Selección del número óptimo de clusters. Recuperado el 25 de agosto de 2024, de https://jarroba.com/seleccion-del-numero-optimo-clusters/

Fin de Máster, T., Tutor, Q., Pérez, J., Cotutor, Á. A., & Bernabeu, P. (n.d.). Métodos de clasificación con Python: Aplicaciones empresariales. Universitat Politècnica de València, Escuela Politécnica Superior de Alcoy. Recuperado el 25 de agosto de 2024, de https://riunet.upv.es/handle/10251/195236

Downloads

Published

2025-04-02

How to Cite

Romero Cardenas , D. C. ., Ospina Mejía, A., & Serna Cardona, L. A. (2025). Model for Predicting Suppliers in the Financial Sector of the Vehicle Manufacturing Company in the City of Pereira. Scientia Et Technica, 30(01), 26–35. https://doi.org/10.22517/23447214.25692

Issue

Section

Sistemas y Computación