CAMINHANDO POR MILHÕES DE DIMENSÕES

Alan Martins da Cruz

doi:10.56238/arev8n3-119

Autores/as

Alan Martins da Cruz Autor/a

DOI:

https://doi.org/10.56238/arev8n3-119

Palabras clave:

Aprendizaje Automático, Geometría de Error, Alta Dimensión, Optimización, Inferencia Bayesiana, Sistemas Sociotécnicos

Resumen

Este trabajo propone una interpretación geométrica del aprendizaje automático como un proceso de navegación en un espacio de parámetros de alta dimensión, cuyo relieve está determinado por la función de error. Las redes neuronales se analizan como sistemas que exploran un paisaje de optimización abstracto, donde la arquitectura, los datos, los algoritmos de entrenamiento y las fuentes de incertidumbre configuran la topología del aprendizaje. El enfoque establece conexiones entre la dinámica de optimización, la capacidad de generalización y la inferencia bayesiana, sugiriendo que dichos fenómenos pueden comprenderse dentro de un marco geométrico unificado. Más allá del ámbito técnico, se analiza cómo esta perspectiva influye en la interpretación, la gobernanza y el impacto sociotécnico de los sistemas de inteligencia artificial, ofreciendo un lenguaje conceptual integrado para analizar su desempeño en contextos humanos y computacionales.

Descargas

Los datos de descarga aún no están disponibles.

Referencias

[1] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016.

[2] Christopher M. Bishop. Pattern Recognition and Machine Learning. Sprin- ger, 2006.

[3] Shun-ichi Amari. Information Geometry and Its Applications. Springer, 2016.

[4] Vladimir Vapnik. Statistical Learning Theory. Wiley, 1998.

[5] Hao Li, Zheng Xu, Gavin Taylor, Christoph Studer, and Tom Goldstein. Visualizing the Loss Landscape of Neural Nets. Advances in Neural Information Processing Systems, 2018.

[6] Sepp Hochreiter and Jürgen Schmidhuber. Flat Minima. Neural Computation, 1997.

[7] Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, and Ping Tak Peter Tang. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima. International Conference on Learning Representations, 2017.

[8] Pratik Chaudhari and Stefano Soatto. Stochastic Gradient Descent Per- forms Variational Inference. arXiv preprint arXiv:1710.11029, 2018.

[9] Stephan Mandt, Matthew D. Hoffman, and David M. Blei. Stochastic Gradient Descent as Approximate Bayesian Inference. Journal of Machine Learning Research, 2017.

[10] Samuel L. Smith and Quoc V. Le. A Bayesian Perspective on Generalization and Stochastic Gradient Descent. International Conference on Learning Representations, 2020.

[11] Diederik P. Kingma and Jimmy Ba. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations, 2015.

[12] Tijmen Tieleman and Geoffrey Hinton. Lecture 6.5 — RMSProp: Divide the Gradient by a Running Average of Its Recent Magnitude. COURSERA: Neural Networks for Machine Learning, 2012.

[13] Sebastian Ruder. An Overview of Gradient Descent Optimization Algorithms. arXiv preprint arXiv:1609.04747, 2016.

[14] Andrew M. Saxe, James L. McClelland, and Surya Ganguli. Exact Solutions to the Nonlinear Dynamics of Learning in Deep Linear Neural Networks. International Conference on Learning Representations, 2014.

[15] Connor Shorten and Taghi M. Khoshgoftaar. A Survey on Image Data Augmentation for Deep Learning. Journal of Big Data, 2019.

[16] Yann N. Dauphin, Razvan Pascanu, Çağlar Gülçehre, Kyunghyun Cho, Surya Ganguli, and Yoshua Bengio. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. Advances in Neural Information Processing Systems (NeurIPS), 2014.

[17] A. Gal and Z. Ghahramani, “Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning,” in International Conference on Machine Learning, 2016.

[18] A. Kendall and Y. Gal, “What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?” in Advances in Neural Information Processing Systems, 2017.

[19] C. Blundell, J. Cornebise, K. Kavukcuoglu, and D. Wierstra,“Weight Uncertainty in Neural Networks,” in International Conference on Machine Learning (ICML), 2015.

[20] T. Garipov, P. Izmailov, D. Podoprikhin, D. Vetrov, and A. G. Wilson,“Loss Surfaces, Mode Connectivity, and Fast Ensembling,” in Advances in Neural Information Processing Systems, 2018.

[21] F. Draxler, K. Veschgini, M. Salmhofer, and F. Hamprecht,“Essentially No Barriers in Neural Network Energy Landscape,” in International Conference on Machine Learning, 2018.

[22] S. Hochreiter and J. Schmidhuber,“Long Short-Term Memory,” Neural Computation, 1997.

[23] K. Cho, B. van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares,H. Schwenk, and Y. Bengio,“Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation,” in Proceedings of EMNLP, 2014.

CAMINANDO A TRAVÉS DE MILLONES DE DIMENSIONES

Autores/as

DOI:

Palabras clave:

Resumen

Descargas

Referencias

Descargas

Publicado

Número

Sección

Cómo citar

Scopus e googs

Enviar un artículo

Idioma

Últimas publicaciones

Información

Palabras clave