A new framework for training a CNN with a hardware-software architecture

dc.contributor.advisorCamargo Bareño, Carlos Ivan
dc.contributor.authorParra Prada, Dorfell Leonardo
dc.contributor.researchgroupGrupo de Física Nuclear de la Universidad Nacionalspa
dc.date.accessioned2023-08-14T15:43:03Z
dc.date.available2023-08-14T15:43:03Z
dc.date.issued2023-04
dc.descriptionilustraciones, diagramas., fotografías a colorspa
dc.description.abstractFacial Expression Recognition (FER) systems classify emotions by using geometrical approaches or Machine Learning (ML) algorithms such as Convolutional Neural Networks (CNNs). However, designing these systems could be a challenging task that depends on the data set's quality or the designer's expertise. Moreover, CNNs inference requires a large amount of memory and computational resources, making it unfeasible for low-cost embedded systems. Hence, although GPUs are expensive and have high power consumption, they are frequently employed because they considerably reduce the inference time compared to CPUs. On the other hand, SoCs implemented in FPGAs could employ less power and support pipelining. However, the floating point representation may result in intricate and larger designs that are only suitable for high-end FPGAs. Therefore, custom hardware-software architectures that maintain acceptable performance while using simpler data representations are advantageous. To address these challenges, this work proposes a design methodology for CNN-based FER systems. The methodology includes the preprocessing, the Local binary pattern (LBP), and the data augmentation. Besides, several CNN models were trained with TensorFlow and the JAFFE data set to validate the methodology. In each test, the relationship between parameters, layers, and performance was studied, as were the overfitting and underfitting scenarios. Furthermore, this work introduces the model M6, a single channel CNN that reaches an accuracy of 94% in less than 30 epochs. M6 has 306.182 parameters in 1.17 MB. In addition, the work also employs the quantization methodology from TensorFlow Lite (tflite), to compute the inference of a CNN using integer numbers. M6's accuracy dropped from 94.44% to 83.33% after quantization, the number of parameters increased from 306.182 to 306.652, and the model size decreased almost 4x from 1.17 MB to 0.3 MB. Also, the work presents a custom hardware-software architecture to accelerate CNNs known as the FER SoC, which reproduces the main tflite operations in hardware. Hence, as the integer numbers are fully mapped to hardware registers, the accelerator results will be similar to their software counterparts. The architecture has been tested on a Zybo-Z7 development board with 1 GB RAM and the Zynq7 device XCZ7020-CLG400. Moreover, it was observed that the architecture got the same accuracy but was 20% slower than a laptop equipped with an AMD CPU with 16 threads, 16 GB of RAM and a Nvidia GTX1660Ti GPU. Therefore, it is recommended to assess whether the trade-off between quantization and inference time is worth it for the target application. Lastly, another contribution is the framework for CNNs' training in custom hardware-software architectures known as Resiliency. It has been used to train and run the inference of the single-channel M6 model. Resiliency provides the design files needed as well as the Pynq 2.7 image created for running ML frameworks such as TensorFlow and PyTorch. Although the training time was slow, the accuracy and loss were consistent to traditional approaches. However, the execution time could be improved by utilizing bigger FPGAs with MPSoCs like the Zynq Ultrascale family. (Texto tomado de la fuente)eng
dc.description.abstractLos sistemas de reconocimiento de expresiones faciales (FER) clasifican emociones usando estrategias geométricas o algoritmos de Machine Learning (ML) como redes neuronales convolucionales (CNNs). Sin embargo, el diseño de estos sistemas es una tarea compleja que depende de la calidad del set de datos y la experiencia del diseñador. Además, la inferencia de las CNNs requiere recursos de memoria y cómputo que hacen inviable el uso de sistemas embebidos de bajo costo. Igualmente, aunque las GPUs son costosas y presentan un alto consumo de potencia, se utilizan frecuentemente porque reducen el tiempo de ejecución en comparación a las CPU. Por otro lado, los sistemas on-chip (SoCs) implementados en FPGAs emplean menos potencia y soportan cómputo en paralelo. No obstante, representaciones numéricas como punto flotante pueden resultar en diseños complejos sólo adecuadas para FPGAs de gama alta. Por esta razón, el uso de arquitecturas de hardware-software que emplean representaciones numéricas sencillas y mantienen un desempeño aceptable son favorables. Para afrontar estos desafíos, este trabajo propone una metodología de diseño para sistemas FER basados en CNNs. La metodología incluye el preprocesamiento, el patrón local binario (LBP), y la aumentación de datos. Asimismo, para validar la metodología varios modelos CNNs fueron entrenados con TensorFlow y el set de datos JAFFE. En cada test, se estudia la relación entre los parámetros, las capas y el desempeño, el subentrenamiento y el sobreentrenamiento. Además, este trabajo introduce un modelo CNN de un canal llamado M6 que alcanza una exactitud de 94% en menos de 30 épocas. M6 tiene 306.182 parámetros y emplea 1.17 MB. El trabajo también utiliza la estrategia de quantización de TensorFlow Lite (tflite) para computar la inferencia de la CNN empleando números enteros. Después de la quantización, la exactitud de M6 se redujo de 94.44% a 83.33%, el número de parámetros aumentó de 306.182 a 306.652, y el tamaño del modelo se redujo aproximadamente 4 veces pasando de 1.17 MB a 0.3 MB. Igualmente, el trabajo presenta el FER SoC, una arquitectura de hardware-software para la aceleración de CNNs que reproduce las operaciones principales de tflite en hardware. En este caso, como los registros en hardware soportan las operaciones enteras, los resultados del acelerador son similares a la contraparte software. El FER SoC fue implementado en el sistema de desarrollo Zybo-Z7, el cuál emplea 1 GB RAM, y la FPGA Zynq XCZ7020-CLG400. Además, se observó que la arquitectura obtuvo la misma exactitud que una laptop con una CPU AMD de 16 threads, 16 GB de RAM, y la GPU de Nvidia GTX1660Ti, pero fue 20% más lenta. Por lo que se recomienda evaluar sí el intercambio entre la quantización y el tiempo de ejecución es suficiente para la aplicación objetivo. Por último, otra contribución del trabajo es el framework Resiliency, el cuál permite el entrenamiento y la inferencia de modelos CNN de un solo canal. Resiliency, provee los archivos de diseño necesarios y la imagen Pynq 2.7 creada para ejecutar los frameworks de ML TensorFlow y PyTorch. Aunque el tiempo de entrenamiento fue lento, la exactitud y la perdida son consistentes con las estrategias tradicionales. Sin embargo, el tiempo de ejecución puede ser mejorado usando FPGAs de gama alta con MPSoCs como las Zynq Ultrascale.spa
dc.description.degreelevelDoctoradospa
dc.description.degreenameDoctor en Ingenieríaspa
dc.description.researchareaDiseño digital, sistemas embebidos.spa
dc.description.sponsorshipN/Aspa
dc.format.extentxiii, 100 páginasspa
dc.format.mimetypeapplication/pdfspa
dc.identifier.instnameUniversidad Nacional de Colombiaspa
dc.identifier.reponameRepositorio Institucional Universidad Nacional de Colombiaspa
dc.identifier.repourlhttps://repositorio.unal.edu.co/spa
dc.identifier.urihttps://repositorio.unal.edu.co/handle/unal/84550
dc.language.isoengspa
dc.publisherUniversidad Nacional de Colombiaspa
dc.publisher.branchUniversidad Nacional de Colombia - Sede Bogotáspa
dc.publisher.facultyFacultad de Ingenieríaspa
dc.publisher.placeBogotá, Colombiaspa
dc.publisher.programBogotá - Ingeniería - Doctorado en Ingeniería - Ingeniería Eléctricaspa
dc.relation.referencesY. Tian, T. Kanade, and J. Cohn, “Recognizing action units for facial expression analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 97–115, 2001.spa
dc.relation.referencesM. Lyons, M. Kamachi, and Gyoba, “Coding facial expressions with gabor wavelets (ivc special issue),” Modified version of a conference article, that was invited for publication in a special issue of Image and Vision Computing dedicated to a selection of articles from the IEEE Face and Gesture 1998 conference. The special issue never materialized., 2020. [Online]. Available: https://zenodo.org/record/4029680spa
dc.relation.referencesK. Anas, “Facial expression recognition in jaffe database,” https://github.com/anas-899/ facial-expression-recognition-Jaffe, last accessed 12 Dec 2021.spa
dc.relation.referencesS. Christos, T. Georgios, Z. Stefanos, and P. Maja, “300 faces in-the-wild challenge: The first facial landmark localization challenge,” Proceedings of IEEE Int’l Conf. on Computer Vision (ICCV-W), 2013. [Online]. Available: https://ibug.doc.ic.ac.uk/resources/facial-point-annotations/spa
dc.relation.referencesB. Yang, J. Cao, R. Ni, and Y. Zhang, “Facial expression recognition using weighted mixture deep neural network based on double-channel facial images,” IEEE Access, vol. 6, pp. 4630–4640, 2018.spa
dc.relation.referencesJ. Kim, B. Kim, P. Roy, and D. Jeong, “Efficient facial expression recognition algorithm based on hierar- chical deep neural network structure,” IEEE Access, vol. 7, pp. 41 273–41 285, 2019.spa
dc.relation.referencesDigilent, “Zybo z7,” https://digilent.com/reference/programmable-logic/zybo-z7/start, last accessed 28 Jan 2022.spa
dc.relation.references——, “Zybo z7 reference manual,” https://digilent.com/reference/programmable-logic/zybo-z7/ reference-manual, last accessed 28 Jan 2022.spa
dc.relation.referencesXilinx, “7 series fpgas configurable logic block,” https://www.xilinx.com/support/documentation/user guides/ug474 7Series CLB.pdf, last accessed 28 Jan 2022.spa
dc.relation.references——, “7 series fpgas memory resources,” https://www.xilinx.com/support/documentation/user guides/ ug473 7Series Memory Resources.pdf, last accessed 28 Jan 2022.spa
dc.relation.references——, “7 series dsp48e1 slice,” https://www.xilinx.com/support/documentation/user guides/ug479 7Series DSP48E1.pdf, last accessed 28 Jan 2022.spa
dc.relation.referencesAMD-Xilinx, “Pynq: Python productivity,” http://www.pynq.io/, last accessed 19 May 2022.spa
dc.relation.referencesB. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, “Quantization and training of neural networks for efficitent integer-arithmetic-only inference,” Google Inc., pp. 1–14, December 2017.spa
dc.relation.referencesS. Maloney, “Survey: Implementing dense neural networks in hardware,” https://pdfs.semanticscholar. org/b709/459d8b52783f58f1c118619ec42f3b10e952.pdf, March 2013, last accessed 15 Feb 2018.spa
dc.relation.references“Face image analysis with convolutional neural networks,” https://lmb.informatik.uni-freiburg.de/papers/ download/du diss.pdf, last accessed 15 Feb 2018.spa
dc.relation.referencesZ. Saidane, Image and video text recognition using convolutional neural networks: Study of new CNNs architectures for binarization, segmentation and recognition of text images. LAP LAMBERT Academic Publishing, 2011.spa
dc.relation.referencesJ. Misra and I. Saha, “Artificial neural networks in hardware: A survey of two decades of progress,” Neurocomputing, vol. 74, no. 1–3, pp. 239–255, December 2010.spa
dc.relation.referencesV. Bettadapura, “Face expression recognition and analysis: The state of the art,” CoRR, vol. abs/1203.6722, pp. 1–27, 2012.spa
dc.relation.referencesM. Z. Uddin, W. Khaksar, and J. Torresen, “Facial expression recognition using salient features and convolutional neural,” IEEE Access, vol. 5, pp. 26 146–26 161, 2017.spa
dc.relation.referencesXie, Siyue, and H. Hu, “Facial expression recognition using hierarchical features with deep comprehensive multipatches aggregation convolutional neural networks,” IEEE Transactions on Multimedia, vol. 21, no. 1, pp. 211–220, 2019.spa
dc.relation.referencesC. Zhang, P. Wang, K. Chen, and J. Kamarainen, “Identity-aware convolutional neural networks for facial expression recognition,” Journal of Systems Engineering and Electronics, vol. 28, no. 4, pp. 784–792, 2017.spa
dc.relation.referencesP. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, and I. Matthews, “The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, pp. 94–101, 2010.spa
dc.relation.referencesM. Lyons, M. Kamachi, and J. Gyoba, “The japanese female facial expression (jaffe) dataset,” https: //doi.org/10.5281/zenodo.3451524, last accessed 06 Dec 2021.spa
dc.relation.referencesV. Paul and J. Michael J., “Robus real-time object detection,” International Journal of Computer Vision, vol. 57, no. 2, pp. 137–154, December 2004.spa
dc.relation.referencesK. David, “Dlib-models,” https://github.com/davisking/dlib-models/, last accessed 12 Dec 2021.spa
dc.relation.referencesK. Davis E., “Dlib-ml: A machine learning toolkit,” Journal of Machine Learning Research, vol. 10, no. 2, pp. 1755–1758, 2009.spa
dc.relation.referencesItseez, “Open source computer vision library,” https://github.com/itseez/opencv, last accessed 12 Dec 2021.spa
dc.relation.referencesS. van der Walt, J. L. Schonberger, J. , Nunez-Iglesias, F. Boulogne, J. D. Warner, N. Yager, E. Gouillart, T. Yu, and the scikit-image contributors, “scikit-image: image processing in Python,” PeerJ, vol. 2, p. e453, 6 2014. [Online]. Available: https://doi.org/10.7717/peerj.453spa
dc.relation.referencesskimage, “local binary pattern,” https://scikit-image.org/docs/stable/api/skimage.feature.html? highlight=local binary pattern#skimage.feature.local binary pattern, last accessed 17 Dec 2021.spa
dc.relation.referencesA. Buslaev, V. I. Iglovikov, E. Khvedchenya, A. Parinov, M. Druzhinin, and A. A. Kalinin, “Albumentations: Fast and flexible image augmentations,” Information, vol. 11, no. 2, 6 2020. [Online]. Available: https://www.mdpi.com/2078-2489/11/2/125spa
dc.relation.references“Tensorflow: An open-source software library for machine intelligence,” https://www.tensorflow.org/, last accessed 15 Feb 2018.spa
dc.relation.referencestensorflow, “Quantization aware training,” https://blog.tensorflow.org/2020/04/ quantization-aware-training-with-tensorflow-model-optimization-toolkit.html, last accessed 28 Jan 2022.spa
dc.relation.references——, “Tensorflow lite 8-bit quantization specification,” https://www.tensorflow.org/lite/performance/ quantization spec, last accessed 28 Jan 2022.spa
dc.relation.referencesXilinx, “Field programmable gate array (fpga),” https://www.xilinx.com/products/silicon-devices/fpga/ what-is-an-fpga.html, last accessed 28 Jan 2022.spa
dc.relation.referencesTensorFlow, “Transfer learning with tensorflow hub,” https://www.tensorflow.org/tutorials/images/ transfer learning with hub, last accessed 17 May 2022.spa
dc.relation.referencesAmazon, “Alexa,” https://www.amazon.com/b?node=21576558011, last accessed 17 May 2022.spa
dc.relation.referencesGoogle, “Hey google,” https://assistant.google.com/, last accessed 17 May 2022.spa
dc.relation.referencesAnki, “Vector by anki: A giant roll forward for robot kind,” https://www.kickstarter.com/projects/anki/ vector-by-anki-a-giant-roll-forward-for-robot-kind, last accessed 17 May 2022.spa
dc.relation.referencesD. D. Labs, “Vector 2.0,” https://www.digitaldreamlabs.com/products/vector-robot, last accessed 17 May 2022.spa
dc.relation.referencesL. AI, “Emo: The coolest ai desktop pet with personality and ideas,” https://living.ai/emo/, last accessed 17 May 2022.spa
dc.relation.referencesGoogle, “Coral,” https://coral.ai/, last accessed 17 May 2022.spa
dc.relation.referencesIntel, “Intel neural compute stick 2 (intel ncs2),” https://www.intel.com/content/www/us/en/developer/ tools/neural-compute-stick/overview.html, last accessed 17 May 2022.spa
dc.relation.referencesAMD-Xilinx, “Vitis ai,” https://www.xilinx.com/products/design-tools/vitis/vitis-ai.html, last accessed 19 May 2022.spa
dc.relation.references——, “Xilinx alveo,” https://www.xilinx.com/products/boards-and-kits/alveo.html, last accessed 19 May 2022.spa
dc.relation.references——, “Board support package settings page,” https://docs.xilinx.com/r/en-US/ug1400-vitis-embedded/ Board-Support-Package-Settings-Page, last accessed 29 May 2022.spa
dc.relation.references——, “Dpu on pynq,” https://github.com/Xilinx/DPU-PYNQ, last accessed 29 May 2022.spa
dc.relation.referencesDorfell, “Pynq 2.7 for zybo-z7,” https://discuss.pynq.io/t/pynq-2-7-for-zybo-z7/4124, last accessed 02 Jun 2022.spa
dc.relation.referencesAMD-Xilinx, “Retargeting to a different board,” https://pynq.readthedocs.io/en/latest/pynq sd card. html#retargeting-to-a-different-board, last accessed 31 May 2022.spa
dc.relation.referencesLogictronix, “Installing tensorflow in pynq,” https://logictronix.com/wp-content/uploads/2019/04/ TensorFlow Installation on PYNQ Nov 6 2018.pdf, last accessed 03 Jun 2022.spa
dc.relation.referencesK. Hyodo, “Tensorflow-bin previous versions,” https://github.com/PINTO0309/Tensorflow-bin/tree/ main/previous versions, last accessed 03 Jun 2022.spa
dc.relation.referencesY. Shen, T. Ji, M. Ferdman, and P. Milder, “Argus: An end-to-end framework for accelerating cnns on fpgas,” IEEE Micro, vol. 39, no. 5, pp. 17–25, 2019.spa
dc.relation.referencesS. Sabogal, A. George, and G. Crum, “Recon: A reconfigurable cnn acceleration framework for hybrid semantic segmentation on hybrid socs for space applications,” in 2019 IEEE Space Computing Conference (SCC), 2019, pp. 41–52.spa
dc.relation.referencesS. Mouselinos, V. Leon, S. Xydis, D. Soudris, and K. Pekmestzi, “Tf2fpga: A framework for projecting and accelerating tensorflow cnns on fpga platforms,” in 2019 8th International Conference on Modern Circuits and Systems Technologies (MOCAST), 2019, pp. 1–4.spa
dc.relation.referencesJ. Zhu, L. Wang, H. Liu, S. Tian, Q. Deng, and J. Li, “An efficient task assignment framework to accelerate dpu-based convolutional neural network inference on fpgas,” IEEE Access, vol. 8, pp. 83 224–83 237, 2020.spa
dc.relation.referencesY. Liang, L. Lu, and J. Xie, “Omni: A framework for integrating hardware and software optimizations for sparse cnns,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 40, no. 8, pp. 1648–1661, 2021.spa
dc.relation.referencesXilinx, “Zynq ultrascale+ mpsoc,” https://www.xilinx.com/products/silicon-devices/soc/ zynq-ultrascale-mpsoc.html, last accessed 12 Sep 2022.spa
dc.relation.referencesC. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao, and J. Cong, “Optimizing fpga-based accelerator design for deep convolutional neural networks.” Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA’15, February 2015, pp. 161–170.spa
dc.relation.referencesS. I. Venieris and C. S. Bouganis, “Fpgaconvnet: A framework for mapping convolutional neural net- works on fpgas.” Proceedings - 24th IEEE International Symposium on Field-Programmable Custom Computing Machines, FCCM 2016, May 2016, pp. 40–47.spa
dc.relation.referencesA. Dundar, J. Jin, B. Martini, and E. Culurciello, “Embedded streaming deep neural networks accelerator with applications,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 7, pp. 1572– 1583, July 2017.spa
dc.relation.referencesN. Li, S. Takaki, Y. Tomioka, and H. Kitazawa, “A multistage dataflow implementation of a deep convolu- tional neural network based on fpga for high-speed object recognition.” 2016 IEEE Southwest Symposium On Image Analysis and Interpretation (SSIAI), 2016, pp. 165–168.spa
dc.relation.references“Caffe: Deep learning framework,” http://caffe.berkeleyvision.org/, last accessed 15 Feb 2018.spa
dc.relation.references“Mathworks: Matlab,” https://www.mathworks.com/products/matlab.html, last accessed 15 Feb 2018.spa
dc.relation.references“Microsoft cognitive toolkit,” https://www.microsoft.com/en-us/cognitive-toolkit/, last accessed 15 Feb 2018.spa
dc.relation.referencesT. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam, “Diannao: a small-footprint high- throughput accelerator for ubiquitous machine-learning.” Proceedings of the 19th International Confer- ence on Architectural Support for Programming Languages and Operating Systems - ASPLOS’14, March 2014, pp. 269–284.spa
dc.relation.referencesY. Zhou and J. Jiang, “An fpga-based accelerator implementation for deep convolutional neural networks.” 4th International Conference on Computer Science and Network Technology (ICCSNT), December 2015, pp. 829–832.spa
dc.relation.referencesY. Murakami, “Fpga implementation of a simd-based array processor with torus interconnect.” 2015 International Conference on Field Programmable Technology, FPT 2015, May 2015, pp. 244–247.spa
dc.relation.referencesB. Kitchenham, O. P. Brereton, D. Budgen, M. Turner, J. Bailey, and S. Linkman, “Systematic literature reviews in software engineering - a systematic literature review,” Information and Software Technology, vol. 51, no. 1, pp. 7–15, November 2008.spa
dc.relation.referencesB. Kitchenham, R. Pretorius, D. Budgen, O. P. Brereton, M. Turner, M. Niazi, and S. Linkman, “Sys- tematic literature reviews in software engineering-a tertiary study,” Information and Software Technology, August 2010.spa
dc.relation.referencesA. krizhevsky, “Survey: Implementing dense neural networks in hardware,” https://arxiv.org/abs/1404. 5997, April 2014, last accessed 15 Feb 2018.spa
dc.relation.referencesS. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, and E. Shelhamer, “cudnn: Efficient primitives for deep learning,” https://arxiv.org/abs/1410.0759, December 2014, last accessed 15 Feb 2018.spa
dc.relation.referencesF. Ortega-Zamorano, J. M. Jerez, D. U. Munoz, R. M. Luque-Baena, and L. Franco, “Efficient implemen- tation of the backpropagation algorithm in fpgas and microcontrollers,” IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 9, pp. 1840–1850, August 2016.spa
dc.relation.referencesC. Farabet, B. Martini, B. Corda, P. Akselrod, E. Culurciello, and Y. Lecun, “Neuflow: A runtime reconfigurable dataflow processor for vision.” IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, June 2011, pp. 109–116.spa
dc.relation.referencesM. R. D. Abdu-Aljabar, “Design and implementation of neural network in fpga,” Journal of Engineering and Development, vol. 16, no. 3, September 2012.spa
dc.relation.referencesG. H. Shakoory, “Fpga implementation of multilayer perceptron for speech recognition,” Journal of En- gineering and Development, vol. 17, no. 6, December 2013.spa
dc.relation.referencesE. Z. Mohammed and H. K. Ali, “Hardware implementation of artificial neural network using field pro- grammable gate array,” International Journal of Computer Theory and Engineering, vol. 5, no. 5, October 2013.spa
dc.relation.referencesS. Singh, S. Sanjeevi, S. V., and A. Talashi, “Fpga implementation of a trained neural network,” IOSR Journal of Electronics and Communication Engineering (IOSR-JECE), vol. 10, no. 3, May-June 2015.spa
dc.relation.referencesZ. Du, R. Fasthuber, T. Chen, P. Ienne, L. Li, T. Luo, X. Feng, Y. Chen, and O. Temam, “Shidiannao: Shifting vision processing closer to the sensor.” Proceedings of the 42nd Annual International Symposium on Computer Architecture-ISCA’15, June 2015, pp. 92–104.spa
dc.relation.referencesM. Motamedi, P. Gysel, V. Akella, and S. Ghiasi, “Design space exploration of fpga-based deep con- volutional neural networks.” 21st Asia and South Pacific Design Automation Conference, 2016, pp. 575–580.spa
dc.relation.referencesL. B. Saldanha and C. Bobda, “Sparsely connected neural networks in fpga for handwritten digit recog- nition.” Proceedings - International Symposium on Quality Electronic Design (ISQED), May 2016, pp. 113–117.spa
dc.relation.referencesY. Wang, L. Xia, T. Tang, B. Li, S. Yao, M. Cheng, and H. Yang, “Low power convolutional neural networks on a chip,” no. 1. IEEE International Symposium on Computer Architecture, April 2016, pp. 129–132.spa
dc.relation.referencesC. Kyrkou, C. S. Bouganis, T. Theocharides, and M. M. Polycarpou, “Embedded hardware-efficient real- time classification with cascade support vector machines,” IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 1, January 2016.spa
dc.relation.referencesT. Luo, S. Liu, L. Li, Y. Wang, S. Zhang, T. Chen, Z. Xu, O. Temam, and Y. Chen, “Dadiannao: A neural network supercomputer,” IEEE Transactions on Computers, vol. 66, no. 1, pp. 73–88, January 2017.spa
dc.rights.accessrightsinfo:eu-repo/semantics/openAccessspa
dc.rights.licenseReconocimiento 4.0 Internacionalspa
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/spa
dc.subject.ddc620 - Ingeniería y operaciones afines::629 - Otras ramas de la ingenieríaspa
dc.subject.lembComputadores neuronalesspa
dc.subject.lembNeural computerseng
dc.subject.lembSupercomputadoresspa
dc.subject.lembSupercomputerseng
dc.subject.proposalFEReng
dc.subject.proposalCNNeng
dc.subject.proposalFPGAeng
dc.subject.proposalHNNeng
dc.subject.proposalFERspa
dc.subject.proposalCNNspa
dc.subject.proposalFPGAspa
dc.subject.proposalHNNspa
dc.titleA new framework for training a CNN with a hardware-software architectureeng
dc.title.translatedNuevo framework para el entrenamiento de CNN usando una arquitecture hardware-softwarespa
dc.typeTrabajo de grado - Doctoradospa
dc.type.coarhttp://purl.org/coar/resource_type/c_db06spa
dc.type.coarversionhttp://purl.org/coar/version/c_ab4af688f83e57aaspa
dc.type.contentTextspa
dc.type.driverinfo:eu-repo/semantics/doctoralThesisspa
dc.type.redcolhttp://purl.org/redcol/resource_type/TDspa
dc.type.versioninfo:eu-repo/semantics/acceptedVersionspa
dcterms.audience.professionaldevelopmentEstudiantesspa
dcterms.audience.professionaldevelopmentInvestigadoresspa
oaire.accessrightshttp://purl.org/coar/access_right/c_abf2spa
oaire.awardtitleN/Aspa
oaire.fundernameN/Aspa

Archivos

Bloque original

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
1098679415.2023.pdf
Tamaño:
16.75 MB
Formato:
Adobe Portable Document Format
Descripción:
Tesis de Doctorado en Ingeniería - Ingeniería Eléctrica

Bloque de licencias

Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
license.txt
Tamaño:
5.74 KB
Formato:
Item-specific license agreed upon to submission
Descripción: