Dispositivo IoT para prevenir la violencia de género usando TinyML

Main Article Content

Mónica Tamara Avila Rodríguez
Elsa Marina Quizhpe Buñay
Wilson Gustavo Chango Sailema
Stalin Arciniegas

Abstract

 


The study is framed within the development of a solution based on the Internet of Things (IoT) and machine learning to prevent and detect dangerous situations related to Gender Based Violence (GBV). The goal is to provide a useful and accessible tool for women at risk, thus contributing to the prevention and reduction of GBV.


The problem addressed by the study is gender-based violence, an issue of great social and humanitarian relevance. It seeks to use digital technologies and machine learning to detect words associated with dangerous situations and prevent GBV in real time.


To address the problem, a public data set created by Microsoft containing audio samples of different words, including words associated with dangerous situations such as "yes" and "no", as well as other words and static noise, is used.


Audio data in WAV format is used, divided into one-second windows with a sampling rate of 16000 Hz. A homogeneous data window with a duration of one second is selected and the frequency cepstral coefficient (MFCC) is used to highlight the human voice and reduce background noise.


The developed model showed good overall performance, with an average efficiency of 91.3% in the training set and 85.83% in the evaluation set. High accuracy was obtained in the detection of words associated with danger situations, such as "yes" and "no". It is recognized that technology has a significant role to play in addressing GBV, but it also emphasizes the need for a commitment from society and governments to achieve lasting and significant change in eradicating this problem worldwide

Downloads

Download data is not yet available.

Article Details

How to Cite
Avila RodríguezM. T., Quizhpe BuñayE. M., Chango SailemaW. G., & ArciniegasS. (2023). Dispositivo IoT para prevenir la violencia de género usando TinyML. AXIOMA, 1(29), 77-84. https://doi.org/10.26621/ra.v1i29.920
Section
TECNOLOGÍAS DE LA INFORMACIÓN Y COMUNICACIÓN (TIC)

References

Bahar, P., Zeyer, A., Schluter, R., & Ney, H. (2019). On Using 2D Sequence-to-sequence Models for Speech Recognition. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2019-May, 5671–5675. https://doi.org/10.1109/ICASSP.2019.8682155

Chang, H. J., Lee, H. Y., & Lee, L. S. (2021). Towards lifelong learning of end-to-end ASR. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2, 1306–1310. https://doi.org/10.21437/INTERSPEECH.2021-563

Chang, S., Li, B., Rybach, D. J., Li, W., He, Y. (Ryan), Sainath, T. N., & Strohman, T. D. (2020). Low Latency Speech Recognition using End-to-End Prefetching. https://research.google/pubs/pub49622/

Enaifoghe, A., Dlelana, M., Abosede Durokifa, A., & P. Dlamini, N. (2021). The Prevalence of Gender-Based Violence against Women in South Africa : A Call for Action. African Journal of Gender, Society and Development (Formerly Journal of Gender, Information and Development in Africa), 10(1), 117–146. https://doi.org/10.31920/2634-3622/2021/V10N1A6

Guo, J. (2022). Deep learning approach to text analysis for human emotion detection from big data. Journal of Intelligent Systems, 31(1), 113–126. https://doi.org/10.1515/JISYS-2022-0001/MACHINEREADABLECITATION/RIS

John, N., Roy, C., Mwangi, M., Raval, N., & McGovern, T. (2021). COVID-19 and gender-based violence (GBV): hard-to-reach women and girls, services, and programmes in Kenya. Https://Doi.Org/10.1080/13552074.2021.1885219, 29(1), 55–71. https://doi.org/10.1080/13552074.2021.1885219

Kamalraj, R., Neelakandan, S., Ranjith Kumar, M., Chandra Shekhar Rao, V., Anand, R., & Singh, H. (2021). Interpretable filter based convolutional neural network (IF-CNN) for glucose prediction and classification using PD-SS algorithm. Measurement, 183, 109804. https://doi.org/10.1016/J.MEASUREMENT.2021.109804

Lo, T. H., Weng, S. Y., Chang, H. J., & Chen, B. (2020). An Effective End-to-End Modeling Approach for Mispronunciation Detection. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2020-October, 3027–3031. https://doi.org/10.21437/Interspeech.2020-1605

Miranda Calero, J. A., Rituerto-Gonzalez, E., Luis-Mingueza, C., Canabal, M. F., Barcenas, A. R., Lanza-Gutierrez, J. M., Pelaez-Moreno, C., & Lopez-Ongil, C. (2022). Bindi: Affective Internet of Things to Combat Gender-Based Violence. IEEE Internet of Things Journal, 9(21), 21174–21193. https://doi.org/10.1109/JIOT.2022.3177256

Mishra, S., & Tyagi, A. K. (2022). The Role of Machine Learning Techniques in Internet of Things-Based Cloud Applications. Internet of Things, 105–135. https://doi.org/10.1007/978-3-030-87059-1_4/COVER

Mrozek, D., Kwaśnicki, S., Sunderam, V., Małysiak-Mrozek, B., Tokarz, K., & Kozielski, S. (2021). Comparison of Speech Recognition and Natural Language Understanding Frameworks for Detection of Dangers with Smart Wearables. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 12745 LNCS, 471–484. https://doi.org/10.1007/978-3-030-77970-2_36/COVER

Oh, S., Shi, Y., del Valle, J., Salev, P., Lu, Y., Huang, Z., Kalcheim, Y., Schuller, I. K., & Kuzum, D. (2021). Energy-efficient Mott activation neuron for full-hardware implementation of neural networks. Nature Nanotechnology 2021 16:6, 16(6), 680–687. https://doi.org/10.1038/s41565-021-00874-8

Rituerto-González, E., Mínguez-Sánchez, A., Gallardo-Antolín, A., & Peláez-Moreno, C. (2019). Data Augmentation for Speaker Identification under Stress Conditions to Combat Gender-Based Violence. Applied Sciences 2019, Vol. 9, Page 2298, 9(11), 2298. https://doi.org/10.3390/APP9112298

Wang, D., Wang, X., & Lv, S. (2019). An Overview of End-to-End Automatic Speech Recognition. Symmetry 2019, Vol. 11, Page 1018, 11(8), 1018. https://doi.org/10.3390/SYM11081018

Wang, S., Yang, Y., Wu, Z., Qian, Y., & Yu, K. (2020). Data Augmentation Using Deep Generative Models for Embedding Based Speaker Recognition. IEEE/ACM Transactions on Audio Speech and Language Processing, 28, 2598–2609. https://doi.org/10.1109/TASLP.2020.3016498

Zhang, Q., Wang, D., Zhao, R., Yu, Y., & Shen, J. (2021). Sensing to Hear. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 5(3). https://doi.org/10.1145/3478093

Zhang, S., Zhang, S., Huang, T., & Gao, W. (2018). Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching. IEEE Transactions on Multimedia, 20(6), 1576–1590. https://doi.org/10.1109/TMM.2017.2766843