Revista de Estudios Sociales

rev. estud. soc. | eISSN 1900-5180 | ISSN 0123-885X

Gender Semantics and Historical Feminisms: An Interdisciplinary Approach through Natural Language Processing

No. 93 (2025-07-25)
  • Laura Manrique-Gómez
    Universidad de los Andes, Colombia
    ORCID iD: https://orcid.org/0000-0003-0843-8157
  • Tony Montes
    Universidad de los Andes, Colombia
    ORCID iD: https://orcid.org/0009-0002-9012-8384
  • Rubén Manrique
    Universidad de los Andes, Colombia
    ORCID iD: https://orcid.org/0000-0001-8742-2094

Abstract

This article explores the evolution of gender semantics in 19th-century Latin America, focusing on the semantic nuances of the word women. The study’s primary aim is to present the results of an interdisciplinary methodology that reveals histor­ical gender concepts embedded in language, analyzed through the dual lens of social sciences and artificial intelligence. This study uses Machine Learning techniques to analyze historical texts, specifically employing Natural Language Processing to detect semantic shifts, part-of-speech tagging, and named entity recognition to identify key gender vocabulary. Additionally, an n-gram approach was employed to recognize the most frequent terms associated with target words. The methodology was applied to a self-collected historical corpus from Latin American Spanish newspapers, demon­strating the effectiveness of these technologies in processing extensive collections of written sources. The article reveals how artificial intelligence tools can elucidate underlying gender ideas in historical written texts, offering empirical insights into the historical inequalities in linguistic representation. By comparing the newspaper dataset results with a specific literary work, the Colombian novel Manuela by Eugenio Díaz Castro (1859), the research highlights latent feminist tensions and revolutionary ideas contesting societal norms. The article does not provide a definitive historical or literary analysis. Instead, it invites social scientists to engage in the next phase of research by illustrating the potential of artificial intelligence in enhancing interpretability and critical analysis of historical narratives through interdisciplinary collaboration. This article contributes to historical feminist debates by presenting an original frame­work synthesizing qualitative and computational methods, open datasets, and code, thus expanding the possibilities of traditional historiography. It concludes with reflec­tions on improving gender semantic studies, emphasizing how this integration of disci­plines can propel future research directions in critical cultural inquiries.

Keywords: artificial intelligence, gender semantics, historical feminisms, machine learning, natural language processing, 19th-century Latin America

References

Primary Sources

Acosta de Samper, Soledad. 1878. Prologue to La Mujer. Revista Quincenal n.° 1, September 1. Bogotá: Biblioteca Nacional de Colombia (BNC). Hemeroteca Digital. https://bnco.ent.sirsi.net/custom/web/content/conservacion/html/visorFicheros.html?idFichero=88932

Colombia Ilustrada. 1891. Year 1, January 31. Bogotá: Biblioteca Nacional de Colombia (BNC). Hemeroteca.

Díaz Castro, Eugenio. 1889. Manuela: novela de costumbres colombianas. Biblioteca Virtual Miguel de Cervantes. https://www.cervantesvirtual.com/obra-visor/manuela-novela-de-costumbres-colombianas-tomo-primero--0/html/ff1e97e4-82b1-11df-acc7-002185ce6064_2.html

Díaz Castro, Eugenio. (1859) 2011. Manuela: novela bogotana. Edited by Flor María Rodríguez-Arenas. Miami: Stockero.

Secondary Sources

Alzate, Carolina. 2015. Soledad Acosta de Samper y el discurso letrado de género: 1853-1881. Madrid; Frankfurt: Iberoamericana Vervuert.

Blei, David M., Andrew Y. Ng, and Michael I. Jordan. 2003. “Latent Dirichlet Allocation.” Journal of Machine Learning Research 3: 993-1022. https://dl.acm.org/doi/10.5555/944919.944937

Castro Carvajal, Beatriz. 2014. “La escritura de las monjas francesas viajeras en el siglo XIX.” Anuario Colombiano de Historia Social y de la Cultura 41 (1): 91-126. https://doi.org/10.15446/achsc.v41n1.44765

Clark, Emily Joy. 2014. “The Caged Bird and the Female Writer: A Recurring Metaphor in Women’s Hispanic Prose from the Mid-Nineteenth Century.” Letras Femeninas 40 (2): 199-215. https://doi.org/10.2307/44733729

Cañete, José. 2019. “Compilation of Large Spanish Unannotated Corpora.” Zenodo. https://doi.org/10.5281/zenodo.3247731

Cañete, José, Gabriel Chaperon, Rodrigo Fuentes, Jou-Hui Ho, Hojin Kang, and Jorge Pérez. 2020. “Spanish Pre-Trained BERT Model and Evaluation Data.” Paper presented in the Practical ML for Developing Countries (PML4DC) Workshop, ICLR 2020, online. https://doi.org/10.48550/arxiv.2308.02976

Davies, Catherine, Claire Brewster, and Hilary Owen. 2006. South American Independence: Gender, Politics, Text. Liverpool: Liverpool University Press.

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.

Jockers, Matthew, and Gabi Kirilloff. 2016. “Understanding Gender and Character Agency in the 19th Century Novel.” Journal of Cultural Analytics 2 (2): 1-26. https://doi.org/10.22148/16.010

Lux, Martha, and María Cristina Pérez Pérez. 2020. “Los estudios de historia y género en América Latina”. Historia Crítica 77: 3-33. https://doi.org/10.7440/histcrit77.2020.01

Manrique-Gómez, Laura, Tony Montes, Arturo Rodríguez-Herrera, and Rubén Manrique. 2024. “Historical Ink: 19th Century Latin American Spanish Newspaper Corpus with LLM OCR Correction.” In Proceedings of the 4th International Conference on Natural Language for Digital Humanities, 132-139. November, Miami, United States. https://doi.org/10.18653/v1/2024.nlp4dh-1.13

Mehmood, Ayaz, Muhammad Tayyab Zamir, Muhammad Asif Ayub, Nasir Ahmad, and Kashif Ahmad. 2024. “A Named Entity Recognition and Topic Modeling-based Solution for Locating and Better Assessment of Natural Disasters in Social Media.” arXiv:2405.00903. https://doi.org/10.48550/arxiv.2405.00903

Miseres, Vanesa. 2017. Mujeres en tránsito: viaje, identidad y escritura en Sudamérica (1830-1910). Chapel Hill: University of North Carolina Press.

Montes, Tony, Laura Manrique-Gómez, and Rubén Manrique. 2024. “Historical Ink: Semantic Shift Detection for 19th Century Spanish.” In Proceedings of the 5th Workshop on Computational Approaches to Historical Language Change, 29-41. August, Bangkok, Thailand. https://doi.org/10.18653/v1/2024.lchange-1.4

Montoya Upegui, Laura. 2023. “Balance sobre teorías feministas: un cuestionamiento a la noción de ‘mujer’ como sujeto de análisis”. Comprehensive doctoral essay, Universidad de los Andes, Colombia.

Porto-Dapena, José-Álvaro. 1975. “En torno a las entradas del ‘diccionario’ de Rufino José Cuervo.” Boletín del Instituto Caro y Cuervo 30 (1): 113-152. https://thesaurus.caroycuervo.gov.co/index.php/rth/article/view/1597

Resnik, Philip. 2024. “Large Language Models are Biased Because They Are Large Language Models.” arXiv: 2406.13138. https://doi.org/10.48550/arxiv.2406.13138

Rodríguez-Arenas, Flor María. 2011. “Manuela. Novela Bogotana (1859) de Eugenio Díaz Castro: La ideología y el realismo de medio siglo.” Preface to Manuela: novela bogotana. Edited by Flor María Rodríguez-Arenas. Miami: Stockero.

Schlechtweg, Dominik, Nina Tahmasebi, Simon Hengchen, Haim Dubossarsky, and Barbara McGillivray. 2021. “DWUG: A large Resource of Diachronic Word Usage Graphs in Four Languages.” In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 7079-7091. November, online and Punta Cana, Dominican Republic. https://aclanthology.org/2021.emnlp-main.567

Schopf, Tim, Simon Klimek, and Florian Matthes. 2022. “PatternRank: Leveraging Pretrained Language Models and Part of Speech for Unsupervised Keyphrase Extraction.” In Proceedings of the 14th International Joint Confer¬ence on Knowledge Discovery, Knowledge Engineering and Knowledge Management–KDIR, 243-248. October 24-26, Valletta, Malta. https://doi.org/10.48550/arxiv.2210.05245

Scott, Joan Wallach. 1999. Gender and the Politics of History. New York: Columbia University Press.

Skinner, Lee. 2016. Gender and the Rhetoric of Modernity in Spanish America: 1850-1910. Gainesville: University Press of Florida.

Taylor, Barbara. 2024. “History, Feminism and the Feeling Woman.” History Workshop Journal 98: 125-134. https://doi.org/10.1093/hwj/dbae027

Van der Maaten, Laurens, and Geoffrey Hinton. 2008. “Visualizing Data using t-SNE.” Journal of Machine Learning Research 9 (86): 2579-2605. http://jmlr.org/papers/v9/vandermaaten08a.html

Williams, Claudette. 2008. “Cuban Anti-Slavery Narrative through Postcolonial Eyes: Gertrudis Gómez de Avellaneda’s Sab.” Bulletin of Latin American Research 27 (2): 155-175. https://doi.org/10.1111/j.1470-9856.2008.00261.x

License

Copyright (c) 2025 Laura Manrique-Gómez, Tony Montes, Rubén Manrique

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.