Research Outputs

Now showing 1 - 7 of 7
  • Publication
    Teach me to play, gamer! Imitative learning in computer games via linguistic description of complex phenomena and decision trees
    (Soft Computing , Springer Link, 2023)
    Clemente Rubio-Manzano
    ;
    Lermanda, TomĂ¡s
    ;
    ;
    Christian Vidal & Alejandra Segura
    In this article, we present a new machine learning model by imitation based on the linguistic description of complex phenomena. The idea consists of, first, capturing the behaviour of human players by creating a computational perception network based on the execution traces of the games and, second, representing it using fuzzy logic (linguistic variables and if-then rules). From this knowledge, a set of data (dataset) is automatically created to generate a learning model based on decision trees. This model will be used later to automatically control the movements of a bot. The result is an artificial agent that mimics the human player. We have implemented, tested and evaluated this technology from two different points of view: performance by using classical metrics (accuracy, ROC area and PRC area) and believability by using a Turing test for trained bots. The results obtained are interesting and promising, showing that this method can be a good alternative to design and implement the behaviour of intelligent agents in video game development.
  • Thumbnail Image
    Publication
    Detecting aggressiveness in tweets: A hybrid model for detecting cyberbullying in the Spanish language
    (MDPI, 2021) ;
    Lepe-FaĂºndez, Manuel
    ;
    Segura-Navarrete, Alejandra
    ;
    Vidal-Castro, Christian
    ;
    Rubio-Manzano, Clemente
    In recent years, the use of social networks has increased exponentially, which has led to a significant increase in cyberbullying. Currently, in the field of Computer Science, research has been made on how to detect aggressiveness in texts, which is a prelude to detecting cyberbullying. In this field, the main work has been done for English language texts, mainly using Machine Learning (ML) approaches, Lexicon approaches to a lesser extent, and very few works using hybrid approaches. In these, Lexicons and Machine Learning algorithms are used, such as counting the number of bad words in a sentence using a Lexicon of bad words, which serves as an input feature for classification algorithms. This research aims at contributing towards detecting aggressiveness in Spanish language texts by creating different models that combine the Lexicons and ML approach. Twenty-two models that combine techniques and algorithms from both approaches are proposed, and for their application, certain hyperparameters are adjusted in the training datasets of the corpora, to obtain the best results in the test datasets. Three Spanish language corpora are used in the evaluation: Chilean, Mexican, and Chilean-Mexican corpora. The results indicate that hybrid models obtain the best results in the 3 corpora, over implemented models that do not use Lexicons. This shows that by mixing approaches, aggressiveness detection improves. Finally, a web application is developed that gives applicability to each model by classifying tweets, allowing evaluating the performance of models with external corpus and receiving feedback on the prediction of each one for future research. In addition, an API is available that can be integrated into technological tools for parental control, online plugins for writing analysis in social networks, and educational tools, among others.
  • Thumbnail Image
    Publication
    Explainable Hopfield Neural Networks using an automatic video-generation system
    (MDPI, 2021)
    Rubio Manzano, Clemente
    ;
    Segura Navarrete, Alejandra
    ;
    ;
    Vidal Castro, Christian
    Hopfield Neural Networks (HNNs) are recurrent neural networks used to implement associative memory. They can be applied to pattern recognition, optimization, or image segmentation. However, sometimes it is not easy to provide the users with good explanations about the results obtained with them due to mainly the large number of changes in the state of neurons (and their weights) produced during a problem of machine learning. There are currently limited techniques to visualize, verbalize, or abstract HNNs. This paper outlines how we can construct automatic video-generation systems to explain its execution. This work constitutes a novel approach to obtain explainable artificial intelligence systems in general and HNNs in particular building on the theory of data-to-text systems and software visualization approaches. We present a complete methodology to build these kinds of systems. Software architecture is also designed, implemented, and tested. Technical details about the implementation are also detailed and explained. We apply our approach to creating a complete explainer video about the execution of HNNs on a small recognition problem. Finally, several aspects of the videos generated are evaluated (quality, content, motivation and design/presentation).
  • Publication
    Predicting engineering undergraduates dropout: A case study in Chile
    (TEMPUS Publications, 2023) ; ;
    Bizama-Varas, Michelle
    The main objective of this article is to present and validate a statistical model (N = 3,152) to predict the dropout of students from the School of Engineering of the Universidad CatĂ³lica de la SantĂ­sima ConcepciĂ³n (UCSC) in Chile. Student droupout in engineering is a generalized and multifactorial phenomenon, even more so when the student can use his or her university access score for a period of two years. In the UCSC, a distinction is made between formal and nonformal droupout. The information collection methodology in this study included the survey administered by the Department of Evaluation, Measurement and Educational Registry of Chile (DEMRE) and input from the Directorate of Admission and Academic Registration of the UCSC. Within the analysis groups were students who formally resigned and were analyzed according to the reasons they gave for leaving; the other group was constituted by students who did not formalize their abandonment, deserters. Subsequently, a logistic regression analysis was applied to determine which variables would best explain the phenomenon of droupout. Among the main factors are gender (GENDER), program (AU), cumulative average score (PPA_SCORE), mathematics score of the university selection test (PSU_MATH_SCORE), mother education level (EDU_MOM), progression rate of student in engineering program (PROGRESSION_RATE) and socioeconomic quintile of student (QUINTILE). The performance of the prediction model shows an accuracy (88.53%) and precision (88.69%), which is a very encouraging result in relation to the performance of the studies reviewed in the literature.
  • Publication
    How useful TutorBot+ is for teaching and learning in programming courses: A preliminary study
    (IEEE, 2023) ; ;
    GĂ³mez-Meneses, Pedro
    ;
    Maldonado-Montiel, Diego
    ;
    Segura-Navarrete, Alejandra
    ;
    Vidal-Castro, Christian
    Objective: The objective of this paper is to present preliminary work on the development of an EduChatBot tool and the measurement of the effects of its use aimed at providing effective feedback to programming course students. This bot, hereinafter referred to as tutorBot+, was constructed based on chatGPT3.5 and is tasked with assisting and providing timely positive feedback to students in computer science programming courses at UCSC. Methods/Analysis: The proposed method consists of four stages: (1) Immersion in the feedback and Large Language Models (LLMs) topic; (2) Development of tutorBot+ prototypes in both non-conversational and conversational versions; (3) Experiment design; and (4) Intervention and evaluation. The first stage involves a literature review on feedback and learning, the use of intelligent tutors in the educational context, as well as the topics of LLMs and chatGPT. The second and third stages detail the development of tutorBot+ in its two versions, and the final stage lays the foundation for a quasi-experimental study involving students in the curriculum activities of Programming Workshop and Database Workshop, focusing on learning outcomes related to the development of computational thinking skills, and facilitating the use and measurement of the tool’s effects. Findings: The preliminary results of this work are promising, as two functional prototypes of tutorBot+ have been developed for both the non-conversational and conversational versions. Additionally, there is ongoing exploration into the possibility of creating a domain-specific model based on pretrained models for programming, integrating tutorBot+ with other platforms, and designing an experiment to measure student performance, motivation, and the tool’s effectiveness.
  • Publication
    A novel approach to the creation of a labelling lexicon for improving emotion analysis in text
    (Emerald Publishing, 2021)
    Segura Navarrete, Alejandra
    ;
    ;
    Vidal Castro, Christian
    ;
    Rubio Manzano, Clemente
    Purpose – This paper aims to describe the process used to create an emotion lexicon enriched with the emotional intensity of words and focuses on improving the emotion analysis process in texts. Design/methodology/approach – The process includes setting, preparation and labelling stages. In the first stage, a lexicon is selected. It must include a translation to the target language and labelling according to Plutchik’s eight emotions. The second stage starts with the validation of the translations. Then, it is expanded with the synonyms of the emotion synsets of each word. In the labelling stage, the similarity of words is calculated and displayed using WordNet similarity. Findings – The authors’ approach shows better performance to identification of the predominant emotion for the selected corpus. The most relevant is the improvement obtained in the results of the emotion analysis in a hybrid approach compared to the results obtained in a purist approach. Research limitations/implications – The proposed lexicon can still be enriched by incorporating elements such as emojis, idioms and colloquial expressions. Practical implications – This work is part of a research project that aids in solving problems in a digital society, such as detecting cyberbullying, abusive language and gender violence in texts or exercising parental control. Detection of depressive states in young people and children is added. Originality/value – This semi-automatic process can be applied to any language to generate an emotion lexicon. This resource will be available in a software tool that implements a crowdsourcing strategy allowing the intensity to be re-labelled and new words to be automatically incorporated into the lexicon.
  • Thumbnail Image
    Publication
    Guide for the application of the data augmentation approach on sets of texts in Spanish for sentiment and emotion analysis
    (PLOS, 2024) ;
    Gutiérrez-Benítez, Rodrigo
    ;
    Segura-Navarrete, Alejandra
    ;
    Vidal-Castro, Christian
    Over the last ten years, social media has become a crucial data source for businesses and researchers, providing a space where people can express their opinions and emotions. To analyze this data and classify emotions and their polarity in texts, natural language processing (NLP) techniques such as emotion analysis (EA) and sentiment analysis (SA) are employed. However, the effectiveness of these tasks using machine learning (ML) and deep learning (DL) methods depends on large labeled datasets, which are scarce in languages like Spanish. To address this challenge, researchers use data augmentation (DA) techniques to artificially expand small datasets. This study aims to investigate whether DA techniques can improve classification results using ML and DL algorithms for sentiment and emotion analysis of Spanish texts. Various text manipulation techniques were applied, including transformations, paraphrasing (back-translation), and text generation using generative adversarial networks, to small datasets such as song lyrics, social media comments, headlines from national newspapers in Chile, and survey responses from higher education students. The findings show that the Convolutional Neural Network (CNN) classifier achieved the most significant improvement, with an 18% increase using the Generative Adversarial Networks for Sentiment Text (SentiGan) on the Aggressiveness (Seriousness) dataset. Additionally, the same classifier model showed an 11% improvement using the Easy Data Augmentation (EDA) on the Gender-Based Violence dataset. The performance of the Bidirectional Encoder Representations from Transformers (BETO) also improved by 10% on the back-translation augmented version of the October 18 dataset, and by 4% on the EDA augmented version of the Teaching survey dataset. These results suggest that data augmentation techniques enhance performance by transforming text and adapting it to the specific characteristics of the dataset. Through experimentation with various augmentation techniques, this research provides valuable insights into the analysis of subjectivity in Spanish texts and offers guidance for selecting algorithms and techniques based on dataset features.