Research Outputs

Now showing 1 - 4 of 4
Thumbnail Image
Publication

Is news really pessimistic? Sentiment Analysis of Chilean online newspaper headlines

2018, Mg. Martinez-Araneda, Claudia, Segura, Alejandra, Vidal-Castro, Christian, Elgueta, Jorge

Objectives: This paper explores the popular belief that all news is bad news. Many claim not to read newspapers to avoid knowing about the worst of our society. We want tear down the myth by applying a Sentiment Analysis (SA) approach. Method/Analysis: This work applies sentiment analysis techniques to study the headline bias of online newspapers for the period between March 2014 and April 2015. We analyzed 2953 headlines gathered from five of the most popular Chilean newspapers which are available online and offer RSS feeds. Findings: Our results show a roughly equivalent percentage of positive bias (38%) and negative bias (37%) instances, with 25% of headlines exhibiting a neutral bias. Automatic classification performance is promising, with decent classifier performance and sensitivity, with plenty of room for improvement. Novelty/Improvement: This work also a domain-specific Spanish language tagged corpus was generated as a result of this work, which is a valuable resource for future studies.

Thumbnail Image
Publication

Guide for the application of the data augmentation approach on sets of texts in Spanish for sentiment and emotion analysis

2024, Mg. Martinez-Araneda, Claudia, Gutiérrez-Benítez, Rodrigo, Segura-Navarrete, Alejandra, Vidal-Castro, Christian

Over the last ten years, social media has become a crucial data source for businesses and researchers, providing a space where people can express their opinions and emotions. To analyze this data and classify emotions and their polarity in texts, natural language processing (NLP) techniques such as emotion analysis (EA) and sentiment analysis (SA) are employed. However, the effectiveness of these tasks using machine learning (ML) and deep learning (DL) methods depends on large labeled datasets, which are scarce in languages like Spanish. To address this challenge, researchers use data augmentation (DA) techniques to artificially expand small datasets. This study aims to investigate whether DA techniques can improve classification results using ML and DL algorithms for sentiment and emotion analysis of Spanish texts. Various text manipulation techniques were applied, including transformations, paraphrasing (back-translation), and text generation using generative adversarial networks, to small datasets such as song lyrics, social media comments, headlines from national newspapers in Chile, and survey responses from higher education students. The findings show that the Convolutional Neural Network (CNN) classifier achieved the most significant improvement, with an 18% increase using the Generative Adversarial Networks for Sentiment Text (SentiGan) on the Aggressiveness (Seriousness) dataset. Additionally, the same classifier model showed an 11% improvement using the Easy Data Augmentation (EDA) on the Gender-Based Violence dataset. The performance of the Bidirectional Encoder Representations from Transformers (BETO) also improved by 10% on the back-translation augmented version of the October 18 dataset, and by 4% on the EDA augmented version of the Teaching survey dataset. These results suggest that data augmentation techniques enhance performance by transforming text and adapting it to the specific characteristics of the dataset. Through experimentation with various augmentation techniques, this research provides valuable insights into the analysis of subjectivity in Spanish texts and offers guidance for selecting algorithms and techniques based on dataset features.

Thumbnail Image
Publication

Detecting aggressiveness in tweets: A hybrid model for detecting cyberbullying in the Spanish language

2021, Mg. Martinez-Araneda, Claudia, Lepe-FaĂºndez, Manuel, Segura-Navarrete, Alejandra, Vidal-Castro, Christian, Rubio-Manzano, Clemente

In recent years, the use of social networks has increased exponentially, which has led to a significant increase in cyberbullying. Currently, in the field of Computer Science, research has been made on how to detect aggressiveness in texts, which is a prelude to detecting cyberbullying. In this field, the main work has been done for English language texts, mainly using Machine Learning (ML) approaches, Lexicon approaches to a lesser extent, and very few works using hybrid approaches. In these, Lexicons and Machine Learning algorithms are used, such as counting the number of bad words in a sentence using a Lexicon of bad words, which serves as an input feature for classification algorithms. This research aims at contributing towards detecting aggressiveness in Spanish language texts by creating different models that combine the Lexicons and ML approach. Twenty-two models that combine techniques and algorithms from both approaches are proposed, and for their application, certain hyperparameters are adjusted in the training datasets of the corpora, to obtain the best results in the test datasets. Three Spanish language corpora are used in the evaluation: Chilean, Mexican, and Chilean-Mexican corpora. The results indicate that hybrid models obtain the best results in the 3 corpora, over implemented models that do not use Lexicons. This shows that by mixing approaches, aggressiveness detection improves. Finally, a web application is developed that gives applicability to each model by classifying tweets, allowing evaluating the performance of models with external corpus and receiving feedback on the prediction of each one for future research. In addition, an API is available that can be integrated into technological tools for parental control, online plugins for writing analysis in social networks, and educational tools, among others.

Thumbnail Image
Publication

Explainable Hopfield Neural Networks using an automatic video-generation system

2021, Rubio Manzano, Clemente, Segura Navarrete, Alejandra, Martinez-Araneda, Claudia, Vidal Castro, Christian

Hopfield Neural Networks (HNNs) are recurrent neural networks used to implement associative memory. They can be applied to pattern recognition, optimization, or image segmentation. However, sometimes it is not easy to provide the users with good explanations about the results obtained with them due to mainly the large number of changes in the state of neurons (and their weights) produced during a problem of machine learning. There are currently limited techniques to visualize, verbalize, or abstract HNNs. This paper outlines how we can construct automatic video-generation systems to explain its execution. This work constitutes a novel approach to obtain explainable artificial intelligence systems in general and HNNs in particular building on the theory of data-to-text systems and software visualization approaches. We present a complete methodology to build these kinds of systems. Software architecture is also designed, implemented, and tested. Technical details about the implementation are also detailed and explained. We apply our approach to creating a complete explainer video about the execution of HNNs on a small recognition problem. Finally, several aspects of the videos generated are evaluated (quality, content, motivation and design/presentation).