No abstract
Sentiment analysis is a significant task in Natural Language Processing. It refers to classification based on the emotional tendency in text by extracting text features. The existing results show that models based on RNN and CNN have good performance. In order to improve the performance of text sentiment analysis, we reformulate the classification task as a comparing problem, and propose Comparison Enhanced Bi-LSTM with Multi-Head Attention (CE-B-MHA). In fact, it is efficient to classify by comparison mechanism instead of doing complex calculation. In this model, bidirectional LSTM is used for initial feature extraction, and valuable information is extracted from different dimensions and representation subspaces by Multi-Head Attention. The comparison mechanism aims to score the feature vectors by comparing with the labeled vectors. The experimental results show that CE-B-MHA has better performance than many existing models on three sentiment analysis datasets. INDEX TERMS Sentiment analysis, machine learning, neural networks.
Emotion cause extraction is a challenging task for the fine-grained emotion analysis. Even though a few studies have addressed the task using clause-level classification methods, most of them have partly ignored emotion-level context information. To comprehensively leverage the information, we propose a novel method based on learning to rank to identify emotion causes from an information retrieval perspective. Our method seeks to rank candidate clauses with respect to certain provoked emotions in analogy with query-level document ranking in information retrieval. To learn effective clause ranking models, we represent candidate clauses as feature vectors involving both emotion-independent features and emotion-dependent features. Emotion-independent features are extracted to capture the possibility that a clause is expected to provoke an emotion, and emotion-dependent features are extracted to capture the relevance between candidate cause clauses and their corresponding emotions. We investigate three approaches to learning to rank for emotion cause extraction in our method. We evaluate the performance of our method on an existing dataset for emotion cause extraction. The experimental results show that our method is effective in emotion cause extraction, significantly outperforming the state-of-the-art baseline methods in terms of the precision, recall, and F-measure. INDEX TERMS Emotion analysis, emotion cause extraction, natural language processing, sentiment analysis, learning to rank.
BackgroundThe baseline incidence of the adverse events of statin therapy varies between countries. Notably, Chinese patients seem more susceptible to myopathy induced by simvastatin.ObjectivesThis research studies the adverse drug reactions (ADRs) of statin therapy in China by analysing trial-based data from the Anti-hyperlipidaemic Drug Database built by the China National Medical Products Administration Information Centre.MethodsAll clinical trials involving statin therapy (including simvastatin, atorvastatin, fluvastatin, lovastatin, pravastatin and rosuvastatin) in China from 1989 to 2019 were screened. In total, 569 clinical studies with 37 828 patients were selected from 2650 clinical trials in the database.ResultsAmong the reported cases with ADRs (2822/37 828; 7.460%), gastrointestinal symptoms were the most common (1491/37 828; 3.942%), followed by liver disease (486/37 828; 1.285%), muscle symptoms (444/37 828; 1.174%) and neurological symptoms (247/37 828; 0.653%). Pravastatin (231/1988; 11.620%) caused the most common gastrointestinal side effects, followed by fluvastatin (333/3094; 10.763%). The least likely to cause gastrointestinal irritation was rosuvastatin (82/1846; 4.442%).ConclusionIn Chinese clinical trials, gastrointestinal symptoms were the most common ADR of statin use for hyperlipidaemia and other cardiovascular diseases.
Pseudo relevance feedback, as an effective query expansion method, can significantly improve information retrieval performance. However, the method may negatively impact the retrieval performance when some irrelevant terms are used in the expanded query. Therefore, it is necessary to refine the expansion terms. Learning to rank methods have proven effective in information retrieval to solve ranking problems by ranking the most relevant documents at the top of the returned list, but few attempts have been made to employ learning to rank methods for term refinement in pseudo relevance feedback. This article proposes a novel framework to explore the feasibility of using learning to rank to optimize pseudo relevance feedback by means of reranking the candidate expansion terms. We investigate some learning approaches to choose the candidate terms and introduce some state-of-the-art learning to rank methods to refine the expansion terms. In addition, we propose two term labeling strategies and examine the usefulness of various term features to optimize the framework. Experimental results with three TREC collections show that our framework can effectively improve retrieval performance.
With the rapid development of biomedicine, the number of biomedical articles has increased accordingly, which presents a great challenge for biologists trying to keep up with the latest research. Information retrieval technologies seek to meet this challenge by searching among a large number of articles based on a given query and providing the most relevant ones to fulfill information needs. As an effective information retrieval technique, query expansion has some room for improvement to achieve the desired performance when directly applied for biomedical information retrieval because there exist many domain-related terms both in users' queries and in related articles. To solve this problem, we propose a biomedical query expansion framework based on learning-to-rank methods, in which we refine the candidate expansion terms by training term-ranking models to select the most relevant terms for enriching the original query. To train the term-ranking models, we first propose a pseudo-relevance feedback method based on MeSH to select candidate expansion terms and then represent the candidate terms as feature vectors by defining both the corpus-based term features and the resource-based term features. Experimental results obtained for TREC genomics datasets show that our method can capture more relevant terms to expand the original query and effectively improve biomedical information retrieval performance.
BackgroundThe number of biomedical research articles have increased exponentially with the advancement of biomedicine in recent years. These articles have thus brought a great difficulty in obtaining the needed information of researchers. Information retrieval technologies seek to tackle the problem. However, information needs cannot be completely satisfied by directly introducing the existing information retrieval techniques. Therefore, biomedical information retrieval not only focuses on the relevance of search results, but also aims to promote the completeness of the results, which is referred as the diversity-oriented retrieval.ResultsWe address the diversity-oriented biomedical retrieval task using a supervised term ranking model. The model is learned through a supervised query expansion process for term refinement. Based on the model, the most relevant and diversified terms are selected to enrich the original query. The expanded query is then fed into a second retrieval to improve the relevance and diversity of search results. To this end, we propose three diversity-oriented optimization strategies in our model, including the diversified term labeling strategy, the biomedical resource-based term features and a diversity-oriented group sampling learning method. Experimental results on TREC Genomics collections demonstrate the effectiveness of the proposed model in improving the relevance and the diversity of search results.ConclusionsThe proposed three strategies jointly contribute to the improvement of biomedical retrieval performance. Our model yields more relevant and diversified results than the state-of-the-art baseline models. Moreover, our method provides a general framework for improving biomedical retrieval performance, and can be used as the basis for future work.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.