AITopics

The Thirty-Third International Flairs Conference

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

AAAI ConferencesMay-16-2020

Detecting Trait versus Performance Student Behavioral Patterns Using Discriminative Non-Negative Matrix Factorization

Mirzaei, Mehrdad ( The State University of New York at Albany ) | Sahebi, Shaghayegh (The State University of New York at Albany) | Brusilovsky, Peter (University of Pittsburgh)

Recent studies have shown that students follow stable behavioral patterns while learning in online educational systems. These behavioral patterns can further be used to group the students into different clusters. However, as these clusters include both high-and low-performance students, the relation between the behavioral patterns and student performance is yet to be clarified. In this work, we study the relation between students' learning behaviors and their performance, in a self-organized online learning system that allows them to freely practice with various problems and worked examples. We represent each student's behavior as a vector of high-support sequential micro-patterns. Assuming that some behavioral patterns are shared across high-and low-performance students, and some are specific to each group, we group the students according to their performance. Having this assumption, we discover both the prevalent behavioral patterns in each group, and the shared patterns across groups using discriminative non-negative matrix factorization. Our experiments show that there are such common and specific patterns in students' behavior that are discriminative among students with different performances.

matrix factorization, performance student behavioral pattern

The Thirty-Third International Flairs Conference

Industry: Education > Educational Setting > Online (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.53)

Inexact Proximal Gradient Methods for Non-Convex and Non-Smooth Optimization

Gu, Bin (University of Pittsburgh) | Wang, De (University of Texas at Arlington) | Huo, Zhouyuan (University of Pittsburgh ) | Huang, Heng (University of Pittsburgh )

In machine learning research, the proximal gradient methods are popular for solving various optimization problems with non-smooth regularization. Inexact proximal gradient methods are extremely important when exactly solving the proximal operator is time-consuming, or the proximal operator does not have an analytic solution. However, existing inexact proximal gradient methods only consider convex problems. The knowledge of inexact proximal gradient methods in the non-convex setting is very limited. To address this challenge, in this paper, we first propose three inexact proximal gradient algorithms, including the basic version and Nesterov’s accelerated version. After that, we provide the theoretical analysis to the basic and Nesterov’s accelerated versions. The theoretical results show that our inexact proximal gradient algorithms can have the same convergence rates as the ones of exact proximal gradient algorithms in the non-convex setting. Finally, we show the applications of our inexact proximal gradient algorithms on three representative non-convex learning problems. Empirical results confirm the superiority of our new inexact proximal gradient algorithms.

algorithm, artificial intelligence, optimization problem, (16 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.87)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)

Argument Mining for Improving the Automated Scoring of Persuasive Essays

Nguyen, Huy V. (University of Pittsburgh) | Litman, Diane J. (University of Pittsburgh)

End-to-end argument mining has enabled the development of new automated essay scoring (AES) systems that use argumentative features (e.g., number of claims, number of support relations) in addition to traditional legacy features (e.g., grammar, discourse structure) when scoring persuasive essays. While prior research has proposed different argumentative features as well as empirically demonstrated their utility for AES, these studies have all had important limitations. In this paper we identify a set of desiderata for evaluating the use of argument mining for AES, introduce an end-to-end argument mining system and associated argumentative feature sets, and present the results of several studies that both satisfy the desiderata and demonstrate the value-added of argument mining for scoring persuasive essays.

argumentative feature, educational technology, survey article, (21 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Europe (1.00)
North America > United States > Texas (0.14)
North America > United States > Maryland (0.14)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Education > Assessment & Standards > Student Performance (0.88)
Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.35)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.34)

Asking Friendly Strangers: Non-Semantic Attribute Transfer

Murrugarra-Llerena, Nils (University of Pittsburgh) | Kovashka, Adriana (University of Pittsburgh)

Nickisch, and Harmeling 2009; Parikh and Grauman We propose an attention-guided transfer network. Briefly, 2011; Akata et al. 2013), learn object models expediently our approach works as follows. First, the network receives by providing information about multiple object classes training images for attributes in both the source and target with each attribute label (Kovashka, Vijayanarasimhan, and domains. Second, it separately learns models for the attributes Grauman 2011; Parkash and Parikh 2012), interactively recognize in each domain, and then measures how related each fine-grained object categories (Branson et al. 2010; target domain classifier is to the classifiers in the source domains. Wah and Belongie 2013), and learn to retrieve images from Finally, it uses these measures of similarity (relatedness) precise human feedback (Kumar et al. 2011; Kovashka, to compute a weighted combination of the source classifiers, Parikh, and Grauman 2015). Recent ConvNet approaches which then becomes the new classifier for the target have shown how to learn accurate attribute models through attribute. We develop two methods, one where the target and multi-task learning (Fouhey, Gupta, and Zisserman 2016; source domains are disjoint, and another where there is some Huang et al. 2015) or by localizing attributes (Xiao and overlap between them. Importantly, we show that when the Jae Lee 2015; Singh and Lee 2016). However, deep learning source attributes come from a diverse set of domains, the with ConvNets requires a large amount of data to be available gain we obtain from this transfer of knowledge is greater for the task of interest, or for a related task (Oquab et than if only use attributes from the same domain.

classifier, deep learning, neural network, (21 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Directional Label Rectification in Adaptive Graph

Wang, Xiaoqian (University of Pittsburgh) | Huang, Hao (GE Global Research)

With the explosive growth of multivariate time-series data, failure (event) analysis has gained widespread applications. A primary goal for failure analysis is to identify the fault signature, i.e., the unique feature pattern to distinguish failure events. However, the complex nature of multivariate time-series data brings challenge in the detection of fault signature. Given a time series from a failure event, the fault signature and the onset of failure are not necessarily adjacent, and the interval between the signature and failure is usually unknown. The uncertainty of such interval causes the uncertainty in labeling timestamps, thus makes it inapplicable to directly employ any standard supervised algorithms in signature detection. To address this problem, we present a novel directional label rectification model which identifies the fault-relevant timestamps and features in a simultaneous approach. Different from previous graph-based label propagation models using fixed graph, we propose to learn an adaptive graph which is optimal for the label rectification process. We conduct extensive experiments on both synthetic and real world datasets and illustrate the advantage of our model in both effectiveness and efficiency.

artificial intelligence, machine learning, stability, (17 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.28)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.90)

Accelerated Method for Stochastic Composition Optimization With Nonsmooth Regularization

Huo, Zhouyuan (University of Pittsburgh) | Gu, Bin (University of Pittsburgh) | Liu, Ji (University of Rochester) | Huang, Heng (University of Pittsburgh)

Stochastic composition optimization draws much attention recently and has been successful in many emerging applications of machine learning, statistical analysis, and reinforcement learning. In this paper, we focus on the composition problem with nonsmooth regularization penalty. Previous works either have slow convergence rate, or do not provide complete convergence analysis for the general problem. In this paper, we tackle these two issues by proposing a new stochastic composition optimization method for composition problem with nonsmooth regularization penalty. In our method, we apply variance reduction technique to accelerate the speed of convergence. To the best of our knowledge, our method admits the fastest convergence rate for stochastic composition optimization: for strongly convex composition problem, our algorithm is proved to admit linear convergence; for general composition problem, our algorithm significantly improves the state-of-the-art convergence rate from O ( T –1/2 ) to O (( n 1 + n 2 ) 2/3 T -1 ). Finally, we apply our proposed algorithm to portfolio management and policy evaluation in reinforcement learning. Experimental results verify our theoretical analysis.

artificial intelligence, optimization, optimization problem, (16 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.28)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Sentiment Analysis via Deep Hybrid Textual-Crowd Learning Model

Dizaji, Kamran Ghasedi (University of Pittsburgh) | Huang, Heng (University of Pittsburgh)

Crowdsourcing technique provides an efficient platform to employ human skills in sentiment analysis, which is a difficult task for automatic language models due to the large variations in context, writing style, view point and so on. However, the standard crowdsourcing aggregation models are incompetent when the number of crowd labels per worker is not sufficient to train parameters, or when it is not feasible to collect labels for each sample in a large dataset. In this paper, we propose a novel hybrid model to exploit both crowd and text data for sentiment analysis, consisting of a generative crowdsourcing aggregation model and a deep sentimental autoencoder. Combination of these two sub-models is obtained based on a probabilistic framework rather than a heuristic way. We introduce a unified objective function to incorporate the objectives of both sub-models, and derive an efficient optimization algorithm to jointly solve the corresponding problem. Experimental results indicate that our model achieves superior results in comparison with the state-of-the-art models, especially when the crowd labels are scarce.

crowd label, crowdsourcing, neural network, (23 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.14)

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.81)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.81)
(2 more...)

Matrix Variate Gaussian Mixture Distribution Steered Robust Metric Learning

Luo, Lei (University of Pittsburgh) | Huang, Heng (University of Pittsburgh)

Mahalanobis Metric Learning (MML) has been actively studied recently in machine learning community. Most of existing MML methods aim to learn a powerful Mahalanobis distance for computing similarity of two objects. More recently, multiple methods use matrix norm regularizers to constrain the learned distance matrixMto improve the performance. However, in real applications, the structure of the distance matrix M is complicated and cannot be characterized well by the simple matrix norm. In this paper, we propose a novel robust metric learning method with learning the structure of the distance matrix in a new and natural way. We partition M into blocks and consider each block as a random matrix variate, which is fitted by matrix variate Gaussian mixture distribution. Different from existing methods, our model has no any assumption on M and automatically learns the structure of M from the real data, where the distance matrix M often is neither sparse nor low-rank. We design an effective algorithm to optimize the proposed model and establish the corresponding theoretical guarantee. We conduct extensive evaluations on the real-world data. Experimental results show our method consistently outperforms the related state-of-the-art methods.

artificial intelligence, machine learning, metric learning, (15 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Jointly Parse and Fragment Ungrammatical Sentences

Hashemi, Homa B. (University of Pittsburgh) | Hwa, Rebecca (University of Pittsburgh)

However, the sentences under analysis may experiments, we find that both joint methods produce tree not always be grammatically correct. When a dependency fragment sets that are more similar to those produced by the parser nonetheless produces fully connected, syntactically oracle method than the previous pipeline method; moreover, well-formed trees for these sentences, the trees may be inappropriate the seq2seq method's pruning decision has a significantly and lead to errors. In fact, researchers have raised higher accuracy. In terms of downstream applications, we valid questions about the merit of annotating dependency show that dependency arc pruning is helpful for two applications: trees for ungrammatical sentences (Ragheb and Dickinson sentential grammaticality judgment and semantic role 2012; Cahill 2015). On the other hand, previous work has labeling.

deep learning, dependency, neural network, (21 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(3 more...)