· 7 years ago · Nov 23, 2018, 05:08 PM
1title,authors,groups,keywords,topics,abstract
2Kernelized Bayesian Transfer Learning,Mehmet Gönen and Adam A. Margolin,Novel Machine Learning Algorithms NMLA ,"cross-domain learning
3domain adaptation
4kernel methods
5transfer learning
6variational approximation","APP Biomedical / Bioinformatics
7NMLA Bayesian Learning
8NMLA Kernel Methods
9NMLA Transfer Adaptation Multitask Learning
10VIS Object Recognition",Transfer learning considers related but distinct tasks defined on heterogenous domains and tries to transfer knowledge between these tasks to improve generalization performance. It is particularly useful when we do not have sufficient amount of labeled training data in some tasks which may be very costly laborious or even infeasible to obtain. Instead learning the tasks jointly enables us to effectively increase the amount of labeled training data. In this paper we formulate a kernelized Bayesian transfer learning framework that is a principled combination of kernel-based dimensionality reduction models with task-specific projection matrices to find a shared subspace and a coupled classification model for all of the tasks in this subspace. Our two main contributions are i two novel probabilistic models for binary and multiclass classification and ii very efficient variational approximation procedures for these models. We illustrate the generalization performance of our algorithms on two different applications. In computer vision experiments our method outperforms the state-of-the-art algorithms on nine out of 12 benchmark supervised domain adaptation experiments defined on two object recognition data sets. In cancer biology experiments we use our algorithm to predict mutation status of important cancer genes from gene expression profiles using two distinct cancer populations namely patient-derived primary tumor data and in-vitro-derived cancer cell line data. We show that we can increase our generalization performance on primary tumors using cell lines as an auxiliary data source.
11 Source Free Transfer Learning for Text Classification,Zhongqi Lu Yin Zhu Sinno Pan Evan Xiang Yujing Wang and Qiang Yang,"AI and the Web AIW
12Novel Machine Learning Algorithms NMLA ","Transfer Learning
13Auxiliary Data Retrieval
14Text Classification","AIW Knowledge acquisition from the web
15AIW Machine learning and the web
16NMLA Transfer Adaptation Multitask Learning",Transfer learning uses relevant auxiliary data to help the learning task in a target domain where labeled data are usually insufficient to train an accurate model. Given appropriate auxiliary data researchers have proposed many transfer learning models. How to find such auxiliary data however is of little research in the past. In this paper we focus on this auxiliary data retrieval problem and propose a transfer learning framework that effectively selects helpful auxiliary data from an open knowledge space e.g. the World Wide Web . Because there is no need of manually selecting auxiliary data for different target domain tasks we call our framework Source Free Transfer Learning SFTL . For each target domain task SFTL framework iteratively queries for the helpful auxiliary data based on the learned model and then updates the model using the retrieved auxiliary data. We highlight the automatic constructions of queries and the robustness of the SFTL framework. Our experiments on the 20 NewsGroup dataset and the Google search snippets dataset suggest that the new framework is capable to have the comparable performance to those state-of-the-art methods with dedicated selections of auxiliary data.
17A Generalization of Probabilistic Serial to Randomized Social Choice,Haris Aziz and Paul Stursberg,Game Theory and Economic Paradigms GTEP ,"social choice theory
18voting
19fair division
20social decision schemes","GTEP Game Theory
21GTEP Social Choice / Voting",The probabilistic serial PS rule is one of the most well-established and desirable rules for the random assignment problem. We present the egalitarian simultaneous reservation ESR social decision scheme � an extension of PS to the more general setting of randomized social choice. ESR also generalizes an egalitarian rule from the literature which is defined only for dichotomous preferences. We consider various desirable fairness efficiency and strategic properties of ESR and show that it compares favourably against other social decision schemes. Finally we define a more general class of social decision schemes called Simultaneous Reservation SR that contains ESR as well as the serial dictatorship rules. We show that outcomes of SR characterize efficiency with respect to a natural refinement of stochastic dominance.
22Lifetime Lexical Variation in Social Media,Liao Lizi Jing Jiang Ying Ding Heyan Huang and Ee-Peng Lim,NLP and Text Mining NLPTM ,"Generative model
23Social Networks
24Age Prediction","AIW Web personalization and user modeling
25NLPTM Information Extraction
26NLPTM Natural Language Processing General/Other ",As the rapid growth of online social media attracts a large number of Internet users the large volume of content generated by these users also provides us with an opportunity to study the lexical variations of people of different age. In this paper we present a latent variable model that jointly models the lexical content of tweets and Twitter users' age. Our model inherently assumes that a topic has not only a word distribution but also an age distribution. We propose a Gibbs-EM algorithm to perform inference on our model. Empirical evaluation shows that our model can generate meaningful age-specific topics such as school for teenagers and health for older people. Our model also performs age prediction better than a number of baseline methods.
27Hybrid Singular Value Thresholding for Tensor Completion,Xiaoqin Zhang Zhengyuan Zhou Di Wang and Yi Ma,"Knowledge Representation and Reasoning KRR
28Machine Learning Applications MLA
29Novel Machine Learning Algorithms NMLA
30Vision VIS ","tensor completion
31low-rank recovery
32hybrid singular value thresholding","KRR Knowledge Representation General/Other
33MLA Machine Learning Applications General/other
34NMLA Data Mining and Knowledge Discovery
35NMLA Dimension Reduction/Feature Selection
36VIS Statistical Methods and Learning","In this paper we study the low-rank tensor completion problem where a high-order tensor with missing entries is given and the goal is to complete the tensor. We propose to minimize a new convex objective function based on log sum of exponentials of nuclear norms that promotes the low-rankness of unfolding matrices of the completed tensor. We show for the first time that the proximal operator to this objective function is readily computable through a hybrid singular value thresholding scheme. This leads to a new solution to high-order low-rank tensor completion via convex relaxation. We show that this convex relaxation and the resulting solution are much more effective than existing tensor completion methods
37 including those also based on minimizing ranks of unfolding matrices . The hybrid singular value thresholding scheme can be applied to any problem where the goal is
38to minimize the maximum rank of a set of low-rank matrices."
39Locality Preserving Hashing,Kang Zhao Hongtao Lu and Jincheng Mei,Vision VIS ,"Similarity Search
40Approximate Nearest Neighbor Search
41Binary Codes
42Locality Preserving Hashing",VIS Image and Video Retrieval,Hashing has recently attracted considerable attention for large scale similarity search. However learning compact codes with good performance is still a challenge. In many cases the real-world data lies on a low-dimensional manifold embedded in high-dimensional ambient space. To capture meaningful neighbors a compact hashing representation should uncover the intrinsic geometric structure of the manifold e.g. the neighborhood relationships between subregions. Most existing hashing methods only consider this issue during mapping data points into certain projected dimensions. When getting the binary codes they either directly quantize the projected values with a threshold or use an orthogonal matrix to refine the initial projection matrix which both consider projection and quantization separately and it will not well preserve the locality structure in the whole learning process. In this paper we propose a novel hashing algorithm called Locality Preserving Hashing to effectively solve the above problems. Specifically we learn a set of locality preserving projections with a joint optimization framework which minimizes the average projection distance and quantization loss simultaneously. Experimental comparisons with other state-of-the-art methods on two large scale databases demonstrate the effectiveness and efficiency of our method.
43Discovering Better AAAI Keywords via Clustering with Crowd-sourced Constraints,Kelly Moran Byron Wallace and Carla Brodley,Machine Learning Applications MLA ,"constraint-based clustering
44machine learning
45crowdsourcing",MLA Applications of Unsupervised Learning,Selecting good conference keywords is important because they often determine the composition of review committees and hence which papers are reviewed by whom. But presently conference keywords are generated in an ad-hoc manner by a small set of conference organizers. This approach is plainly not ideal. There is no guarantee for example that the generated keyword set aligns with what the community is actually working on and submitting to the conference in a given year. This is especially true in fast moving fields such as AI. The problem is exacerbated by the tendency of organizers to draw heavily on preceding years' keyword lists when generating a new set. Rather than a select few ordaining a keyword set that that represents AI at large it would be preferable to generate these keywords more directly from the data with input from research community members. To this end we solicited feedback from seven AAAI PC members regarding a previously existing keyword set and used these 'crowd-sourced constraints' to inform a clustering over the abstracts of all submissions to AAAI 2013. We show that the keywords discovered via this data-driven human-in-the-loop method are at least as preferred by AAAI PC members as 2013's manually generated set and that they include categories previously overlooked by organizers. Many of the discovered terms were used for this year's conference.
46Online Classification Using a Voted RDA Method,Tianbing Xu Jianfeng Gao Lin Xiao and Amelia Regan,"Machine Learning Applications MLA
47NLP and Machine Learning NLPML
48Novel Machine Learning Algorithms NMLA ","Online Classification
49Voted Dual Averaging Method
50Natural Language Processing
51Parsing Reranking
52Sparse Regularization","MLA Machine Learning Applications General/other
53NLPML Natural Language Processing General/Other
54NMLA Big Data / Scalability
55NMLA Classification
56NMLA Online Learning","We propose a voted dual averaging method for online
57classification problems with explicit regularization.
58This method employs the update rule of the regularized
59dual averaging RDA method proposed by Xiao but
60only on the subsequence of training examples where a
61classification error is made. We derive a bound on the
62number of mistakes made by this method on the training
63set as well as its generalization error rate.We also introduce
64the concept of relative strength of regularization
65and show how it affects the mistake bound and generalization
66performance. We examine the method using
67 1-regularization on a large-scale natural language processing
68task and obtained state-of-the-art classification
69performance with fairly sparse models."
70Fraudulent Support Telephone Number Identification Based on Co-occurrence Information on the Web,Xin Li Yiqun Liu Min Zhang and Shaoping Ma,AI and the Web AIW ,"Fraudulent Support Telephone Number
71Co-occurrence Graph
72Propagation Algorithm","AIW Enhancing web search and information retrieval
73AIW Recognizing web spam such as link farms and splogs", Fraudulent support phones refers to the misleading telephone numbers placed on Web pages or other media that claim to provide services with which they are not associated. Most fraudulent support phone information is found on search engine result pages SERPs and such information substantially degrades the search engine user experience. In this paper we propose an approach to identify fraudulent support telephone numbers on the Web based on the co-occurrence relations between telephone numbers that appear on SERPs. We start from a small set of seed official support phone numbers and seed fraudulent numbers. Then we construct a co-occurrence graph according to the co-occurrence relationships of the telephone numbers that appear on Web pages. Additionally we take the page layout information into consideration on the assumption that telephone numbers that appear in nearby page blocks should be regarded as more closely related. Finally we develop a propagation algorithm to diffuse the trust scores of seed official support phone numbers and the distrust scores of the seed fraudulent numbers on the co-occurrence graph to detect additional fraudulent numbers. Experimental results based on over 1.5 million SERPs produced by a popular Chinese commercial search engine indicate that our approach outperforms TrustRank Anti-TrustRank and Good-Bad Rank algorithms by achieving an AUC value of over 0.90.
74Supervised Hashing for Image Retrieval via Image Representation Learning,Rongkai Xia Yan Pan Hanjiang Lai Cong Liu and Shuicheng Yan,"Novel Machine Learning Algorithms NMLA
75Vision VIS ","supervised hashing
76approximate near neighbor search
77representation learning
78convolutional neural networks
79coordinate descent","NMLA Neural Networks/Deep Learning
80VIS Image and Video Retrieval",Hashing is a popular approximate nearest neighbor search approach in large-scale image retrieval. Supervised hashing which incorporates similarity/dissimilarity information on entity pairs to improve the quality of hashing function learning has recently received increasing attention. However in the existing supervised hashing methods for images an input image is usually encoded by a vector of hand-crafted visual features. Such hand-crafted feature vectors do not necessary preserve the accurate semantic similarities of images pairs which may often degrade the performance of hashing function learning. In this paper we propose a supervised hashing method for image search in which we automatically learn a good image representation tailored to hashing as well as a set of hash functions. The proposed method has two stages. In the first stage given the pairwise similarity matrix $S$ on pairs of training images we propose a scalable coordinate descent method to decompose $S$ into a product of $HH^T$ where $H$ is a matrix with each of its row being the approximate hash code associated to a training image. In the second stage we propose to simultaneously learn a good feature representation for the input images as well as a set of hash functions via a deep convolutional network tailored to the learned hash codes in $H$ or the discrete class labels of the images. Extensive empirical evaluations on three benchmark datasets with different kinds of images show that the proposed method has superior performance gains over several state-of-the-art supervised and unsupervised hashing methods.
81Tailoring Local Search for Partial MaxSAT,Shaowei Cai Chuan Luo Kaile Su and John Thornton,"Heuristic Search and Optimization HSO
82Search and Constraint Satisfaction SCS ","Partial MaxSAT
83Local Search
84Heuristics","HSO Heuristic Search
85HSO Optimization
86SCS Constraint Optimization",Partial MaxSAT PMS is a generalization to SAT and MaxSAT. Many real world problems can be encoded into PMS in a more natural and compact way than SAT and MaxSAT. In this paper we propose new ideas for local search for PMS which mainly rely on the distinction between hard and soft clauses. We then use these ideas to develop a local search PMS algorithm called Dist. Experimental results on PMS benchmarks from MaxSAT Evaluation 2013 show that Dist significantly outperforms state-of-the-art PMS algorithms including both local search algorithms and complete ones on random and crafted benchmarks. For the industrial benchmark Dist dramatically outperforms previous local search algorithms and is comparable and complementary to complete algorithms.
87R2 An Efficient MCMC Sampler for Probabilistic Programs,Aditya Nori Chung-Kil Hur Sriram Rajamani and Selva Samuel,"Novel Machine Learning Algorithms NMLA
88Reasoning under Uncertainty RU ","Probabilistic programming
89Program analysis
90Sampling","MLA Machine Learning Applications General/other
91NMLA Bayesian Learning
92RU Bayesian Networks
93RU Graphical Models Other
94RU Probabilistic Inference",We present a new Markov Chain Monte Carlo MCMC sampling algorithm for probabilistic programs. Our approach and tool called R2 has the unique feature of employing program analysis in order to improve the efficiency of MCMC sampling. Given an input program P R2 propagates observations in P backwards to obtain a semantically equivalent program P' in which every probabilistic assignment is immediately followed by an observe statement. Inference is performed by a suitably modified version of the Metropolis-Hastings algorithm that exploits the structure of the program P0. This has the overall effect of preventing rejections due to program executions that fail to satisfy observations in P. We formalize the semantics of probabilistic programs and rigorously prove the correctness of R2.We also empirically demonstrate the effectiveness of R2 �--in particular we show that R2 is able to produce results of similar quality as the Church and Stan probabilistic programming tools with much shorter execution time.
95Reconsidering Mutual Information Based Feature Selection A Statistical Significance View,Vinh Nguyen Jeffrey Chan and James Bailey,Novel Machine Learning Algorithms NMLA ,"feature selection
96mutual information
97global optimization","NMLA Data Mining and Knowledge Discovery
98NMLA Dimension Reduction/Feature Selection",Mutual information MI based approaches are an important feature selection paradigm. Although the stated goal of MI-based feature selection is to identify a subset of features that share the highest mutual information with the class variable most current MI-based techniques are greedy methods that make use of low dimensional MI quantities. The reason for using low dimensional approximation has been mostly attributed to the difficulty associated with estimating the high dimensional MI from limited samples. In this paper we argue a different viewpoint that given a very large amount of data the high dimensional MI objective is still problematic to be employed as a meaningful optimization criterion due to its overfitting nature the MI almost always increases as more features are added thus leading to a trivial solution which includes all features. We propose a novel approach to the MI-based feature selection problem in which the overfitting phenomenon is controlled rigourously by means of a statistical test. We develop local and global optimization algorithms for this new feature selection model and demonstrate its effectiveness in the applications of explaining variables and objects.
99Influence Maximization with Novelty Decay in Social Networks,Shanshan Feng Xuefeng Chen Gao Cong Yifeng Zeng Yeow Meng Chee and Yanping Xiang,AI and the Web AIW ,"social networks
100influence maximization
101novelty decay",AIW Social networking and community identification,"Influence maximization problem is to find a set of seed
102nodes in a social network such that their influence
103spread is maximized under certain propagation models.
104A few algorithms have been proposed for this problem.
105However they have not considered the impact of
106novelty decay in influence propagation i.e. repeated
107exposures will have diminishing influence on users. In
108this paper we consider the problem of influence max-
109imization with novelty decay IMND . We investigate
110the effect of novelty decay on influence propagation in
111real-life datasets and formulate the IMND problem. We
112further analyze its relevant properties and propose an
113influence estimation technique. We demonstrate perfor-
114mance of our algorithms over four social networks."
115Solving Uncertain MDPs by Reusing State Information and Plans,Ping Hou William Yeoh and Tran Cao Son,Planning and Scheduling PS ,"Markov Decision Processes MDPs
116Replanning
117Incremental Search
118Uncertain MDPs","PS Probabilistic Planning
119PS Replanning and Plan Repair
120PS Planning General/Other ",While Markov decision processes MDPs are powerful tools for modeling sequential decision making problems under uncertainty they are sensitive to the accuracy of their parameters. MDPs with uncertainty in their parameters are called Uncertain MDPs. In this paper we introduce a general framework that allows off-the-shelf MDP algorithms to solve Uncertain MDPs by planning based on currently available information and replan if and when the problem changes. We demonstrate the generality of this approach by showing that it can use the VI TVI ILAO* LRTDP and UCT algorithms to solve Uncertain MDPs. We experimentally show that our approach is typically faster than replanning from scratch and we also provide a way to estimate the amount of speedup based on the amount of information is reused.
121Identifying Differences in Physician Communication Styles with a Log-Linear Transition Component Model,Byron Wallace Issa Dahabreh Michael Barton Laws Ira Wilson Thomas Trikalinos and Eugene Charniak,"Applications APP
122Machine Learning Applications MLA
123NLP and Machine Learning NLPML ","Conversation modeling
124Patient-doctor communication
125Sequential component model","APP Biomedical / Bioinformatics
126MLA Bio/Medicine
127MLA Applications of Supervised Learning
128NLPML Discourse and Dialogue",We consider the task of grouping doctors with respect to communication patterns exhibited in outpatient visits. We propose a novel approach toward this end in which we model speech act transitions in conversations via a log-linear model incorporating physician specific components. We train this model over transcripts of outpatient visits annotated with speech act codes and then cluster physicians in a transformation of this parameter space. We find significant correlations between the induced groupings and patient survey response data comprising ratings of physician communication. Furthermore the novel sequential component model we leverage to induce this clustering allows us to explore differences across these groups. This work demonstrates how statistical AI might be used to better understand and ultimately improve physician communication.
129Multi-Organ Exchange The Whole is Greater than the Sum of its Parts,John Dickerson and Tuomas Sandholm,"Applications APP
130Game Theory and Economic Paradigms GTEP
131Multiagent Systems MAS ","Kidney exchange
132Sparse random graphs
133Computational economics","APP Biomedical / Bioinformatics
134GTEP Auctions and Market-Based Systems
135MAS Mechanism Design",Kidney exchange where candidates with organ failure trade incompatible but willing donors is a life-saving alternative to the deceased donor waitlist which has inadequate supply to meet demand. While fielded kidney exchanges see huge benefit from altruistic kidney donors who give an organ without a paired needy candidate a significantly higher medical risk to the donor deters similar altruism with livers. In this paper we begin by proposing the idea of liver exchange and show on demographically accurate data that vetted kidney exchange algorithms can be adapted to clear such an exchange at the nationwide level. We then explore cross-organ donation where kidneys and livers can be bartered for each other. We show theoretically that this multi-organ exchange provides linearly more transplants than running separate kidney and liver exchanges; this linear gain is a product of altruistic kidney donors creating chains that thread through the liver pool. We support this result experimentally on demographically accurate multi-organ exchanges. We conclude with thoughts regarding the fielding of a nationwide liver or joint liver-kidney exchange from a legal and computational point of view.
136A Latent Variable Model for Discovering Bird Species Commonly Misidentified by Citizen Scientists,Jun Yu Rebecca Hutchinson and Weng-Keen Wong,Computational Sustainability and AI CSAI ,"Probabilistic Graphical Model
137Crowdsourcing
138Citizen Science
139Ecology",CSAI Modeling and prediction of dynamic and spatiotemporal phenomena and systems,Data quality is a common source of concern for large-scale citizen science projects like eBird. In the case of eBird a major cause of poor quality data is the misidentification of bird species by inexperienced contributors. A proactive approach for improving data quality is to identify commonly misidentified bird species and to teach inexperienced birders the differences between these species. In this paper we develop a latent variable graphical model that can identify groups of bird species that are often confused for each other by eBird participants. Our model is a multi-species extension of the classic occupancy-detection model in the ecology literature. This multi-species extension is non-trivial requiring a structure learning step as well as a computationally expensive parameter learning stage which we make efficient through a variational approximation. We show that by including these species misidentifications in the model we can not only discover these misidentifications but predictions of both species occupancy and detection are also more accurate.
140Cross-View Feature Learning for Scalable Social Image Analysis,Wenxuan Xie Yuxin Peng and Jianguo Xiao,AI and the Web AIW ,"Cross-View Learning
141Feature Learning
142Random Projection","AIW AI for multimedia and multimodal web applications
143AIW Enhancing web search and information retrieval
144AIW Machine learning and the web",Nowadays images on social networking websites e.g. Flickr are mostly accompanied with user-contributed tags which help cast a new light on the conventional content-based image analysis tasks such as image classification and retrieval. In order to establish a scalable social image analysis system two issues need to be considered 1 Supervised learning is a futile task in modeling the enormous number of concepts in the world whereas unsupervised approaches overcome this hurdle; 2 Algorithms are required to be both spatially and temporally efficient to handle large-scale datasets. In this paper we propose a cross-view feature learning CVFL framework to handle the problem of social image analysis effectively and efficiently. Through explicitly modeling the relevance between image content and tags which is empirically shown to be visually and semantically meaningful CVFL yields more promising results than existing methods in the experiments. More importantly being general and descriptive CVFL and its variants can be readily applied to other large-scale multi-view tasks in unsupervised setting.
145Semantic Graph Construction for Weakly-Supervised Image Parsing,Wenxuan Xie Yuxin Peng and Jianguo Xiao,Vision VIS ,"Weakly-Supervised Learning
146Image Parsing
147Graph Construction","VIS Categorization
148VIS Object Recognition",We investigate weakly-supervised image parsing i.e. assigning class labels to image regions by using image-level labels only. Existing studies pay main attention to the formulation of the weakly-supervised learning problem i.e. how to propagate class labels from images to regions given an affinity graph of regions. Notably however the affinity graph of regions which is generally constructed in relatively simpler settings in existing methods is of crucial importance to the parsing performance due to the fact that the weakly-supervised parsing problem cannot be solved within a single image and that the affinity graph enables label propagation among multiple images. In order to embed more semantics into the affinity graph we propose novel criteria by exploiting the weak supervision information carefully and develop two graphs L1 semantic graph and k-NN semantic graph. Experimental results demonstrate that the proposed semantic graphs not only capture more semantic relevance but also perform significantly better than conventional graphs in image parsing.
149The Importance of Cognition and Affect for Artificially Intelligent Decision Makers,Celso de Melo Jonathan Gratch and Peter Carnevale,"Cognitive Modeling CM
150Cognitive Systems CS
151Humans and AI HAI ","Mind perception
152cognition
153affect
154Decision making
155cooperation","CM Simulating Humans
156CS Problem solving and decision making
157HAI Human-Computer Interaction
158HAI Understanding People Theories Concepts and Methods",Agency the capacity to plan and act and experience the capacity to sense and feel are two critical aspects that determine whether people will perceive non-human entities such as autonomous agents to have a mind. There is evidence that the absence of either can reduce cooperation. We present an experiment that tests the necessity of both for cooperation with agents. In this experiment we manipulated people s perceptions about the cognitive and affective abilities of agents when engaging in the ultimatum game. The results indicated that people offered more money to agents that were perceived to make decisions according to their intentions rather than randomly. Additionally the results showed that people offered more money to agents that expressed emotion when compared to agents that did not. We discuss the implications of this agency-experience theoretical framework for the design of artificially intelligent decision makers.
159The Complexity of Reasoning with FODD and GFODD,Benjamin Hescott and Roni Khardon,Knowledge Representation and Reasoning KRR ,"Decision Diagrams
160Computational Complexity
161First Order Logic","KRR Automated Reasoning and Theorem Proving
162KRR Computational Complexity of Reasoning
163KRR Knowledge Representation Languages",Recent work introduced Generalized First Order Decision Diagrams GFODD as a knowledge representation that is useful in mechanizing decision theoretic planning in relational domains. GFODDs generalize function-free first order logic and include numerical values and numerical generalizations of existential and universal quantification. Previous work presented heuristic inference algorithms for GFODDs. In this paper we study the complexity of the evaluation problem the satiability problem and the equivalence problem for GFODDs under the assumption that the size of the intended model is given with the problem a restriction that guarantees decidability. Our results provide a complete characterization. The same characterization applies to the corresponding restriction of problems in first order logic giving an interesting new avenue for efficient inference when the number of objects is bounded. Our results show that for $\Sigma_k$ formulas and for corresponding GFODDs evaluation and satisfiability are $\Sigma_k^p$ complete and equivalence is $\Pi_{k+1}^p$ complete. For $\Pi_k$ formulas evaluation is $\Pi_k^p$ complete satisfiability is one level higher and is $\Sigma_{k+1}^p$ complete and equivalence is $\Pi_{k+1}^p$ complete.
164Scalable Complex Contract Negotiation With Structured Search and Agenda Management,Xiaoqin Zhang Mark Klein and Ivan Marsa Maestre,Multiagent Systems MAS ,"Large-scale Negotiation
165Interdependent Issues
166Complex Contracts
167agenda management","MAS Distributed Problem Solving
168MAS Mechanism Design
169MAS Multiagent Systems General/other
170SCS Distributed CSP/Optimization",A large number of interdependent issues in complex contract poses a challenge for current negotiation approaches which becomes even more apparent when negotiation problems scale up. To address this challenge we present a structured anytime search process with agenda management mechanism using a hierarchical negotiation model where agents search at various levels during the negotiation with the guidance from a mediator agent. This structured negotiation process increases computational efficiency making negotiations scalable for large number of interdependent issues. To validate the contributions of our approach 1 we developed anytime tree search negotiation process with an agenda management mechanism using a hierarchical problem structure and constraint-based preference model for real-world applications; 2 we defined a scenario matrix to capture various characteristics of negotiation scenarios and developed a scenario generator that produces testing cases according to this matrix; and 3 we performed an extensive set of experiments to study the performance of this structured negotiation protocol and the influence of different scenario parameters and investigated the Pareto efficiency and social welfare optimality of the negotiation outcomes.
171Manifold Learning for Jointly Modeling Topic and Visualization,Tuan Le and Hady Lauw,Novel Machine Learning Algorithms NMLA ,"document visualization
172dimensionality reduction
173topic model
174manifold learning","NMLA Data Mining and Knowledge Discovery
175NMLA Dimension Reduction/Feature Selection
176NMLA Graphical Model Learning
177NMLA Unsupervised Learning Other ",Classical approaches to visualization directly reduce a document's high-dimensional representation into visualizable two or three dimensions using techniques such as multidimensional scaling. More recent approaches consider an intermediate representation in topic space between word space and visualization space which preserves the semantics by topic modeling. We call the latter semantic visualization problem as it seeks to jointly model topic and visualization. While previous approaches aim to preserve the global consistency they do not consider the local consistency in terms of the intrinsic geometric structure of the document manifold. We therefore propose an unsupervised probabilistic model called Semafore which aims to preserve the manifold in the lower-dimensional spaces. Comprehensive experiments on several real-life text datasets of news articles and web pages show that Semafore significantly outperforms the state-of-the-art baselines on objective evaluation metrics.
178Constructing Symbolic Representations for High-Level Planning,George Konidaris Leslie Kaelbling and Tomas Lozano-Perez,Novel Machine Learning Algorithms NMLA ,"Reinforcement learning
179Planning
180Representation","NMLA Reinforcement Learning
181PS Learning Models for Planning and Diagnosis
182ROB Cognitive Robotics",We consider the problem of constructing a symbolic description of a continuous low-level environment for use in planning. We show that symbols that can represent the preconditions and effects of an agent's actions are both necessary and sufficient for high-level planning. This enables reinforcement learning agents to acquire their own symbolic representations autonomously and eliminates the symbol design problem when a representation must be constructed in advance. The resulting representation can be converted into PDDL a canonical planning representation that enables very fast planning.
183Towards Understanding Unscripted Gesture and Language Input for Human-Robot Interactions,Cynthia Matuszek Liefeng Bo Luke Zettlemoyer and Dieter Fox,"Humans and AI HAI
184NLP and Machine Learning NLPML
185Robotics ROB ","Human-Robot Interaction
186Robotics
187Natural Language Processing
188ML classifier features","APP Other Applications
189HAI Language Acquisition
190MLA Machine Learning Applications General/other
191NLPML Natural Language Processing General/Other
192NMLA Time-Series/Data Streams
193ROB Human-Robot Interaction",As robots become more ubiquitous it is increasingly important for untrained users to be able to interact with them intuitively. In this work we investigate how people refer to objects in the world during relatively unstructured communication with robots. We collect a corpus of interactions from users describing objects which we use to train language and gesture models that allow our robot to determine what objects are being indicated. We introduce a temporal extension to state-of-the-art hierarchical matching pursuit features to support gesture understanding and demonstrate that combining multiple communication modalities more effectively captures user intent than relying on a single type of input. Finally we present initial interactions with a robot that uses the learned models to follow commands while continuing to learn from user input.
194Intelligent System for Urban Emergency Management During Large-scale Disaster,Xuan Song Quanshi Zhang and Ryosuke Shibasaki,Computational Sustainability and AI CSAI ,"Emergency Management
195Disaster Informatics
196Human Mobility",CSAI Modeling and prediction of dynamic and spatiotemporal phenomena and systems,The frequency and intensity of natural disasters has significantly increased over the past decades and this trend is predicted to continue. Facing these unexpected disasters urban emergency management has become the especially important issue for the whole governments around the world. In this paper we present a novel intelligent system for urban emergency management during the large-scale disasters. The proposed system stores and manages the global positioning system GPS records from mobile devices used by approximately 1.6 million people throughout Japan from 1 August 2010 to 31 July 2011. By mining and analyzing population movements after the Great East Japan Earthquake our system is able to automatically learn a probabilistic model to better understand and simulate human mobility during the emergency situations. Based on the learning model population mobility in various urban areas impacted by the earthquake throughout Japan is able to be automatically simulated or predicted. On the basis of such kind of system it is easy for us to find some new features or population mobility patterns after the recent and unprecedented composite disasters which are likely to provide the valuable experiences and play a vital role for future disaster management worldwide.
197Cached Iterative Weakening for Optimal Multi-Way Number Partitioning,Ethan Schreiber and Richard Korf,"Heuristic Search and Optimization HSO
198Planning and Scheduling PS
199Search and Constraint Satisfaction SCS ","Heuristic Search
200Optimization
201Search
202Scheduling
203Constraint Optimization","HSO Heuristic Search
204HSO Optimization
205HSO Search General/Other
206PS Scheduling
207SCS Constraint Satisfaction","The NP-hard number-partitioning problem is to separate a multiset
208 S of n positive integers into k subsets such that the largest
209 sum of the integers assigned to any subset is minimized. The classic
210 application is scheduling a set of n jobs with different run times
211 onto k identical machines such that the makespan the time to
212 complete the schedule is minimized. We present a new algorithm
213 cached iterative weakening CIW for solving this problem
214 optimally. It incorporates three ideas distinct from the previous
215 state of the art it explores the search space using iterative
216 weakening instead of branch and bound; generates feasible subsets
217 once and caches them instead of at each node of the search tree; and
218 explores subsets in cardinality order instead of an arbitrary
219 order. The previous state of the art is represented by three different
220 algorithms depending on the values of n and k. We provide one
221 algorithm which outperforms all previous algorithms for k >=
222 4. Our run times are up to two orders of magnitude faster."
223Online Social Spammer Detection,Xia Hu Jiliang Tang and Huan Liu,"AI and the Web AIW
224Machine Learning Applications MLA ","Social Media
225Social Spammer
226Online Learning","AIW Machine learning and the web
227AIW Recognizing web spam such as link farms and splogs
228AIW Web personalization and user modeling
229MLA Machine Learning Applications General/other
230NMLA Data Mining and Knowledge Discovery
231NMLA Online Learning",The explosive use of social media also makes it a popular platform for malicious users known as social spammers to overwhelm normal users with unwanted content. One effective way for social spammer detection is to build a classifier based on content and social network information. However social spammers are sophisticated and adaptable to game the system with fast evolving content and network patterns. First social spammers continually change their spamming content patterns to avoid being detected. Second reflexive reciprocity makes it easier for social spammers to establish social influence and pretend to be normal users by quickly accumulating a large number of ``human friends. It is challenging for existing anti-spamming systems based on batch-mode learning to quickly respond to newly emerging patterns for effective social spammer detection. In this paper we present a general optimization framework to collectively use content and network information for social spammer detection and provide the solution for efficient online processing. Experimental results on Twitter datasets confirm the effectiveness and efficiency of the proposed framework.
232Modeling and Predicting Popularity Dynamics via Reinforced Poisson Process,Huawei Shen Dashun Wang Chaoming Song and Albert Laszlo Barabasi,Applications APP ,"Social Dynamics
233Poisson Process
234Popularity Prediction","APP Computational Social Science
235APP Social Networks",An ability to predict the popularity dynamics of individual items within a complex evolving system has important implications in an array of areas. Here we propose a generative probabilistic framework using a reinforced Poisson process to model explicitly the process through which individual items gain their popularity. This model distinguishes itself from existing models via its capability of modeling the arrival process of popularity and its remarkable power at predicting the popularity of individual items. It possesses the flexibility of applying Bayesian treatment to further improve the predictive power using a conjugate prior. Extensive experiments on a longitudinal citation dataset demonstrate that this model consistently outperforms existing popularity prediction methods
236The Computational Rise and Fall of Fairness,John Dickerson Jonathan Goldman Jeremy Karp Ariel Procaccia and Tuomas Sandholm,Game Theory and Economic Paradigms GTEP ,"Fair division
237Computational social choice
238Envy-free allocation
239Phase transition",GTEP Social Choice / Voting,The fair division of indivisible goods has long been an important topic in economics and more recently computer science. We investigate the existence of envy-free allocations of indivisible goods that is allocations where each player values her own allocated set of goods at least as highly as any other player's allocated set of goods. Under additive valuations we show that even when the number of goods is larger than the number of agents by a linear fraction envy-free allocations are unlikely to exist. We then show that when the number of goods is larger by a logarithmic factor such allocations exist with high probability. We support these results experimentally and show that the asymptotic behavior of the theory holds even when the number of goods and agents is quite small. We demonstrate that there is a sharp phase transition from nonexistence to existence of envy-free allocations and that on average the computational problem is hardest at that transition.
240Type-based Exploration for Satisficing Planning with Multiple Search Queues,Fan Xie Martin Mueller and Robert Holte,"Heuristic Search and Optimization HSO
241Planning and Scheduling PS ","Satisficing Planning
242Heuristic Search
243Greedy Best First Search","HSO Heuristic Search
244PS Deterministic Planning","Utilizing multiple queues in Greedy Best-First Search GBFS has been proven to be a very effective approach to satisficing planning. Successful applications include extra queues based on Helpful Actions or Preferred Operators as well as using Multiple Heuristics. One weakness of all standard GBFS algorithms is their lack of exploration. All queues used in these methods work as priority queues sorted by heuristic values. Therefore misleading heuristics especially early in the search process cause the search to become ineffective.
245
246Type systems as introduced for heuristic search by Lelis et al are a recent development of ideas for exploration related to the classic stratified sampling approach. The current work introduces a search algorithm that utilizes type systems in a new way for exploration within a GBFS multiqueue frame- work in satisficing planning.
247
248A careful case study shows the benefits of such exploration for overcoming deficiencies of the heuristic. The proposed new baseline algorithm Type-GBFS solves almost 200 more problems than baseline GBFS over all International Planning Competition problems. Type-LAMA a new planner which integrates Type-GBFS into LAMA-2011 substantially improves upon LAMA in terms of both coverage and speed."
249Exact Subspace Clustering in Linear Time,Shusen Wang Bojun Tu Congfu Xu and Zhihua Zhang,Novel Machine Learning Algorithms NMLA ,"subspace clustering
250data selection
251scalable algorithm
252robust principal component analysis",NMLA Clustering,Subspace clustering is an important unsupervised learning problem with wide applications in computer vision and data analysis. However the state-of-the-art methods for this problem suffer from high time complexity---quadratic or cubic in $n$ the number of data instances . In this paper we exploit a data selection algorithm to speedup computation and the robust principal component analysis to strengthen robustness. Accordingly we devise a scalable and robust subspace clustering method which costs time only linear in $n$. We prove theoretically that under certain mild assumptions our method solves the subspace clustering problem exactly even for grossly corrupted data. Our algorithm is based on very simple ideas yet it is the only linear time algorithm with noiseless or noisy recovery guarantee. Finally empirical results verify our theoretical analysis.
253Robust Bayesian Inverse Reinforcement Learning with Sparse Behavior Noise,Jiangchuan Zheng Siyuan Liu and Lionel Ni,Novel Machine Learning Algorithms NMLA ,"Inverse reinforcement learning
254Robust model
255Sparse behavior noise
256Variational inference",NMLA Reinforcement Learning,Inverse reinforcement learning IRL aims to recover the reward function underlying a Markov Decision Process from behaviors of experts in support of decision-making. Most recent work on IRL assumes the same level of trustworthiness of all expert behaviors and frames IRL as a process of seeking reward function that makes those behaviors appear near -optimal. However it is common in reality that noisy expert behaviors disobeying the optimal policy exist which may degrade the IRL performance significantly. To address this issue in this paper we develop a robust IRL framework that can accurately estimate the reward function in the presence of behavior noise. In particular we focus on a special type of behavior noise referred to as sparse noise due to its wide popularity in real-world behavior data. To model such noise we introduce a novel latent variable characterizing the reliability of each expert action and use Laplace distribution as its prior. We then device an EM algorithm with a novel variational inference procedure in the E-step which can automatically identify and remove behavior noise in reward learning. Experiments on both synthetic data and real vehicle routing data with noticeable behavior noise show significant improvement of our method over previous approaches in learning accuracy and also demonstrate its power in de-noising behavior data.
257Lazy Defenders Are Almost Optimal Against Diligent Attackers,Avrim Blum Nika Haghtalab and Ariel Procaccia,Game Theory and Economic Paradigms GTEP ,"Security games
258Approximation
259Sampling","GTEP Game Theory
260GTEP Imperfect Information",Most work building on the Stackelberg security games model assumes that the attacker can perfectly observe the defender's randomized assignment of resources to targets. This assumption has been challenged by recent papers which designed tailor-made algorithms that compute optimal defender strategies for security games with limited surveillance. We analytically demonstrate that in zero-sum security games lazy defenders who simply keep optimizing against perfectly informed attackers are almost optimal against diligent attackers who go to the effort of gathering a reasonable number of observations. This result implies that in some realistic situations limited surveillance may not need to be explicitly addressed.
261PREGO An Action Language for Belief-Based Cognitive Robotics in Continuous Domains,Vaishak Belle and Hector Levesque,Knowledge Representation and Reasoning KRR ,"knowledge representation
262situation calculus
263cognitive robotics
264reasoning about beliefs
265action and change
266action languages","KRR Action Change and Causality
267KRR Knowledge Representation Languages
268KRR Reasoning with Beliefs
269KRR Knowledge Representation General/Other ","Cognitive robotics is often subject to the criticism that the proposals
270investigated in the literature are far removed from the kind of continuous
271uncertainty and noise seen in actual real-world robotics. This paper
272proposes a new language and an implemented system called PREGO based on
273the situation calculus that is able to reason effectively about degrees of
274belief against noisy sensors and effectors in continuous domains. It
275embodies the representational richness of conventional logic-based action
276languages such as context-sensitive successor state axioms but is still
277shown to be efficient using a number of empirical evaluations. We believe
278that PREGO is a simple yet powerful dialect to explore real-time reactivity
279and an interesting bridge between logic and probability for cognitive
280robotics applications."
281Dropout Training for Support Vector Machines,Ning Chen Jun Zhu Jianfei Chen and Bo Zhang,"Machine Learning Applications MLA
282Novel Machine Learning Algorithms NMLA ","Dropout traning
283Support Vector Machines
284Data Augmentation
285Feature Noising","MLA Applications of Supervised Learning
286NMLA Classification
287NMLA Supervised Learning Other
288NMLA Machine Learning General/other ",Dropout and other feature noising schemes have shown promising results in controlling over-fitting by artificially corrupting the training data. Though extensive theoretical and empirical studies have been performed for generalized linear models little work has been done for support vector machines SVMs one of the most successful approaches for supervised learning. This paper presents dropout training for linear SVMs. To deal with the intractable expectation of the non-smooth hinge loss under corrupting distributions we develop an iteratively re-weighted least square IRLS algorithm by exploring data augmentation techniques. Our algorithm iteratively minimizes the expectation of a re-weighted least square problem where the re-weights have closed-form solutions. The similar ideas are applied to develop a new IRLS algorithm for the expected logistic loss under corrupting distributions. Our algorithms offer insights on the connection and difference between the hinge loss and logistic loss in dropout training. Empirical results on several real datasets demonstrate the effectiveness of dropout training on significantly boosting the classification accuracy of linear SVMs.
289Game-theoretic Resource Allocation for Protecting Large Public Events,Yue Yin Bo An and Manish Jain,"Applications APP
290Game Theory and Economic Paradigms GTEP
291Multiagent Systems MAS ","Security
292Game Theory
293Stackelberg Games","APP Security and Privacy
294GTEP Game Theory
295MAS Multiagent Systems General/other ","High profile large scale public events are attractive targets for terrorist attacks. The recent Boston Marathon bombings on April 15 2013 have further emphasized the importance of protecting public events. The security challenge is exacerbated by the dynamic nature of such events e.g. the impact of an attack at different locations changes over time as the Boston marathon participants and spectators move along the race track. In addition the defender can relocate security resources among potential attack targets at any time and the attacker may act at any time during the event.
296
297This paper focuses on developing efficient patrolling algorithms for such dynamic domains with continuous strategy spaces for both the defender and the attacker. We aim at computing optimal pure defender strategies since an attacker does not have an opportunity to learn and respond to mixed strategies due to the relative infrequency of such events. We propose SCOUT-A which makes assumptions on relocation cost exploits payoff representation and computes optimal solutions efficiently. We also propose SCOUT-C to compute the exact optimal defender strategy for general cases despite the continuous strategy spaces. SCOUT-C computes the optimal defender strategy by constructing an equivalent game with discrete defender strategy space then solving the constructed game. Experimental results show that both SCOUT-A and SCOUT-C significantly outperform other existing strategies."
298Quality-based Learning for Web Data Classification,Ou Wu,"AI and the Web AIW
299Machine Learning Applications MLA ","Information quality
300Multi-task learning
301Web data classification","AIW Machine learning and the web
302MLA Applications of Supervised Learning",The types of web data vary in terms of information quantity and quality. For example some pages contain numerous texts whereas some others contain few texts; some web videos are in high resolution whereas some other web videos are in low resolution. As a consequence the quality of extracted features from different web data may also vary greatly. Existing learning algorithms on web data classification usually ignore the variations of information quality or quantity. In this paper the information quantity and quality of web data are described by quality-related factors such as text length and image quantity and a new learning method is proposed to train classifiers based on quality-related factors. The method divides training data into subsets according to the clustering results of quality-related factors and then trains classifiers by using a multi-task learning strategy for each subset. Experimental results indicate that the quality-related factors are useful in web data classification and the proposed method outperforms conventional algorithms that do not consider information quantity and quality.
303A Strategy-Proof Online Auction with Time Discounting Values,Fan Wu Junming Liu Zhenzhe Zheng and Guihai Chen,Game Theory and Economic Paradigms GTEP ,"Online Auction
304Mechanism Design
305Game Theory",GTEP Auctions and Market-Based Systems,Online mechanism design has been widely applied to various practical applications. However designing a strategy-proof online mechanism is much more challenging than that in a static scenario due to short of knowledge of future information. In this paper we investigate online auctions with time discounting values in contrast to the flat values studied in most of existing work. We present a strategy-proof 2-competitive online auction mechanism despite of time discounting values. We also implement our design and compare it with off-line optimal solution. Our numerical results show that our design achieves good performance in terms of social welfare revenue average winning delay and average valuation loss.
306ReLISH Reliable Label Inference via Smoothness Hypothesis,Chen Gong Dacheng Tao Keren Fu and Jie Yang,Novel Machine Learning Algorithms NMLA ,"Semi-supervised learning
307Local smoothness
308Regularization","NMLA Classification
309NMLA Semisupervised Learning",The smoothness hypothesis is critical for graph-based semi-supervised learning. This paper defines local smoothness based on which a new algorithm Reliable Label Inference via Smoothness Hypothesis ReLISH is proposed. ReLISH has produced smoother labels than some existing methods for both labeled and unlabeled examples. Theoretical analyses demonstrate good stability and generalizability of ReLISH. Using real-world datasets our empirical analyses reveal that ReLISH is promising for both transductive and inductive tasks when compared with representative algorithms including Harmonic Functions Local and Global Consistency Constraint Metric Learning Linear Neighborhood Propagation and Manifold Regularization.
310Parallel Materialisation of Datalog Programs in Centralised Main-Memory RDF Systems,Boris Motik Yavor Nenov Robert Piro Ian Horrocks and Dan Olteanu,"AI and the Web AIW
311Knowledge Representation and Reasoning KRR ","datalog
312materialization
313fixpoint computation
314parallelism
315big data","AIW Question answering on the web
316AIW Representing reasoning and using provenance trust privacy and security on the web
317KRR Ontologies
318KRR Automated Reasoning and Theorem Proving
319KRR Logic Programming",We present a novel approach to parallel materialisation i.e. fixpoint computation of datalog programs in centralised main-memory multi-core RDF systems. The approach comprises an algorithm that evenly distributes the workload to cores and an RDF indexing data structure that supports efficient 'mostly' lock-free parallel updates. Our empirical evaluation shows that our approach parallelises computation very well so with 16 physical cores materialisation can be up to 13.9 times faster than with just one core.
320Non-linear Label Ranking for Large-scale Prediction of Long-Term User Interests,Nemanja Djuric Mihajlo Grbovic Vladan Radosavljevic Narayan Bhamidipati and Slobodan Vucetic,"Machine Learning Applications MLA
321Novel Machine Learning Algorithms NMLA ","Computational advertising
322Label ranking
323Online learning
324Large-scale learning
325Big data","MLA Applications of Supervised Learning
326NMLA Big Data / Scalability
327NMLA Preferences/Ranking Learning",We consider the problem of personalization of online services from the viewpoint of display ad targeting where we seek to find the best ad categories to be shown to each user resulting in improved user experience and increased advertiser's revenue. We propose to address this problem as a task of ranking the ad categories by each user's preferences and introduce a novel label ranking approach capable of efficiently learning non-linear highly accurate models in large-scale settings. Experiments on real-world advertising data set with more than 3.2 million users show that the proposed algorithm outperforms the existing solutions in terms of both rank loss and top-K retrieval performance strongly suggesting the benefit of using the proposed model on large-scale ranking problems.
328Efficient Object Detection via Adaptive Online Selection of Sensor-Array Elements,Matthai Philipose,"Novel Machine Learning Algorithms NMLA
329Planning and Scheduling PS
330Reasoning under Uncertainty RU
331Vision VIS ","object detection
332low power
333value of information
334adaptive submodular optimization
335online optimization","MLA Machine Learning Applications General/other
336NMLA Active Learning
337NMLA Bayesian Learning
338PS Probabilistic Planning
339PS Temporal Planning
340RU Bayesian Networks
341RU Decision/Utility Theory
342RU Probabilistic Inference
343RU Sequential Decision Making
344VIS Categorization
345VIS Object Detection
346VIS Statistical Methods and Learning
347VIS Videos","We examine how to use emerging far-infrared imager ensembles to detect certain
348objects of interest e.g. faces hands people and animals in synchronized
349RGB video streams at very low power. We formulate the problem as one of selecting
350subsets of sensing elements among many thousand possibilities from the
351ensembles for tests. The subset selection problem is naturally {\em adaptive} and {\em online} testing certain elements early can obviate the need for testing many others later and selection policies must be updated at inference time. We pose the ensemble sensor selection problem as a structured extension of test-cost-sensitive classification propose a principled suite of techniques to exploit ensemble structure to speed up processing and show how to re-estimate policies fast. We estimate reductions in power consumption of roughly 50x relative to even highly optimized implementations of face detection a canonical object-detection problem. We also illustrate the benefits of adaptivity and online estimation."
352Simultaneous Cake Cutting,Eric Balkanski Simina Brânzei David Kurokawa and Ariel Procaccia,Game Theory and Economic Paradigms GTEP ,"Cake cutting
353Fair division
354Computational social choice",GTEP Social Choice / Voting,We introduce the simultaneous model for cake cutting the fair allocation of a divisible good in which agents simultaneously send messages containing a sketch of their preferences over the cake. We show that this model enables the computation of divisions that satisfy proportionality � a popular fairness notion � using a protocol that circumvents a standard lower bound via parallel information elicitation. Cake divisions satisfying another prominent fairness notion envy-freeness are impossible to compute in the simultaneous model but such allocations admit arbitrarily good approximations.
355Recovering from Selection Bias in Causal and Statistical Inference,Elias Bareinboim Jin Tian and Judea Pearl,"Knowledge Representation and Reasoning KRR
356Reasoning under Uncertainty RU ","Causal Inference
357Causal Reasoning
358Do-calculus
359Causality
360Selection Bias
361Sampling Selection Bias
362Case-control studies
363Bias Removal
364Backdoor criterion","KRR Action Change and Causality
365RU Bayesian Networks
366RU Uncertainty in AI General/Other ",Selection bias is caused by preferential exclusion of units from the samples and represents a major obstacle to valid causal and statistical inferences; it cannot be removed by randomized experiments and can rarely be detected in either experimental or observational studies. Extending the results of Bareinboim and Pearl 2012 we provide complete graphical and algorithmic conditions for recovering conditional probabilities from selection biased data. We further provide graphical conditions for recoverability when unbiased data is available over a subset of the variables. Finally we provide a graphical condition that generalizes the backdoor criterion and serves to recover causal effects when the data is collected under preferential selection.
367Semi-supervised Target Alignment via Label-Aware Base Kernels,Qiaojun Wang Kai Zhang and Ivan Marsic,Novel Machine Learning Algorithms NMLA ,"Semi-supervised Kernel Learning
368Eigenfunction Extrapolation
369Kernel Target Alignment
370Ideal Kernel","NMLA Kernel Methods
371NMLA Semisupervised Learning",Currently a large family of kernel methods for semi-supervised learning SSL problems builds the kernel by weighted average of predefined base kernels i.e. those spanned by kernel eigenvectors . Optimization of the base kernel weights has been studied extensively in the literatures. However little attention was devoted to designing high-quality base kernels. Note that the eigenvectors of the kernel matrix which are computed irrespective of class labels may not always reveal useful structures of the target. As a result the generalization performance can be poor however hard the base kernel weighting is tuned. On the other hand there are many SSL algorithms whose focus is not on kernel design but instead the estimation of the class labels directly. Motivated by the label propagation approach in this paper we propose to construct novel kernel eigenvectors by injecting the class label information under the framework of eigenfunction extrapolation. A set of ``label-aware'' base kernels can be obtained with greatly improved quality which leads to higher target alignment and henceforth better performance. Our approach is computationally efficient and demonstrates encouraging performance in semi-supervised classification and regression tasks.
372Active Learning with Model Selection via Nested Cross-Validation,Alnur Ali Rich Caruana and Ashish Kapoor,Humans and AI HAI ,"active learning
373model selection
374machine learning",NMLA Active Learning,Most work on active learning avoids the issue of model selection by training models of only one type SVMs boosted trees etc. using one pre-defined set of model hyperparameters. We propose an algorithm that actively samples data to simultaneously train a set of candidate models different model types and/or different hyperparameters and also to select the best model from this set of candidates. The algorithm actively samples points for training that are most likely to improve the accuracy of the more promising candidate models and also samples points to use for model selection---all samples count against the same ï¬xed labeling budget. This exposes a natural trade-off between the focused active sampling that is most effective for training models and the unbiased uniform sampling that is better for model selection. We empirically demonstrate on six test problems that this algorithm is nearly as effective as an active learning oracle that knows the optimal model in advance.
375Sketch Recognition with Natural Correction and Editing,Jie Wu Changhu Wang Liqing Zhang and Yong Rui,"AI and the Web AIW
376Humans and AI HAI
377Machine Learning Applications MLA
378Vision VIS ","Sketch Recognition
379Symbol Recognition
380User Interface
381Correction and Editing
382Shape Knowledge","AIW Intelligent user interfaces for web systems
383HAI Human-Computer Interaction
384HAI Interaction Techniques and Devices
385MLA Applications of Supervised Learning
386NMLA Data Mining and Knowledge Discovery
387VIS Object Recognition",In this paper we target at the problem of sketch recognition. We systematically study how to incorporate users' natural correction and editing into isolated and full sketch recognition. This is a natural and necessary interaction in real systems such as Visio where extremely similar shapes exist. First a novel algorithm is proposed to mine the prior shape knowledge for three editing modes. Second to differentiate visually similar shapes a novel symbol recognition algorithm is introduced by leveraging the learnt shape knowledge. Then a novel correction/editing detection algorithm is proposed to facilitate symbol recognition. Furthermore both of the symbol recognizer and the correction/editing detector are systematically incorporated into the full sketch recognition. Finally based on the proposed algorithms a real-time sketch recognition system is built to recognize hand-drawn flowchart/diagram with flexible interactions. Extensive experiments on benchmark datasets show the effectiveness of the proposed algorithms.
388Generalized Label Reduction for Merge-and-Shrink Heuristics,Silvan Sievers Martin Wehrle and Malte Helmert,"Heuristic Search and Optimization HSO
389Planning and Scheduling PS ","classical planning
390heuristic search
391merge-and-shrink abstractions
392label reduction","HSO Heuristic Search
393HSO Optimization
394HSO Evaluation and Analysis Search and Optimization
395PS Deterministic Planning",Label reduction is a technique for simplifying families of labeled transition systems by dropping distinctions between certain transition labels. While label reduction is critical to the efficient computation of merge-and-shrink heuristics current theory only permits reducing labels in a limited number of cases. We generalize this theory so that labels can be reduced in every intermediate abstraction of a merge-and-shrink tree. This is particularly important for efficiently computing merge-and-shrink abstractions based on non-linear merge strategies. As a case study we implement a non-linear merge strategy based on the original work on merge-and-shrink heuristics in model checking by Dräger et al.
396Predicting Emotions in User-Generated Videos,Yu-Gang Jiang and Baohan Xu,AI and the Web AIW ,"Emotion
397User-generated videos
398Multimodal features",AIW AI for multimedia and multimodal web applications,User-generated video collections are expanding rapidly in recent years and systems for automatic analysis of these collections are in high demands. While extensive research efforts have been devoted to recognizing semantics like birthday party and skiing little attempts have been made to understand the emotions carried by the videos e.g. joy and sadness . In this paper we propose a comprehensive computational framework for predicting emotions in user-generated videos. We first introduce a rigorously designed dataset collected from popular video-sharing websites with manual annotations which can serve as a valuable benchmark for future research. A large set of features are extracted from this dataset ranging from popular low-level visual descriptors audio features to high-level semantic attributes. Results of a comprehensive set of experiments indicate that combining multiple types of features---such as the joint use of the audio and visual clues---is important and attribute features such as those containing sentiment-level semantics are very effective.
399Emotion Classification in Microblog Texts Using Class Sequential Rules,Shiyang Wen and Xiaojun Wan,AI and the Web AIW ,"Emotion Classification
400Chinese Microblogs
401Class Sequential Rules","AIW Knowledge acquisition from the web
402AIW Web-based opinion extraction and trend spotting",This paper studies the problem of emotion classification in microblog texts. Given a microblog text which consists of several sentences we classify its emotion as anger disgust fear happiness like sadness or surprise if possible. Existing methods can be categorized as lexicon based methods or machine learning based methods. However due to some intrinsic characteristics of the microblog texts previous studies using these methods always get unsatisfactory results. This paper introduces a novel approach based on class sequential rules for emotion classification of microblog texts. The approach first obtains two potential emotion labels for each sentence in a microblog text by using an emotion lexicon and a machine learning approach respectively and regards each microblog text as a data sequence. It then mines class sequential rules from the sequence set and finally derives new features from the mined rules for emotion classification of microblog texts. Experimental results on a Chinese benchmark dataset show the superior performance of the proposed approach.
403k-CoRating Filling up Data to Obtain Privacy and Utility,Feng Zhang Victor E Lee and Ruoming Jin,"AI and the Web AIW
404Applications APP
405Novel Machine Learning Algorithms NMLA ","Privacy-preserving Collaborative Filtering Recommender Systems
406Data Privacy
407Parallel Computing","AIW Web-based recommendation systems
408APP Security and Privacy
409NMLA Recommender Systems",For datasets in Collaborative Filtering CF recommendations even if the identifier is deleted and some trivial perturbation operations are applied to ratings before they are released there are research results claiming that the adversary could discriminate the individual's identity with a little bit of information. In this paper we propose $k$-coRating a novel privacy-preserving model to retain data privacy by replacing some null ratings with significantly predicted scores. They do not only mask the original ratings such that a $k$-anonymity-like data privacy is preserved but also enhance the data utility measured by prediction accuracy in this paper which shows that the traditional assumption that accuracy and privacy are two goals in conflict is not necessarily correct. We show that the optimal $k$-coRated mapping is an NP-hard problem and design a naive but efficient algorithm to achieve $k$-coRating. All claims are verified by experimental results.
410Adaptive Multi-Compositionality for Recursive Neural Models with Applications to Sentiment Analysis,Li Dong Furu Wei Ming Zhou and Ke Xu,NLP and Machine Learning NLPML ,"recursive neural network
411sentiment analysis
412semantic composition
413recursive neural model
414neural network
415deep learning",NLPML Natural Language Processing General/Other ,Recursive neural models have achieved promising results in many natural language processing tasks. The main difference among these models lies in the composition function i.e. how to obtain the vector representation for a phrase or sentence using the representations of words it contains. This paper introduces a novel Adaptive Multi-Compositionality AdaMC layer to recursive neural models. The basic idea is to use more than one composition functions and adaptively select them depending on the input vectors. We present a general framework to model each semantic composition as a distribution over these composition functions. The composition functions and parameters used for adaptive selection are learned jointly from data. We integrate AdaMC into existing recursive neural models and conduct extensive experiments on the Stanford Sentiment Treebank. The results illustrate that AdaMC significantly outperforms state-of-the-art sentiment classification methods. It helps push the best accuracy of sentence-level negative/positive classification from 85.4% up to 88.5%.
416Predicting the Hardness of Learning Bayesian Networks,Brandon Malone Kustaa Kangas Matti Järvisalo Mikko Koivisto and Petri Myllymäki,"Heuristic Search and Optimization HSO
417Novel Machine Learning Algorithms NMLA ","Bayesian networks
418structure learning
419algorithm portfolios
420empirical hardness models","HSO Metareasoning and Metaheuristics
421HSO Evaluation and Analysis Search and Optimization
422NMLA Graphical Model Learning
423NMLA Evaluation and Analysis Machine Learning ","There are various algorithms for finding a Bayesian network structure
424 BNS that is optimal with respect to a given scoring function. No
425single algorithm dominates the others in speed and given a problem
426instance it is a priori unclear which algorithm will perform
427best and how fast it will solve the problem. Estimating the running
428times directly is extremely difficult as they are complicated functions
429of the instance. The main contribution of this paper is characterization
430of the empirical hardness of an instance for a given algorithm based on
431a novel collection of non-trivial yet efficiently computable features.
432Our empirical results based on the largest evaluation of
433state-of-the-art BNS learning algorithms to date demonstrate that we
434can predict the runtimes to a reasonable degree of accuracy and
435effectively select algorithms that perform well on a particular
436instance. Moreover we also show how the results can be utilized in
437building a portfolio algorithm that combines several individual
438algorithms in an almost optimal manner."
439Stochastic Privacy Model Methods and Experiments,Adish Singla Ece Kamar Ryen White and Eric Horvitz,AI and the Web AIW ,"privacy tradeoff
440value of information
441online services
442web search personalization
443submodular optimization","AIW Representing reasoning and using provenance trust privacy and security on the web
444AIW Web personalization and user modeling
445APP Security and Privacy
446RU Decision/Utility Theory",Online services such as web search and e-commerce applications typically rely on the collection of data about users including details of their activities on the web. Such personal data is used to enhance the quality of service via personalization of content and to maximize revenues via better targeting of advertisements and deeper engagement of users on sites. To date service providers have largely followed the approach of either requiring or requesting consent for opting-in to share their data. Users may be willing to share private information in return for better quality of service or for incentives or in return for assurances about the nature and extend of the logging of data. We introduce \emph{stochastic privacy} a new approach to privacy centering on a simple concept A guarantee is provided to users about the upper-bound on the probability that their personal data will be used. Such a probability which we refer to as \emph{privacy risk} can be assessed by users as a preference or communicated as a policy by a service provider. Service providers can work to personalize and to optimize revenues in accordance with preferences about privacy risk. We present procedures proofs and an overall system for maximizing the quality of services while respecting bounds on allowable or communicated privacy risk. We demonstrate the methodology with a case study and evaluation of the procedures applied to web search personalization. We show how we can achieve near-optimal utility of accessing information with provable guarantees on the probability of sharing data.
447A Parameterized Complexity Analysis of Generalized CP-Nets,Martin Kronegger Martin Lackner Andreas Pfandler and Reinhard Pichler,"Game Playing and Interactive Entertainment GPIE
448Knowledge Representation and Reasoning KRR ","Computational social choice
449CP-nets
450Fixed-parameter tractable algorithms
451 Parameterized complexity","GTEP Social Choice / Voting
452KRR Computational Complexity of Reasoning
453KRR Preferences",Generalized CP-nets GCP-nets allow a succinct representation of preferences over multi-attribute domains. As a consequence of their succinct representation many GCP-net related tasks are computationally hard. Even finding the more preferable of two outcomes is PSPACE-complete. In this work we employ the framework of parameterized complexity to achieve two goals First we want to gain a deeper understanding of the complexity of GCP-nets. Second we search for efficient fixed-parameter tractable algorithms.
454Solving Imperfect Information Games Using Decomposition,Neil Burch Michael Johanson and Michael Bowling,Game Theory and Economic Paradigms GTEP ,"game theory
455equilibrium theory
456extensive-form games
457imperfect information games","GTEP Game Theory
458GTEP Equilibrium
459GTEP Imperfect Information",Decomposition i.e. independently analyzing possible subgames has proven to be an essential principle for effective decision-making in perfect information games. However in imperfect information games decomposition has proven to be problematic. To date all proposed techniques for decomposition in imperfect information games have abandoned theoretical guarantees. This work presents the first technique for decomposing an imperfect information game into subgames that can be solved independently while retaining optimality guarantees on the full-game solution. We can use this technique to construct theoretically justified algorithms that make better use of information available at run-time overcome memory or disk limitations at run-time or make a time/space trade-off to overcome memory or disk limitations while solving a game. In particular we present an algorithm for subgame solving which guarantees performance in the whole game in contrast to existing methods which may have unbounded error. In addition we present an offline game solving algorithm CFR-D which can produce a Nash equilibrium for a game that is larger than available storage.
460Robust Visual Robot Localization Across Seasons using Network Flows,Tayyab Naseer Luciano Spinello Wolfram Burgard and Cyrill Stachniss,Robotics ROB ,"robotics
461visual localization
462seasons",ROB Localization Mapping and Navigation,Image-based localization is an important problem in robotics and an integral part of visual mapping and navigation systems. An approach to robustly matching images to previously recorded ones must be able to cope with seasonal changes especially when it is supposed to work reliably over long periods of time. In this paper we present a novel approach to visual localization of mobile robots in outdoor environments which is able to deal with substantial seasonal changes. We formulate image matching as a minimum cost flow problem in a data association graph to effectively exploit sequence information. This allows us to deal with non-matching image sequences that result from temporal occlusions or from visiting new places. We present extensive experimental evaluations under substantial seasonal changes. They suggest that our approach allows for an accurate matching across seasons and outperforms existing state-of-the-art methods such as FABMAP2 and SeqSLAM in such a context.
463Small-variance Asymptotics for Dirichlet Process Mixtures of SVMs,Yining Wang and Jun Zhu,Novel Machine Learning Algorithms NMLA ,"small-variance asymptotics
464Bayesian nonparametric modeling
465infinite SVM
466collapsed Gibbs sampling","NMLA Bayesian Learning
467NMLA Big Data / Scalability
468NMLA Classification
469NMLA Clustering","Infinite SVMs iSVM is a Dirichlet process DP mixture of large-margin classifiers. Though flexible in learning nonlinear classifiers and discovering latent clustering structures iSVM has a difficult inference task and existing methods could hinder its applicability to large-scale problems. This paper presents a small-variance asymptotic analysis to derive a simple and efficient algorithm which monotonically optimizes a max-margin DP-means M2 DPM problem an extension of DP-means for both predictive learning and descriptive clustering. Our analysis is built on Gibbs infinite SVMs an alternative DP mixture of large-margin machines
470which admits a partially collapsed Gibbs sampler without truncation by exploring data augmentation techniques. Experimental results show that M2 DPM runs much faster than similar algorithms without sacrificing prediction accuracies."
471Online Budgeted Social Choice,Joel Oren and Brendan Lucier,Game Theory and Economic Paradigms GTEP ,"Online algorithms
472online learning
473approximation algorithms
474computational social choice","GTEP Game Theory
475GTEP Social Choice / Voting
476GTEP Adversarial Learning
477GTEP Imperfect Information
478MAS E-Commerce","We consider a classic social choice problem in an online setting.
479In each round a decision maker observes a single agent's preferences over
480a set of $m$ candidates and must choose whether to irrevocably add a candidate
481to a selection set of limited cardinality $k$. Each agent's positional score
482depends on the candidates in the set when he arrives and the decision-maker's
483goal is to maximize average over all agents score.
484
485We prove that no algorithm even randomized can achieve an approximation
486factor better than $O \log\log m/ \log m $. In contrast if the agents
487arrive in random order we present a $ 1 - 1/e - o 1 $-approximate
488algorithm matching a lower bound for the offline problem.
489We show that improved performance is possible for natural input distributions
490or scoring rules.
491
492Finally if the algorithm is permitted to revoke decisions at a fixed
493cost we apply regret-minimization techniques to achieve approximation
494$1 - 1/e - o 1 $ even for arbitrary inputs."
495Using The Matrix Ridge Approximation to Speedup Determinantal Point Processes Sampling Algorithms,Shusen Wang Chao Zhang Hui Qian and Zhihua Zhang,Novel Machine Learning Algorithms NMLA ,"kernel approximation
496determinantal point process DPP
497matrix ridge approximation MRA
498the Nystrom method","NMLA Big Data / Scalability
499NMLA Kernel Methods",Determinantal point process DPP is an important probabilistic model that has extensive applications in artificial intelligence. The exact sampling algorithm of DPP requires the full eigenvalue decomposition of the kernel matrix which has high time and space complexities. This prohibits the applications of DPP from large-scale datasets. Previous work has applied the Nystrom method to speedup the sampling algorithm of DPP and error bounds have been established for the approximation. In this paper we employ the matrix ridge approximation MRA to speedup the sampling algorithm of DPP and we show that our approach MRA-DPP has stronger error bound than the Nystrom-DPP. In certain circumstance our MRA-DPP is provably exact whereas the Nystrom-DPP is far from the ground truth. Finally experiments on several real-world datasets show that our MRA-DPP is much more accurate than the other approximation approaches.
500TopicMF Simultaneously Exploiting Ratings and Reviews for Recommendation,Yang Bao Hui Fang and Jie Zhang,"AI and the Web AIW
501Knowledge Representation and Reasoning KRR ","recommender system
502ratings and free-form reviews
503Non-negative matrix factorization","AIW Web-based recommendation systems
504KRR Preferences
505NMLA Recommender Systems",Although users' preference is semantically reflected in the free-form review texts this wealth of information was not fully exploited for learning recommender models. Specifically almost all existing recommendation algorithms only exploit rating scores in order to find users' preference but ignore the review texts accompanied with rating information. In this paper we propose a novel matrix factorization model called TopicMF which simultaneously considers the ratings and accompanied review texts. Experimental results on 20 real-world datasets show the superiority of our model over the state-of-the-art models demonstrating its effectiveness for recommendation task.
506OurAgent'13 A Champion Adaptive Power Trading Agent,Daniel Urieli and Peter Stone,"Applications APP
507Computational Sustainability and AI CSAI
508Game Theory and Economic Paradigms GTEP
509Machine Learning Applications MLA
510Multiagent Systems MAS
511Novel Machine Learning Algorithms NMLA
512Planning and Scheduling PS
513Reasoning under Uncertainty RU ","Autonomous Electricity Trading Agents
514Machine Learning
515Reinforcement Learning
516Online Learning
517Smart Grid
518Trading agents competition
519Sustainable Energy","APP Other Applications
520CSAI Control and optimization of dynamic and spatiotemporal systems
521CSAI Modeling and control of complex high-dimensional systems
522CSAI Modeling the interactions of agents with different and often conflicting interests
523GTEP Auctions and Market-Based Systems
524MLA Environmental
525MLA Applications of Supervised Learning
526MLA Applications of Reinforcement Learning
527MAS Multiagent Systems General/other
528NMLA Reinforcement Learning
529PS Markov Models of Environments
530RU Sequential Decision Making","Sustainable energy systems of the future will no longer be able to
531rely on the current paradigm that energy supply follows demand. Many
532of the renewable energy resources do not necessarily produce the
533energy when it is needed and therefore there is a need for new market
534structures that motivate sustainable behaviors by participants. The
535Power Trading Agent Competition $\powertac$ is a new annual
536competition that focuses on the design and operation of future retail
537power markets specifically in smart grid environments with renewable
538energy production smart metering and autonomous agents acting on
539behalf of customers and retailers. It uses a rich open-source
540simulation platform that is based on real-world data and
541state-of-the-art customer models. Its purpose is to help researchers
542understand the dynamics of customer and retailer decision-making as
543well as the robustness of proposed market designs. This paper
544introduces OurAgent'13 the champion agent from the inaugural
545competition in 2013. OurAgent is an adaptive agent that learns and
546reacts to the environment in which it operates by heavily relying on
547reinforcement-learning and prediction methods. This paper describes the
548constituent components of our agent and examines the success of the
549complete agent through analysis of competition results and subsequent
550controlled experiments."
551Fast and Accurate Influence Maximization on Large Networks with Pruned Monte-Carlo Simulations,Naoto Ohsaka Takuya Akiba Yuichi Yoshida and Ken-Ichi Kawarabayashi,AI and the Web AIW ,"influence maximization
552viral marketing
553independent cascade model
554social networks",AIW Social networking and community identification,Influence maximization is a problem to find small sets of highly influential individuals in a social network to maximize the spread of influence under stochastic cascade models of propagation. Although the problem has been well-studied it is still highly challenging to find solutions of high quality in large-scale networks of the day. While Monte-Carlo-simulation-based methods produce nearly optimal solutions with a theoretical guarantee they are prohibitively slow for large graphs. As a result many heuristic methods without any theoretical guarantee have been developed but all of them substantially compromise solution quality. To address this issue we propose a new method for the influence maximization problem. Unlike other recent heuristic methods the proposed method is a Monte-Carlo-simulation-based method and thus it consistently produces solutions of high quality with the theoretical guarantee. On the other hand unlike other previous Monte-Carlo-simulation-based methods it runs as fast as other state-of-the-art methods and can be applied to large networks of the day. Through our extensive experiments we demonstrate the scalability and the solution quality of the proposed method.
555Fixing a Balanced Knockout Tournament,Haris Aziz Serge Gaspers Simon Mackenzie Nicholas Mattei Paul Stursberg and Toby Walsh,Game Theory and Economic Paradigms GTEP ,"knockout tournaments
556tournament fixing problem
557manipulation","GTEP Game Theory
558GTEP Social Choice / Voting",Balanced knockout tournaments are one of the most common formats for sports competitions and are also used in elections and decision-making. We consider the computational problem of finding the optimal draw for a particular player in such a tournament. The problem has generated considerable research within AI in recent years. We prove that checking whether there exists a draw in which a player wins is NP-complete thereby settling an outstanding open problem. Our main result has a number of interesting implications on related counting and approximation problems. We present a memoization-based algorithm for the problem that is faster than previous approaches. Moreover we highlight two natural cases that can be solved in polynomial time. All of our results also hold for the more general problem of counting the number of draws in which a given player is the winner.
559Querying Inconsistent Description Logic Knowledge Bases under Preferred Repair Semantics,Camille Bourgaux Meghyn Bienvenu and François Goasdoué,Knowledge Representation and Reasoning KRR ,"inconsistency-tolerant query answering
560complexity of query answering
561DL-Lite
562conjunctive queries","KRR Ontologies
563KRR Computational Complexity of Reasoning
564KRR Description Logics
565KRR Preferences","Recently several inconsistency-tolerant semantics have been introduced for querying
566inconsistent description logic knowledge bases. Most of these semantics rely on the notion of a repair defined as an inclusion-maximal subset of the facts ABox which is consistent with the ontology TBox . In this paper we investigate variants of two popular inconsistency-tolerant semantics obtained by replacing the classical notion of repair by different types of preferred repairs. For each of the resulting semantics we analyze the complexity of conjunctive query answering over knowledge bases expressed in the lighweight logic DL-Lite. Unsurprisingly query answering is intractable in all cases but we nonetheless identify one notion of preferred repair based upon assigning facts to priority levels whose data complexity is ``only coNP-complete. This leads us to propose an approach combining incomplete tractable methods with calls to a SAT solver. An experimental evaluation of the approach shows good scalability on realistic cases."
567Incomplete Preferences in Single-Peaked Electorates,Martin Lackner,Game Theory and Economic Paradigms GTEP ,"Computational social choice
568Preferences
569Incomplete information
570Structure
571Single-peaked
572Algorithms","GTEP Social Choice / Voting
573GTEP Imperfect Information","Incomplete preferences are likely to arise in real-world preference aggregation and voting systems.
574This paper deals with determining whether an incomplete preference profile is single-peaked.
575This is essential information since many intractable voting problems become tractable for single-peaked profiles.
576We prove that for incomplete profiles the problem of determining single-peakedness is NP-complete.
577Despite this computational hardness result we find four polynomial-time algorithms for reasonably restricted settings."
578Forecasting Potential Diabetes Complications,Yang Yang Walter Luyten Lu Liu Marie-Francine Moens Juanzi Li and Jie Tang,Applications APP ,"forecast diabetes complications
579feature sparseness
580sparse factor graph",APP Biomedical / Bioinformatics,Diabetes complications often afflict diabetes patients seriously over 68% of diabetes-related mortality is caused by diabetes complications. In this paper we study the problem of automatically diagnosing diabetes complications from patients lab test results. The objective problem has two main challenges 1 feature sparseness a patient only takes 1:26% lab tests on average and 65:5% types of lab tests are taken by less than 10 patients; 2 knowledge skewness it lacks comprehensive detailed domain knowledge of association between diabetes complications and lab tests. To address these challenges we propose a novel probabilistic model called Sparse Factor Graph Model SparseFGM . SparseFGM projects sparse features onto a lower-dimensional latent space which alleviates the problem of sparseness. SparseFGM is also able to capture the associations between complications and lab tests which help handle the knowledge skewness. We evaluate the proposed model on a large collections of real medical records. SparseFGM outperforms +20% by F1 baselines significantly and gives detailed associations between diabetes complications and lab tests.
581Knowledge Graph Embedding by Translating on Hyperplanes,Zhen Wang Jianwen Zhang Jianlin Feng and Zheng Chen,"Knowledge Representation and Reasoning KRR
582Machine Learning Applications MLA
583NLP and Knowledge Representation NLPKR
584Novel Machine Learning Algorithms NMLA
585Reasoning under Uncertainty RU ","Knowledge Embedding
586Knowledge Graph
587Knowledge Reasoning
588Knowledge Completion
589Fact Extraction
590Representation Learning","KRR Knowledge Representation General/Other
591MLA Machine Learning Applications General/other
592NLPKR Semantics and Summarization
593NLPTM Information Extraction
594NMLA Relational/Graph-Based Learning
595RU Uncertainty Representations
596RU Uncertainty in AI General/Other ",We deal with embedding a large scale knowledge graph composed of entities and relations into a continuous vector space. TransE is a promising method proposed recently which is very efficient while achieving the state-of-the-art predictive performance. We discuss about some mapping properties of relations which should be considered in embedding such as symmetric one-to-many many-to-one and many-to-many. We point out that TransE does not do well in dealing with these properties. Some complex models are capable to preserve these mapping properties but sacrificing the efficiency. To make a good trade-off between model capacity and efficiency in this paper we propose a method where a relation is modeled as a hyperplane together with a translation operation on it. In this way we can well preserve the above mapping properties of relations with almost the same model complexity of TransE. Meanwhile as a practical knowledge graph is often far from completed how to construct negative examples to reduce false negative labels in training is very important. Utilizing the one-to-many/many-to-one mapping property of a relation we propose a simple trick to reduce the possibility of false negative labeling. We conduct extensive experiments on the tasks of link prediction triplet classification and fact extraction on benchmark data sets from WordNet and Freebase. They show impressive improvements on predictive accuracy and also the capability to scale up.
597A Control Dichotomy for Pure Scoring Rules,Edith Hemaspaandra Lane A. Hemaspaandra and Henning Schnoor,Game Theory and Economic Paradigms GTEP ,"voting systems
598computational social choice
599complexity
600scoring rules
601control of elections
602dichotomy theorems",GTEP Social Choice / Voting,"Scoring systems are an extremely important class of election systems. A length-m so-called scoring vector applies only to m-candidate elections. To handle general elections one must use a family of vectors one per length.
603
604The most elegant approach to making sure such families are family-like is the recently introduced notion of polynomial-time uniform pure scoring rules Betzler and Dorn 2010 where each scoring vector is obtained from its precursor by adding one new coefficient.
605
606We obtain the first dichotomy theorem for pure scoring rules for a control problem. In particular for constructive control by adding voters CCAV which is arguably the most important control type we show that CCAV is solvable in polynomial time for k-approval with k<=3 k-veto with k<=2 every pure scoring rule in which only the two top-rated candidates gain nonzero scores and a particular rule that is a hybrid of 1-approval and 1-veto. For all other pure scoring rules CCAV is NP-complete.
607
608We also investigate the descriptive richness of different models for defining pure scoring rules proving how more rule-generation time gives more rules proving that rationals give more rules than do the natural numbers and proving that some restrictions previously thought to be w.l.o.g. in fact do lose generality."
609Fast consistency checking of very large real-world RCC-8 constraint networks using graph partitioning,Charalampos Nikolaou and Manolis Koubarakis,"Knowledge Representation and Reasoning KRR
610Search and Constraint Satisfaction SCS ","qualitative spatial reasoning
611consistency checking
612graph partitioning","KRR Geometric Spatial and Temporal Reasoning
613KRR Qualitative Reasoning
614SCS Constraint Satisfaction","We present a new reasoner for RCC-8 constraint networks called gp-rcc8 that is based on the patchwork property of path-consistent tractable RCC-8 networks and graph partitioning. We compare gp-rcc8 with state of the art reasoners that are based on constraint propagation and backtracking search as well as one that is based on graph partitioning and SAT solving. Our evaluation considers very large real-world RCC-8 networks and medium-sized synthetic ones and shows that gp-rcc8
615outperforms the other reasoners for these networks while it is less efficient for smaller networks."
616Encoding Tree Sparsity in Multi-Task Learning A Probabilistic Framework,Lei Han Yu Zhang Guojie Song and Kunqing Xie,Novel Machine Learning Algorithms NMLA ,"Multi-Task Learning
617Sparsity
618Probabilistic Modeling",NMLA Transfer Adaptation Multitask Learning,Multi-task learning seeks to improve the generalization performance by sharing common information among multiple related tasks. A key assumption in most MTL algorithms is that all tasks are related which however may not hold in many real-world applications. Existing techniques which attempt to address this issue aim to identify groups of related tasks using group sparsity. In this paper we propose a probabilistic tree sparsity PTS model to utilize the tree structure to obtain the sparse solution instead of the group structure. Specifically each model coefficient in the learning model is decomposed into a product of multiple component coefficients each of which corresponds to a node in the tree. Based on the decomposition Gaussian and Cauchy distributions are placed on the component coefficients as priors to restrict the model complexity. We devise an efficient expectation maximization algorithm to learn the model parameters. Experiments conducted on both synthetic and real-world problems show the effectiveness of our model compared with state-of-the-art baselines.
619Efficient Generalized Fused Lasso with Application to the Diagnosis of Alzheimer s Disease,Bo Xin Yoshinubo Kawahara Yizhou Wang and Wen Gao,"Machine Learning Applications MLA
620Novel Machine Learning Algorithms NMLA ","Generalized Fused Lasso
621Alzhemier's Disease
622Parametric graph cut","MLA Bio/Medicine
623NMLA Dimension Reduction/Feature Selection
624NMLA Relational/Graph-Based Learning
625NMLA Structured Prediction",Generalized Fused Lasso GFL penalizes variables with L1 norms both on variables and their pairwise differences. GFL is useful when applied to data where prior information expressed with a graph is available in the domain. However the existing algorithms for GFL take high computational cost and do not scale to large size problems. In this paper we propose a fast and scalable algorithm for GFL. Based on the fact that the fusion penalty is the Lov �asz extension of a cut function we show that the key building block is equivalent to recursively solving graph cut problems. We then solve GFL efficiently via a parametric flow algorithm. Runtime comparison demonstrate a significant speed-up over existing algorithms for GFL. Benefited from the scalability of our algorithm we propose to formulate the diagnosis of Alzheimer s Disease as GFL. Experiments demonstrate that GFL seems to be a natural way to formulate such a problem. Not only is the diagnosis performance promising but the selected critical voxels are well structured consistent across tasks and in accordance with clinical prior knowledge.
626How Do Your Friends on Social Media Disclose Your Emotions?,Yang Yang Jia Jia Shumei Zhang Boya Wu Jie Tang and Juanzi Li,Applications APP ,"emotion inference
627images and comments
628generative model",APP Social Networks,Mining emotions hidden in images has attracted significant interest in particular with the rapid development of social networks. The emotional impact is very important for understanding the intrinsic meanings of images. Despite many studies have been done most existing methods focus on image content but ignore the emotions of the user who has published the image. To understand the emotional impact from images one interesting question is How does social effect correlate with the emotion expressed in an image? Specifically can we leverage friends interactions e.g. discussions related to an image to help discover the emotions? In this paper we formally formalize the problem and propose a novel emotion learning method by jointly modeling images posted by social users and comments added by friends. One advantage of the model is that it can distinguish those comments that are closely related to the emotion expression for an image from other irrelevant ones. Experiments on an open Flickr dataset show that the proposed model can significantly improve +37.4% by F1 the accuracy for inferring emotions from images. More interestingly we found that half of the improvements are due to interactions between 1% of the closest friends.
629Contextually Supervised Source Separation with Application to Energy Disaggregation,Matt Wytock and Zico Kolter,"Computational Sustainability and AI CSAI
630Machine Learning Applications MLA ","energy disaggregation
631convex optimization
632source separation","CSAI Modeling and prediction of dynamic and spatiotemporal phenomena and systems
633MLA Environmental
634MLA Applications of Unsupervised Learning","We propose a new framework for single-channel source separation that lies
635between the fully supervised and unsupervised setting. Instead of supervision
636we provide input features for each source signal and use convex methods to
637estimate the correlations between these features and the unobserved signal
638decomposition. Contextually supervised source separation is a natural fit for
639domains with large amounts of data but no explicit supervision; our motivating
640application is energy disaggregation of hourly smart meter data the separation
641of whole-home power signals into different energy uses . Here contextual
642supervision allows us to provide itemized energy usage for thousands homes a task
643previously impossible due to the need for specialized data collection hardware.
644On smaller datasets which include labels we demonstrate that contextual
645supervision improves significantly over a reasonable baseline and existing
646unsupervised methods for source separation. Finally we analyze the case of
647$\ell_2$ loss theoretically and show that recovery of the signal components
648depends only on cross-correlation between features for different signals not on
649correlations between features for the same signal."
650The Computational Complexity of Structure-Based Causality,Hana Chockler Gadi Aleksandrowicz Joseph Y. Halpern and Alexander Ivrii,"Knowledge Representation and Reasoning KRR
651Reasoning under Uncertainty RU ","Knowledge Representation and Reasoning
652Causal Models
653Structural Causality
654Complexity
655Strong Cause
656Actual Cause
657Polynomial Hierarchy","KRR Action Change and Causality
658KRR Computational Complexity of Reasoning
659KRR Qualitative Reasoning
660KRR Reasoning with Beliefs
661RU Uncertainty Representations","Halpern and Pearl 2001 introduced a definition of
662actual causality; Eiter and Lukasiewicz 2001 showed that
663computing whether X=x is a cause of Y=y is NP-complete in binary
664models where all variables can take on only two values and
665Sigma_2^P-complete in general models. In the final version of their
666paper Halpern and Pearl 2005 slightly modified the definition of
667actual cause in order to deal with problems pointed by Hopkins and
668Pearl in 2003. As we show this modification has a
669nontrivial impact on the complexity of computing actual cause.
670To characterize the complexity a new family D_k k= 1 2 3 ...
671is introduced which generalizes the class D^P introduced by
672Papadimitriou and Yannakakis 1984 D^P is just D_1.
673We show that the complexity of computing causality is D_2-complete
674under the new definition. Chockler and Halpern 2004 extended the
675definition of causality by introducing notions of responsibility
676and blame. They characterized the complexity of determining the
677degree of responsibility and blame using the original definition of
678causality. Again we show that changing the definition of causality
679affects the complexity and completely characterize the complexity of
680determining the degree of responsibility and blame with the new definition."
681Biased Games,Ioannis Caragiannis David Kurokawa and Ariel Procaccia,Game Theory and Economic Paradigms GTEP ,"Equilibrium existence
682Equilibrium computation
683Solutions concepts","GTEP Game Theory
684GTEP Equilibrium",We present a novel extension of normal form games that we call biased games. In these games a player's utility is influenced by the distance between his mixed strategy and a given base strategy. We argue that biased games capture important aspects of the interaction between software agents. Our main result is that biased games satisfying certain mild conditions always admit an equilibrium. We also tackle the computation of equilibria in biased games.
685Grounding Acoustic Echoes In Single View Geometry Estimation,Muhammad Wajahat Hussain Javier Civera and Luis Montano,"Knowledge Representation and Reasoning KRR
686Machine Learning Applications MLA
687Vision VIS ","Scene layout
688Scene understanding
689Acoustic echoes
690Room geometry","KRR Geometric Spatial and Temporal Reasoning
691MLA Applications of Supervised Learning
692MLA Machine Learning Applications General/other
693VIS Perception",Extracting the 3D geometry plays an important part in scene understanding. Recently structured prediction-based robust visual descriptors are proposed for extracting the indoor scene layout from a passive agent s perspective i.e. single image. This robustness is mainly due to modeling the physical interaction of the underlying room geometry with the objects and the humans present in the room. In this work we add the physical constraints coming from acoustic echoes generated by an audio source to this visual model. Our audio-visual 3D geometry descriptor improves over the state of the art in passive perception models as shown by experiments.
694Preference Elicitation and Interview Minimization in Stable Matchings,Joanna Drummond and Craig Boutilier,"Game Theory and Economic Paradigms GTEP
695Multiagent Systems MAS ","Stable Matching
696Preference Elicitation
697Computational Social Choice","APP Computational Social Science
698GTEP Social Choice / Voting
699KRR Preferences
700RU Decision/Utility Theory",While stable matching problems are widely studied little work has investigated schemes for effectively eliciting agent preferences using either preference e.g. comparison queries or interviews to form such comparisons ; and no work has addressed how to combine both. We develop a new model for representing and assessing agent preferences that accommodates both forms of information and heuristically minimizes the number of queries and interviews required to determine a stable matching. Our Refine-then-Interview RtI scheme uses coarse preference queries to refine knowledge of agent preferences and relies on interviews only to assess comparisons of relatively close options. Empirical results show RtI to compare favorably to a recent pure interview minimization algorithm and that the number of interviews is generally independent of the size of the market.
701Placement of Loading Stations for Electric Vehicles No Detours Necessary!,Stefan Funke André Nusser and Sabine Storandt,Computational Sustainability and AI CSAI ,"energy efficiency
702electric vehicles
703facility location","CSAI Control and optimization of dynamic and spatiotemporal systems
704CSAI Network modeling prediction and optimization.",Compared to conventional cars electric vehicles still suffer from a considerably shorter cruising range. Combined with the sparsity of battery loading stations the complete transition to E-mobility still seems a long way to go. In this paper we consider the problem of placing as few loading stations as possible such that on any shortest path there are enough to guarantee sufficient energy supply. This means that EV owners no longer have to plan their trips ahead incorporating loading station positions and are no longer forced to accept long detours to reach their destinations. We show how to model this problem and introduce heuristics which provide close-to-optimal solutions even in large road networks.
705Extracting Keyphrases from Research Papers using Citation Networks,Sujatha Das Gollapalli and Cornelia Caragea,NLP and Text Mining NLPTM ,"CiteTextRank
706Citation Network
707PageRank","NLPTM Information Extraction
708NLPTM Natural Language Processing General/Other ","Keyphrases for a document concisely describe the document using a small set of
709phrases. Keyphrases have been previously
710shown to improve several document processing and retrieval tasks. In this work we study keyphrase extraction from research papers by leveraging citation networks.
711We propose CiteTextRank for extracting keyphrases
712a graph-based algorithm that incorporates evidence from
713both a document's content as well as the contexts
714in which the document is referenced within the citation network. Our model obtains significant
715improvements over the state-of-the-art models for this task. Specifically on several datasets of research papers
716CiteTextRank improves precision at rank $1$ by as much as 16-60\%"
717Learning Parametric Models for Social Infectivity in Multi-dimensional Hawkes Processes,Liangda Li and Hongyuan Zha,"AI and the Web AIW
718Applications APP
719Machine Learning Applications MLA ","Social infectivity
720diffusion network
721Hawkes process
722time-varying feature","AIW Machine learning and the web
723AIW Social networking and community identification
724APP Social Networks
725MLA Networks
726NMLA Time-Series/Data Streams","Efficient and effective learning of social infectivity is a critical challenge in modeling diffusion phenomenons in social networks and other applications.
727Existing methods require substantial amount of event cascades to guarantee the learning accuracy while only time-invariant infectivity is considered.
728Our paper overcomes those two drawbacks by constructing a more compact model and parameterizing the infectivity using time-varying features thus dramatically reduces the data requirement and enable the learning of time-varying infectivity which also takes into account the underlying network topology.
729We replace the pairwise infectivity in the multidimensional Hawkes processes with linear combinations of those time-varying features and optimize the associated coefficients with lasso regularization on coefficients. To efficiently solve the resulting optimization problem we employ the technique of alternating direction method of multipliers and under that framework update each coefficient independently by optimizing a surrogate function which upper-bounds the original objective function. On both synthetic and real world data the proposed method performs better than alternatives in terms of both recovering the hidden diffusion network and predicting the occurrence time of social events."
730Sub-Selective Quantization for Large-Scale Image Search,Yeqing Li Chen Chen Wei Liu and Junzhou Huang,"Machine Learning Applications MLA
731Vision VIS ","subselection
732image retrieval
733binary embedding
734image hashing
735similarity search",VIS Image and Video Retrieval,Recently with the explosive growth of visual content on the Internet large-scale image search has attracted intensive attention. It has been shown that mapping high-dimensional image descriptors to compact binary codes can lead to considerable efficiency gains in both storage and similarity computation of images. However most existing methods still suffer from expensive training devoted to large-scale binary code learning. To address this issue we propose a sub-selection based matrix manipulation algorithm which can significantly reduce the computational cost of code learning. As case studies we apply the sub-selection algorithm to two popular quantization techniques PCA Quantization PCAQ and Iterative Quantization ITQ . Crucially we can justify the resulting sub-selective quantization by proving its theoretic properties. Extensive experiments are carried out on three image benchmarks with up to one million samples corroborating the efficacy of the sub-selective quantization method in terms of image retrieval.
736Accurate Integration of Aerosol Predictions by Smoothing on a Manifold,Shuai Zheng and James Kwok,Machine Learning Applications MLA ,"aerosol optical depth
737manifold
738Gaussian random field",MLA Environmental,Accurately measuring the aerosol optical depth AOD is essential for our understanding of the climate. Currently AOD can be measured by i satellite instruments which operate on a global scale but have limited accuracies; and ii ground-based instruments which are more accurate but not widely available. Recent approaches focus on integrating measurements from these two sources to complement each other. In this paper we further improve the prediction accuracy by using the observation that the AOD varies slowly in the spatial domain. Using a probabilistic approach we impose this smoothness constraint by a Gaussian random field on the Earth's surface which can be considered as a two-dimensional manifold. The proposed integration approach is computationally simple and experimental results on both synthetic and real-world data sets show that it significantly outperforms the state-of-the-art.
739Leveraging Decomposed Trust in Probabilistic Matrix Factorization for Effective Recommendation,Hui Fang Yang Bao and Jie Zhang,AI and the Web AIW ,"multi-facet trust
740trust theory
741probabilistic matrix factorization
742rating prediction","AIW Representing reasoning and using provenance trust privacy and security on the web
743AIW Web-based recommendation systems",Trust has been used to replace or complement rating-based similarity in recommender systems to improve the accuracy of rating prediction. However people trusting each other may not always share similar preferences. In this paper we try to fill in this gap by decomposing the original single-aspect trust information into four general trust aspects i.e. benevolence integrity competence and predictability and further employing the support vector regression technique to incorporate them into the probabilistic matrix factorization model for rating prediction in recommender systems. Experimental results on four datasets demonstrate the superiority of our method over the state-of-the-art approaches.
744Multi-Instance Learning with Distribution Change,Wei-Jia Zhang and Zhi-Hua Zhou,Novel Machine Learning Algorithms NMLA ,"multi-instance learning
745covariate shift
746distribution change
747importance sampling",NMLA Classification,Multi-instance learning deals with tasks where each example is a bag of instances and the bag labels of training data are known whereas instance labels are unknown. Most previous studies on multi-instance learning assumed that the training and testing data are from the same distribution; however this assumption is often violated in real tasks. In this paper we present possibly the first study on multi-instance learning with distribution change. We propose the MICS approach by considering both bag-level distribution change and instance-level distribution change. Experiments show that MICS is almost always significantly better than many state-of-the-art multi-instance learning approaches when distribution change occurs and even when there is no distribution change it is still highly competitive.
748Minimising Undesired Task Costs in Multi-robot Task Allocation Problems with In-Schedule Dependencies,Bradford Heap and Maurice Pagnucco,"Multiagent Systems MAS
749Robotics ROB ","Multi-Robot Task Allocation
750Distributed Auctions
751Multi-Agent Systems
752Multi-Robot Systems
753Autonomous Systems
754Autonomous Robotics
755Market-Based Systems","GTEP Auctions and Market-Based Systems
756MAS Coordination and Collaboration
757MAS Distributed Problem Solving
758ROB Multi-Robot Systems","In multi-robot task allocation problems with in-schedule dependencies tasks with high costs have a large influence on the overall time for a robot team to complete all tasks.
759We reduce this influence by calculating a novel task cost dispersion value which measures robots' collective preference for each task.
760By modifying the winner determination phase of sequential single-item auctions our approach inspects the bids for every task to identify tasks which robots collectively consider to be high cost and we ensure these tasks are allocated before other tasks.
761We show empirically this method reduces the overall time taken to complete all tasks."
762A Machine Learning Approach to Musically Meaningful Homogeneous Style Classification,William Herlands Ricky Der Yoel Greenberg and Simon Levin,Machine Learning Applications MLA ,"supervised machine learning
763music
764information retrieval
765style classification
766classical music","MLA Humanities
767MLA Applications of Supervised Learning
768MLA Machine Learning Applications General/other ",Recent literature has demonstrated the difficulty of classifying between composers who write in extremely similar styles homogeneous style . Additionally machine learning studies in this field have been exclusively of technical import with little musicological interpretability or significance. We present a supervised machine learning system which addresses the difficulty of differentiating between stylistically homogeneous composers using foundational elements of music their complexity and interaction. Our work expands on previous style classification studies by developing more complex features as well as introducing a new class of musical features which focus on local irregularities within musical scores. We demonstrate the discriminative power of the system as applied to Haydn and Mozart's string quartets. Our results yield interpretable musicological conclusions about Haydn's and Mozart's stylistic differences while distinguishing between the composers with higher accuracy than previous studies in this domain.
769Learning Deep Representations for Graph Clustering,Fei Tian Bin Gao Enhong Chen and Tie-Yan Liu,"Machine Learning Applications MLA
770Novel Machine Learning Algorithms NMLA ","deep representations
771clustering on graph
772neural networks","MLA Applications of Unsupervised Learning
773NMLA Clustering
774NMLA Neural Networks/Deep Learning
775NMLA Relational/Graph-Based Learning",Recently deep learning has been successfully adopted in many applications such as speech recognition image classification and natural language processing. In this work we explore the possibility of employing deep learning in graph clustering. In particular we propose a simple method which first learns a nonlinear embedding of the original graph by stacked autoencoder and then runs a $k$-means algorithm on the embedding to obtain the clustering result. We show that this simple method has a solid theoretical foundation due to the equivalence between autoencoder and spectral clustering in terms of what they actually optimize. Then we demonstrate that the proposed method is more efficient and flexible than spectral clustering. First the computational complexity of autoencoder is much lower than spectral clustering the former can be linear to the number of nodes in a sparse graph while the latter is super quadratic due to its dependency on an eigenvalue decomposition. Second when additional constraints like sparsity is imposed we can simply employ the \emph{sparse} autoencoder developed in the literature of deep learning; however it is non-straightforward to implement a sparse spectral method. We have conducted comprehensive experiments to test the performance of the proposed method. The results on various graph datasets show that it can significantly outperform the conventional spectral clustering method. This clearly indicates the effectiveness of deep learning in graph clustering and enriches our understanding on the power of deep learning in general.
776Fast Multi-Instance Multi-Label Learning,Sheng-Jun Huang Wei Gao and Zhi-Hua Zhou,Novel Machine Learning Algorithms NMLA ,"Multi-instance multi-label learning
777Key instance
778Fast",NMLA Classification,In multi-instance multi-label learning MIML one object is represented by multiple instances and simultaneously associated with multiple labels. Existing MIML approaches have been found useful in many applications; however most of them can only handle moderate-sized data. To efficiently handle large data sets we propose the MIMLfast approach which first constructs a low-dimensional subspace shared by all labels and then trains label specific linear models to optimize approximated ranking loss via stochastic gradient descent. Although the MIML problem is complicated MIMLfast is able to achieve excellent performance by exploiting label relations with shared space and discovering sub-concepts for complicated labels. Experiments show that the performance of MIMLfast is highly competitive to state-of-the-art techniques whereas its time cost is much less; particularly on a data set with 30K bags and 270K instances where none of existing approaches can return results in 24 hours MIMLfast takes only 12 minutes. Moreover our approach is able to identify the most representative instance for each label and thus providing a chance to understand the relation between input patterns and output semantics.
779Modeling and mining spatiotemporal patterns of infection risk from heterogeneous data for active surveillance planning,Bo Yang Hua Guo and Yi Yang,"Computational Sustainability and AI CSAI
780Novel Machine Learning Algorithms NMLA ","active surveillance planning
781spatiotemporal data mining
782heterogeneous data mining","CSAI Modeling and prediction of dynamic and spatiotemporal phenomena and systems
783CSAI Control and optimization of dynamic and spatiotemporal systems
784NMLA Data Mining and Knowledge Discovery","Active surveillance is a desirable way to prevent the
785spread of infectious diseases in that it aims to timely
786discover individual incidences through an active searching
787for patients. However in practice active surveillance
788is difficult to implement especially when monitoring
789space is large but the available resources are limited.
790Therefore it is extremely important for public health
791authorities to know how to distribute their very sparse
792resources to high-priority regions so as to maximize the
793outcomes of active surveillance. In this paper we raise
794the problem of active surveillance planning and provide
795an effective method to address it via modeling and mining
796spatiotemporal patterns of infection risk from heterogeneous
797data sources. Taking malaria as an example
798we perform an empirical study on real-world data
799to validate our method and provide our new findings."
800Symbolic Domain Predictive Control,Johannes Löhr Martin Wehrle Maria Fox and Bernhard Nebel,Planning and Scheduling PS ,"planning
801control
802hybrid domains
803switched linear dynamic systems
804domain predictive control","PS Deterministic Planning
805PS Mixed Discrete/Continuous Planning
806PS Planning General/Other ",Planning-based methods to guide switched hybrid systems from an initial state into a desired goal region opens an interesting field for control. The idea of the Domain Predictive Control DPC approach is to generate input signals affecting both the numerical states and the modes of the system by stringing together atomic actions to a logically consistent plan. However the existing DPC approach is restricted in the sense that a discrete and pre-defined input signal is required for each action. In this paper we extend the approach to deal with symbolic states. This allows for the propagation of reachable regions of the state space emerging from actions with inputs that can be arbitrarily chosen within specified input bounds. The symbolic extension enables the applicability of DPC to systems with bounded inputs sets and increases its robustness due to the implicitly reduced search space. Moreover precise numeric goal states instead of goal regions become reachable.
807Can Agent Development Affect Developer's Strategy?,Avshalom Elmalech David Sarne and Noa Agmon,"Cognitive Modeling CM
808Humans and AI HAI ","Decision-making
809Peer Designed Agents
810The Doors Game
811Simulating Humans","CM Simulating Humans
812HAI Understanding People Theories Concepts and Methods","Peer Designed Agents PDAs computer agents developed by non-experts is an emerging technology widely advocated in recent literature for the purpose of replacing people in simulations and investigating human behavior. Its main premise is that strategies programmed into these agents reliably reflect to some extent the behavior used by the programmer in real life. In this paper we show that PDA development has an important side effect that has not been addressed to date --- the process that merely attempts to capture one's strategy is also likely to affect the developer's strategy. The phenomenon is demonstrated experimentally using several performance measures. This result has many implications concerning the appropriate design of PDA-based simulations and the validness of using PDAs for studying individual decision making.
813Furthermore we obtain that PDA development actually improved the developer's strategy according to all performance measures. Therefore PDA development can be suggested as a means for improving people's problem solving skills."
814Robust Distance Learning in the Presence of Label Noise,Dong Wang and Xiaoyang Tan,"Machine Learning Applications MLA
815Novel Machine Learning Algorithms NMLA ","label noise
816robust neighbourhood components analysis
817distance learning","NMLA Feature Construction/Reformulation
818NMLA Supervised Learning Other
819NMLA Machine Learning General/other ",Many distance learning algorithms have been developed in recent years. However few of them consider the problem when the class labels of training data are noisy and this may lead to serious performance deterioration. In this paper we present a robust distance learning method in the presence of label noise by extending a previous non-parametric discriminative distance learning algorithm i.e. Neighbourhood Components Analysis NCA . Particularly we model the conditional probability of each point s true label over all the possible classe labels and use it for a more robust estimation of intra-class scatter matrix. The model is then optimized within the EM framework. In addition considering that the model tends to be complex under the situation of label noise we propose to regularize its objective function to prevent overï¬tting. Our experiments on several UCI datasets and a real dataset with unknown noise patterns show that the proposed RNCA is more tolerant to class label noise compared to the original NCA method.
820Exponential Deepening A* for Real-Time Agent-Centered Search,Guni Sharon Ariel Felner and Nathan Sturtevant,Heuristic Search and Optimization HSO ,"Heuristic Search
821Real-Time Search
822Agent-Centered Search","HSO Heuristic Search
823HSO Evaluation and Analysis Search and Optimization
824HSO Search General/Other ","In the Real-Time Agent-Centered Search RTACS problem
825an agent has to arrive at a goal location while acting and
826reasoning in the physical state space. Traditionally RTACS
827problems are solved by propagating and updating heuristic
828values of states visited by the agent. In existing RTACS
829algorithms e.g. the LRTA* family the agent may revisit
830each state a many times causing the entire procedure to be
831quadratic in the state space. In this paper we study the Iterative
832Deepening ID approach for solving RTACS. We then
833present Exponential Deepening A* EDA* a RTACS algorithm
834where the threshold between successive Depth-First
835calls is increased exponentially. We prove that EDA* results
836in a linear worst case bound and support this experimentally
837by demonstrating up to 10x reduction over existing RTACS
838solvers WRT. states expanded and CPU runtime."
839Learning Low-Rank Representations with Classwise Block-Diagonal Structure for Robust Face Recognition,Yong Li Jing Liu Zechao Li Yangmuzi Zhang Hanqing Lu and Songde Ma,Vision VIS ,"Low-Rank Representations
840Classwise Block-Diagonal Structure
841Robust Face Recognition",CS Structural learning and knowledge capture,Face recognition has been widely studied due to its importance in various applications. However the case that both training images and testing images are corrupted is not well addressed. Motivated by the success of low-rank matrix recovery we propose a novel semi-supervised low-rank matrix recovery algorithm for robust face recognition. The proposed method can learn robust discriminative representations for both training images and testing images simultaneously by exploiting the classwise block-diagonal structure. Specifically low-rank matrix approximation can handle the possible contamination of data and the sparse noises. Moreover the classwise block-diagonal structure is exploited to promote discrimination between different classes. The above issues are formulated into a unified objective function and we design an efficient optimization procedure based on augmented Lagrange multiplier method to solve it. Extensive experiments on three public databases are performed to validate the effectiveness of our approach. The strong identification capability of representations with block-diagonal structure is verified.
842Congestion Games for V2G-Enabled EV Charging,Benny Lutati Vadim Levit Tal Grinshpoun and Amnon Meisels,"Computational Sustainability and AI CSAI
843Game Theory and Economic Paradigms GTEP
844Multiagent Systems MAS ","Congestion games
845Potential games
846EV charging
847V2G","CSAI Modeling the interactions of agents with different and often conflicting interests
848GTEP Coordination and Collaboration
849MAS Distributed Problem Solving",A model of the problem of charging and discharging electrical vehicles as a congestion game is presented. A generalization of congestion games -- feedback congestion games FCG -- is introduced. The charging of grid-integrated vehicles which can also discharge energy back to the grid is a natural FCG application. FCGs are proven to be exact potential games and therefore converge to a pure-strategy Nash equilibrium by an iterated better-response process. A compact representation and an algorithm that enable efficient best-response search are presented. A detailed empirical evaluation assesses the performance of the iterated best-response process. The evaluation considers the quality of the resulting solutions and the rate of convergence to a stable state. The effect of allowing to also discharge batteries using FCG is compared to scenarios that only include charging and is found to dramatically improve the predictability of the achieved solutions as well as the balancing of load.
850Accurate Household Occupant Behavior Modeling Based on Data Mining Techniques,Márcia Baptista Anjie Fang Helmut Prendinger Rui Prada and Yohei Yamaguchi,"Machine Learning Applications MLA
851Multiagent Systems MAS ","Household occupant behavior modeling
852Markov chain models
853Nearest Neighbor algorithm","MLA Applications of Supervised Learning
854MLA Machine Learning Applications General/other
855MAS Agent-based Simulation and Emergent Behavior",An important requirement of household energy simulation models is their accuracy in estimating energy demand and its fluctuations. Occupant behavior has a major impact upon energy demand. However Markov chains the traditional approach to model occupant behavior 1 has limitations in accurately capturing the coordinated behavior of occupants and 2 is prone to over-fitting. To address these issues we propose a novel approach that relies on a combination of data mining techniques. The core idea of our model is to determine the behavior of occupants based on nearest neighbor comparison over a database of sample data. Importantly the model takes into account features related to the coordination of occupants' activities. We use a customized distance function suited for mixed categorical and numerical data. Further association rule learning allows us to capture the coordination between occupants. Using real data from four households in Japan we are able to show that our model outperforms the traditional Markov chain model with respect to occupant coordination and generalization of behavior patterns.
856Generating Content for Scenario-Based Serious-Games using CrowdSourcing,Sigal Sina Avi Rosenfeld and Sarit Kraus,Game Playing and Interactive Entertainment GPIE ,"Scenario-Based Serious-Games
857Generated Content
858Crowd-Sourcing",GPIE Procedural Content Generation,"Scenario-based serious-games have become a main tool for
859learning new skills and capabilities. An important factor in
860the development of such systems is reducing the time and cost
861overhead in manually creating content for these scenarios. To do so we present ScenarioGen an automatic method for generating content about everyday activities through combining computer science techniques with the crowd. ScenarioGen uses the crowd in three different ways to capture a database of scenarios of everyday activities to generate a database of likely replacements for specific events within that scenario and to evaluate the resulting scenarios. We evaluated ScenarioGen in 6 different content domains and found that it was consistently rated as coherent and consistent as the originally captured content. We also compared ScenarioGen s content to that created by traditional planning techniques. We found that both methods were equally effective in generated coherent and consistent scenarios yet ScenarioGen s content was found to be more varied and easier to create."
862A Tractable Approach to ABox Abduction over Description Logic Ontologies,Jianfeng Du Kewen Wang and Yi-Dong Shen,Knowledge Representation and Reasoning KRR ,"ABox abduction
863abductive reasoning
864query abduction problem
865description logics
866first-order rewritable","KRR Description Logics
867KRR Diagnosis and Abductive Reasoning",ABox abduction is an important reasoning mechanism for description logic ontologies. It computes all minimal explanations sets of ABox assertions whose appending to a consistent ontology enforces the entailment of an observation while keeps the ontology consistent. We focus on practical computation for a general problem of ABox abduction called the query abduction problem where an observation is a Boolean conjunctive query and the explanations may contain fresh individuals neither in the ontology nor in the observation. However in this problem there can be infinitely many minimal explanations. Hence we first identify a class of TBoxes called first-order rewritable TBoxes. It guarantees the existence of finitely many minimal explanations and is sufficient for many ontology applications. To reduce the number of explanations that need to be computed we introduce a special kind of minimal explanations called representative explanations from which all minimal explanations can be retrieved. We develop a tractable method in data complexity for computing all representative explanations in a consistent ontology whose TBox is first-order rewritable. Experimental results demonstrate that the method is efficient and scalable for ontologies with large ABoxes.
868User Group Oriented Dynamics Exploration,Zhiting Hu Junjie Yao and Bin Cui,"AI and the Web AIW
869Machine Learning Applications MLA
870NLP and Knowledge Representation NLPKR ","Social media
871temporal dynamics
872topic modeling
873user group","AIW Human language technologies for web systems including text summarization and machine translation
874AIW Languages tools and methodologies for representing managing and visualizing semantic web data
875AIW Machine learning and the web
876AIW Web-based opinion extraction and trend spotting
877MLA Networks
878NLPKR Semantics and Summarization
879NLPKR Natural Language Processing General/Other ","Dynamic online content becomes the zeitgeist. Vibrant user groups are essential participants and promoters behind it. Studying temporal dynamics is a valuable means to uncover group characters. Since different user groups usually tend to have highly diverse interest and show distinct behavior patterns they will exhibit varying temporal dynamics. However most current work only use global trends of temporal topics and fail to distinguish such fine-grained patterns across groups.
880
881In this paper we propose GrosToT Group Specific Topics-over-Time a unified probabilistic model which infers latent user groups and temporal topics. It can model group-specific temporal topic variation from social media stream. By leveraging the comprehensive group-specific temporal patterns GrosToT significantly outperforms state-of-the-art dynamics modeling methods. Our proposed approach shows advantage not only in temporal dynamics modeling but also group content exploration. Based on GrosToT we uncover the interplay between group interest and temporal dynamics. Specifically we find that group's attention to their medium-interested topics are event-driven showing rich bursts; while its engagement in group's dominating topics are interest-driven remaining stable over time. The dynamics over different groups vary and reflect the groups' intention."
882Automatic Construction and Natural-Language Description of Nonparametric Regression Models,James Lloyd David Duvenaud Roger Grosse Josh Tenenbaum and Zoubin Ghahramani,"Machine Learning Applications MLA
883Novel Machine Learning Algorithms NMLA ","Automatic statistician
884gaussian processes
885regression
886bayesian
887summarization
888model description","MLA Applications of Supervised Learning
889NMLA Bayesian Learning
890NMLA Kernel Methods
891NMLA Time-Series/Data Streams","This paper presents the beginnings of an automatic statistician focusing on regression problems. Our system explores an open-ended space of possible statistical models to discover a good explanation of the data and then produces a detailed report with figures and natural-language text.
892
893Our approach treats unknown functions nonparametrically using Gaussian processes which has two important consequences. First Gaussian processes model functions in terms of high-level properties e.g. smoothness trends periodicity changepoints .
894Taken together with the compositional structure of our language of models this allows us to automatically describe functions through a decomposition into additive parts. Second the use of flexible nonparametric models and a rich language for composing them in an open-ended manner also results in state-of-the-art extrapolation performance evaluated over 13 real time series data sets from various domains."
895Capturing Difficulty Expressions in Student Online Q&A Discussions,Jaebong Yoo and Jihie Kim,"AI and the Web AIW
896Applications APP
897NLP and Machine Learning NLPML
898NLP and Text Mining NLPTM ","Student online discussions
899dialog analysis emotional or information roles
900student performance prediction
901computational linguistic features
902educational data mining","AIW Question answering on the web
903APP Computer-Aided Education
904MLA Machine Learning Applications General/other
905NLPML Discourse and Dialogue
906NLPML Text Classification",We introduce a new application of online dialogue analysis supporting pedagogical assessment of online Q&A discussions. Extending the existing speech act framework we capture common emotional expressions that often appear in student discussions such as frustration and degree of certainty and present a viable approach for the classification. We demonstrate how such dialogue information can be used in analyzing student discussions and identifying difficulties. In particular the difficulty indicators are aligned to discussion patterns and student performance. We found that frustration occurs more frequently in longer discussions. The students who frequently express frustration tend to get lower grades than others. On the other hand frequency of high certainty expressions is positively correlated with the performance. We believe emotional and informational dialogue roles are important factors in explaining discussion development and student performance. We expect such online dialogue analyses can become a powerful assessment tool for instructors and education researchers.
907Lifting Relational MAP-LP Relaxations using Permutation Constraint Graphs,Udi Apsel Kristian Kersting and Martin Mladenov,Reasoning under Uncertainty RU ,"Lifted Probabilistic Inference
908Statistical Relational Learning
909MAP Estimation
910Linear Programming
911Sherali-Adams Hierarchy","RU Probabilistic Inference
912RU Relational Probabilistic Models",Inference in large scale graphical models is an important task in many domains and in particular probabilistic relational models e.g. Markov logic networks . Such models often exhibit considerable symmetry and it is a challenge to devise algorithms that exploit this symmetry to speed up inference. Recently the automorphism group has been proposed to formalize mathematically what exploiting symmetry means. However obtaining symmetry derived from automorphism is GI-hard and consequently only a small fraction of the symmetry is easily available for effective employment. In this paper we improve upon efficiency in two ways. First we introduce the Permutation Constraint Graph PCG a platform on which greater portions of the symmetries can be revealed and exploited. PCGs classify clusters of variables by projecting relations between cluster members onto a graph allowing for the efficient pruning of symmetrical clusters even before their generation. Second we introduce a novel framework based on PCGs for the Sherali-Adams hierarchy of linear program LP relaxations dedicated to exploiting this symmetry for the benefit of tight Maximum A Posteriori MAP approximations. Combined with the pruning power of PCG the framework quickly generates compact formulations for otherwise intractable LPs as demonstrated by several empirical results.
913Boosting SBDS for Partial Symmetry Breaking in Constraint Programming,Zichen Zhu and Jimmy Lee,Search and Constraint Satisfaction SCS ,"constraint satisfaction problems
914symmetry breaking
915SBDS","SCS Constraint Satisfaction
916SCS SAT and CSP Modeling/Formulations
917SCS Constraint Satisfaction General/other ","The paper proposes a dynamic method Recursive SBDS
918 ReSBDS for efficient partial symmetry breaking. We
919first demonstrate how partial SBDS misses important
920pruning opportunities when given only a subset of symmetries
921to break. The investigation pinpoints the culprit
922and in turn suggests rectification. The main idea is to
923add extra conditional constraints during search recursively
924to prune also symmetric nodes of some pruned
925subtrees. Thus ReSBDS can break extra symmetry
926compositions but is carefully designed to break only the
927ones that are easy to identify and inexpensive to break.
928We present theorems to guarantee the soundness and
929termination of our approach and compare our method
930with popular static and dynamic methods. When the
931variable value heuristic is static ReSBDS is also complete
932in eliminating all interchangeable variables values
933given only the generator symmetries. Extensive
934experimentations confirm the efficiency of ReSBDS
935when compared against state of the art methods."
936Evolutionary dynamics of learning algorithms over the sequence form,Fabio Panozzo Nicola Gatti and Marcello Restelli,"Game Theory and Economic Paradigms GTEP
937Multiagent Systems MAS ","Game Theory
938Multiagent learning
939Evolutionary game theory","GTEP Adversarial Learning
940GTEP Equilibrium
941MAS Multiagent Learning",Multi-agent learning is a challenging open task in artificial intelligence. It is known an interesting connection between multi-agent learning algorithms and evolutionary game theory showing that the learning dynamics of some algorithms can be modeled as replicator dynamics with a mutation term. Inspired by the recent sequence-form replicator dynamics we develop a new version of the Q-learning algorithm working on the sequence form of an extensive-form game allowing thus an exponential reduction of the dynamics length w.r.t. those of the normal form. The dynamics of the proposed algorithm can be modeled by using the sequence-form replicator dynamics with a mutation term. We show that although sequence-form and normal-form replicator dynamics are realization equivalent the Q-learning algorithm applied to the two forms have non-realization equivalent dynamics. Originally from the previous works on evolutionary game theory models form multi-agent learning we produce an experimental evaluation to show the accuracy of the model.
942Structured Possibilistic Planning using Decision Diagrams,Nicolas Drougard Florent Teichteil-Königsbuch and Jean-Loup Farges,"Knowledge Representation and Reasoning KRR
943Planning and Scheduling PS
944Reasoning under Uncertainty RU ","Qualitative Planning under Uncertainty
945Symbolic Planning with Decision Diagrams
946Possibilistic PO MDPs
947Mixed Observability","KRR Qualitative Reasoning
948KRR Reasoning with Beliefs
949PS Markov Models of Environments
950PS Probabilistic Planning
951PS Planning General/Other
952RU Graphical Models Other
953RU Decision/Utility Theory
954RU Sequential Decision Making
955RU Uncertainty Representations
956RU Uncertainty in AI General/Other ","Qualitative Possibilistic Mixed-Observable MDPs $\pi$-MOMDPs generalizing
957$\pi$-MDPs and $\pi$-POMDPs are well-suited models to planning under
958uncertainty with mixed-observability when transition observation and reward
959functions are not precisely known. Functions defining the model as well as
960intermediate calculations are valued in a finite possibilistic scale
961$\mathcal{L}$ which induces a \emph{finite} belief state space under partial
962observability contrary to its probabilistic counterpart. In this paper we
963propose the first study of factored $\pi$-MOMDP models in order to solve large
964structured planning problems under \emph{imprecise} uncertainty or considered
965as qualitative approximations of probabilistic problems. Building upon the SPUDD
966algorithm for solving factored probabilistic MDPs we conceived a symbolic
967algorithm named PPUDD for solving factored $\pi$-MOMDPs. Whereas SPUDD's
968decision diagrams' leaves may be as large as the state space since their values
969are real numbers aggregated through additions and multiplications PPUDD's ones
970always remain in the finite scale $\echL$ via $\min$ and $\max$ operations only.
971Our experiments show that PPUDD's computation time is much lower than SPUDD
972Symbolic-HSVI and APPL for possibilistic and probabilistic versions of the same
973benchmarks under either total or mixed observability while providing
974high-quality policies."
975Labeling Complicated Objects Multi-View Multi-Instance Multi-Label Learning,Cam-Tu Nguyen Xiaoliang Wang and Zhi-Hua Zhou,Machine Learning Applications MLA ,"multi-view learning.
976multi-instance multi-label learning.
977partial examples.","AIW AI for multimedia and multimodal web applications
978MLA Applications of Supervised Learning
979MLA Machine Learning Applications General/other ",Multi-Instance Multi-Label MIML is a learning framework where an example is associated with multiple labels and represented by a set of feature vectors multiple instances . In the original formalization of MIML learning instances come from a single source single view . To leverage multiple information sources multi-view we develop a multi-view MIML framework based on hierarchical Bayesian Network and present an effective learning algorithm based on variational inference. The model can naturally deal with the examples in which some views could be absent partial examples . On multi-view datasets it is shown that our method is better than other multi-view and single-view approaches particularly in the presence of partial examples. On single-view benchmarks extensive evaluation shows that our approach is highly comparable or better than other MIML approaches on labeling examples and instances. Moreover our method can effectively handle datasets with a large number of labels.
980Reasoning on LTL on Finite Traces Insensitivity to Infiniteness,Giuseppe De Giacomo Riccardo De Masellis and Marco Montali,Knowledge Representation and Reasoning KRR ,"Linear Temporal Logic
981Finite Traces
982Reasoning about Actions
983Trajectory Constraints in Planning
984Logic-Based Process and Service Modelling","AIW AI for web services semantic descriptions planning matching and coordination
985KRR Action Change and Causality
986PS Temporal Planning",In this paper we study when an LTL formula on finite traces LTLf formula is insensitive to infiniteness that is it can be correctly handled as a formula on infinite traces under the assumption that at a certain point the infinite trace starts repeating an end event forever trivializing all other propositions to false. This intuition has been put forward and wrongly assumed to hold in general in the literature. We define a necessary and sufficient condition to characterize whether an LTLf formula is insensitive to infiniteness which can be automatically checked by any LTL reasoner. Then we show that typical LTLf specification patterns used in process and service modeling in CS as well as trajectory constraints in Planning and transition-based LTLf specifications of action domains in KR are indeed very often insensitive to infiniteness. This may help to explain why the assumption of interpreting LTL on finite and on infinite traces has been wrongly blurred. Possibly because of this blurring virtually all literature detours to Buechi automata for constructing the NFA that accepts the traces satisfying an LTLf formula. As a further contribution we give a simple direct algorithm for computing such NFA.
987A Game-theoretic Analysis of Catalog Optimization,Joel Oren Nina Narodytska and Craig Boutilier,"Game Theory and Economic Paradigms GTEP
988Knowledge Representation and Reasoning KRR ","Competitive assortment optimization
989Nash equilibrium
990Price of Stability
991Computational social choice","GTEP Auctions and Market-Based Systems
992GTEP Game Theory
993GTEP Social Choice / Voting
994GTEP Equilibrium
995GTEP Imperfect Information
996KRR Preferences
997KRR Reasoning with Beliefs
998MAS E-Commerce","Vendors of all types face the problem of selecting a slate of product offerings---their assortment or catalog---that will maximize their profits. The profitability of a catalog is determined by both customer preferences and the offerings of their competitors.
999
1000We develop a game-theoretic model for analyzing the vendor \emph{catalog optimization} problem in the face of competing vendors. We show that computing a best response is intractable in general but can be solved by dynamic programming given certain informational or structural assumptions about consumer preferences.
1001We also analyze conditions under which pure Nash equilibria exist. We study the price of anarchy and stability where applicable."
1002Linear-Time Filtering Algorithms for the Disjunctive Constraint,Hamed Fahimi and Claude-Guy Quimper,"Knowledge Representation and Reasoning KRR
1003Planning and Scheduling PS ","Constraint programming
1004Scheduling
1005Global constraints
1006Filtering algorithms
1007Disjunctive constraint","PS Scheduling
1008SCS Constraint Satisfaction
1009SCS Global Constraints
1010SCS Constraint Satisfaction General/other ","We present three new filtering algorithms for the disjunctive constraint that all have a linear running time
1011complexity in the number of tasks. The first algorithm filters the tasks according to the rules of the time tabling. The second algorithm performs an overload check that could also be used for the cumulative constraint. The third algorithm enforces the rules of detectable precedences. The two last algorithms use a new data structure that we introduce and that we call the time line. This data structure provides many constant time operations that were previously implemented in logarithmic time by the Theta-tree data structure. Experiments show that these new algorithms are competitive even for a small number of tasks and outperforms existing algorithms as the number of tasks increases."
1012Low-Rank Tensor Completion with Discriminant Analysis for Action Classification,Chengcheng Jia Guoqiang Zhong and Yun Fu,"Machine Learning Applications MLA
1013Vision VIS ","low-rank
1014tensor
1015action recognition
1016discriminant analysis","MLA Applications of Supervised Learning
1017VIS Face and Gesture Recognition
1018VIS Videos",Tensor completion is an important topic in the area of image processing and computer vision research which is generally built on extraction of the intrinsic structure of tensor data. However tensor completion techniques are rarely used for action classification which heavily relies on the extracted features of high-dimensional tensors as well. In this paper we proposed a method for video based action classification via low-rank tensor completion. Since there could be distortion and corruption in the tensor we projected the tensor into the subspace which contains the invariant structure of the tensor and be used as the input of the classifier. The key point is that we aim to calculate the optimal projection matrices which have the low-rank structure and are used to obtain the subspace. In order to integrate the useful supervisory information of data we adopt the discriminant analysis criterion to learn the projection matrices and the augmented Lagrange multiplier ALM algorithm to solve the multi-variate optimization problem. We explained the properties of the projection matrices which indicate the different meanings in the row space column space and frame space respectively. Experiments demonstrate that our method has obtained better accuracy compared with some other state-of-the-art low-rank tensor methods on MSR Hand Gesture 3D database and MSR Action 3D database.
1019Programming by Example using Least General Generalizations,Mohammad Raza Sumit Gulwani and Natasa Milic-Frayling,"Applications APP
1020Heuristic Search and Optimization HSO
1021Knowledge Representation and Reasoning KRR ","Programming by example
1022Inductive inference
1023XML transformations
1024Program synthesis
1025Document editing interfaces","APP Intelligent User Interfaces
1026APP Other Applications
1027HSO Search General/Other
1028KRR Knowledge Representation Languages
1029KRR Knowledge Representation General/Other ",Programming-by-example PBE has recently seen important advances in the domain of text editing but existing technology is restricted to transformations on relatively small unstructured text strings. In this paper we address structural/formatting transformations in richly formatted documents using an approach based on the idea of least general generalizations from inductive inference which avoids the scalability issues faced by state-of-the-art PBE methods. We describe a novel domain specific language DSL that can succinctly describe expressive transformations over XML structures used for describing richly formatted content and is equipped with a natural partial ordering between programs based on a subsumption relation. We then describe a synthesis algorithm that given a set of input-output examples of XML structures identifies a minimal DSL program that is consistent with the examples. We present experimental results over a benchmark of formatting tasks that we collected from online help forums which show an average of 4.1 examples required for task completion in a few seconds.
1030Give a Hard Problem to a Diverse Team Exploring Large Action Spaces,Leandro Soriano Marcolino Haifeng Xu Albert Xin Jiang Milind Tambe and Emma Bowring,Multiagent Systems MAS ,"Team formation
1031Coordination & Collaboration
1032Distributed AI",MAS Coordination and Collaboration,Recent work has shown that diverse teams can outperform a uniform team made of copies of the best agent. However there are fundamental questions that were not asked before. When should we use diverse or uniform teams? How does the performance change as the action space or the teams get larger? Hence we present a new model of diversity for teams that is more general than previous models. We prove that the performance of a diverse team improves as the size of the action space gets larger. Concerning the size of the diverse team we show that the performance converges exponentially fast to the optimal one as we increase the number of agents. We present synthetic experiments that allow us to gain further insights even though a diverse team outperforms a uniform team when the size of the action space increases the uniform team will eventually again play better than the diverse team for a large enough action space. We verify our predictions in a system of Go playing agents where we show a diverse team that improves in performance as the board size increases and eventually overcomes a uniform team.
1033A Simple Polynomial-Time Randomized Distributed Algorithm for Connected Row Convex Constraints,Nguyen Duc Thien T. K. Satish Kumar William Yeoh and Sven Koenig,"Multiagent Systems MAS
1034Search and Constraint Satisfaction SCS ","Connected Row Convex Constraints
1035Distributed CSP
1036Randomized Algorithms","MAS Distributed Problem Solving
1037SCS Distributed CSP/Optimization",In this paper we describe a simple randomized algorithm that runs in polynomial time and solves connected row convex CRC constraints in a distributed setting. CRC constraints generalize many known tractable classes of constraints like 2-SAT and implicational constraints. They can model problems in many domains including temporal reasoning and geometric reasoning; and generally speaking play the role of ``Gaussians'' in the logical world. Our simple randomized algorithm for solving them in distributed settings therefore has a number of important applications. We support our claims through empirical results. We also generalize our algorithm to tractable classes of tree convex constraints.
1038SUIT A Supervised User-Item based Topic model for Sentiment Analysis,Fangtao Li Sheng Wang Shenghua Liu and Ming Zhang,NLP and Text Mining NLPTM ,"topic model
1039sentiment analysis
1040review rating","AIW Web-based opinion extraction and trend spotting
1041NLPTM Information Extraction",Topic models have been widely used for sentiment analysis. Previous studies shows that supervised topic model can better model the sentiment expressions. However most of existing topic methods only model the sentiment text information but do not consider the user who expressed the sentiment and the item which the sentiment is expressed on. Since different users may use different sentiment expressions for different items we argue that it is better to incorporate the user and item information into the topic model for sentiment analysis. In this paper we propose a novel Supervised User-Item based Topic model called SUIT model for sentiment analysis. It can simultaneously utilize the textual topic and latent user-item factors. Our proposed method uses the tensor outer product of text topic proportion vector user latent factor and item latent factor to model the sentiment label generalization. Extensive experiments are conducted on two datasets review dataset and microblog dataset. The results demonstrate the advantages of our model. It shows significant improvement compared with supervised topic models and collaborative filtering methods.
1042Automatic Game Design via Mechanic Generation,Alexander Zook and Mark Riedl,Game Playing and Interactive Entertainment GPIE ,"procedural content generation
1043game generation
1044game mechanics","GPIE AI in Game Design
1045GPIE Procedural Content Generation",Game designs often center on the game mechanics - rules governing the logical evolution of the game. We seek to develop an intelligent system that generates computer games. As first steps towards this goal we present a composable and cross-domain representation for game mechanics that draws from AI planning action representations. We use a constraint solver to generate mechanics subject to design requirements on the form of those mechanics - what they do in the game. A planner takes a set of generated mechanics and tests whether those mechanics meet playability requirements - controlling how mechanics function in a game to affect player behavior. We demonstrate our system by modeling and generating mechanics in a role-playing game platformer game and combined role-playing-platformer game.
1046Learning the Structure of Probabilistic Graphical Models with an Extended Cascading Indian Buffet Process,Patrick Dallaire Philippe Giguère and Brahim Chaib-Draa,Novel Machine Learning Algorithms NMLA ,"Structure learning
1047Bayesian Learning
1048Bayesian nonparametrics
1049Graphical models
1050Deep belief networks
1051Infinite directed acyclic graphs
1052MCMC inference","NMLA Bayesian Learning
1053NMLA Data Mining and Knowledge Discovery
1054NMLA Graphical Model Learning
1055RU Bayesian Networks
1056RU Graphical Models Other ",In this paper we present an extension of the cascading Indian buffet process CIBP intended to learning arbitrary directed acyclic graph structures as opposed to the CIBP which is limited to purely layered structures. The extended cascading Indian buffet process eCIBP essentially consists in adding an extra sampling step to the CIBP to generate connections between non-consecutive layers. In the context of graphical model structure learning the proposed approach allows learning structures having an unbounded number of hidden random variables and automatically selecting the model complexity. We evaluated the extended process on multivariate density estimation and structure identification tasks by measuring the structure complexity and predictive performance. The results suggest the extension leads to extracting simpler graphs without scarifying predictive precision.
1057Learning Compositional Sparse Models of Bimodal Percepts,Suren Kumar Vikas Dhiman and Jason J. Corso,"Cognitive Systems CS
1058Novel Machine Learning Algorithms NMLA
1059Vision VIS ","compositional model
1060bimodal sparse representation
1061vision and audio","CS Structural learning and knowledge capture
1062NMLA Dimension Reduction/Feature Selection
1063VIS Language and Vision",Various perceptual domains have underlying compositional semantics that are rarely captured in current models. We suspect this is because directly learning the compositional structure has evaded these models. Yet the compositional structure of a given domain can be grounded in a separate domain thereby simplifying its learning. To that end we propose a new approach to modeling bimodal percepts that explicitly relates distinct projections across each modality and then jointly learns a bimodal sparse representation. The resulting model enables compositionality across these distinct projections and hence can generalize to unobserved percepts spanned by this compositional basis. For example our model can be trained on red triangles and blue squares ; yet implicitly will also have learned red squares and blue triangles . The structure of the projections and hence the compositional basis is learned automatically for a given language model. To test our model we have acquired a new bimodal dataset comprising images and spoken utterances of colored shapes in a tabletop setup. Our experiments demonstrate the benefits of explicitly leveraging compositionality in both quantitative and human evaluation studies.
1064Dramatis A Computational Model of Suspense,Brian O'Neill and Mark Riedl,"Applications APP
1065Game Playing and Interactive Entertainment GPIE ","computational creativity
1066psychological models
1067computational aesthetics","APP Art and Music
1068GPIE AI Storytelling
1069GPIE Computational Creativity and Generative Art",We introduce Dramatis a computational model of suspense based on a reformulation of a psychological definition of the suspense phenomenon. In this reformulation suspense is correlated with the audience s ability to generate a plan for the protagonist to avoid an impending negative outcome. Dramatis measures the suspense level by generating such a plan and determining its perceived likelihood of success. We report on three evaluations of Dramatis including a comparison of Dramatis output to the suspense reported by human readers as well as ablative tests of Dramatis components. In these studies we found that Dramatis output corresponded to the suspense ratings given by human readers for stories in three separate domains.
1070LASS A Simple Assignment Model with Laplacian Smoothing,Miguel Carreira-Perpinan and Weiran Wang,Novel Machine Learning Algorithms NMLA ,"semi-supervised learning
1071convex optimization
1072ADMM
1073soft assignment
1074Laplacian smoothing","HSO Optimization
1075NMLA Semisupervised Learning
1076RU Uncertainty in AI General/Other ",We consider the problem of learning soft assignments of N items to K categories given two sources of information an item-category similarity matrix which encourages items to be assigned to categories they are similar to and to not be assigned to categories they are dissimilar to and an item-item similarity matrix which encourages similar items to have similar assignments. We propose a simple quadratic programming model that captures this intuition. We give necessary conditions for its solution to be unique define an out-of-sample mapping and derive a simple effective training algorithm based on the alternating direction method of multipliers. The model predicts reasonable assignments from even a few similarity values and can be seen as a generalization of semisupervised learning. It is particularly useful when items naturally belong to multiple categories as for example when annotating documents with keywords or pictures with tags with partially tagged items or when the categories have complex interrelations e.g. hierarchical that are unknown.
1077Learning Goal-Oriented Hierarchical Tasks from Situated Interactive Instruction,Shiwali Mohan and John Laird,Cognitive Systems CS ,"task learning
1078interactive learning
1079explanation-based learning
1080situated interactive instruction
1081learning for robots
1082learning from dialog
1083Soar
1084cognitive architecture","CM Agent Architectures
1085CM Symbolic AI
1086CS Social cognition and interaction
1087CS Structural learning and knowledge capture",Our research aims at building interactive robots and agent that can expand their knowledge by interacting with human users. In this paper we focus on learning goal-oriented tasks from situated interactive instructions. Learning novel tasks is a challenging computational problem requiring the agent to acquire a variety of knowledge including goal definitions and hierarchical control information. We frame acquisition of hierarchical tasks as an explanation-based learning EBL problem and propose an interactive learning variant of EBL for a robotic agent. We show that our approach can exploit information in situated instructions along with the domain knowledge to demonstrate fast generalization on several tasks.The knowledge acquired transfers across structurally similar tasks. Finally we show that our approach seamlessly combines agent-driven exploration with instructions for mixed-initiative learning.
1088Chance-constrained Probabilistic Simple Temporal Problems,Cheng Fang Peng Yu and Brian Williams,"Planning and Scheduling PS
1089Reasoning under Uncertainty RU
1090Search and Constraint Satisfaction SCS ","scheduling
1091probabilistic
1092STNU","PS Probabilistic Planning
1093PS Scheduling
1094PS Temporal Planning
1095SCS Constraint Optimization","Robust scheduling is essential to many autonomous systems and logistics tasks. Probabilistic formalisms quantify the risk of schedule failure which is essential for mission critical applications. Probabilistic methods for solving temporal problems exist that attempt to minimize the probability of schedule failure. These methods are overly conservative resulting in a loss in schedule utility. Chance constrained formalism address this problem by imposing bounds on risk while maximizing utility subject to these risk bounds.
1096
1097In this paper we present the probabilistic Simple Temporal Network pSTN a probabilistic formalism for representing temporal problems with bounded risk and a utility over event timing. We introduce a constrained optimisation algorithm for pSTNs that achieves compactness and efficiency through a problem encoding in terms of a parameterised STNU and its reformulation as a parameterised STN. We demonstrate through a car sharing application that our chance-constrained approach runs in the same time as the previous probabilistic approach yields solutions with utility improvements of at least 5% over previous arts while guaranteeing operation within the specified risk bound."
1098False-Name Bidding and Economic Efficiency in Combinatorial Auctions,Adrian Vetta and Colleen Alkalay-Houlihan,"Game Theory and Economic Paradigms GTEP
1099Multiagent Systems MAS
1100Reasoning under Uncertainty RU ",#NAME?,"GTEP Auctions and Market-Based Systems
1101GTEP Game Theory
1102GTEP Equilibrium
1103GTEP Imperfect Information
1104MAS E-Commerce
1105MAS Evaluation and Analysis Multiagent Systems
1106RU Decision/Utility Theory","Combinatorial auctions are multiple-item auctions in which bidders may place bids on any package subset of goods. This additional expressibility produces benefits that have led to combinatorial auctions becoming extremely important both in practice and in theory. In the computer science community research has focused primarily on computation and incentive compatibility. The latter concerns a specific form of bidder misrepresentation. However with modern forms of bid submission such as electronic bidding new types of cheating become feasible. For combinatorial auctions prominent amongst them is false-name bidding; that is bidding under pseudonyms. The ubiquitous Vickrey-Clarke-Groves VCG mechanism is incentive compatible and produces optimal allocations but it is not false-name-proof; bidders can increase their utility by submitting bids under multiple identifiers. Consequently there has recently been much interest in the design and analysis of false-name-proof auction mechanisms.
1107
1108Such false-name-proof mechanisms however can produce allocations with very low economic efficiency/social welfare. In contrast we show that provided the degree to which different goods are complementary is bounded as is the case in many important practical auctions the VCG mechanism gives a constant efficiency guarantee. Such efficiency guarantees hold even at equilibria where the agents bid in a manner that is not individually rational. Thus while an individual bidder may
1109personally benefit greatly from making false-name bids this will have only a small detrimental effect on the objective of the auctioneer maximizing economic efficiency. Thus from the auctioneer's viewpoint the VCG mechanism remains preferable to false-name-proof mechanisms."
1110Exploiting Competition Relationship for Robust Visual Recognition,Liang Du and Haibin Ling,"Applications APP
1111Machine Learning Applications MLA
1112Vision VIS ","Competition relationships
1113jointly learning
1114visual recognition","APP Other Applications
1115MLA Applications of Supervised Learning
1116VIS Categorization","Joint learning of similar tasks has been a popular trend in visual recognition and proven to be beneficial. The between-task similarity typically provides useful cues such as feature sharing for learning visual classifiers. By contrast the competition relationship between visual recognition tasks \eg content independent writer identification and handwriting recognition remains largely under-explored.
1117Intuitively the between-task competition can be used to guide the feature selection process that plays a key role when learning a visual classifier.
1118Motivated by this intuition we propose a novel algorithm to exploit competition relationship for improving visual recognition tasks. More specifically given a target task and its competing tasks we jointly model them by a generalized additive regression model with competition constraints. This constraint effectively discourages choosing of irrelevant features weak learners that are good for the
1119 tasks with competition relationships . The proposed algorithm is named \emph{CompBoost} since it can be viewed extended from the RealAdaboost algorithm. We apply CompBoost to two visual recognition applications 1 content-independent writer identification from handwriting scripts by exploiting competing tasks of handwriting recognition and 2 actor-independent facial expression recognition by exploiting competing tasks of face recognition. In both experiments our approach demonstrates promising performance gains by exploiting the between-task competition relationships."
1120Power Iterated Color Refinement,Kristian Kersting Martin Mladenov Roman Garnet and Martin Grohe,"Novel Machine Learning Algorithms NMLA
1121Reasoning under Uncertainty RU ","Lifted Belief Propagation
1122Color Refinement
1123Fractional Automorphism
1124Conditional Gradient
1125Power Iteration
1126Continuous Optimization","NMLA Relational/Graph-Based Learning
1127RU Relational Probabilistic Models","Color refinement is a basic algorithmic routine for graph isomorphism
1128testing and has recently been used for computing graph kernels as well as for lifting belief propagation and linear programming.
1129So far color refinement has been treated as a combinatorial
1130problem. Instead we treat it as a nonlinear continuous optimization problem and prove that
1131it implements a conditional gradient optimizer that can be turned into
1132graph clustering approaches using hashing and truncated power iterations. This shows that color refinement
1133is easy to understand in terms of local random walks easy to implement matrix-vector multiplications and is readily parallelizable.
1134We support our theoretical results with experimental evidence on real-world graphs with millions of edges."
1135A Framework for Task Planning in Heterogeneous Multi Robot Systems Based on Robot Capabilities,Jennifer Buehler and Maurice Pagnucco,"Planning and Scheduling PS
1136Robotics ROB ","Robot Task Planning
1137Temporal Planning
1138Heterogeneous Multi Robot Systems
1139Robot capabilities","PS Temporal Planning
1140PS Planning General/Other
1141ROB Multi-Robot Systems
1142ROB Robotics General/Other ","In heterogeneous multi-robot teams robustness and flexibility are increased by the diversity of the robots each contributing different capabilities. Yet platform-
1143independence is desirable when planning actions for the various robots. This work develops a framework for task planning based on a platform-independent model
1144of robots capabilities building on a temporal planner. Generating new data objects during planning is essential to reflect data flow between actions in a robotic system. This requires online action instantiation for which we present a novel approach. Required concurrency of actions is an essential part of robotic systems and therefore is incorporated in the framework. We evaluate the planner on benchmark domains and present results on an example object transportation task in simulation."
1145Active Learning Using Knowledge Transfer from Unlabeled Data,Meng Fang Jie Yin and Dacheng Tao,Novel Machine Learning Algorithms NMLA ,"Active learning
1146Multiple labelers
1147Knowledge transfer",NMLA Active Learning,This paper studies the problems associated with active learning in which multiple users with varying levels of expertise are available for labeling data. Annotations collected from different users may be noisy and unreliable and the quality of labeled data needs to be maintained for data mining tasks. Previous solutions have included estimating individual user reliability based on existing knowledge in each task but for this to be effective each task requires a large quantity of labeled data to provide accurate estimates. In practice annotation budgets for a given target task are limited so each example can be presented to only a few users each of whom can only label a few examples. To overcome data scarcity we propose a new probabilistic model that transfers knowledge from abundant unlabeled data in auxiliary domains to help estimate labelers' expertise. Based on this model we present a novel active learning algorithm that a simultaneously selects the most informative example and b queries its label from the labeler with the best expertise. Experiments on both text and image datasets demonstrate that our proposed method outperforms other state-of-the-art active learning methods.
1148Using Model-Based Diagnosis to Improve Software Testing,Tom Zamir Roni Stern and Meir Kalech,"Applications APP
1149Knowledge Representation and Reasoning KRR
1150Planning and Scheduling PS ","Model based diagnosis
1151Software engineering
1152Planning
1153Testing","APP Other Applications
1154KRR Automated Reasoning and Theorem Proving
1155KRR Diagnosis and Abductive Reasoning
1156PS Model-Based Reasoning",We propose a combination of AI techniques to improve software testing. When a test fails a model-based diagnosis MBD algorithm is used to propose a set of possible explanations. We call these explanations diagnoses . Then a planning algorithm is used to suggest further tests to identify the correct diagnosis. A tester preforms these tests and reports their outcome back to the MBD algorithm which uses this information to prune incorrect diagnoses. This iterative process continues until the correct diagnosis is returned. We call this testing paradigm Test Diagnose and Plan TDP . Several test planning algorithms are proposed to minimize the number of TDP iterations and consequently the number of tests required until the correct diagnosis is found. Experimental results show that benefits of using an MDP-based planning algorithms over greedy test planning.
1157Wormhole Hamiltonian Monte Carlo,Shiwei Lan Jeffrey Streets and Babak Shahbaba,Novel Machine Learning Algorithms NMLA ,"Bayesian Inference
1158Computational Statistics
1159Markov Chain Monte Carlo
1160Multimodal Distributions
1161Geometric Methods","NMLA Bayesian Learning
1162NMLA Machine Learning General/other ",In machine learning and statistics probabilistic inference involving multimodal distributions is quite difficult. This is especially true in high dimensional problems where most existing algorithms cannot easily move from one mode to another. To address this issue we propose a novel Bayesian inference approach based on Markov Chain Monte Carlo. Our method can effectively sample from multimodal distributions especially when the dimension is high and the modes are isolated. To this end it exploits and modifies the Riemannian geometric properties of the target distribution to create wormholes connecting modes in order to facilitate moving between them. Further our proposed method uses the regeneration technique in order to adapt the algorithm by identifying new modes and updating the network of wormholes without affecting the stationary distribution. To find new modes as opposed to rediscovering those previously identified we employ a novel mode searching algorithm that explores a residual energy function obtained by subtracting an approximate Gaussian mixture density based on previously discovered modes from the target density function.
1163Pay-as-you-go OWL Query Answering Using a Triple Store,Yujiao Zhou Yavor Nenov Bernardo Cuenca Grau and Ian Horrocks,"AI and the Web AIW
1164Knowledge Representation and Reasoning KRR ","OWL
1165ontologies
1166triple store","AIW Searching querying visualizing and interpreting the semantic web
1167KRR Ontologies
1168KRR Automated Reasoning and Theorem Proving
1169KRR Description Logics",We present an enhanced hybrid approach to OWL query answering that combines an RDF triple-store with a fully-fledged OWL reasoner in order to provide scalable ``pay as you go'' performance. The enhancements presented here include an extension to deal with arbitrary OWL ontologies and several optimisations that significantly improve scalability. We have implemented these techniques in a prototype system a preliminary evaluation of which has produced very encouraging results.
1170Imitation Learning with Demonstrations and Shaping Rewards,Kshitij Judah Alan Fern Prasad Tadepalli and Robby Goetschalckx,Reasoning under Uncertainty RU ,"Imitation Learning
1171Reinforcement Learning
1172Reward Shaping",RU Sequential Decision Making,Imitation Learning IL is a popular approach for teaching behavior policies to agents by demonstrating the desired target policy. While the approach has lead to many successes IL often requires a large set of demonstrations to achieve robust learning which can be expensive for the teacher. In this paper we consider a novel approach to improve the learning efficiency of IL by providing a shaping reward function in addition to the usual demonstrations. Shaping rewards are numeric functions of states and possibly actions that are generally easily specified and capture general principles of desired behavior without necessarily completely specifying the behavior. Shaping rewards have been used extensively in reinforcement learning but have been seldom considered for IL though they are often easy to specify. Our main contribution is to propose an IL approach that learns from both shaping rewards and demonstrations. We demonstrate the effectiveness of the approach across several IL problems even when the shaping reward is not fully consistent with the demonstrations.
1173Spectral Thompson Sampling,Tomáš Kocák Michal Valko Remi Munos and Shipra Agrawal,"Novel Machine Learning Algorithms NMLA
1174Reasoning under Uncertainty RU ","Spectral bandits
1175Thompson Sampling
1176Smooth functions on graphs","NMLA Online Learning
1177NMLA Recommender Systems
1178RU Sequential Decision Making","Thompson Sampling TS has surged a lot of interest due to its good
1179empirical
1180performance in particular in the computational advertising. Though successful
1181the tools for its performance analysis appeared only recently. In this paper
1182we describe and analyze SpectralTS for a bandit problem where
1183the payoffs of the choices are smooth given an underlying graph. In
1184this
1185setting each choice is a node of a graph and the expected payoffs of the
1186neighboring nodes are assumed to be similar. Although the setting has
1187application both in recommender systems and advertising the traditional
1188algorithms would scale poorly with the number of choices. For that purpose we
1189consider an effective dimension $d$ which is small in real-world graphs.
1190Building on prior work we deliver the analysis showing that the regret of
1191SpectralTS scales with $d\sqrt{T \ln N}$ where $T$ is the time
1192horizon and $N$ is the number of choices. Since a $d\sqrt{T \ln N}$ regret is
1193comparable to the known results SpectralTS offers a computationally more
1194efficient alternative. We also show that our algorithm is competitive on both
1195synthetic and real-world data."
1196Mechanism Design for Scheduling with Uncertain Execution Time.,Vincent Conitzer and Angelina Vidali,"Game Theory and Economic Paradigms GTEP
1197Multiagent Systems MAS ","mechanism design
1198scheduling mechanisms
1199game theory
1200uncertainity","GTEP Auctions and Market-Based Systems
1201GTEP Game Theory
1202MAS Mechanism Design","We study the problem where a task or multiple unrelated tasks must be executed there are multiple
1203machines/agents that can potentially perform the task and our
1204objective is to minimize the expected sum of the agents' processing
1205times. Each agent does not know exactly how long it will take him to
1206finish the task; he only knows the distribution from which this time
1207is drawn. These times are independent across agents and the
1208distributions fulfill the monotone hazard rate condition. Agents are
1209selfish and will lie about their distributions if this increases their
1210expected utility.
1211
1212We study different variations of the Vickrey mechanism that take as input the agents' reported distributions and the players' realized running times and that output a schedule that minimizes the expected sum of processing times as well as payments that make it an ex-post equilibrium for the agents to both truthfully report their
1213distributions and exert full effort to complete the task. We devise the ChPE mechanism which is uniquely tailored to our problem and has many desirable properties including not rewarding agents that fail to finish the task and having non-negative payments."
1214Robust Winners and Winner Determination Policies under Candidate Uncertainty,Craig Boutilier Jérôme Lang Joel Oren and Hector Palacios,"Game Theory and Economic Paradigms GTEP
1215Multiagent Systems MAS ","voting
1216candidate availability
1217query policies
1218robust winner determination","APP Computational Social Science
1219GTEP Social Choice / Voting",We consider voting situations in which some candidates may turn out to be unavailable. When determining availability is costly e.g. in terms of money time or computation voting prior to determining candidate availability and testing the winner's availability *after* the vote may be beneficial. However since few voting rules are robust to candidate deletion winner determination requires a number of such availability tests. We outline a model for analyzing such problems defining *robust winners* relative to potential candidate unavailability. We assess the complexity of computing robust winners for several voting rules. Assuming a distribution over availability and costs for availability tests/queries we describe algorithms for *optimal query policies* which minimize the expected cost of determining true winners.
1220Learning Latent Engagement Patterns of Students in Online Courses,Arti Ramesh Dan Goldwasser Bert Huang Hal Daume Iii and Lise Getoor,"Applications APP
1221Machine Learning Applications MLA ","probabilistic modeling
1222structured prediction
1223data-driven methods in education
1224MOOC
1225online education","APP Computer-Aided Education
1226HAI Understanding People Theories Concepts and Methods
1227MLA Machine Learning Applications General/other ",Maintaining and cultivating student engagement is a critical component of education. In various teaching settings communication via online forums electronic quizzes and interaction with multimedia can include valuable information for assessing and understanding student engagement. Massive open online courses MOOCs measure large-scale data of this nature and provide the opportunity for data-driven study. Characterizing student engagement as a course progresses helps identify student learning patterns and can aid in minimizing dropout rates initiating instructor intervention. In this paper we construct a probabilistic model connecting student behavior and class completion formulating student engagement types as latent variables. We show that our model accurately identifies course success indicators which can be used by instructors to initiate interventions and assist students.
1228Semantic Segmentation Using Multiple Graphs with Block-Diagonal Constraints,Ke Zhang Wei Zhang Sheng Zeng and Xiangyang Xue,"Machine Learning Applications MLA
1229Novel Machine Learning Algorithms NMLA
1230Vision VIS ","Image Semantic Segmentation
1231Block-Diagonal Constraints
1232MultiView Affinity Graph","MLA Applications of Supervised Learning
1233NMLA Classification
1234NMLA Relational/Graph-Based Learning
1235VIS Categorization
1236VIS Object Detection
1237VIS Object Recognition
1238VIS Statistical Methods and Learning","In this paper we propose a novel method for image semantic segmentation
1239using multiple graphs. The
1240multi-view affinity graph is constructed by leveraging the consistency between semantic space and multiple visual spaces.
1241With
1242block-diagonal constraints we enforce the affinity matrix to be sparse such that the pairwise
1243potential for dissimilar superpixels is close to zero. By a divide-and-conquer strategy the optimization for learning affinity matrix is decomposed into several subproblems that can be solved in parallel. Using the $neighborhood$ $relationship$ between superpixels and the $consistency$ between
1244affinity matrix and label-confidence matrix we infer the semantic label for each superpixel of unlabeled images by minimizing an objective whose closed form solution can be easily obtained.
1245 Experimental results on two real-world
1246image datasets demonstrate the effectiveness of our method."
1247Parametrized Families of Hard Planning Problems from Phase Transitions,Eleanor Rieffel Davide Venturelli Minh Do Itay Hen and Jeremy Frank,Planning and Scheduling PS ,"parametrized families
1248scaling analysis
1249phase transition
1250benchmark planning problems","PS Scheduling
1251PS Planning General/Other ",There are two complementary ways to evaluate planning algorithms performance on benchmark problems derived from real applications and analysis of performance on parametrized families of problems with known properties. Prior to this work few means of generating parametrized families of hard planning problems were known. We generate hard planning problems from the solvable/unsolvable phase transition region of well-studied NP-complete problems that map naturally to navigation and scheduling aspects common to many planning domains. Our results confirm exponential scaling of hardness with problem size even at very small problem sizes. We observe significant differences between state-of-the-art planners on these problem families enabling us to gain insight into the relative strengths and weaknesses of these planners. These families provide complementary test sets exhibiting properties not found in existing benchmarks.
1252On Dataless Hierarchical Text Classification,Yangqiu Song and Dan Roth,NLP and Machine Learning NLPML ,"Hierarchical Text Classification
1253Dataless Text Classification
1254Semantic Representation",NLPML Text Classification,In this paper we systematically study the problem of dataless hierarchical text classification. Unlike standard text classification schemes that rely on supervised training dataless classification depends on understanding the labels of the sought after categories and requires no labeled data. Given a collection of text documents and a set of labels we show that understanding the labels can be used to categorize the documents to the corresponding categories. This is done by embedding both labels and documents in a semantic space that allows one to compute meaningful semantic similarity between a document and a potential label. We show that this scheme can be used to support accurate multiclass classification without any supervision. We study several semantic representations and show how to improve the classification using bootstrapping methods. Our results show that bootstrapped dataless classification is competitive with supervised classification with thousands of labeled examples.
1255Confident Reasoning on Raven s Progressive Matrices Tests,Keith McGreggor and Ashok Goel,Knowledge Representation and Reasoning KRR ,"Visual Representations
1256Reasoning
1257Analogy","KRR Geometric Spatial and Temporal Reasoning
1258KRR Knowledge Representation General/Other ",We report a novel approach to addressing the Raven s Progressive Matrices RPM tests one based upon purely visual representations. Our technique introduces the calculation of confidence in an answer and the automatic adjustment of level of resolution if that confidence is insufficient. We first describe the nature of the visual analogies found on the RPM. We then exhibit our algorithm and work through a detailed example. Finally we present the performance of our algorithm on the four major variants of the RPM tests illustrating the impact of confidence. This is the first such account of any computational model against the entirety of the Raven s.
1259The Role of Dimensionality Reduction in Linear Classification,Weiran Wang and Miguel Carreira-Perpinan,Novel Machine Learning Algorithms NMLA ,"dimensionality reduction
1260nonlinear classification
1261optimization","HSO Optimization
1262NMLA Classification
1263NMLA Dimension Reduction/Feature Selection",Dimensionality reduction DR is often used as a preprocessing step in classification but usually one first fixes the DR mapping possibly using label information and then learns a classifier a filter approach . Best performance would be obtained by optimizing the classification error jointly over DR mapping and classifier a wrapper approach but this is a difficult nonconvex problem particularly with nonlinear DR. Using the method of auxiliary coordinates we give a simple efficient algorithm to train a combination of nonlinear DR and a classifier and apply it to a RBF mapping with a linear SVM. This alternates steps where we train the RBF mapping and a linear SVM as usual regression and classification respectively with a closed-form step that coordinates both. The resulting nonlinear low-dimensional classifier achieves classification errors competitive with the state-of-the-art but is fast at training and testing and allows the user to trade off runtime for classification accuracy easily. We then study the role of nonlinear DR in linear classification and the interplay between the DR mapping the number of latent dimensions and the number of classes. When trained jointly the DR mapping takes an extreme role in eliminating variation it tends to collapse classes in latent space erasing all manifold structure and lay out class centroids so they are linearly separable with maximum margin.
1264Generalizing Policy Advice with Gaussian Process Bandits for Dynamic Skill Improvement,Jared Glover and Charlotte Zhu,"Heuristic Search and Optimization HSO
1265Humans and AI HAI
1266Machine Learning Applications MLA
1267Reasoning under Uncertainty RU
1268Robotics ROB ","robot table tennis
1269gaussian process bandits
1270human advice
1271coaching robots","HSO Heuristic Search
1272HAI Human-Computer Interaction
1273MLA Machine Learning Applications General/other
1274RU Decision/Utility Theory
1275RU Sequential Decision Making
1276ROB Robotics General/Other ","We present a ping-pong-playing robot that learns to improve its swings with human advice. Our method learns a reward function over the joint space of task and policy parameters. This allows the robot to explore policy space more intelligently by leveraging active learning techniques to explore the reward surface
1277in a way that trades off exploration vs. exploitation to maximize the total cumulative reward over time. Multimodal stochastic polices can also easily be learned with this approach when the reward function is multimodal in the policy parameters. We extend the recently-developed Gaussian Process Bandit
1278Optimization framework to include advice from human domain experts."
1279Deep Modeling of Group Preferences for Group-based Recommendation,Liang Hu Jian Cao Guandong Xu Longbing Cao Zhiping Gu and Wei Cao,"AI and the Web AIW
1280Novel Machine Learning Algorithms NMLA ","Group Recommender System
1281Deep Learning
1282Feature Learning
1283Deep Belief Network
1284Restricted Boltzmann Machine","AIW Web-based recommendation systems
1285NMLA Preferences/Ranking Learning
1286NMLA Recommender Systems",Nowadays most recommender systems RSs mainly aim to suggest appropriate items for individuals. Due to the social nature of human beings group activities have become an integral part of our daily life thus motivating the study on group RS GRS . However most existing methods used by GRS make recommendations through aggregating individual ratings or individual predictive results rather than considering the collective features that govern user choices made within a group. As a result such methods are heavily sensitive to data hence they often fail to learn group preferences when the data are slightly inconsistent with predefined aggregation assumptions. To this end we devise a novel GRS approach which accommodates both individual choices and group decisions in a joint model. More specifically we propose a deep-architecture model built with a collective deep belief network and dual-wing restricted Boltzmann machine. With such a deep model we can use high-level features which are induced from lower-level features to represent group preference so as to relieve the vulnerability of data. Finally the experiments conducted on a real-world dataset prove the superiority of our deep model over other state-of-the-art methods.
1287Cross-Domain Metric Learning Based on Information Theory,Wei Wang,Novel Machine Learning Algorithms NMLA ,"Mahalanobis distance
1288metric learning
1289transfer learning
1290relative entropy",NMLA Transfer Adaptation Multitask Learning,Supervised metric learning plays a substantial role in statistical classification. Conventional metric learning algorithms have limited utility when the training data and the testing data are drawn from related but different domains i.e. source domain and target domain . Although this issue has got some progress in feature-based transfer learning most of the work in this area suffers from non-trivial optimization and pays little attention to preserving the discriminating information. In this paper we propose a novel metric learning algorithm to transfer knowledge from the source domain to the target domain in an information-theoretic setting where a shared Mahalanobis distance across two domains is learnt by combining three goals together 1 reducing the distribution difference between different domains; 2 preserving the geometry of target domain data; 3 aligning the geometry of source domain data with its label information. Based on this combination the learnt Mahalanobis distance effectively transfers the discriminating power and propagates standard classifiers across two domains. More importantly our proposed method has closed-form solution and can be efficiently optimized. Experiments on two real-world applications i.e. face recognition and text classification demonstrate the effectiveness and efficiency of our proposed method.
1291Spatial Scan for Disease Mapping on a Mobile Population,Liang Lan Vuk Malbasa and Slobodan Vucetic,"Applications APP
1292Computational Sustainability and AI CSAI
1293Machine Learning Applications MLA ","disease mapping
1294spatial scan
1295mobile data","APP AI and Natural Sciences
1296CSAI Modeling and prediction of dynamic and spatiotemporal phenomena and systems
1297CSAI Control and optimization of dynamic and spatiotemporal systems
1298MLA Machine Learning Applications General/other
1299NMLA Machine Learning General/other ",Spatial scan statistics is used for discovery of spatial regions with significantly higher scores according to some density measure. In disease surveillance spatial scan is a standard tool to detect spatial regions whose population has significantly higher disease risk than the overall population. In this important application called the disease mapping current residence is typically used to define the location of individuals from the population. Considering the mobility of humans at various temporal and spatial scales using only information about the current residence can be insufficient because it ignores a multitude of exposures that occur away from home or which had occurred at previous residences. In this paper we propose a novel spatial scan statistic that allows disease mapping in a mobile population. We also propose a computationally efficient disease mapping algorithm that uses the proposed statistic to find the significant high-risk spatial regions. The experimental results demonstrate that the proposed algorithm is superior to the traditional disease mapping algorithms in discovering high-risk regions in mobile populations. Moreover the algorithm is applicable on large populations and over dense spatial grids.
1300Identifying Hierarchies for Fast Optimal Search,Tansel Uras and Sven Koenig,Heuristic Search and Optimization HSO ,"Hierarchical search
1301Path planning
1302Subgoals",HSO Heuristic Search,"Search with Subgoal Graphs Uras Koenig and Hernandez 2013 was a non-dominated optimal path-planning algorithm in the Grid-Based Path Planning Competitions 2012 and 2013. During a preprocessing phase it computes a Simple Subgoal Graph which is analogous to a visibility graph for continuous terrain and then partitions the subgoals into global and local subgoals to obtain a Two-Level Subgoal Graph. During the path-planning phase it performs an A* search that ignores local subgoals that are not relevant to the search which significantly reduces the size of the graph being searched.
1303
1304In this paper we generalize this partitioning process to any undirected graph and show that it can be recursively applied to generate more than two levels which reduces the size of the graph being searched even further. We distinguish between basic partitioning which only partitions the vertices into different levels and advanced partitioning which can also add new edges. We show that the construction of Simple Subgoal Graphs from grids and the construction of Two-Level Subgoal Graphs from Simple Subgoal Graphs are instances of generalized partitioning. We then report on experiments on Subgoal Graphs that demonstrate the effects of different types and levels of partitioning. We also report on experiments that demonstrate that our new N-Level Subgoal Graphs with several additional improvements achieve a better performance compared to Two-Level Subgoal graphs from Uras Koenig and Hernandez 2013 ."
1305On Boosting Sparse Parities,Lev Reyzin,Novel Machine Learning Algorithms NMLA ,"choosing weak learners
1306boosting
1307parity functions","NMLA Ensemble Methods
1308NMLA Machine Learning General/other ","While the ensemble method of boosting has been extensively studied considerably less attention has been devoted to the task of designing good weak learning algorithms. In this paper we consider the problem of designing weak learners that are especially adept to the boosting procedure and specifically the AdaBoost algorithm.
1309
1310First we describe conditions desirable for a weak learning algorithm. We then propose using sparse parity functions as weak learners which have many of our desired properties as weak learners in boosting. Our experimental tests show the proposed weak learners to be competitive with the most widely used ones decision stumps and pruned decision trees."
1311Decentralized Stochastic Planning with Anonymity in Interactions,Pradeep Varakantham Yossiri Adulyasak and Patrick Jaillet,"Multiagent Systems MAS
1312Planning and Scheduling PS
1313Reasoning under Uncertainty RU ","Planning under uncertainty
1314Multiagent Systems
1315DEC-MDP
1316Optimization","PS Markov Models of Environments
1317RU Decision/Utility Theory
1318RU Uncertainty in AI General/Other ","In this paper we solve cooperative decentralized stochastic planning problems where the interactions between agents specified using transition and reward functions are dependent on the number of agents and not on the identity of the individual agents involved in the interaction. A collision of robots in a narrow corridor defender teams coordinating patrol activities to secure a target etc. are examples of such anonymous interactions. Formally we consider problems that are a subset of the well known Decentralized MDP DEC-MDP model where the anonymity in interactions is specified within the joint reward and transition functions. In this paper we make the following key contributions:\\
1319 a A generic model Decentralized Stochastic Planning with Anonymous InteracTions D-SPAIT to represent stochastic planning problems in cooperative domains with anonymity in interactions.\\
1320 b An optimization based formulation along with theoretical results that establish scalability properties for general D-SPAIT problems. \\
1321 c Optimization formulations whose scalability has little or no dependence on the number of agents for solving certain classes of D-SPAIT problems optimally. \\
1322 d Finally we demonstrate the performance of our optimization approaches on randomly generated benchmark problems from the literature."
1323Regret Transfer and Parameter Optimization,Noam Brown and Tuomas Sandholm,Game Theory and Economic Paradigms GTEP ,"Large Incomplete-Information Games
1324Poker
1325Game Solving
1326No-Regret Learning
1327Counterfactual Regret Minimization
1328Regret Matching
1329Regret Minimization","GTEP Game Theory
1330GTEP Equilibrium
1331GTEP Imperfect Information","Regret matching is a widely-used algorithm for learning how to act.
1332
1333We begin by proving that regrets on actions in one setting game can be transferred to warm start the regrets for solving a different setting with same structure but different payoffs that can be written as a function of parameters. We prove how this can be done by carefully discounting the prior regrets. This provides to our knowledge the first warm-starting method for no-regret learning. It also extends to warm-starting the widely-adopted counterfactual regret minimization CFR algorithm for large incomplete-information games; we show this experimentally as well.
1334
1335We then study optimizing a parameter vector for a player in a two-player zero-sum game e.g. optimizing bet sizes to use in poker . We propose a custom gradient descent algorithm that provably finds a locally optimal parameter vector while leveraging our warm-start theory to significantly save regret-matching iterations at each step. It optimizes the parameter vector while simultaneously finding an equilibrium. We present experiments in no-limit Leduc Hold'em and no-limit Texas Hold'em to optimize bet sizing. This amounts to the first action abstraction algorithm algorithm for selecting a small number of discrete actions to use from a continuum of actions---a key preprocessing step for solving large games using current equilibrium-finding algorithms with convergence guarantees for extensive-form games."
1336Learning with Augmented Class by Exploiting Unlabeled Data,Qing Da Yang Yu and Zhi-Hua Zhou,Novel Machine Learning Algorithms NMLA ,"open set classification
1337unlabeled data
1338support vector machines",NMLA Classification,In many real-world applications of learning the environment is open and changes gradually which requires the system to have the ability of detecting and adapting to the changes. Class-incremental learning C-IL is an important and practical problem where data from unseen augmented classes are fed but has not been studied well in the past. In C-IL the system should beware of predicting instances from augmented classes as a seen class and thus faces the challenge that no such instances were shown in training. In this paper we investigate tackling the challenge by using unlabeled data which can be cheaply collected in many real-world applications. We propose the LACU framework as well as the LACU-SVM approach to learn the concept of seen classes while incorporating the structure presented in the unlabeled data so that the misclassification risks among the seen classes as well as between the augmented and the seen classes are minimized simultaneously. Experiments on diverse datasets show the effectiveness of the proposed approach.
1339Locality-constrained Low-rank Coding for Image Classification,Ziheng Jiang Ping Guo and Lihong Peng,Vision VIS ,"Bag-of-words Model
1340Mid-level Representations
1341Locality Coding
1342Low-rank Coding
1343Inexact Augmented Lagrange Multiplier","VIS Categorization
1344VIS Face and Gesture Recognition
1345VIS Object Recognition",Low-rank coding LRC originated from matrix decomposition is recently introduced into image classification. Following the standard bag-of-words BOW pipeline coding the data matrix in the sense of low-rankness incorporates contextual information into the traditional BOW model and it can capture the dependency relationship among neighbor patches. This differs from the traditional sparse coding paradigms which encode patches independently. Current LRC-based methods use l1 norm to increase the discrimination and sparsity of the learned codes. However such methods fail to consider the local manifold structure between data space and dictionary space. To solve this problem we propose a locality-constrained low-rank coding LCLR algorithm for image representations. By using the geometric structure information as a regularization term the obtained representations are more discriminative. In addition we present a fast and stable online algorithm to solve the optimization problem. In the experiments we evaluate LCLR on four benchmarks including one face recognition dataset extended Yale B one handwritten digit recognition dataset USPS and two image datasets Scene13 for scene recognition and Caltech101 for object recognition . Experimental results show that our approach outperforms many state-of-the-art algorithms even with a linear classifier.
1346Qualitative Planning with Quantitative Constraints for Online Learning of Robotic Behaviours,Timothy Wiley Claude Sammut and Ivan Bratko,"Cognitive Systems CS
1347Robotics ROB ","Robotics
1348Online Machine Learning
1349Multi-Strategy Architecture
1350Qualitative Reasoning
1351Qualitative Planning","CS Problem solving and decision making
1352ROB Behavior and Control
1353ROB Motion and Path Planning
1354ROB Robotics General/Other ","This paper resolves previous problems in the Multi-Strategy architecture for online learning of robotic behaviours.
1355The hybrid method includes a symbolic qualitative planner that constructs an approximate solution to a control problem.
1356The approximate solution provides constraints for a numerical optimisation algorithm which is used to refine the qualitative plan into an operational policy.
1357Introducing quantitative constraints into the planner gives previously unachievable domain independent reasoning.
1358The method is demonstrated on a multi-tracked robot intended for urban search and rescue."
1359Regret-Based Multi-Agent Coordination with Uncertain Task Rewards,Feng Wu,Multiagent Systems MAS ,"Multi-Agent Coordination
1360Uncertain Task Rewards
1361DCOP","MAS Coordination and Collaboration
1362MAS Distributed Problem Solving","Many multi-agent coordination problems can be represented
1363as DCOPs. Motivated by task allocation in disaster
1364response we extend standard DCOP models to consider
1365uncertain task rewards where the outcome of completing
1366a task depends on its current state which is randomly
1367drawn from unknown distributions. The goal of
1368solving this problem is to find a solution for all agents
1369that minimizes the overall worst-case loss. This is a
1370challenging problem for centralized algorithms because
1371the search space grows exponentially with the number
1372of agents and is nontrivial for existing algorithms for
1373standard DCOPs. To address this we propose a novel
1374decentralized algorithm that incorporates Max-Sum
1375with iterative constraint generation to solve the problem
1376by passing messages among agents. By so doing our
1377approach scales well and can solve instances of the task
1378allocation problem with hundreds of agents and tasks."
1379Convex Co-embedding,Farzaneh Mirzazadeh Yuhong Guo and Dale Schuurmans,Novel Machine Learning Algorithms NMLA ,"convex relaxation
1380matrix norm regularization
1381relation learning
1382representation learning","NMLA Dimension Reduction/Feature Selection
1383NMLA Relational/Graph-Based Learning
1384NMLA Supervised Learning Other ",We present a general framework for association learning where entities are embedded in a common latent semantic space to allow relatedness to be expressed by geometry---an approach that underlies the state of the art for link prediction relation learning multi-label tagging relevance retrieval and ranking. Although current approaches rely on local training algorithms applied to non-convex formulations we demonstrate how general convex relaxations can be easily achieved for entity embedding both for the standard multi-linear and prototype-distance response models. We propose an incremental optimization strategy that exploits decomposition to allow scaling. An experimental evaluation reveals the advantages of tractable and repeatable global training in different case studies.
1385Optimal Decoupling in Linear Constraint Systems,Cees Witteveen Michel Wilson and Tomas Klos,"Heuristic Search and Optimization HSO
1386Multiagent Systems MAS
1387Planning and Scheduling PS
1388Search and Constraint Satisfaction SCS ","temporal decoupling
1389constraint solving
1390linear programming
1391flexibility","HSO Optimization
1392MAS Distributed Problem Solving
1393MAS Multiagent Planning
1394PS Scheduling
1395PS Temporal Planning
1396PS Planning General/Other
1397SCS Constraint Satisfaction
1398SCS Constraint Optimization
1399SCS Global Constraints
1400SCS Constraint Satisfaction General/other ",Decomposition can be defined as a technique to obtain complete solutions by easy composition of partial solutions. Typically these partial solutions are obtained by distributed and concurrent local problem solving without communication between the individual problem solvers. Constraint decomposition plays an important role in distributed databases distributed scheduling and violation detection Here it enables conflict-free local decision making while avoiding communication overloading. One of the main issues in decomposition is the loss of flexibility due to the composition technique used. Here flexibility roughly refers to the freedom in choosing suitable values for the variables in order to satisfy the constraints. In this paper we concentrate on linear constraint systems and efficient decomposition techniques for these systems. Using a generalization of a flexibility metric developed for STNs we show how an efficient decomposition technique for linear constraints can be derived that minimizes the loss of flexibility due to decomposition. As a by-product of our decomposition technique we show that an intuitively attractive flexibility metric for linear constraint systems can be developed where decomposition does not incur any loss of flexibility.
1401Who also likes it? Generating the most Persuasive Social Explanations in Recommender Systems,Beidou Wang and Martin Ester,AI and the Web AIW ,"Recommendation Explanation
1402Social Explanation
1403Social Network","AIW Social networking and community identification
1404AIW Web-based recommendation systems",Social explanation the statement with the form of �A and B also like the item � is widely used in almost all the major recommender systems in the web and effectively improves the persuasiveness of the recommendation results by convincing more users to try. This paper presents the first algorithm to generate the most persuasive social explanation by recommending the optimal set of users to be put in the explanation. New challenges like modeling persuasiveness of multiple users different types of users in social network sparsity of likes are discussed in depth and solved in our algorithm. The extensive evaluation demonstrates the advantage of our proposed algorithm compared with traditional methods.
1405Computing Contingent Plans via Fully Observable Non-Deterministic Planning,Christian Muise Vaishak Belle and Sheila Mcilraith,Planning and Scheduling PS ,"contingent planning
1406conditional planning
1407partial observability
1408planning and sensing
1409offline planning
1410FOND","PS Deterministic Planning
1411PS Planning General/Other ",Planning with sensing actions under partial observability is a computationally challenging problem that is fundamental to the realization of AI tasks in areas as diverse as robotics game playing and diagnostic problem solving. In this paper we explore a particular class of planning problems where the initial state specification includes a set of state constraints or so-called state invariants and where uncertainty about the state monotonically decreases. Recent work on generating plans for partially observable domains has advocated for online planning claiming that offline plans are often too large to generate. Unfortunately planning online can lead to avoidable deadends and the generated plan only addresses the particular sequence of observations realized during the execution. Here we push the envelope on this challenging problem proposing a technique for generating conditional aka contingent plans offline. The conditional plans we produce will eventually achieve the goal for all consistent sequences of observations for which a solution exists. The key to our planner's success is the reliance on state-of-the-art techniques for fully observable non-deterministic FOND planning. In particular we use an existing compilation for converting a planning problem under partial observability and sensing to a FOND planning problem. With a modified FOND planner in hand we are able to scale beyond previous techniques for contingent planning and compute solutions that are orders of magnitude smaller than previously possible in some domains.
1412Instance-based Domain Adaptation in NLP via In-target-domain Logistic Approximation,Rui Xia Jianfei Yu Feng Xu and Shumei Wang,"NLP and Machine Learning NLPML
1413Novel Machine Learning Algorithms NMLA ","domain adaptation
1414instance adaptation
1415instance-based adaptation
1416density-ratio estimation
1417text categorization
1418sentiment classification","NLPML Text Classification
1419NLPML Natural Language Processing General/Other
1420NMLA Transfer Adaptation Multitask Learning",In the field of NLP most of the existing domain adaptation studies belong to the feature-based adaptation while the research of instance-based adaptation is very scarce. In this work we propose a new instance-based adaptation model called in-target-domain logistic approximation ILA . In ILA we adapt the source-domain data to the target domain by a logistic approximation. The normalized in-target-domain probability is assigned as an instance weight to each of the source-domain training data. An instance-weighted classification model is trained finally for the cross-domain classification problem. Compared to the previous techniques ILA conducts instance adaptation in a dimensionality-reduced linear feature space to ensure efficiency in high-dimensional NLP tasks. The instance weights in ILA are learnt by leveraging the criteria of both maximum likelihood and minimum statistical distance. The empirical results on two NLP tasks including text categorization and sentiment classification show that our ILA model beats the state-of-the-art instance adaptation methods significantly in cross-domain classification accuracy parameter stability and computational efficiency.
1421Ordering Effects and Belief Adjustment in the Use of Comparison Shopping Agents,Chen Hajaj Noam Hazon and David Sarne,Humans and AI HAI ,"comparison shopping agents
1422belief-adjustment
1423ordering
1424experimentation
1425eCommerce",HAI Human-Computer Interaction,"The popularity of online shopping has contributed to the development of comparison shopping agents CSAs aiming to facilitate buyers' ability to compare prices of online stores for any desired product. Furthermore the plethora of CSAs in today s markets enables buyers to query more than a single CSA when shopping thus expanding even further the list of sellers whose prices they obtain. This potentially decreases the chance of a purchase based on the prices outputted as a result of any single query and consequently decreases each CSAs expected revenue per-query. Obviously a CSA can improve its competence in such settings by acquiring more sellers prices potentially resulting in a more attractive ``best price''. In this paper we suggest a complementary approach that improves the attractiveness of a CSA by presenting the prices to the user in a specific intelligent manner which is based on known cognitive-biases.
1426The advantage of this approach is its ability to affect the buyer s tendency to terminate her search for a better price hence avoid querying further CSAs without having the CSA spend any of its resources on finding better prices to present.
1427The effectiveness of our method is demonstrated using real data collected from four CSAs for five products. Our experiments with people confirm that the suggested method effectively influence people in a way that is highly advantageous to the CSA."
1428Scalable sparse covariance estimation via self-concordance,Anastasios Kyrillidis Rabeeh Karimi Mahabadi Quoc Tran-Dinh and Volkan Cevher,"Machine Learning Applications MLA
1429Novel Machine Learning Algorithms NMLA ","Inexact proximal Newton methods
1430Sparse covariance estimation
1431Self-concordance property","MLA Machine Learning Applications General/other
1432NMLA Big Data / Scalability
1433NMLA Data Mining and Knowledge Discovery
1434NMLA Graphical Model Learning",We consider the class of convex minimization problems composed of a self-concordant function such as the logdet metric a convex data fidelity term h and a regularizing -- possibly non-smooth -- function g accompanied with an easily computable proximity operator. These type of problems have recently attracted a great deal of interest mainly due to their omnipresence in top-notch applications. Under this locally Lipschitz continuous gradient setting we analyze the convergence behavior of proximal Newton schemes with the added twist of a probable presence of inexact evaluations; a scenario that has not been considered yet to the best of our knowledge. By using standard convex tools combined with self-concordance machinery we provide a concise convergence theory with attractive convergence rate guarantees and enhance state-of-the-art optimization schemes to accommodate such developments. Experimental results on sparse covariance estimation show the merits of our algorithm both in terms of recovery efficiency and complexity rendering the proposed framework a suitable choice for such problems.
1435Trading Multiple Indivisible Goods with Indifferences Beyond Sönmez's Result,Akihisa Sonoda Etsushi Fujita Taiki Todo and Makoto Yokoo,Game Theory and Economic Paradigms GTEP ,"Mechanism design
1436Exchange
1437Indifference
1438Pareto efficiency
1439Strategy-proofness
1440Individual rationality","GTEP Game Theory
1441GTEP Social Choice / Voting
1442MAS E-Commerce
1443MAS Mechanism Design","Designing mechanisms that satisfy individual rationality Pareto efficiency and strategyproofness is one of the most important problems in mechanism design. In this paper we investigate mechanism design for exchange models where each agent is initially endowed with a set of goods each agent may have indifferences on distinct bundles of goods and monetary transfers are not allowed. Sönmez 1999 showed that in such models those three properties are not compatible in general. The impossibility however only holds under an assumption on preference domains.
1444The purpose of this paper is to give a discussion on the compatibility of those three properties when the assumption does not hold. We first establish a preference domain called top-only preferences which violates the assumption and develop a class of exchange mechanisms satisfying all those properties. Each mechanism in the class utilizes one instance of mechanisms introduced by Saban and Sethuraman 2013 . We also find a class of preference domains called m-chotomous preferences where the assumption fails and those properties are incompatible."
1445SenticNet 3 A Common and Common-Sense Knowledge Base for Cognition-Driven Sentiment Analysis,Erik Cambria,"Cognitive Systems CS
1446Knowledge Representation and Reasoning KRR
1447NLP and Knowledge Representation NLPKR ","concept-level sentiment analysis
1448natural language processing
1449common-sense reasoning","CS Conceptual inference and reasoning
1450KRR Common-Sense Reasoning
1451KRR Knowledge Representation General/Other
1452NLPKR Natural Language Processing General/Other ",SenticNet is a publicly available semantic and affective resource for concept-level opinion mining and sentiment analysis. Rather than using graph-mining and dimensionality-reduction techniques SenticNet 3 makes use of `energy flows' to connect various parts of extended common and common-sense knowledge representations to one another. SenticNet 3 models nuanced semantics and sentics that is the conceptual and affective information associated with multi-word natural language expressions representing information with a symbolic opacity intermediate between that of neural networks and of typical symbolic systems.
1453Combining Multiple Correlated Reward and Shaping Signals by Measuring Confidence,Tim Brys Ann Nowé Daniel Kudenko and Matthew E. Taylor,Novel Machine Learning Algorithms NMLA ,"Reinforcement Learning
1454Reward Shaping
1455Multi-Objective Optimization
1456Traffic Light Control
1457Pursuit Domain",NMLA Reinforcement Learning,Multi-objective problems with correlated objectives are a class of problems that deserve specific attention. In contrast to typical multi-objective problems they do not require the identification of trade-offs between the objectives as near- optimal solutions for any objective are near- optimal for every objective. Intelligently combining the feedback from these objectives instead of only looking at a single one can improve optimization. This class of problems is very relevant in reinforcement learning as any single-objective reinforcement learning problem can be framed as such a multi-objective problem using multiple reward shaping functions. After discussing this problem class we propose a solution technique for such reinforcement learning problems called adaptive objective selection. This technique makes a temporal difference learner estimate the Q-function for each objective in parallel and introduces a way to measure confidence in these estimates. This confidence metric is then used to choose which objective's estimates to use for action selection. We show significant improvements in performance over other plausible techniques on two problem domains. Finally we provide an intuitive analysis of the technique's decisions yielding insights into the nature of the problems being solved.
1458Avoiding Plagiarism in Markov Sequence Generation,Alexandre Papadopoulos Pierre Roy and François Pachet,"Machine Learning Applications MLA
1459Search and Constraint Satisfaction SCS ","markov chains
1460plagiarism
1461constraint satisfaction
1462global constraints","APP Art and Music
1463MLA Machine Learning Applications General/other
1464SCS Constraint Satisfaction
1465SCS Global Constraints
1466SCS Constraint Satisfaction General/other ",Markov processes are widely used to generate sequences that imitate a given style using random walk. Random walk generates sequences by iteratively concatenating states to prefixes of length equal or less than the given Markov order. However at higher orders Markov chains tend to replicate chunks of the corpus with a size possibly higher than the order a primary form of plagiarism. In fact the Markov order defines a maximum length for training but not for generation. In the framework of constraint satisfaction CSP we introduce MaxOrder. This global constraint ensures that generated sequences do not include chunks larger than a given maximum order. We exhibit an automaton that recognises the solution set with a size linear in the size of the corpus. We propose a linear-time procedure to generate this automaton from a corpus and a given max order. We then use this automaton to achieve generalised arc consistency for the MaxOrder constraint holding on a sequence of size n in O n.T time where T is the size of the automaton. We illustrate our approach by generating text sequences from text corpora with a maximum order guarantee effectively controlling plagiarism.
1467A Region-Based Model for Estimating Urban Air Pollution,Arnaud Jutzeler Jason Jingshi Li and Boi Faltings,"Computational Sustainability and AI CSAI
1468Machine Learning Applications MLA ","Spatial Reasoning
1469Computational Sustainability
1470Gaussian Process
1471Urban Air Quality","CSAI Modeling and prediction of dynamic and spatiotemporal phenomena and systems
1472MLA Environmental",Air pollution has a direct impact to human health and data-driven air quality models are useful for evaluating population exposure to air pollutants. In this paper we propose a novel region-based Gaussian Process model for estimating urban air pollution dispersion and applied it to a large dataset of ultrafine particle measurements collected from a network of trams monitoring levels of ultrafine particle dispersion in the city of Zurich. We show that compared to existing grid-based models the region-based model produces better predictions across all aggregate time scales. The new model is appropriate for many useful user applications such as anomaly detection exposure assessment and sensor optimization.
1473Item Bidding for Combinatorial Public Projects,Evangelos Markakis and Orestis Telelis,"Game Theory and Economic Paradigms GTEP
1474Multiagent Systems MAS ","Public Project
1475Mechanisms
1476Valuation Function
1477Social Welfare
1478Nash Equilibrium
1479Strong Equilibrium
1480Price of Anarchy","GTEP Game Theory
1481GTEP Coordination and Collaboration
1482GTEP Equilibrium
1483MAS Coordination and Collaboration
1484MAS Mechanism Design",We present and analyze a mechanism for the Combinatorial Public Project Problem CPPP . The problem asks to select k out of m available items so as to maximize the social welfare for autonomous agents with combinatorial preferences valuation functions over subsets of items. The CPPP constitutes an abstract model for decision making by autonomous agents and has been shown to present severe computational hardness in the design of truthful approximation mechanisms. We study a non-truthful mechanism that is however practically relevant to multi-agent environments by virtue of its natural simplicity. It employs an Item Bidding interface wherein every agent issues a separate bid for the inclusion of each distinct item in the outcome; the k items with the highest sums of bids are chosen and agents are charged according to a VCG-based payment rule. For fairly expressive classes of the agents' valuation functions we establish existence of socially optimal pure Nash and strong equilibria that are resilient to coordinated deviations of subsets of agents. Subsequently we derive tight worst-case bounds on the approximation of the optimum social welfare achieved in equilibrium. We show that the mechanism's performance improves with the number of agents that can coordinate and reaches half of the optimum welfare at strong equilibrium.
1485How Long Will It Take? Accurate Prediction of Ontology Reasoning Performance,Yong-Bin Kang Jeff Z. Pan Shonali Krishnaswamy Wudhichart Sawangphol and Yuan-Fang Li,AI and the Web AIW ,"Ontology
1486Reasoning performance
1487Semantic Web
1488Prediction
1489Regression
1490Performance hotspot detection",AIW Ontologies and the web creation extraction evolution mapping merging and alignment; tags and folksonomies,For expressive ontology languages such as OWL 2 DL classification is a computationally expensive task---2\textsc{NExpTime}-complete in the worst case. Hence it is highly desirable to be able to accurately estimate classification time especially for large and complex ontologies. Recently machine learning techniques have been successfully applied to predicting the reasoning \emph{hardness category} for a given ontology reasoner pair. In this paper we further develop predictive models to estimate actual classification time using regression techniques with ontology metrics as features. Our large-scale experiments on 6 state-of-the-art OWL 2 DL reasoners and more than 450 significantly diverse ontologies demonstrate that the prediction models achieve high accuracy good generalizability and statistical significance. Such prediction models have a wide range of applications. We demonstrate how they can be used to efficiently and accurately identify \emph{performance hotspots} in an large and complex ontology an otherwise very time-consuming and resource-intensive task.
1491Backdoors to Planning,Martin Kronegger Sebastian Ordyniak and Andreas Pfandler,"Knowledge Representation and Reasoning KRR
1492Planning and Scheduling PS ","Planning
1493Backdoors
1494Fixed-parameter tractable algorithms
1495 Parameterized complexity","KRR Computational Complexity of Reasoning
1496PS Deterministic Planning","Backdoors measure the distance to tractable fragments
1497and have become an important tool to find fixed-parameter
1498tractable fpt algorithms. Despite their success backdoors
1499have not been used for planning a central problem
1500in AI that has a high computational complexity. In this
1501work we introduce two notions of backdoors building
1502upon the causal graph. We analyze the complexity of
1503finding a small backdoor detection and using the backdoor
1504to solve the problem evaluation in the light of
1505planning with un bounded domain/plan length. For each
1506setting we present either an fpt-result or rule out the existence
1507thereof by showing parameterized intractability.
1508In three cases we achieve the most desirable outcome:
1509detection and evaluation are fpt."
1510Dynamic Multi-Agent Task Allocation with Spatial and Temporal Constraints,Sofia Amador Steven Okamoto and Roie Zivan,Multiagent Systems MAS ,"Task Allocation
1511Dynamic Problem
1512Cooperative Agents","MAS Coordination and Collaboration
1513MAS Distributed Problem Solving","Realistic multi-agent team applications often feature dynamic environments with soft deadlines that penalize late execution of tasks. This puts a premium on quickly allocating tasks to agents but finding the optimal allocation is NP-hard due to temporal and spatial constraints that require tasks to be executed sequentially by agents.
1514
1515We propose FMC_TA a novel task allocation algorithm that allows tasks to be easily sequenced to yield high-quality solutions. FMC_TA first finds allocations that are fair envy-free balancing the load and sharing important tasks between agents and efficient Pareto optimal in a simplified version of the problem. It computes such allocations in polynomial or pseudo-polynomial time centrally or distributedly respectively using a Fisher market with agents as buyers and tasks as goods. It then heuristically schedules the allocations taking into account inter-agent constraints on shared tasks.
1516
1517We empirically compare our algorithm to state-of-the-art incomplete methods both centralized and distributed on law enforcement problems inspired by real police logs. The results show a clear advantage for FMC_TA both in total utility and in other measures commonly used by law enforcement authorities."
1518Datalog Rewritability of Disjunctive Datalog Programs and its Applications to Ontology Reasoning,Mark Kaminski Yavor Nenov and Bernardo Cuenca Grau,Knowledge Representation and Reasoning KRR ,"disjunctive datalog
1519tractable reasoning
1520ontology-based query answering
1521OWL 2","KRR Ontologies
1522KRR Automated Reasoning and Theorem Proving
1523KRR Computational Complexity of Reasoning
1524KRR Description Logics
1525KRR Knowledge Representation Languages
1526KRR Logic Programming",We study the problem of rewriting a disjunctive datalog program into plain datalog. We show that a disjunctive program is rewritable if and only if it is equivalent to a linear disjunctive program thus providing a novel characterisation of datalog rewritability. Motivated by this result we propose weakly linear disjunctive datalog---a novel rule-based KR language that extends both datalog and linear disjunctive datalog and for which reasoning is tractable in data complexity. We then explore applications of weakly linear programs to ontology reasoning and propose a tractable extension of OWL 2 RL with disjunctive axioms. Our empirical results suggest that many non-Horn ontologies can be reduced to weakly linear programs and that query answering over such ontologies using a datalog engine is feasible in practice.
1527Increasing VCG revenue by decreasing the quality of items,Mingyu Guo Argyrios Deligkas and Rahul Savani,"Game Theory and Economic Paradigms GTEP
1528Multiagent Systems MAS ","auctions
1529mechanism design
1530VCG
1531revenue maximization","GTEP Auctions and Market-Based Systems
1532MAS Mechanism Design","The VCG mechanism is the standard method to incentivize bidders in combinatorial auctions to bid truthfully. Under the VCG mechanism the auctioneer can sometimes increase revenue by �burning � items. We study this phenomenon in a setting where items are described by a number of attributes. The value of an attribute corresponds to a quality level and bidders valuations are non-decreasing in the quality levels. In addition to burning items we allow the auctioneer to present
1533some of the attributes as lower quality than they actually are. We study the following two revenue maximization problems under VCG finding an optimal way to mark down items by reducing their quality levels and finding an optimal set of items to burn. We study the effect of the following parameters on the computational complexity of these two problems the number of attributes the number of quality levels per attribute and the complexity of the bidders valuation functions. Bidders have unit demand so VCG s outcome can be computed in polynomial time and the valuation functions we consider are step functions that are non-decreasing with the quality levels. We prove that both problems are NP-hard even in the following three simple settings a four attributes arbitrarily many quality levels per attribute and single-step valuation functions b arbitrarily many attributes two quality levels per attribute and single-step valuation functions and c one attribute arbitrarily many quality-levels and multi-step valuation functions. For the case where items have only one attribute and every bidder has a single-step valuations that is zero below some quality threshold we show that both problems can be solved in polynomial-time using a dynamic programming approach. For this case we also quantify how much better marking down is than item burning and provide examples where the improvement is best possible. Finally we compare the revenue of both approaches with computational experiments."
1534Grandpa Hates Robots - Interaction Constraints for Planning in Inhabited Environments,Uwe Köckemann Federico Pecora and Lars Karlsson,"Knowledge Representation and Reasoning KRR
1535Planning and Scheduling PS
1536Search and Constraint Satisfaction SCS ","Constraint-based planning
1537Planning in inhabited environments
1538Human-aware planning","KRR Preferences
1539PS Scheduling
1540PS Temporal Planning
1541PS Planning General/Other
1542SCS Constraint Satisfaction General/other ",Consider a family whose home is equipped with several service robots. The actions planned for the robots e.g. doing chores playing with the children must adhere to {\em interaction constraints} relating them to human activities and preferences. These constraints must be sufficiently expressive to model both temporal and logical dependencies among robot actions and human behavior and must accommodate incomplete information regarding human activities. In this paper we introduce an approach for automatically generating plans that are conformant wrt. given interaction constraints and partially specified human activities. The approach allows to separate causal reasoning about actions from reasoning about interaction constraints and we illustrate the computational advantage this brings with experiments on a large-scale semi- realistic household domain with hundreds of human activities and several robots.
1543The Most Uncreative Examinee A First Step toward Wide Coverage Natural Language Math Problem Solving,Takuya Matsuzaki Hidenao Iwane Hirokazu Anai and Noriko Arai,Knowledge Representation and Reasoning KRR ,"natural language semantics
1544mathematical problem solving
1545automated reasoning
1546computer algebra",KRR Automated Reasoning and Theorem Proving,"We report on a project aiming at developing a system that solves a
1547wide range of math problems written in natural language. In the
1548system formal analysis of natural language semantics is coupled with
1549automated reasoning technologies including computer algebra using
1550logic as their common language. We have developed a prototype system
1551that accepts as its input a linguistically annotated problem text.
1552Using the prototype system as a reference point we analyzed real
1553university entrance examination problems from the viewpoint of
1554end-to-end automated reasoning. Further evaluation on entrance exam
1555mock tests revealed that an optimistic estimate of the system s
1556performance already matches human averages on a few test sets."
1557Acquiring Commonsense Knowledge for Sentiment Analysis through Human Computation,Marina Boia Claudiu Cristian Musat and Boi Faltings,"Human-Computation and Crowd Sourcing HCC
1558Knowledge Representation and Reasoning KRR
1559NLP and Machine Learning NLPML ","human computation
1560games with a purpose
1561crowdsourcing
1562commonsense knowledge
1563sentiment analysis
1564context","HCC Domain-specific implementation challenges in human computation games
1565KRR Knowledge Acquisition
1566NLPML Text Classification",Many Artificial Intelligence tasks need large amounts of commonsense knowledge. Because obtaining this knowledge through machine learning would require a huge amount of data a better alternative is to elicit it from people through human computation. We consider the sentiment classification task where knowledge about the contexts that impact word polarities is crucial but hard to acquire from data. We show a novel task design that allows us to crowdsource this knowledge through Amazon Mechanical Turk with high quality. We show that the commonsense knowledge acquired in this way dramatically improves the performance of established sentiment classification methods.
1567Optimistic Adaptive Submodularity at Scale,Victor Gabillon Branislav Kveton Brian Eriksson S. Muthukrishnan and Zheng Wen,"Machine Learning Applications MLA
1568Novel Machine Learning Algorithms NMLA
1569Planning and Scheduling PS
1570Reasoning under Uncertainty RU ","Submodularity
1571Adaptive submodularity
1572Linear bandits
1573Online learning","APP Other Applications
1574MLA Machine Learning Applications General/other
1575NMLA Active Learning
1576NMLA Online Learning
1577NMLA Recommender Systems
1578PS Planning General/Other
1579RU Sequential Decision Making",Maximization of submodular functions has wide applications in artificial intelligence and machine learning. In this work we study the problem of learning how to maximize an adaptive submodular function. The function is initially unknown and we learn it by interacting repeatedly with the environment. A major problem in applying existing solutions to this problem is that their regret bounds scale linearly with the size of the problem. Therefore these solutions are impractical even for moderately large problems. In this work we use the structure of real-world problems to make learning practical. We make three main contributions. First we propose a practical algorithm for learning how to maximize an adaptive submodular function where the distribution of the states of each item is conditioned on its features. Second we analyze this algorithm and show that its expected cumulative regret is polylogarithmic in time. Finally we evaluate our algorithm on two real-world problems movie recommendation and face detection and show that high-quality policies can be learned in just several hundred interactions.
1580Sequential Click Prediction for Sponsored Search with Recurrent Neural Networks,Yuyu Zhang Hanjun Dai Chang Xu Jun Feng Taifeng Wang Jiang Bian Bin Wang and Tie-Yan Liu,Machine Learning Applications MLA ,"Sponsored Search
1581Recurrent Neural Network
1582Sequential Click Prediction","AIW Enhancing web search and information retrieval
1583AIW Machine learning and the web
1584MLA Applications of Supervised Learning
1585MLA Machine Learning Applications General/other ",Click prediction is one of the fundamental problems in sponsored search. Most of existing studies took advantage of machine learning approaches to predict ad click for each event of ad view independently. While these studies aimed at providing a stationary interpretation on ad clicks they were lack of the capability to understand user clicks in a dynamic way. As observed in real-world sponsored search system user's behavior on the ad yield high dependency on how the user previously behaved along the time especially in terms of the queries user submitted click / non-click on ads dwell time on landing page etc. Inspired by these observations we introduce a novel framework based on Recurrent Neural Networks RNN to model user's sequential behaviors into the click prediction process. Compared to traditional methods this framework aims at effective click prediction by leveraging not only user's stationary historical behaviors but the rich information and patterns implied by user's dynamic sequential behaviors. Large scale evaluations on the click-through logs from a commercial search engine demonstrate that our approach can significantly improve the click prediction accuracy compared to other time-independent approaches.
1586Backdoors into Heterogeneous Classes of SAT and CSP,Serge Gaspers Neeldhara Misra Sebastian Ordyniak Stefan Szeider and Stanislav Živný,Search and Constraint Satisfaction SCS ,"theoretical analysis
1587Constraint Satisfaction Problem CSP
1588Satisfiability SAT
1589polymorphism
1590backdoor set
1591parameterized complexity","KRR Computational Complexity of Reasoning
1592SCS Constraint Satisfaction
1593SCS Satisfiability General/Other
1594SCS Constraint Satisfaction General/other ","Backdoor sets represent clever reasoning shortcuts through the search space for SAT and CSP. By instantiating the backdoor variables one reduces the given instance to several easy instances that belong to a tractable class. The overall time needed to solve the instance is exponential in the size of the backdoor set hence it is a challenging problem to find a small backdoor set if one exists; over the last years this problem has been subject of intensive research.
1595
1596In this paper we extend the classical notion of a strong backdoor set by allowing that different instantiations of the backdoor variables result in instances that belong to different base classes; the union of the base classes forms a heterogeneous base class. Backdoor sets to heterogeneous base classes can be much smaller than backdoor sets to homogeneous ones hence they are much more desirable but possibly harder to find.
1597
1598We draw a detailed complexity landscape for the problem of detecting strong backdoor sets into heterogeneous base classes for SAT and CSP. We provide algorithms that establish fixed-parameter tractability under natural parameterizations and we contrast the tractability results with hardness results that pinpoint the theoretical limits."
1599Theory of Cooperation in Complex Social Networks,Bijan Ranjbar-Sahraei Haitham Bou Ammar Daan Bloembergen Karl Tuyls and Gerhard Weiss,"Game Theory and Economic Paradigms GTEP
1600Multiagent Systems MAS ","coevolutionary networks
1601evolution of cooperation
1602influencing social networks","GTEP Game Theory
1603GTEP Coordination and Collaboration
1604GTEP Equilibrium
1605MAS Agent-based Simulation and Emergent Behavior
1606MAS Coordination and Collaboration
1607MAS Evaluation and Analysis Multiagent Systems ",This paper presents a theoretical as well as empirical study on the evolution of cooperation on complex social networks following the continuous action iterated prisoner's dilemma CAIPD model. In particular convergence to network-wide agreement is proven for both evolutionary networks with fixed interaction dynamics as well as for coevolutionary networks where these dynamics change over time. Moreover an extension to the CAIPD model is proposed that allows to model active influence of the evolution of cooperation in social networks. As such this work contributes to a better understanding of behavioral change on social networks and provides a first step towards their active control.
1608Explanation-Based Approximate Weighted Model Counting for Probabilistic Logics,Joris Renkens Angelika Kimmig Guy Van den Broeck and Luc De Raedt,"Knowledge Representation and Reasoning KRR
1609Reasoning under Uncertainty RU ","Probabilistic Logic Programming
1610Bounded Approximate Inference
1611Weighted Model Counting","KRR Logic Programming
1612RU Probabilistic Inference
1613RU Relational Probabilistic Models",Probabilistic inference in statistical relational learning and probabilistic programming can be realised using weighted model counting. Despite a lot of progress computing weighted model counts exactly is still infeasible for most problems of interest and one typically has to resort to approximation methods. We contribute a new bounded approximation method for weighted model counting based on probabilistic logic programming principles. Our bounded approximation algorithm is an anytime algorithm that provides lower and upper bounds on the weighted model count. An empirical evaluation on probabilistic logic programs shows that our approach is effective in many cases that are currently beyond the reach of exact methods.
1614A Knowledge Compilation Map for Ordered Real-Valued Decision Diagrams,Helene Fargier Pierre Marquis Alexandre Niveau and Nicolas Schmidt,Knowledge Representation and Reasoning KRR ,"Decision Diagrams ADD - AADD - OBDD - SLDD
1615Knowledge Compilation
1616Complexity","KRR Computational Complexity of Reasoning
1617KRR Knowledge Representation Languages
1618KRR Preferences
1619SCS Constraint Optimization",Valued decision diagrams VDDs are languages that represent functions mapping variable-value assignments to non-negative real numbers. They prove useful to compile cost functions utility functions or probability distributions. While the complexity of some queries notably optimization and transformations notably conditioning on VDD languages has been known for some time there remain many significant queries and transformations such as the various kinds of cuts marginalizations and combinations the complexity of which has not been identified so far. This paper contributes to filling this gap and completing previous results about the time and space efficiency of VDD languages thus leading to a knowledge compilation map for real-valued functions. Our results show that many tasks that are hard on valued CSPs are actually tractable on VDDs.
1620Prices Matter for the Parameterized Complexity of Shift Bribery,Robert Bredereck Jiehua Chen Piotr Faliszewski André Nichterlein and Rolf Niedermeier,"Game Theory and Economic Paradigms GTEP
1621Multiagent Systems MAS ","preferenced-based voting
1622campaign management
1623computational in tractability
1624parameterized complexity analysis
1625approximation","GTEP Game Theory
1626GTEP Social Choice / Voting
1627MAS E-Commerce",In the Shift Bribery problem we are given an election based on preference orders a preferred candidate p and a budget. The goal is to ensure that p wins by shifting p higher in some voters' preference orders. However each such shift request comes at a price depending on the voter and on the extent of the shift and we must not exceed the given budget. We study the parameterized computational complexity of Shift Bribery with respect to a number of parameters pertaining to the nature of the solution sought and the size of the election and several classes of price functions. When we parameterize Shift Bribery by the number of affected voters then for each of our voting rules Borda Maximin Copeland the problem is W[2]-hard. If instead we parameterize by the number of positions by which p is shifted in total then the problem is fixed-parameter tractable for Borda and Maximin and is W[1]-hard for Copeland. If we parameterize by the budget for the cost of shifting then the results depend on the price function class. We also show that Shift Bribery tends to be tractable when parameterized by the number of voters but that the results for the number of candidates are more enigmatic.
1628Decomposing Activities of Daily Living to Discover Routine Clusters,Onur Yuruten Jiyong Zhang and Pearl Pu,Machine Learning Applications MLA ,"activity recognition
1629activities of daily living
1630time series clustering
1631low rank and sparse matrix decomposition","MLA Applications of Unsupervised Learning
1632MLA Machine Learning Applications General/other ","An activity recognition system tries to analyze measurements of activities of daily living ADLs and automatically recognize whether someone is sitting walking or running. Most of the existing approaches either have to rely on a model trained by a preselected and manually labeled set of activities or perform micro-pattern analysis method which requires manual selection of the lengths and the number of micro-patterns. Because real life ADL datasets are massive the cost associated with these manual efforts is too high. As a result these approaches limit the discovery of ADL patterns from real life datasets in a scalable way.
1633
1634We propose a novel approach to extract meaningful patterns found in time-series ADL data. We use a matrix decomposition method to isolate routines and deviations to obtain two different sets of clusters. We obtain the final memberships via the cross product of these sets. We validate our approach using two real-life ADL datasets and a well-known artificial dataset. Based on average silhouette width scores our approach can capture strong structures in the underlying data. Furthermore results show that our approach improves on the accuracy of the baseline algorithms by 12% with a statistical significance p<0.05 using the Wilcoxon signed-rank comparison test."
1635On Hair Recognition in the Wild by Machine,Joseph Roth and Xiaoming Liu,Vision VIS ,"Vision
1636Biometrics
1637Face Recognition","VIS Face and Gesture Recognition
1638VIS Statistical Methods and Learning",We present an algorithm for identity inference using only the information from the hair. Face recognition in the wild i.e. unconstrained settings is highly useful in a variety of applications but performance suffers due to many factors e.g. obscured face lighting variation extreme pose angle and expression. It is well known that humans use hair information to guide identity decisions under many of these scenarios due to either the consistent hair appearance of the same subject or obvious hair discrepancy of different subjects but little work exists to replicate this intelligence artificially. We propose a learned hair matcher using shape color and texture features derived from localized patches through an AdaBoost technique with abstaining weak classifiers when features are not present in the given location. The proposed hair matcher achieves 71.53% accuracy on the LFW View 2 dataset. Hair also reduces the error of a COTS face matcher through simple score-level fusion by 5.7%.
1639Capturing Relational Schemas and Functional Dependencies in RDFS,Diego Calvanese Wolfgang Fischl Reinhard Pichler Emanuel Sallinger and Mantas Simkus,"AI and the Web AIW
1640Knowledge Representation and Reasoning KRR ","identification constraints
1641functional dependencies
1642normal forms","AIW Ontologies and the web creation extraction evolution mapping merging and alignment; tags and folksonomies
1643KRR Description Logics
1644KRR Knowledge Representation General/Other ","Mapping relational data to RDF is an important task for the development of the
1645Semantic Web. To this end the W3C has recently released a Recommendation for
1646the so-called direct mapping of relational data to RDF. In this work we
1647propose an enrichment of the direct mapping to make it more faithful by
1648transferring also semantic information present in the relational schema from
1649the relational world to the RDF world. We thus introduce expressive
1650identification constraints to capture functional dependencies and define an
1651RDF Normal Form which precisely captures the classical Boyce-Codd Normal Form
1652of relational schemas."
1653Maximum Satisfiability using core-guided MaxSAT Resolution,Nina Narodytska and Fahiem Bacchus,Search and Constraint Satisfaction SCS ,"maximum satisfiability
1654maxsat resolution
1655iterative SAT solving
1656weighted partial MaxSAT","SCS Constraint Optimization
1657SCS SAT and CSP Evaluation and Analysis
1658SCS SAT and CSP Solvers and Tools
1659SCS Satisfiability General/Other ",Core-guided approaches to solving MaxSat have proved to be effective on industrial problems containing hard clauses and weighted soft clauses weighted partial MaxSat or WPM . These approaches solve WPM problems by building a sequence of new WPM formulas where in each formula a greater weight of soft clauses can be relaxed. Relaxation of the soft clauses is achieved via the addition of blocking variables to the soft clauses along with constraints on these blocking variables. In this work we propose an alternative approach. Our approach also builds a sequence of new WPM formulas. However these formulas are constructed using MaxSat resolution a sound rule of inference for MaxSat. MaxSat resolution can in the worst case cause a quadratic blowup in the formula so we propose a new compressed version of MaxSat resolution. Using compressed MaxSat resolution our new core-guided solver improves the state-of-the-art solving significantly more problems than other state-of-the-art solvers on the industrial benchmarks used in the 2013 MaxSat Solver Evaluation.
1660Mind the Gap Machine Translation by Minimizing the Semantic Gap in Embedding Space,Jiajun Zhang,"NLP and Knowledge Representation NLPKR
1661NLP and Machine Learning NLPML ","Statistical Machine Translation
1662Semantic Phrase Representation
1663Recursive Neural Networks
1664Semantic Gap Minimization","NLPKR Natural Language Processing General/Other
1665NLPML Natural Language Processing General/Other ",The conventional statistical machine translation SMT methods perform the decoding process by compositing a set of the translation rules which have the highest probability. However the probabilities of the translation rules are calculated only according to the cooccurrence statistics in the bilingual corpus rather than the semantic meaning similarity. In this paper we propose a Recursive Neural Network RNN based model that converts each translation rule into a compact real-valued vector in the semantic embedding space and performs the decoding process by minimizing the semantic gap between the source language string and its translation candidates at each state in a bottom-up structure. The RNN-based translation model is trained using a max-margin objective function. Extensive experiments on Chinese-to-English translation show that our RNN-based model can significantly improve the translation quality by up to 1.68 BLEU score.
1666Rounded Dynamic Programming for Tree-Structured Stochastic Network Design,Xiaojian Wu Daniel Sheldon and Shlomo Zilberstein,"Computational Sustainability and AI CSAI
1667Heuristic Search and Optimization HSO
1668Planning and Scheduling PS ","stochastic network design
1669dynamic programming
1670barrier removal
1671river networks
1672influence maximization
1673stochastic optimization","CSAI Modeling and prediction of dynamic and spatiotemporal phenomena and systems
1674CSAI Control and optimization of dynamic and spatiotemporal systems
1675CSAI Network modeling prediction and optimization.
1676HSO Optimization
1677PS Probabilistic Planning",We develop a fast approximation algorithm called rounded dynamic programming RDP for stochastic network design problems on directed trees. The underlying model describes phenomena that spread away from the root of a tree for example the spread of influence in a hierarchical organization or fish in a river network. Actions can be taken to intervene in the network �for some cost �to increase the probability of propagation along an edge. Our algorithm selects a set of actions to maximize the overall spread in the network under a limited budget. We prove that the algorithm is a fully polynomial-time approximation scheme FPTAS that is it finds 1 ��ε -optimal solutions in time polynomial in the input size and 1/ε. We apply the algorithm to an important motivating problem in Computational Sustainability that of efficiently allocating funds to remove barriers in a river network so fish can reach greater portions of their native range. Our experiments show that our algorithm is able to produce near- optimal solutions much faster than an existing technique.
1678Incentives for Truthful Information Elicitation of Continuous Signals,Goran Radanovic and Boi Faltings,"Game Theory and Economic Paradigms GTEP
1679Human-Computation and Crowd Sourcing HCC
1680Multiagent Systems MAS ","Mechanism Design
1681Information elicitation
1682Peer prediction","GTEP Game Theory
1683GTEP Equilibrium
1684GTEP Imperfect Information
1685HCC Game-theoretic mechanism design of incentives for motivation and honest reporting
1686MAS E-Commerce
1687MAS Mechanism Design
1688MAS Multiagent Systems General/other ",Information elicitation mechanisms represent an important component of many information aggregation techniques such as product reviews community sensing or opinion polls. We propose a novel mechanism that elicits both private signals and beliefs. The mechanism extends the previous versions of the Bayesian Truth Serums the original BTS the RBTS and the multi-valued BTS by allowing small populations and non-binary private signals while not requiring additional assumptions on the belief updating process. For priors that are sufficiently smooth such as Gaussians the mechanism allows signals to be continuous.
1689Equilibria in Epidemic Containment Games,Sudip Saha Abhijin Adiga and Anil Kumar S. Vullikanti,"Applications APP
1690Computational Sustainability and AI CSAI
1691Game Theory and Economic Paradigms GTEP
1692Multiagent Systems MAS ","network security game
1693nash equilibria
1694malware propagation
1695epidemic control
1696security
1697protection
1698network infection
1699immunization
1700graph theory
1701game theory
1702spectral radius","APP Security and Privacy
1703CSAI Modeling the interactions of agents with different and often conflicting interests
1704GTEP Game Theory
1705GTEP Equilibrium
1706MAS Evaluation and Analysis Multiagent Systems
1707MAS Multiagent Systems General/other ","The spread of epidemics and malware is commonly modeled by diffusion processes
1708on networks. Protective interventions such as vaccinations or installing anti-virus software are used to contain their spread. Typically each node in the network has to decide its own strategy of securing itself and its benefit depends on which other nodes are secure making this a natural game-theoretic setting. There has been a lot of work on network security game models but most of the focus has been either on simplified epidemic models or homogeneous network structure.
1709
1710We develop a new formulation for an epidemic containment game which relies on the
1711characterization of the SIS model in terms of the spectral radius of the network.
1712We show that in this model pure Nash equilibria NE always exist and can be found by a best response strategy. We analyze the complexity of finding NE and derive rigorous bounds on their costs and the Price of Anarchy or PoA the ratio of the costs of the worst NE to the best NE in general graphs as well as in random graph models. In particular for arbitrary power-law graphs with exponent $\beta>2$ we show that the PoA is bounded by $O T^{2 \beta-1 } $ where $T=\gamma/\alpha$ is the ratio of the recovery rate to the transmission rate in the SIS model.
1713For the Chung-Lu random power-law graph model we prove this bound is tight for the PoA. We study the characteristics of Nash equilibria empirically in different real communication and infrastructure networks and find that our analytical results can help explain some of the empirical observations."
1714Beat the Cheater Computing Game-Theoretic Strategies for When to Kick a Gambler out of a Casino,Troels Bjerre Sørensen Melissa Dalis Joshua Letchford Dmytro Korzhyk and Vincent Conitzer,"Game Theory and Economic Paradigms GTEP
1715Multiagent Systems MAS ","Security
1716Stackelberg
1717Gambling
1718Game Theory","GTEP Game Theory
1719GTEP Equilibrium
1720GTEP Imperfect Information","Gambles in casinos are usually set up so that the casino makes a profit in expectation---as long as gamblers play honestly. However some gamblers are able to cheat reducing the casino's profit. How should the casino address this? A common strategy is to selectively kick gamblers out possibly even without being sure that they were cheating. In this paper we address the following question. Based solely on a gambler's track record when is it optimal for the casino to kick the gambler out? Because cheaters will adapt to the casino's policy this is a game-theoretic question. Specifically we model the problem as a Bayesian game in which the casino is a Stackelberg leader that can commit to a possibly randomized policy for when to kick gamblers out and provide efficient algorithms for computing the optimal policy.
1721Besides being potentially useful to casinos we imagine that similar techniques could be useful for addressing related problems---for example illegal trades in financial markets."
1722A Characterization of the Single-Peaked Single-Crossing Domain,Edith Elkind Piotr Faliszewski and Piotr Skowron,"Game Theory and Economic Paradigms GTEP
1723Multiagent Systems MAS ","elections
1724voting
1725single-peaked
1726single-crossing
17271-Euclidean
1728proportional representation
1729Monroe
1730algorithms","GTEP Game Theory
1731GTEP Social Choice / Voting
1732MAS E-Commerce","We investigate elections that are simultaneously single-peaked and single-crossing
1733 SPSC . We show that the domain of 1-dimensional Euclidean elections where voters and candidates are points on the real line and each voter prefers the candidates that are close to her to the ones that are further away is a proper subdomain of the SPSC domain by constructing an election that is single-peaked and single-crossing but not 1-Euclidean. We then establish a connection between narcissistic elections where each candidate is ranked first by at least one voter single-peaked elections and single-crossing elections by showing that an election is SPSC if and only if it can be obtained from a narcissistic single-crossing election by deleting voters. We use this characterization to show that the SPSC domain admits an efficient algorithm for a problem in fully proportional representation."
1734A Support-Based Algorithm for the Bi-Objective Pareto Constraint,Renaud Hartert and Pierre Schaus,Search and Constraint Satisfaction SCS ,"Constraint Programming
1735Bi-Objective Combinatorial Optimization
1736Global Constraint
1737Pareto Constraint","SCS Constraint Satisfaction
1738SCS Constraint Optimization
1739SCS Global Constraints",Bi-objective combinatorial optimization problems are ubiquitous in real-world applications and designing approaches to solve them efficiently is an important research area of Artificial Intelligence. In Constraint Programming the recently introduced bi-objective Pareto constraint allows one to solve bi-objective combinatorial optimization problems exactly. Using this constraint every non-dominated solution is collected in a single tree-search while pruning sub-trees that cannot lead to a non-dominated solution. This paper introduces a simpler and more efficient filtering algorithm for the bi-objective Pareto constraint. The efficiency of our algorithm is experimentally confirmed on classical bi-objective benchmarks.
1740HC-Search for Multi-Label Prediction An Empirical Study,Janardhan Rao Doppa Jun Yu Chao Ma Alan Fern and Prasad Tadepalli,Novel Machine Learning Algorithms NMLA ,"Supervised Learning
1741Multi-Label Classification
1742Structured Prediction",NMLA Supervised Learning Other ,Multi-label learning concerns learning multiple overlapping and correlated classes. In this paper we adapt a recent structured prediction framework called HC-Search for multi-label prediction problems. One of the main advantages of this framework is that its training is sensitive to the loss function unlike the other multi-label approaches that either assume a specific loss function or require a manual adaptation to each loss function. We empirically evaluate our instantiation of the HC-Search framework along with many existing multi-label learning algorithms on a variety of benchmarks by employing diverse task loss functions. Our results demonstrate that the performance of existing algorithms tends to be very similar in most cases and that the HC-Search approach is comparable and often better than all other algorithms across different loss functions.
1743Semi-supervised Matrix Completion for Cross-Lingual Text Classification,Min Xiao and Yuhong Guo,"Machine Learning Applications MLA
1744Novel Machine Learning Algorithms NMLA ","cross lingual classification
1745semi-supervised learning
1746matrix completion","MLA Machine Learning Applications General/other
1747NMLA Classification
1748NMLA Semisupervised Learning",Cross-lingual text classification is the task of assigning labels to a given document in a label-scarce target language by using a prediction model trained with labeled documents from a label-rich source language which is popularly studied in the natural language processing area as it can largely decrease the expensive manual annotation effort in the target language. In this work we proposed a novel semi-supervised representation learning approach to address this challenging task which discovers interlingual features by simultaneously performing semi-supervised matrix completion. To evaluate the proposed learning technique we conducted extensive experiments on eighteen cross language sentiment classification tasks with four different languages. The empirical results demonstrated the efficacy of our approach and outperformed the other comparison methods.
1749Approximate Lifting Techniques for Belief Propagation,Parag Singla Aniruddh Nath and Pedro Domingos,Reasoning under Uncertainty RU ,"Lifted Inference
1750Belief Propagation
1751Graphical Models","RU Graphical Models Other
1752RU Probabilistic Inference
1753RU Relational Probabilistic Models",Many AI applications need to explicitly represent the relational structure as well as handle uncertainty. First order probabilistic models combine the power of logic and probability to deal with such domains. A naive approach to inference in these models is to propositionalize the whole theory and carry out the inference on the ground network. Lifted inference techniques such as Lifted Belief Propagation; Singla & Domingos 2008 provide a scalable approach to inference by combining together groups of objects which behave identically. In many cases constructing the lifted network can itself be quite costly. In addition the exact lifted network is often very close in size to the fully propositionalized model. To overcome these problems we present approximate lifted inference which groups together similar but distinguishable objects and treats them as if they were identical. Early stopping terminates the execution of the lifted network construction at an early stage resulting in a coarser network. Noise tolerant hypercubes allow for marginal errors in the representation of the lifted network itself. Both of our algorithms can significantly speed-up the process of lifted network construction as well as result in much smaller models. The coarseness of the approximation can be adjusted depending on the accuracy required and we can bound the resulting error. Extensive evaluation on six domains demonstrates great efficiency gains with only minor or no loss in accuracy.
1754Cost-Based Query Optimization via AI Planning,Nathan Robinson Sheila Mcilraith and David Toman,"Knowledge Representation and Reasoning KRR
1755Planning and Scheduling PS ","relational query optimization
1756delete-free planning
1757cost-optimal planning
1758heuristic search
1759applications of planning","KRR Knowledge Representation General/Other
1760PS Deterministic Planning
1761PS Planning General/Other ",The generation of high quality query plans is at the heart of query processing in traditional database management systems as well as in heterogeneous distributed data sources on corporate intranets and in the cloud. A diversity of techniques are employed for query plan generation and optimization many of them proprietary. In this paper we revisit the problem of generating a query plan using AI automated planning. Characterizing query planning as AI planning enables us to leverage state-of-the-art planning techniques -- techniques which have proven to be highly effective for a diversity of dynamical reasoning tasks. While our long-term view is broad here our efforts focus on the specific problem of cost-based join-order optimization a central component of production-quality query optimizers. We characterize the general query planning problem as a delete-free planning problem and query plan optimization as a context-sensitive cost-optimal planning problem. We propose algorithms that generate high quality query plans guaranteeing optimality under certain conditions. Our approach is general supporting the use of a broad suite of domain-independent and domain-specific optimization criteria. Experimental results demonstrate the effectiveness of AI planning techniques for query plan generation and optimization.
1762Efficient buyer groups for prediction-of-use electricity tariffs,Valentin Robu Meritxell Vinyals Alex Rogers and Nick Jennings,"Computational Sustainability and AI CSAI
1763Game Theory and Economic Paradigms GTEP
1764Multiagent Systems MAS ","electricity tariff
1765group buying
1766smart grid","CSAI Modeling the interactions of agents with different and often conflicting interests
1767CSAI Support for public engagement and decision making by the public
1768GTEP Coordination and Collaboration
1769MAS Coordination and Collaboration
1770MAS Evaluation and Analysis Multiagent Systems ",Current electricity tariffs do not reflect the real cost that customers incur to suppliers as units are charged at the same rate regardless of how predictable each customer's consumption is. A recent proposal to address this problem are prediction-of-use tariffs. In such tariffs a customer is asked in advance to predict her future consumption and is charged based both on her actual consumption and the deviation from her prediction. Prior work studied the cost game induced by a single such tariff and showed consumers would have an incentive to minimize their risk by joining together when buying electricity as a grand coalition. In this work we study the efficient i.e. cost-minimizing structure of buying groups for the more realistic setting when multiple competing prediction-of-use tariffs are available. We propose a polynomial time algorithm to compute efficient buyer groups and validate our approach experimentally using a large-scale data set of domestic electricity consumers in the UK.
1771Distribution-Aware Sampling and Weighted Model Counting for SAT,Supratik Chakraborty Daniel J. Fremont Kuldeep S. Meel Sanjit A. Seshia and Moshe Vardi,Search and Constraint Satisfaction SCS ,"Weighted Model Counting
1772Weight Generation
1773SAT","SCS SAT and CSP Evaluation and Analysis
1774SCS SAT and CSP Solvers and Tools
1775SCS Satisfiability General/Other ","Given a CNF formula and a weight for each assignment of values to
1776variables two natural problems are weighted model counting and
1777distribution-aware sampling of satisfying assignments. Both problems
1778have a wide variety of important applications. Due to the inherent
1779complexity of the exact versions of the problems interest has focused
1780on solving them approximately. Prior work in this area scaled only to
1781small problems in practice or failed to provide strong theoretical
1782guarantees or employed a computationally-expensive maximum a poste-
1783riori probability MAP oracle that assumes prior knowledge of a
1784factored representation of the weight distribution. We present a
1785novel approach that works with a black-box oracle for weights of
1786assignments and requires only an {\NP}-oracle to solve both the
1787counting and sampling problems. Our approach works
1788under mild assumptions on the distribution of weights of satisfying
1789assignments provides strong theoretical guarantees and scales to
1790problems involving several thousand variables. We also show that the
1791assumptions can be significantly relaxed if a factored representation
1792of the weights is known."
1793Online and Stochastic Learning with a Human Cognitive Bias,Hidekazu Oiwa and Hiroshi Nakagawa,"Human-Computation and Crowd Sourcing HCC
1794Novel Machine Learning Algorithms NMLA ","Machine Learning
1795Human Cognitive Bias
1796Online Learning
1797Stochastic Learning
1798Endowment effect","HCC Optimality in the context of human computation
1799NMLA Classification
1800NMLA Online Learning",Sequential learning for classification tasks is an effective tool in the machine learning community. In sequential learning settings algorithms sometimes make incorrect predictions on data that were correctly classified in the past. This paper explicitly deals with such inconsistent prediction behavior. Our main contributions are 1 to experimentally show its effect for user utilities as a human cognitive bias 2 to formalize a new framework by internalizing this bias into the optimization problem 3 to develop new algorithms without memorization of the past prediction history and 4 to show some theoretical guarantees of our derived algorithm for both online and stochastic learning settings. Our experimental results show the superiority of the derived algorithm for problems involving human cognition.
1801Adaptive Singleton-based Consistencies,Amine Balafrej Christian Bessiere Gilles Trombettoni and El Houssine Bouyakhf,Search and Constraint Satisfaction SCS ,"CSP
1802Singleton-based Consistencies
1803Adaptive Consistencies",SCS Constraint Satisfaction,"Singleton-based consistencies have been shown to dramatically
1804 improve the performance of constraint solvers on some difficult
1805 instances. However they are in general too expensive to be applied
1806 exhaustively during the whole search. In this paper we focus on
1807 partition-one-AC a singleton-based consistency which as opposed
1808 to singleton arc consistency is able to prune values on all
1809 variables at each singleton test.
1810 We propose adaptive variants of partition-one-AC that do not
1811 necessarily run until having proved the fixpoint. The pruning
1812 can be weaker than the full version but the computational effort
1813 can be significantly reduced. Our experiments
1814 show that adaptive Partition-one-AC can obtain significant speedups over arc
1815 consistency and over the full version of partition-one-AC."
1816Scheduling for Transfers in Pickup and Delivery Problems with Very Large Neighborhood Search,Brian Coltin and Manuela Veloso,Planning and Scheduling PS ,"scheduling
1817transfers
1818PDP","HSO Heuristic Search
1819HSO Metareasoning and Metaheuristics
1820PS Scheduling",In pickup and delivery problems PDPs vehicles pick up and deliver a set of items under various constraints. We extend the well-studied PDP by allowing vehicles to transfer items to and from one another. By scheduling transfers the fleet of vehicles can deliver the items faster and at lower cost. We introduce the Very Large Neighborhood Search with Transfers VLNS-T algorithm to form schedules for PDPs with transfers. We show that VLNS-T algorithm makes use of transfers to improve upon the best known solutions for selected benchmark problems and demonstrate its effectiveness on real world taxi data in New York City.
1821Unsupervised Alignment of Natural Language Instructions with Video Segments,Iftekhar Naim Young Song Qiguang Liu Henry Kautz Jiebo Luo and Daniel Gildea,"Machine Learning Applications MLA
1822NLP and Machine Learning NLPML ","Unsupervised Video Alignment
1823Grounded Language Acquisition
1824HMM
1825IBM Model 1
1826Language and Vision","MLA Applications of Unsupervised Learning
1827NLPML Natural Language Processing General/Other
1828VIS Language and Vision",We propose an unsupervised learning algorithm for automatically inferring the mappings between English nouns and corresponding video objects. Given a sequence of natural language instructions and an unaligned video recording we simultaneously align each instruction to its corresponding video segment and also align nouns in each instruction to their corresponding objects in video. While existing grounded language acquisition algorithms rely on pre-aligned supervised data each sentence paired with corresponding image frame or video segment our algorithm aims to automatically infer the alignment from the temporal structure of the video and parallel text instructions. We propose two generative models that are closely related to the HMM and IBM 1 word alignment models used in statistical machine translation. We evaluate our algorithm on videos of biological experiments performed in wetlabs and demonstrate its capability of aligning video segments to text instructions and matching video objects to nouns in the absence of any direct supervision.
1829Detecting information-dense texts in multiple news domains,Yinfei Yang and Ani Nenkova,"AI and the Web AIW
1830NLP and Knowledge Representation NLPKR
1831NLP and Machine Learning NLPML
1832NLP and Text Mining NLPTM ","writing style
1833information-dense text
1834summarization","AIW Human language technologies for web systems including text summarization and machine translation
1835NLPKR Semantics and Summarization
1836NLPML Text Classification
1837NLPTM Natural Language Processing General/Other ",In this paper we introduce the task of identifying information-dense texts which report important factual information in direct succinct manner. We describe a procedure that allows us to label automatically a large training corpus of New York Times texts. We train a classifier based on lexical discourse and unlexicalized syntactic features and test its performance on a set of manually annotated articles from international relations U.S. politics sports and science domains. Our results indicate that the task is feasible and that both syntactic and lexical features are highly predictive for the distinction. We observe considerable variation of prediction accuracy across domains and find that domain-specific models are more accurate.
1838Smarter Than You Think Acquiring Comparative Commonsense from the Web,Niket Tandon Gerard de Melo and Gerhard Weikum,"AI and the Web AIW
1839NLP and Text Mining NLPTM ","commonsense knowledge
1840information extraction
1841word sense disambiguation","AIW Knowledge acquisition from the web
1842NLPTM Information Extraction",This paper presents a method for automatically constructing a large comparative commonsense knowledge base from Big Data. The resulting knowledge base is semantically refined and organized. Our method is based on linear optimization methods to clean and consolidate the noisy input knowledge while also inferring new information. Our method achieves a high precision while maintaining good coverage.
1843A reasoner for the RCC-5 and RCC-8 calculi extended with constants,Stella Giannakopoulou Charalampos Nikolaou and Manolis Koubarakis,"Knowledge Representation and Reasoning KRR
1844Reasoning under Uncertainty RU
1845Search and Constraint Satisfaction SCS ","Qualitative spatial reasoning
1846Constraint Satisfaction Problems
1847Landmarks","KRR Computational Complexity of Reasoning
1848KRR Geometric Spatial and Temporal Reasoning
1849KRR Qualitative Reasoning
1850RU Uncertainty in AI General/Other
1851SCS Constraint Satisfaction","The problem of checking the consistency in qualitative calculi that contain both unknown and known entities constants i.e. real geometries has recently appeared and has applications in many areas. Until now all the approaches are theoretical and no implementation has been proposed. In this paper we present the first reasoner that takes as input RCC-5 or RCC-8 networks that involve entities with specific geometries and decides their consistency. We investigate the performance of the
1852reasoner and contrary to lots of other works in this area we consider real datasets in our experimental analysis."
1853Schedule-based Robotic Search for Multiple Residents in a Retirement Home Environment,Markus Schwenk Tiago Vaquero and Goldie Nejat,"Planning and Scheduling PS
1854Reasoning under Uncertainty RU
1855Robotics ROB ","Uncertainty in AI
1856Probabilistic Planning
1857Temporal Planning
1858Robotics","PS Probabilistic Planning
1859PS Temporal Planning
1860RU Uncertainty in AI General/Other
1861ROB Robotics General/Other ",In this paper we address the planning problem of a robot searching for multiple residents in a retirement home in order to remind them of an upcoming multi-person recreational activity before a given deadline. We introduce a novel Multi-User Schedule Based M-USB Search approach which generates a high-level-plan to maximize the number of residents that are found within the given time frame. From the schedules of the residents the layout of the retirement home environment as well as direct observations by the robot we obtain spatio-temporal likelihood functions for the individual residents. The main contribution of our work is the development of a novel approach to compute a reward to find a search plan for the robot using 1 the likelihood functions 2 the availabilities of the residents and 3 the order in which the residents should be found. Simulations were conducted on a floor of a real retirement home to compare our proposed M-USB Search approach to a Weighted Informed Walk and a Random Walk. Our results show that the proposed M-USB Search finds residents in a shorter amount of time by visiting fewer rooms when compared to the other approaches.
1862Diagram Understanding in Geometry Problems,Min Joon Seo Hannaneh Hajishirzi Ali Farhadi and Oren Etzioni,Applications APP ,"Diagram Understanding
1863Submodular Optimization
1864Language and Vision","APP Other Applications
1865VIS Language and Vision","Automatically solving geometry questions is a long-standing AI
1866problem. A geometry question typically includes a textual description
1867accompanied by a diagram. The first step in solving geometry
1868questions is diagram understanding which consists of identifying visual
1869elements in the diagram their location their geometric properties
1870and aligning them to corresponding textual descriptions. In this
1871paper we present a method for diagram understanding that identifies
1872visual elements in a diagram while maximizing agreement between
1873textual and visual data. We show that the method's objective function
1874is submodular; thus we are able to introduce an efficient method for
1875diagram understanding that is close to optimal. To empirically
1876evaluate our method we compile a new dataset of geometry questions
1877 textual descriptions and diagrams and compare with baselines that
1878utilize standard vision techniques. Our experimental evaluation shows
1879an F1 boost of more than 17\% in identifying visual elements and 25\% in
1880aligning visual elements with their textual descriptions."
1881Latent Domains Modeling for Domain Adaptation,Caiming Xiong Scott McCloskey and Jason Corso,"Machine Learning Applications MLA
1882Novel Machine Learning Algorithms NMLA
1883Vision VIS ","latent model
1884local linear subspace
1885domain adaptation","MLA Applications of Unsupervised Learning
1886MLA Machine Learning Applications General/other
1887NMLA Clustering
1888NMLA Feature Construction/Reformulation
1889NMLA Transfer Adaptation Multitask Learning
1890NMLA Semisupervised Learning
1891VIS Categorization
1892VIS Object Recognition
1893VIS Statistical Methods and Learning","To improve robustness to significant mismatches between
1894source domain and target domain - arising from changes such
1895as illumination pose and image quality - domain adaptation
1896is increasingly popular in computer vision. But most of methods
1897assume that the source data is from single domain or that
1898multi-domain datasets provide the domain label for training
1899instances. In practice most datasets are mixtures of multiple
1900latent domains and difficult to manually provide the domain
1901label of each data point. In this paper we propose a model
1902that automatically discovers latent domains in visual datasets.
1903We first assume the visual images are sampled from multiple
1904manifolds each of which represents different domain
1905and which are represented by different subspaces. Using the
1906neighborhood structure estimated from images belonging to
1907the same category we approximate the local linear invariant
1908subspace for each image based on its local structure eliminating
1909the category-specific elements of the feature. Based
1910on the effectiveness of this representation we then propose a
1911squared-loss mutual information based clustering model with
1912category distribution prior in each domain to infer the domain
1913assignment for images. In experiment we test our approach
1914on two common image datasets the results show that
1915our method outperforms the existing state-of-the-art methods
1916and also show the superiority of multiple latent domain discovery"
1917Improving Domain-independent Cloud-based Speech Recognition with Domain-dependent Phonetic Post-processing,Johannes Twiefel Timo Baumann Stefan Heinrich and Stefan Wermter,"AI and the Web AIW
1918Applications APP
1919NLP and Knowledge Representation NLPKR
1920Robotics ROB ","speech recognition
1921phonetics
1922domain-dependent knowledge","AIW Human language technologies for web systems including text summarization and machine translation
1923APP Intelligent User Interfaces
1924NLPKR Natural Language Processing General/Other
1925ROB Human-Robot Interaction",Automated speech recognition ASR technology has been developed to such a level that off-the-shelf distributed speech recognition services are available free of cost that allow researchers to integrate speech into their applications with little development effort or expert knowledge leading to better results compared with previously used open-source tools. Often however such services do not accept language models or grammars but process free speech from any domain. While results are very good given the enormous size of the search space results frequently contain out-of-domain words or constructs that cannot be understood by subsequent domain-dependent natural language understanding NLU components. In this paper we present a versatile post-processing technique based on phonetic distance that integrates domain knowledge with open-domain ASR results leading to improved ASR performance. Notably our technique is able to make use of domain restrictions using various degrees of domain knowledge ranging from pure vocabulary restrictions via grammars or N-grams to restrictions of the acceptable utterances. We present results for a variety of corpora mainly from human-robot interaction where our combined approach significantly outperforms Google ASR as well as a plain open-source ASR solution.
1926A Strategy-Aware Technique for Learning Behaviors from Discrete Human Feedback,Robert Loftin James MacGlashan Bei Peng Michael Littman Matthew E. Taylor Jeff Huang and David Roberts,"Humans and AI HAI
1927Novel Machine Learning Algorithms NMLA ","Learning from Feedback
1928Human Computer Interaction
1929Reinforcement Learning
1930Canine Learning","APP Philosophical and Ethical Issues
1931CM Bayesian Learning
1932HCC Active learning from imperfect human labelers
1933HAI Human-Computer Interaction
1934NMLA Bayesian Learning
1935NMLA Reinforcement Learning
1936ROB Human-Robot Interaction",This paper introduces two novel algorithms SABL and I-SABL for learning behaviors from human-provided rewards. The primary novelty of these algorithms is that instead of treating the feedback as a numeric reward signal they interpret feedback as a form of discrete communication that depends on both the behavior the trainer is trying to teach and the teaching strategy used by the trainer. For example some humans use a strategy where the lack of feedback may indicate whether the action was correct or incorrect and interpreting this lack of feedback accurately can significantly improve learning speed. Results from user studies show that 1 humans use a variety of training strategies in practice and 2 both algorithms can successfully learn a contextual bandit task faster than approaches that treat the feedback as numeric. Additionally simulated trainers are employed to evaluate the algorithms in both contextual bandit and sequential decision-making domains with similar results.
1937Learning Instance Concepts from Multiple-Instance Data with Bags as Distributions,Gary Doran and Soumya Ray,,"multiple-instance learning
1938supervised learning
1939classification","NMLA Evaluation and Analysis Machine Learning
1940NMLA Supervised Learning Other
1941NMLA Machine Learning General/other ",We analyze and evaluate a generative process for multiple-instance learning MIL in which bags are distributions over instances. We show that our generative process contains as special cases generative models explored in prior work while excluding scenarios known to be hard for MIL. Further under the mild assumption that every negative instance is observed with nonzero probability in some negative bag we show that it is possible to learn concepts that accurately label instances from MI data in this setting. Finally we show that standard supervised approaches can learn concepts with low area-under-ROC error from MI data in this setting. We validate this surprising result with experiments using several real-world MI datasets that have been annotated with instance labels.
1942Tractability through Exchangeability A New Perspective on Efficient Probabilistic Inference,Mathias Niepert and Guy Van den Broeck,Reasoning under Uncertainty RU ,"efficient inference
1943lifted inference
1944probabilistic inference
1945exchangeability
1946statistical relational learning","RU Probabilistic Inference
1947RU Relational Probabilistic Models",Exchangeability is a central notion in statistics and probability theory. The assumption that an infinite sequence of data points is exchangeable is at the core of Bayesian statistics. However finite exchangeability as a statistical property that renders probabilistic inference tractable is less well-understood. We develop a theory of finite exchangeability and its relation to tractable probabilistic inference. The theory is complementary to that of independence and conditional independence. We show that tractable inference in probabilistic models with high treewidth and millions of variables can be explained with the notion of finite partial exchangeability. We also show that existing lifted inference algorithms implicitly utilize a combination of conditional independence and partial exchangeability.
1948Collaborative Models for Referring Expression Generation in Situated Dialogue,Rui Fang Malcolm Doering and Joyce Chai,NLP and Machine Learning NLPML ,"referring expression generation
1949collaborative models
1950situated dialogue","NLPML Discourse and Dialogue
1951NLPML Natural Language Processing General/Other ",In situated dialogue with artificial agents e.g. robots although a human and an agent are co-present the agent's representation and the human's representation of the shared environment are significantly mismatched. Because of this misalignment previous work has shown that when the agent applies traditional approaches to generate referring expressions to describe target objects the intended objects often cannot be correctly identified by the human. To address this problem motivated by collaborative behaviors in human referential communication we have developed two collaborative models - an episodic model and an installment model - for referring expression generation. In both models instead of generating a single referring expression to describe a target object as in the previous work it generates multiple small expressions that lead to the target object with a goal to minimize the collaborative effort. In particular our installment model incorporates human feedback in a reinforcement learning framework to learn the optimal generation strategies. Our empirical results have shown that the episodic model and the installment model outperform previous non-collaborative models with an absolute gain of 6% and 21% respectively.
1952Efficient Optimization for Autonomous Manipulation of Natural Objects,Abdeslam Boularias J. Andrew Bagnell and Anthony Stentz,"Machine Learning Applications MLA
1953Robotics ROB ","Robotic grasping
1954Planning
1955Bayesian optimization
1956Gaussian Processes
1957Anytime optimization","MLA Machine Learning Applications General/other
1958ROB Behavior and Control
1959ROB Robotics General/Other ",Manipulating irregular natural objects such as rocks is an essential capability of robots operating in outdoor environments. Previous studies have shown that stable grasps for known man-made objects can usually be planned by using physics-based simulators. However planning is an expensive process that requires simulation of hand and object trajectories in different configurations and evaluating the outcome of each trajectory. This problem is particularly concerning when the objects are irregular and cluttered because the space of stable grasps is significantly smaller and more configurations need to be evaluated before finding a good one. We present a learning approach based on template matching for fast detection of a small initial set of potentially stable grasps in a cluttered scene using depth features. The predicted best grasps are further optimized by fine-tunning the configuration of the hand in simulation. To reduce the computational cost of this last operation we model the predicted outcomes of the grasps as a Gaussian Process and use an entropy-search method in order to focus the optimization on regions where the best grasp configuration is most likely to be. This approach is tested on the challenging task of clearing piles of real unknown rock debris using an autonomous robot. Empirical results show a clear advantage of this approach.
1960Effective Management of Electric Vehicle Storage using Smart Charging,Konstantina Valogianni Wolfgang Ketter John Collins and Dmitry Zhdanov,Computational Sustainability and AI CSAI ,"Electric Vehicles
1961Smart Grid
1962Optimization
1963Reinforcement Learning",CSAI Control and optimization of dynamic and spatiotemporal systems,The growing Electric Vehicles' EVs popularity among commuters creates new challenges for the smart grid. The most important of them is the uncoordinated EV charging that substantially increases the energy demand peaks putting the smart grid under constant strain. In order to cope with these peaks the grid needs extra infrastructure a costly solution. We propose an Adaptive Management of EV Storage AMEVS algorithm implemented through a learning agent that acts on behalf of individual EV owners and schedules EV charging over a weekly horizon. It accounts for individual preferences so that mobility service is not violated but also individual benefit is maximized. We observe that it reshapes the energy demand making it less volatile so that fewer resources are needed to cover peaks. It assumes Vehicle-to-Grid discharging when the customer has excess capacity. Our agent uses Reinforcement Learning trained on real world data to learn individual household consumption behavior and to schedule EV charging. Unlike previous work AMEVS is a fully distributed approach. We show that AMEVS achieves significant reshaping of the energy demand curve and peak reduction which is correlated with customer preferences regarding perceived utility of energy availability. Additionally we show that the average and peak energy prices are reduced as a result of smarter energy use.
1964Social Planning Achieving Goals by Altering Others' Mental States,Chris Pearce Ben Meadows Pat Langley and Mike Barley,"Cognitive Systems CS
1965Planning and Scheduling PS ","Cognitive Systems
1966Social Planning
1967Deception","CS Conceptual inference and reasoning
1968CS Social cognition and interaction
1969CS Problem solving and decision making
1970KRR Reasoning with Beliefs
1971PS Planning General/Other ","In this paper we discuss a computational approach to the cognitive
1972task of social planning. First we specify a class of planning
1973problems that involve an agent who attempts to achieve its goals
1974by altering other agents' mental states. Next we describe SFPS
1975a flexible problem solver that generates social plans of this sort
1976including ones that include deception and reasoning about other
1977agents' beliefs. We report the results for experiments on social
1978scenarios that involve different levels of sophistication and that
1979demonstrate both SFPS' capabilities and the sources of its power.
1980Finally we discuss how our approach to social planning has been
1981informed by earlier work in the area and propose directions for
1982additional research on the topic."
1983Feature-Cost Sensitive Learning with Submodular Trees of Classifiers,Matt Kusner Wenlin Chen Quan Zhou Eddie Xu and Kilian Weinberger,Novel Machine Learning Algorithms NMLA ,"submodular optimization
1984feature-cost sensitive learning
1985tree-based learning","NMLA Classification
1986NMLA Supervised Learning Other ",During the past decade machine learning algorithms have become commonplace in large-scale real-world industrial applications. In these settings the computation time to train and test machine learning algorithms is a key consideration. At training-time the algorithms must scale to very large data set sizes. At testing-time the cost of feature extraction can dominate the CPU runtime. Recently a promising method was proposed to account for the feature extraction cost at testing time called Cost-sensitive Tree of Classifiers CSTC . Although the CSTC problem is NP-hard the authors suggest an approximation through a mixed-norm relaxation across many classifiers. This relaxation is slow to train and requires involved optimization hyperparameter tuning. We propose a different relaxation using approximate submodularity called Approximately Submodular Tree of Classifiers ASTC . ASTC is much simpler to implement yields equivalent results but requires no optimization hyperparameter tuning and is up to two orders of magnitude faster to train.
1987Multiagent Metareasoning Through Organizational Design,Jason Sleight and Ed Durfee,"Multiagent Systems MAS
1988Planning and Scheduling PS
1989Reasoning under Uncertainty RU ","organizational design
1990Dec-MDP
1991multiagent metareasoning","MAS Coordination and Collaboration
1992MAS Multiagent Planning
1993MAS Multiagent Systems General/other
1994PS Markov Models of Environments
1995PS Model-Based Reasoning
1996PS Probabilistic Planning
1997RU Decision/Utility Theory
1998RU Sequential Decision Making",We formulate an approach to multiagent metareasoning that uses organizational design to focus each agent's reasoning on the aspects of their respective local problems to which they can make the most worthwhile contributions to joint behavior. By employing the decentralized Markov decision process framework we characterize an organizational design problem that explicitly considers the quantitative impact that a design has on both the quality of the agents' behaviors and their reasoning costs. We describe an automated organizational design process that can approximately solve our organizational design problem via incremental search and present techniques that efficiently estimate the incremental impact of a candidate organizational influence. Our empirical evaluation confirms that our process generates organizational designs that impart a desired metareasoning regime upon the agents.
1999Experiments on Visual Information Extraction with the Faces of Wikipedia,Md. Kamrul Hasan and Christopher Pal,"AI and the Web AIW
2000Vision VIS ","Web mining
2001Information extraction
2002Text processing
2003Face verification
2004Identity resolution
2005Face recognition","AIW Knowledge acquisition from the web
2006VIS Face and Gesture Recognition
2007VIS Image and Video Retrieval
2008VIS Language and Vision","We present a series of visual information extraction experiments
2009using the Faces ofWikipedia database - a new resource
2010that we release into the public domain for both recognition
2011and extraction research containing over 50000 identities and
201260000 disambiguated images of faces. We compare different
2013techniques for automatically extracting the faces corresponding
2014to the subject of a Wikipedia biography within the
2015images appearing on the page. Our top performing approach
2016is based on probabilistic graphical models and uses the text
2017of Wikipedia pages similarities of faces as well as various
2018other features of the document meta-data and image files.
2019Our method resolves the problem jointly for all detected faces
2020on a page. While our experiments focus on extracting faces
2021from Wikipedia biographies our approach is easily adapted
2022to other types of documents and multiple documents. We focus
2023onWikipedia because the content is a Creative Commons
2024resource and we provide our database to the community including
2025registered faces hand labeled and automated disambiguations
2026processed captions meta data and evaluation protocols.
2027Our best probabilistic extraction pipeline yields an expected
2028average accuracy of 77% compared to image only and
2029text only baselines which yield 66% and 63% respectively."
2030Signals in the Silence Models of Implicit Feedback in a Recommender System for Crowdsourcing,Christopher Lin Ece Kamar and Eric Horvitz,"AI and the Web AIW
2031Applications APP
2032Human-Computation and Crowd Sourcing HCC
2033Machine Learning Applications MLA
2034Novel Machine Learning Algorithms NMLA ","Crowdsourcing
2035Recommendation Systems
2036Implicit Feedback
2037Matrix Factorization","AIW Crowdsourcing techniques and methodologies
2038AIW Web-based recommendation systems
2039APP Other Applications
2040HCC Programming languages tools and platforms to support human computation
2041MLA Machine Learning Applications General/other
2042NMLA Recommender Systems",We study the opportunity to exploit the absence of signals as informative observations in the context of providing task recommendations in crowdsourcing. Workers on crowdsourcing platform do not provide explicit ratings about tasks. We present methods that enable a system to leverage implicit signals about task preferences. These signals include types of tasks that have been available and have been displayed and the number of tasks workers select and complete. In distinction to previous work we present a general model that can represent both positive and negative implicit signals. We introduce algorithms that can learn these models without exceeding the computational complexity of existing approaches. Finally using data from a large-scale high throughput crowdsourcing platform we show that reasoning about both positive and negative implicit feedback can improve the quality of task recommendations provided to workers.
2043A Convex Formulation for Semi-supervised Multi-Label Feature Selection,Xiaojun Chang Feiping Nie Yi Yang and Heng Huang,Machine Learning Applications MLA ,"Semi-supervised Learning
2044Multi-Label Feature Selection
2045Convex Algorithm","NMLA Classification
2046NMLA Dimension Reduction/Feature Selection","Explosive growth of multimedia data has brought challenge of how to efficiently browse retrieve and organize these data. Under this circumstance different approaches have been proposed to facilitate multimedia analysis. Several semi-supervised feature selection algorithms have been proposed to exploit both labeled and unlabeled data. However they are implemented based on graphs such that they cannot handle large-scale datasets. How to conduct semi-supervised feature selection on large-scale datasets has become a challenging research problem. Moreover existing multi-label feature selection algorithms rely on eigen-decomposition with heavy computational burden which further prevent current feature selection algorithms from being applied for big data. In this paper we propose a novel semi-supervised multi-label feature selection for large-scale
2047multimedia analysis. We evaluate performance of the proposed algorithm over five benchmark datasets and compare the results with state-of-the-art supervised and
2048semi-supervised feature selection algorithms as well as baseline using all features. The experimental results demonstrate that our proposed algorithm consistently achieve superiors performances."
2049A Spatially Sensitive Kernel to Predict Cognitive Performance from Short-Term Changes in Neural Structure,Hidayath Ansari Michael Coen Barbara Bendlin Mark Sager and Sterling Johnson,Machine Learning Applications MLA ,"Machine Learning
2050Neuroimaging
2051Kernel Methods
2052Wide Data
2053Alzheimer's Disease","APP Biomedical / Bioinformatics
2054MLA Bio/Medicine
2055MLA Applications of Supervised Learning","This paper introduces a novel framework for performing machine learning on longitudinal neuroimaging datasets. These datasets are characterized by their size particularly their width millions of features per input .
2056
2057Specifically we address the problem of detecting subtle short-term changes in neural structure that are indicative of cognitive decline and correlate with risk factors for Alzheimer's disease. We introduce a new spatially-sensitive kernel that allows us to reason about individuals as opposed to populations.
2058
2059In doing so this paper presents the first evidence demonstrating that very small changes in white matter structure over a two year period can predict change in cognitive function in healthy adults."
2060GP-Localize Persistent Mobile Robot Localization using Online Sparse Gaussian Process Observation Model,Nuo Xu Bryan Kian Hsiang Low Jie Chen Keng Kiat Lim and Etkin Ozgul,Robotics ROB ,"Robot localization
2061Gaussian process
2062Online learning","MLA Applications of Supervised Learning
2063NMLA Online Learning
2064ROB Localization Mapping and Navigation
2065ROB State Estimation","Central to robot exploration and mapping is the task of persistent localization in environmental fields characterized by spatially correlated measurements.
2066This paper presents a novel Gaussian process localization GP-Localize algorithm that in contrast to existing works can exploit the spatially correlated field measurements taken during a robot's exploration instead of relying on prior training data for efficiently and scalably learning the GP observation model online.
2067As a result GP-Localize is capable of achieving constant time and memory in the size of the data per filtering step which demonstrates the practical feasibility of using GPs for persistent robot localization.
2068Empirical evaluation via simulated experiments with real-world datasets and a real robot experiment shows that GP-Localize outperforms existing GP localization algorithms."
2069On Detecting Nearly Structured Preference Profiles,Martin Lackner and Edith Elkind,"Game Theory and Economic Paradigms GTEP
2070Multiagent Systems MAS ","single-peaked preferences
2071single-crossing preferences
2072approximation algorithms
2073forbidden configurations",GTEP Social Choice / Voting,Structured preference domains such as e.g. the domains of single-peaked and single-crossing preferences are known to admit efficient algorithms for many problems in computational social choice. Some of these algorithms extend to preferences that are close to having the respective structural property i.e. can be made to enjoy this property by making minor changes to voters preferences such as deleting a small number of voters or candidates. However it has recently been shown that finding the optimal number of voters or candidates to delete in order to achieve the desired structural property is NP-hard for many such domains. In this paper we show that these problems admit efficient approximation algorithms. Our results apply to all domains that can be characterized in terms of forbidden configurations; this includes in particular single-peaked and single-crossing elections. For a large range of scenarios our approximation results are optimal under a plausible complexity-theoretic assumption. We also provide parameterized complexity results for this class of problems.
2074Relational One-Class Classification A Non-Parametric Approach,Tushar Khot Sriraam Natarajan and Jude Shavlik,"Novel Machine Learning Algorithms NMLA
2075Reasoning under Uncertainty RU ","One-class classification
2076Statistical Relational Learning
2077Ensemble learning","NMLA Ensemble Methods
2078NMLA Relational/Graph-Based Learning
2079RU Relational Probabilistic Models",One-class classification approaches have been proposed in literature to learn classifiers from examples of only one class. But these approaches are not directly applicable to relational domains due to their reliance on feature vectors or distance measure. We propose a non-parametric relational one-class classification approach based on first-order trees. We learn a tree-based distance measure that iteratively introduces new relational features to differentiate relational examples. We update the distance measure so as to maximize the one-class classification performance of our model. We also relate our model definition to existing work on combination functions and density estimation. We also experimentally show that our approach can discover relevant features for this task and outperform three baseline approaches.
2080Using Narrative Function to Extract Qualitative Information from Natural Language Texts,Clifton McFate Kenneth Forbus and Thomas Hinrichs,Cognitive Systems CS ,"Cognitive Systems
2081Qualitative Representation
2082Natural Language Understanding",CS Natural language understanding and dialogue,Understanding natural language about the continuous world is an important problem for cognitive systems. The naturalness of qualitative reasoning suggests that qualitative representations might be an important component of the semantics of natural language. Prior work showed that frame-based representations of qualitative process theory constructs could indeed be extracted from natural language texts. That technique relied on the parser recognizing specific syntactic constructions which had limited coverage. This paper describes a new approach using narrative function to represent the higher-order relationships between the constituents of a sentence and between sentences in a discourse. We outline how narrative function combined with query-driven abduction enables the same kinds of information to be extracted from natural language texts. Moreover we also show how the same technique can be used to extract type-level qualitative representations from text and used to improve performance in playing a strategy game.
2083A spatio-temporal pattern mining algorithm to identify objects in a continuous field A global oceanography perspective,James Faghmous Hung Nguyen Matthew Le and Vipin Kumar,Computational Sustainability and AI CSAI ,"spatio-temporal data mining
2084oceanography
2085pattern mining","CSAI Modeling and prediction of dynamic and spatiotemporal phenomena and systems
2086CSAI Control and optimization of dynamic and spatiotemporal systems
2087CSAI Modeling and control of complex high-dimensional systems
2088CSAI Sensor networks for monitoring environments
2089NMLA Data Mining and Knowledge Discovery
2090NMLA Time-Series/Data Streams
2091NMLA Unsupervised Learning Other ",Mesoscale ocean eddies are a critical component of the Earth System as they dominate the ocean's kinetic energy and impact the global distribution of oceanic heat salinity momentum and nutrients. Thus accurately representing these dynamic features is critical for our planet's sustainability. The majority of methods that identify eddies from satellite observations analyze the data in a frame-by-frame basis despite the fact that eddies are dynamic objects that propagate across space and time. We introduce the notion of spatio-temporal consistency to identify eddies in a continuous spatio-temporal field to simultaneously ensure that the features detected are both spatially consistent and temporally persistent. Our spatio-temporal consistency approach allows us to remove most of the expert criteria used in traditional methods and enables us to better render eddy dynamics by identifying smaller and longer lived eddies than existing methods.
2092Efficient codes for inverse dynamics during walking,Leif Johnson and Dana Ballard,Cognitive Modeling CM ,"machine learning
2093inverse dynamics
2094movement",CM Simulating Humans,Efficient codes have been used effectively in both computer science and neuroscience to better understand the information processing in visual and auditory encoding and discrimination tasks. In this paper we explore the use of efficient codes for representing information relevant to human movements during locomotion. Specifically we apply motion capture data to a physical model of the human skeleton to compute joint angles inverse kinematics and joint torques inverse dynamics ; then by treating the resulting data as a regression problem we investigate the effect of sparsity in mapping from angles to torques. The results of our investigation suggest that sparse codes can indeed represent salient features of both the kinematic and dynamic views of locomotion movements in humans. However sparsity appears to be only one parameter in building a model of inverse dynamics; we also show that the encoding process benefits significantly by integrating with the regression process for this task. Finally we use our results to argue that representations of movement are critical to modeling and understanding these movements.
2095Anytime Active Learning,Maria E. Ramirez-Loaiza Aron Culotta and Mustafa Bilgic,"Human-Computation and Crowd Sourcing HCC
2096Machine Learning Applications MLA ","active learning
2097non-uniform labeling costs
2098document classification","HCC Cost reliability and skill of labelers
2099NLPML Text Classification
2100NMLA Active Learning",A common bottleneck in deploying supervised learning systems is collecting human-annotated examples. In many domains annotators form an opinion about the label of an example incrementally --- e.g. each additional word read from a document or each additional minute spent inspecting a video helps inform the annotation. In this paper we investigate whether we can train learning systems more efficiently by requesting an annotation before inspection is fully complete --- e.g. after reading only 25 words of a document. While doing so may reduce the overall annotation time it also introduces the risk that the annotator might not be able to provide a label if interrupted too early. We propose an anytime active learning approach that optimizes the annotation time and response rate simultaneously. We conduct user studies on subsets of two document classification datasets and develop simulated annotators that mimic the users. Our simulated experiments show that anytime active learning outperforms several baselines on these two datasets. For example with an annotation budget of one hour training a classifier by annotating the first 25 words of each document reduces classification error by 17% over annotating the first 100 words of each document.
2101Elimination Ordering in Lifted First-Order Probabilistic Inference,Seyed Mehran Kazemi and David Poole,Reasoning under Uncertainty RU ,"Lifted inference
2102Elimination orderings
2103Probabilistic inference
2104Statistical relational AI","RU Probabilistic Inference
2105RU Relational Probabilistic Models",Various representations and inference methods have been proposed for lifted probabilistic inference in relational models. Many of these methods choose an order to eliminate or branch on the parametrized random variables. Similar to such methods for non-relational probabilistic inference the order of elimination has a significant role in the performance of the algorithm. Since finding the best order is NP-complete even for non-relational models heuristics have been proposed to find good orderings in the non-relational models. We show that these heuristics are inefficient for relational models because they fail to consider the population sizes associated with logical variables in the parametrized random variable. In this paper we extend existing heuristics for non-relational models and propose new heuristics for relational models. We evaluate the existing and new heuristics on a range of generated relational graphs.
2106Predicting Postoperative Atrial Fibrillation from Independent ECG Components,Chih-Chun Chia James Blum Zahi Karam Satinder Singh and Zeeshan Syed,"Applications APP
2107Machine Learning Applications MLA ","atrial fibrillation
2108independent components
2109medicine","APP Biomedical / Bioinformatics
2110MLA Bio/Medicine",Postoperative atrial fibrillation PAF occurs in 10\% to 65\% of the patients undergoing cardiac surgery. It is associated with increased postoperative mortality and morbidity and also results in longer and more expensive hospital stays. Accurately stratifying patients for PAF allows for the selective use of prophylactic therapies e.g. amiodarone to reduce this burden. Our proposed work addresses this need through the development of novel electrocardiographic ECG markers that can be easily deployed in a clinical setting to identify patients at risk of PAF. Specifically we explore a novel eigen-decomposition approach that first partitions ECG signals into atrial and ventricular components by exploiting knowledge of the underlying cardiac cycle. We then quantify cardiac instability manifesting as probabilistic variations in atrial ECG morphology to assess the risk of PAF. When evaluated on a cohort of 385 patients undergoing cardiac surgery our proposed approach based on an analysis of decoupled ECG components demonstrated substantial promise in identifying patients at risk of PAF and improved clinical models both in terms of discrimination and reclassification relative to the use of existing clinical metrics.
2111Online Multi-Task Learning via Sparse Dictionary Optimization,Paul Ruvolo and Eric Eaton,Novel Machine Learning Algorithms NMLA ,"multi-task learning
2112transfer learning
2113lifelong learning
2114sparse coding
2115k-svd","NMLA Online Learning
2116NMLA Transfer Adaptation Multitask Learning",This paper develops an efficient online algorithm for learning multiple consecutive tasks based on the K-SVD algorithm for sparse dictionary optimization. We first derive a batch multi-task learning method that builds upon K-SVD and then extend the batch algorithm to train models online in a lifelong learning setting. The resulting method has lower computational complexity than other current lifelong learning algorithms while maintaining nearly identical performance. Additionally the proposed method offers an alternate formulation for lifelong learning that supports both task and feature similarity matrices.
2117Betting Strategies Market Selection and the Wisdom of Crowds,Willemien Kets David Pennock Rajiv Sethi and Nisarg Shah,Game Theory and Economic Paradigms GTEP ,"Prediction market
2118Market selection
2119Kelly betting
2120CRRA utilities",GTEP Auctions and Market-Based Systems,We investigate the limiting behavior of trader wealth and prices in a simple prediction market with a finite set of participants having heterogeneous beliefs. Traders bet repeatedly on the outcome of a binary event with fixed Bernoulli success probability. A class of strategies including fractional Kelly betting and constant relative risk aversion CRRA are considered. We show that when traders are willing to risk only a small fraction of their wealth in any period belief heterogeneity can persist indefinitely; if bets are large in proportion to wealth then only the most accurate belief type survives. The market price is more accurate in the long run when traders with less accurate {beliefs} also survive. That is the survival of traders with heterogeneous beliefs some less accurate than others allows the market price to better reflect the objective probability of the event in the long run.
2121Where and Why Users Check In ,Yoon-Sik Cho Greg Ver Steeg and Aram Galstyan,"AI and the Web AIW
2122Applications APP ","Location Based Social Network
2123Point Processes
2124Temporal Clustering
2125Social Network Analysis","AIW Social networking and community identification
2126APP Computational Social Science
2127APP Social Networks",The emergence of location based social network LBSN services makes it possible to study individuals mobility patterns at a fine-grained level and to see how they are impacted by social factors. In this study we analyze the check-in patterns in LBSN and observe significant temporal clustering of check-in activities. We explore how self-reinforcing behaviors social factors and exogenous effects contribute to this clustering and introduce a framework to distinguish these effects at the level of individual check-ins for both users and venues. Using check-in data from three major cities we show not only that our model can improve prediction of future check-ins but also that disentangling of different factors allows us to infer meaningful properties of different venues.
2128Designing Fast Absorbing Markov Chains,Stefano Ermon Carla Gomes Ashish Sabharwal and Bart Selman,"Heuristic Search and Optimization HSO
2129Novel Machine Learning Algorithms NMLA ","MCMC
2130Markov Chain
2131Absorption time","HSO Evaluation and Analysis Search and Optimization
2132HSO Search General/Other
2133NMLA Machine Learning General/other ","Markov Chains are a fundamental tool for the analysis of real world
2134phenomena and randomized algorithms. Given a graph with some specified
2135sink nodes and an initial probability distribution
2136we consider the problem of designing an absorbing Markov
2137Chain that minimizes the time required to reach a sink node by
2138selecting transition probabilities subject to some natural regularity
2139constraints. By exploiting the Markovian structure we obtain closed
2140form expressions for the objective function as well as its gradient
2141which can be thus evaluated efficiently without any simulation of the
2142underlying process and fed to a gradient-based optimization
2143package. For the special case of designing reversible Markov Chains
2144we show that global optimum can be efficiently computed by exploiting
2145convexity. We demonstrate how our method can be used to
2146evaluate and design local search methods tailored for certain
2147domains."
2148Symbolic Model Checking Epistemic Strategy Logic,Xiaowei Huang and Ron van der Meyden,Multiagent Systems MAS ,"Logic of Knowledge
2149Strategic Reasoning
2150Model Checking","MAS Coordination and Collaboration
2151MAS Evaluation and Analysis Multiagent Systems ",This paper presents a symbolic BDD-based model checking algorithm for an epistemic strategy logic with observational semantics. The logic has been shown to be more expressive than several variants of ATEL and therefore the algorithm can also be used for ATEL model checking. We implement the algorithm in a model checker and apply it to several applications. The performance of the algorithm is also reported with a comparison with a partially symbolic approach for ATEL model checking.
2152A Scheduler for Actions with Iterated Durations,James Paterson and Brian Williams,Planning and Scheduling PS ,"Scheduling
2153Loops
2154Preference
2155Optimization",PS Scheduling,"A wide range of robotic missions contain actions that exhibit looping behavior. Examples of these actions include picking fruit in agriculture pick-and-place tasks in manufacturing or even search patterns in robotic search or survey missions. These looping actions often have a range of acceptable values for the number of loops and a preference function over them. For example during robotic survey missions the information gain is expected to increase with the number of loops in a search pattern. Since these looping actions also take time which is typically bounded there is a challenge of maximizing utility while respecting time constraints.
2156
2157In this paper we introduce the Looping Temporal Problem with Preference LTPP as a formalism for encoding scheduling problems that contain looping actions. In addition we introduce a scheduling algorithm for LTPPs which leverages the structure of the problem to find the optimal solution efficiently."
2158Materials Discovery - Synthetic and Real World Datasets,John M. Gregoire Santosh Suram Ronan Le Bras Richard Bernstein Carla Gomes Bart Selman and R. Bruce Van Dover,"Computational Sustainability and AI CSAI
2159Heuristic Search and Optimization HSO
2160Machine Learning Applications MLA
2161Search and Constraint Satisfaction SCS ","Materials discovery
2162Dataset
2163Phase-map identification problem","CSAI Modeling and control of complex high-dimensional systems
2164MLA Applications of Unsupervised Learning
2165SCS Constraint Optimization","Newly-discovered materials have been central to recent technological advances. They have contributed significantly to breakthroughs in electronics renewable energy and green buildings and overall have promoted the advancement of global human welfare. Yet only a fraction of all possible materials have been explored. Accelerating the pace of discovery of materials would foster technological innovations and would potentially address pressing issues in sustainability such as energy production or consumption.
2166
2167The bottleneck of this discovery cycle lies however in the analysis of the materials data. As materials scientists have recently devised techniques to efficiently create thousands of materials and experimentalists have developed new methods and tools to characterize these materials the limiting factor has become the data analysis itself. Hence the goal of this paper is to stimulate the development of new computational techniques for the analysis of materials data by bringing together the complimentary expertise of materials scientists and computer scientists.
2168
2169In collaboration with two major research laboratories in materials science we provide the first publicly available dataset for the phase map identification problem. In addition we provide a parameterized synthetic data generator to assess the quality of proposed approaches as well as tools for data visualization and solution evaluation."
2170Automatic Synthesis of Geometry Problems for an Intelligent Tutoring System,Christopher Alvin Sumit Gulwani Rupak Majumdar and Supratik Mukhopadhyay,Applications APP ,"Problem Synthesis
2171Automated Reasoning
2172Computer-Aided Education
2173Intelligent Tutor",APP Computer-Aided Education,This paper presents an intelligent tutoring system GeoTutor for Euclidean Geometry that is automatically able to synthesize proof problems and their respective solutions given a geometric figure together with a set of properties true of it. GeoTutor can provide personalized practice problems that address student deficiencies in the subject matter.
2174Modeling Subjective Experience-based Learning under Uncertainty and Frames,Hyung-Il Ahn and Rosalind Picard,"Cognitive Modeling CM
2175Novel Machine Learning Algorithms NMLA
2176Reasoning under Uncertainty RU ","subjective experience-based learning
2177subjective value function
2178prospect theory
2179subjective discriminability
2180experienced utility
2181decision utility
2182gain frame
2183loss frame","CM Adaptive Behavior
2184CM Simulating Humans
2185NMLA Reinforcement Learning
2186RU Decision/Utility Theory",In this paper we computationally examine how subjective experience may help or harm the decision maker's learning under uncertain outcomes frames and their interactions. To model subjective experience we propose the ``experienced-utility function'' based on a prospect theory PT -based parameterized subjective value function. Our analysis and simulations of two-armed bandit tasks present that the task domain underlying outcome distributions and framing reference point selection influence experienced utilities and in turn the ``subjective discriminability'' of choices under uncertainty. Experiments demonstrate that subjective discriminability improves on objective discriminability by the use of the experienced-utility function with appropriate framing for a domain and that bigger subjective discriminability leads to more optimal decisions in learning under uncertainty.
2187Resolving Pronouns by Leveraging English Resources,Chen Chen and Vincent Ng,NLP and Text Mining NLPTM ,"Pronouns
2188Text Mining
2189Natural Language Processing",NLPML Evaluation and Analysis,Existing approaches to pronoun resolution are monolingual training and testing a pronoun resolver on the data from original language. In contrast we propose a bilingual approach to pronoun resolution aiming to improve the resolution of pronouns by leveraging both the publicly available dictionaries and coreference annotations from a second language. Experiments on the OntoNotes corpus demonstrate that our bilingual approach to pronoun resolution significantly surpasses the performance of state-of-the-art monolingual approaches.
2190Bagging by design on the sub-optimality of bagging ,Cao Zhu Periklis Papakonstantinou and Jia Xu,Novel Machine Learning Algorithms NMLA ,"bagging
2191bootstrapping
2192combinatorial design
2193noise stability
2194correlation
2195dependent sampling","NMLA Classification
2196NMLA Ensemble Methods
2197NMLA Supervised Learning Other
2198NMLA Machine Learning General/other ","Bagging Breiman 1996 and its variants is one of the most popular methods
2199in aggregating classifiers and regressors. Originally its analysis assumed that the bootstraps are built from an unlimited independent source of samples therefore we call this form of bagging \emph{ideal-bagging}. However in the real world base predictors are trained on data subsampled from a limited number of training samples and thus they behave very differently. We analyze the effect of intersections between bootstraps obtained by subsampling to train different base predictors. Most importantly we provide an alternative subsampling method called \emph{design-bagging} based on a new construction of combinatorial designs and prove it universally better than bagging. Methodologically we succeed at this level of generality because we compare the bagging and design-bagging on their prediction accuracy each relative to the accuracy ideal-bagging.This can possibly find applications in more involved bagging-based ensemble methods. Our analytical results are backed up by experiments on classification and regression settings."
2200A Hybrid Grammar-Based Approach for Learning and Recognizing Natural Hand Gestures,Amir Sadeghipour and Stefan Kopp,"Machine Learning Applications MLA
2201Novel Machine Learning Algorithms NMLA ","Iconic Hand Gestures
2202Stochastic Context-Free Grammar
2203Machine learning
2204Classification","MLA Applications of Supervised Learning
2205NMLA Classification
2206NMLA Graphical Model Learning
2207NMLA Supervised Learning Other
2208NMLA Machine Learning General/other
2209VIS Face and Gesture Recognition",In this paper we present an approach to learn structured models of gesture performances that allow for a compressed representation and robust recognition of natural iconic gestures. We analyze a dataset of iconic gestures and show how the proposed hybrid grammar formalism can generalize over both structural and feature-based variations among different gesture performances.
2210Combining Heterogenous Social and Geographical Information for Event Recommendation,Zhi Qiao Peng Zhang Yanan Cao Chuan Zhou and Li Guo,"AI and the Web AIW
2211Novel Machine Learning Algorithms NMLA ","heterogeneous social networks
2212event recommendation
2213geographical features","AIW Web-based recommendation systems
2214NMLA Recommender Systems",With the rapid growth of event-based social services EBSSs like \emph{Meetup} the demand for event recommendation becomes increasingly urgent. In EBSSs event recommendation plays a central role in recommending the most relevant events to users who are likely to participate in. Different from traditional recommendation problems event recommendation encounters three new types of information \emph{i.e.} heterogeneous online+offline social relationships geographical information of events and implicit feedback data from users. Yet combining the three types of data for event recommendation has not been considered. Therefore we present a Bayesian probability model that can unify these data for event recommendation. Experimental results on real-world data sets show the performance of our method.
2215Leveraging Fee-Based Imperfect Advisors in Human-Agent Games of Trust,Cody Buntain Sarit Kraus and Amos Azaria,Humans and AI HAI ,"advisors
2216trust games
2217investment game
2218bribery",HAI Human-Computer Interaction,"This paper explores whether the addition of costly imperfect and exploitable advisors to Berg's investment game enhances or detracts from investor performance in both one-shot and multi-round interactions.
2219We then leverage our findings to develop an automated investor agent that performs as well as or better than humans in these games.
2220To gather this data we extended Berg's game and conducted a series of experiments using Amazon's Mechanical Turk to determine how humans behave in these potentially adversarial conditions.
2221Our results indicate that in games of short duration advisors do not stimulate positive behavior and are not useful in providing actionable advice.
2222In long-term interactions however advisors do stimulate positive behavior with significantly increased investments and returns.
2223By modeling human behavior across several hundred participants we were then able to develop agent strategies that maximized return on investment and performed as well as or significantly better than humans.
2224In one-shot games we identified an ideal investment value that on average resulted in positive returns as long as advisor exploitation was not allowed.
2225For the multi-round games our agents relied on the corrective presence of advisors to stimulate positive returns on maximum investment."
2226GenEth A General Ethical Dilemma Analyzer,Michael Anderson and Susan Leigh Anderson,"Applications APP
2227Humans and AI HAI
2228Machine Learning Applications MLA ","machine ethics
2229concept learning
2230application","APP Philosophical and Ethical Issues
2231HAI Human-Computer Interaction
2232HAI Understanding People Theories Concepts and Methods
2233KRR Logic Programming
2234MLA Applications of Supervised Learning
2235MLA Machine Learning Applications General/other ",We contend that ethically significant behavior of autonomous systems should be guided by explicit ethical principles determined through a consensus of ethicists. As it is likely that in many particular cases of ethical dilemmas ethicists agree on the ethically relevant features and the right course of action generalization of such cases can be used to help discover principles needed for ethical guidance of the behavior of autonomous systems. Such principles help ensure the ethical behavior of complex and dynamic systems and further serve as a basis for justification of their actions as well as a control abstraction for managing unanticipated behavior. To provide assistance in developing ethical principles we have developed GENETH a general ethical dilemma analyzer that through a dialog with ethicists codifies ethical principles in any given domain. GENETH has been used to codify principles in a number of domains pertinent to the behavior of autonomous systems and these principles have been verified using an Ethical Turing Test.
2236Solving the Traveling Tournament Problem by Packing Three-Vertex Paths,Richard Hoshino Ken-Ichi Kawarabayashi Marc Goerigk and Stephan Westphal,"Heuristic Search and Optimization HSO
2237Planning and Scheduling PS ","traveling tournament problem
2238sports scheduling
2239scheduling optimization
2240graph theory","HSO Heuristic Search
2241HSO Optimization
2242PS Scheduling",The Traveling Tournament Problem TTP is a complex problem in sports scheduling whose solution is a schedule of home and away games meeting specific feasibility requirements while minimizing the total distance traveled by all the teams. A recently-developed hybrid algorithm combining local search and integer programming has resulted in best-known solutions for many TTP instances. In this paper we tackle the TTP from a graph-theoretic perspective by generating a new canonical schedule in which each team's three-game road trips match up with the underlying graph's minimum-weight P_3-packing. By using this new schedule as the initial input for the hybrid algorithm we develop tournament schedules for five benchmark TTP instances that beat all previously-known solutions.
2243Gradient Descent Method with Proximal Average for Nonconvex and Composite Regularization,Wenliang Zhong and James Kwok,Novel Machine Learning Algorithms NMLA ,"non-convex optimization
2244composite regularization
2245proximal average","NMLA Big Data / Scalability
2246NMLA Dimension Reduction/Feature Selection",Sparse modeling has been highly successful on high-dimensional data. While a lot of interests have been on convex regularization recent studies show that nonconvex regularizers can outperform their convex counterparts in many situations. However the resulting nonconvex optimization problems are often challenging especially for composite regularizers such as the nonconvex overlapping group lasso. In this paper by using a recent tool known as the proximal average we propose a novel proximal gradient descent method for a wide class of composite and nonconvex problems. Instead of directly solving the proximal step with a composite regularizer we average the solutions from the proximal problems of the individual regularizers. This simple strategy has similar convergence guarantee as existing nonconvex optimization approachesbut its per-iteration complexity is much lower. Experimental results on synthetic and real-world data sets demonstrate the effectiveness and efficiency of the proposed optimization algorithm and also the improved prediction performance resulting from the nonconvex regularizers.
2247Multilabel Classification with Label Correlations and Missing Labels,Wei Bi and James Kwok,Novel Machine Learning Algorithms NMLA ,"multilabel classification
2248label correlation
2249missing label",NMLA Classification,Many real-world applications involve multilabel classification in which the labels can have strong inter-dependencies and some of them may even be missing. Existing multilabel algorithms are unable to deal with both issues simultaneously. In this paper we propose a probabilistic model that can automatically learn and exploit multilabel correlations. By integrating out the missing information it also provides a disciplined approach to the handling of missing labels. The inference procedure is simple and the optimization subproblems are convex. Experiments on a number of real-world data sets with both complete and missing labels demonstrate that the proposed algorithm can consistently outperform the state-of-the-art multilabel classification algorithms.
2250Double Configuration Checking in Stochastic Local Search for Satisfiability,Chuan Luo Shaowei Cai Wei Wu and Kaile Su,"Heuristic Search and Optimization HSO
2251Search and Constraint Satisfaction SCS ","Double Configuration Checking
2252Stochastic Local Search
2253Satisfiability
2254Phase Transition","HSO Heuristic Search
2255SCS Constraint Satisfaction
2256SCS SAT and CSP Solvers and Tools",Stochastic local search SLS algorithms have shown effectiveness on satisfiable instances of the Boolean satisfiability SAT problem. However their performance is still unsatisfactory on random k-SAT at the phase transition which is of significance and is one of the most empirically hardest distributions of SAT instances. In this paper we propose a new heuristic called DCCA which combines two configuration checking CC strategies with different definitions of configuration in a novel way. We use the DCCA heuristic to design an efficient SLS solver for SAT dubbed DCCASat. The experiments show that the DCCASat solver significantly outperforms a number of state-of-the-art solvers on extensive random k-SAT benchmarks at the phase transition. Moreover further empirical analyses on structured benchmarks indicate the robustness of DCCASat.
2257Direct Semantic Analysis for Social Image Classification,Zhiwu Lu Liwei Wang and Ji-Rong Wen,"Machine Learning Applications MLA
2258Vision VIS ","social image classification
2259latent semantic analysis
2260graph-based learning
2261bag-of-words","MLA Machine Learning Applications General/other
2262VIS Image and Video Retrieval
2263VIS Statistical Methods and Learning",This paper presents a direct semantic analysis method for learning the correlation matrix between visual and textual words from socially tagged images. In the literature to improve the traditional visual bag-of-words BOW representation latent semantic analysis has been studied extensively for learning a compact visual representation where each visual word may be related to multiple latent topics. However these latent topics do not convey any true semantic information which can be understood by human. In fact it remains a challenging problem how to recover the relationships between visual and textual words. Motivated by the recent advances in dealing with socially tagged images we develop a direct semantic analysis method which can explicitly learn the correlation matrix between visual and textual words for social image classification. To this end we formulate our direct semantic analysis from a graph-based learning viewpoint. Once the correlation matrix is learnt we can readily first obtain a semantically refined visual BOW representation and then apply it to social image classification. Experimental results on two benchmark image datasets show the promising performance of the proposed method.
2264On the Structure of Synergies in Cooperative Games,Ariel Procaccia Nisarg Shah and Max Tucker,Game Theory and Economic Paradigms GTEP ,"Cooperative game theory
2265Shapley value
2266Weighted voting games",GTEP Game Theory,We investigate synergy or lack thereof between agents in cooperative games building on the popular notion of Shapley value. We think of a pair of agents as synergistic resp. antagonistic if the Shapley value of one agent when the other agent participates in a joint effort is higher resp. lower than when the other agent does not participate. Our main theoretical result is that any graph specifying synergistic and antagonistic pairs can arise even from a restricted class of cooperative games. We also study the computational complexity of determining whether a given pair of agents is synergistic. Finally we use the concepts developed in the paper to uncover the structure of synergies in two real-world organizations the European Union and the International Monetary Fund.
2267Similarity-preserving binary signature for linear subspace,Jianqiu Ji Jianmin Li Shuicheng Yan Qi Tian and Bo Zhang,Novel Machine Learning Algorithms NMLA ,"linear subspace
2268angular distance
2269binary signature
2270Hamming distance",NMLA Data Mining and Knowledge Discovery,Linear subspace is an important representation for many kinds of real-world data in computer vision and pattern recognition e.g. faces motion videos speech. In this paper first we define pairwise angular similarity and angular distance for linear subspaces. The angular distance satisfies non-negativity identity of indiscernibles symmetry and triangle inequality and thus it is a metric. Then we propose a method to compress linear subspaces into compact similarity-preserving binary signatures between which the normalized Hamming distance is an unbiased estimator of the angular distance. We provide a lower bound on the length of binary signatures which suffices to guarantee a uniform distance-preservation within a set of subspaces. Experiments on face recognition demonstrate the effectiveness of this binary signature in terms of recognition accuracy speed and storage requirement. The results show that compared with the exact method the approximation with binary signatures achieves an order of magnitude speed-up while requiring significantly smaller amount of storage space yet it still accurately preserves the similarity and achieves high recognition accuracy comparable to the exact method in face recognition.
2271Sample-Adaptive Multiple Kernel Learning,Xinwang Liu Lei Wang Jian Zhang and Jianping Yin,Novel Machine Learning Algorithms NMLA ,"Latent Support Vector Machine
2272Multiple Kernel Learning
2273Inference","NMLA Classification
2274NMLA Ensemble Methods
2275NMLA Kernel Methods
2276NMLA Supervised Learning Other ",Existing multiple kernel learning MKL algorithms indiscriminately apply a same set of kernel combination weights to all samples. However the utility of base kernels could vary across samples and a base kernel useful for one sample could become noisy for another. In this case rigidly applying a same set of kernel combination weights could adversely affect the learning performance. To improve this situation we propose a sample-adaptive MKL algorithm in which base kernels are allowed to be adaptively switched on/off with respect to each sample. We achieve this goal by assigning a latent binary variable to each base kernel when it is applied to a sample. The kernel combination weights and the latent variables are jointly optimized via margin maximization principle. As demonstrated on five benchmark data sets the proposed algorithm consistently outperforms the comparable ones in the literature.
2277Learning Temporal Dynamics of Behavior Propagation in Social Networks,Jun Zhang Chaokun Wang and Jianmin Wang,"AI and the Web AIW
2278Applications APP ","Temporal Dynamics
2279Social Influence
2280Behavior Propagation
2281Behavior Prediction
2282Social Networks","AIW Exploiting Linked Open Data
2283AIW Social networking and community identification
2284AIW Web personalization and user modeling
2285AIW Web-based recommendation systems
2286APP Computational Social Science
2287APP Social Networks",Social influence has been widely accepted to explain people's cascade behaviors and further utilized to benefit many applications. However few of existing work studied the direct microscopic and temporal impact of social influence on people's behaviors in detail. In this paper we engage in the investigation of behavior propagation based on social influence and its temporal dynamics over continuous time. We formalize the static behavior models including BP and IBP and the discrete DBP and DIBP models. We introduce continuous-temporal functions CTFs to model the fully-continuous dynamic variance of social influence over time. Upon that we propose the continuous-temporal interest-aware behavior propagation model called CIBP and present effective inference algorithm. Experimental studies on real-world datasets evaluated the family of behavior propagation models BPMs and demonstrated the effectiveness of our proposed models.
2288Delivering Guaranteed Display Ads under Reach and Frequency Requirements,Ali Hojjat John Turner Suleyman Cetintas and Jian Yang,"AI and the Web AIW
2289Applications APP
2290Heuristic Search and Optimization HSO
2291Planning and Scheduling PS ","Online Advertising
2292Math programming
2293Optimization
2294Column Generation
2295Guaranteed Targeted Display Advertising
2296Reach
2297Frequency
2298Uniform Delivery","AIW AI for web services semantic descriptions planning matching and coordination
2299APP Other Applications
2300HSO Optimization
2301PS Deterministic Planning
2302PS Planning General/Other ","We propose a new idea in the allocation and serving of online advertising. We show that by using predetermined fixed-length streams of ads which we call patterns to
2303serve advertising we can incorporate a variety of interesting features into the ad allocation optimization problem. In particular our formulation optimizes for representativeness as well as user-level diversity and pacing of ads under
2304reach and frequency requirements. We show how the problem can be solved efficiently using a column generation scheme in which only a small set of best patterns are kept in the optimization problem. Our numerical tests show that with parallelization of the pattern generation process the algorithm has a promising run time and memory usage."
2305Video Recovery via Low-Rank Tensor Completion with Spatio-Temporal Consistency,Hua Wang Feiping Nie and Heng Huang,Vision VIS ,"Video completion
2306Tensor completion
2307Augmented Langrange Multiplier Method",VIS Videos,"Video completion is a computer vision technique to recover the missing values in video sequences by filling the unknown regions with the known information. In recent research tensor completion a generalization of matrix completion for higher order data emerges as a new solution to estimate the missing information in video with the assumption that the video frames are homogenous and correlated. However each video clip often stores the heterogeneous episodes and the correlations among all video frames are not high. Thus the regular tenor completion methods are not suitable to recover the video missing values in practical applications.
2308
2309To solve this problem we propose a novel spatially-temporally consistent tensor completion method for recovering the video missing data. Instead of minimizing the average of the trace norms of all matrices unfolded along each mode in a tensor data we introduce a new smoothness regularization along video time direction to utilize the temporal information between consecutive video frames. Meanwhile we also minimize the trace norm of each individual video frame to employ the spatial correlations among pixels. Different to previous tensor completion approaches our new method can keep the spatio-temporal consistency in video and do not assume the global correlation in video frames. Thus the proposed method can be applied to the general and practical video completion applications. Our method shows promising results in all evaluations on 3D biomedical image sequence and video benchmark data sets."
2310Robust Multi-View Spectral Clustering via Low-Rank and Sparse Decomposition,Rongkai Xia Yan Pan Lei Du and Jian Yin,Novel Machine Learning Algorithms NMLA ,"multi-view clustering
2311spectral clustering
2312low-rank matrices
2313Markov chains
2314Augmented Lagrangian Multiplier method",NMLA Clustering,"Multi-view clustering which seeks a partition of the data in
2315multiple views that often provide complementary information to each
2316other has received considerable attention in recent years. In real
2317life clustering problems the data in each view may have
2318considerable noise. However existing clustering methods blindly
2319combine the information from multi-view data with possibly
2320considerable noise which often degrades their performance. In this
2321paper we propose a novel Markov chain method for \textit{Robust
2322Multi-view Spectral Clustering} RMSC . Our method has a flavor of
2323low-rank and sparse decomposition where we firstly construct a
2324transition probability matrix from each single view and then use
2325these matrices to recover a shared low-rank transition probability
2326matrix as a crucial input to the standard Markov chain method
2327for clustering. The optimization problem of RMSC has a low-rank
2328constraint on the transition probability matrix and simultaneously
2329a probabilistic simplex constraint on each of its rows. To solve
2330this challenging optimization problem we propose an optimization procedure
2331based on the Augmented Lagrangian Multiplier scheme. Experimental
2332results on various real world datasets show that the
2333proposed method has superior performance over several
2334state-of-the-art methods for multi-view clustering."
2335Ranking Tweets by Labeled and Collaboratively Selected Pairs with Transitive Closure,Shenghua Liu Xueqi Cheng and Fangtao Li,"Heuristic Search and Optimization HSO
2336Machine Learning Applications MLA
2337NLP and Machine Learning NLPML
2338Novel Machine Learning Algorithms NMLA ","Microblog search
2339semi-supervised learning
2340transitive closure
2341learning to rank
2342SVM","APP Social Networks
2343NMLA Semisupervised Learning",Tweets ranking is important for information acquisition in Microblog. Due to the content sparsity and lack of labeled data it is better to employ semi-supervised learning methods to utilize the unlabeled data. However most of previous semi-supervised learning methods do not consider the pair conflict problem which means that the new selected unlabeled data may conflict with the labeled and previously selected data. It will hurt the learning performance a lot if the training data contains many conflict pairs. In this paper we propose a new collaborative semi-supervised SVM ranking model CSR-TC with consideration of the order conflict. The unlabeled data is selected based on a dynamically maintained transitive closure graph to avoid pair conflict. We also investigate the two views of features intrinsic and content-relevant features for the proposed model. Extensive experiments are conducted on TREC Microblogging corpus. The results demonstrate that our proposed method achieves significant improvement compared to several state-of-the-art models.
2344Globally and Locally Consistent Unsupervised Projection,Hua Wang Feiping Nie and Heng Huang,Novel Machine Learning Algorithms NMLA ,"Unsupervised learning
2345Dimension reduction
2346L1-norm minimization and maximization",NMLA Unsupervised Learning Other ,In this paper we propose an unsupervised projection method for feature extraction to preserve both global and local consistencies of the input data in the projected space. Traditional unsupervised feature extraction methods such as principal component analysis and locality preserving projections can only explore either the global or local geometric structures of the input data but not the both. In our new method we introduce a new measurement using the neighborhood data variances to assess the data locality by which we propose to learn an optimal projection by rewarding both the global and local structures of the input data. Moreover to improve the robustness of the proposed learning model against outlier data samples and outlier features which is of particular importance in unsupervised learning we propose a new objective that simultaneously minimizes and maximizes minmax the L1-norm distances instead of the traditional squared L2-norm distances. Solving the formulated optimization problem is very challenging because it minimizes and maximizes a number of non-smooth L1- norm terms at the same time. In this paper as an important theoretical contribution we propose a simple yet effective optimization method to solve the L1-norm minmax problem and theoretically prove its convergence and correctness. To the best of our knowledge our paper makes the first attempt to solve the general L1-norm minmax problem with orthogonal constraints. Extensive experiments have been performed on six benchmark data sets where the promising results validate the proposed method.
2347On the Incompatibility of Efficiency and Strategyproofness in Randomized Social Choice,Haris Aziz Florian Brandl and Felix Brandt,Game Theory and Economic Paradigms GTEP ,"social decision schemes
2348lotteries
2349efficiency
2350strategyproofness
2351randomized social choice","GTEP Game Theory
2352GTEP Social Choice / Voting",Efficiency--no agent can be made better off without making another one worse off--and strategyproofness--no agent can obtain a more preferred outcome by misrepresenting his preferences--are two cornerstones of economics and ubiquitous in important areas such as voting auctions or matching markets. Within the context of randomized social choice Bogomolnaia and Moulin have shown that two particular notions of efficiency and strategyproofness based on stochastic dominance are incompatible. However there are various other possibilities of lifting preferences over alternatives to preferences over lotteries apart from stochastic dominance. In this paper we give an overview of common preference extensions propose two new ones and show that the above-mentioned incompatibility can be extended to various other notions of strategyproofness and efficiency.
2353Context-aware Collaborative Topic Regression with Social Matrix Factorization for Recommender Systems,Chaochao Chen Xiaolin Zheng Yan Wang Fuxing Hong and Zhen Lin,AI and the Web AIW ,"Context-awareness
2354Topic modeling
2355Matrix Factorization
2356Social Networks",AIW Web-based recommendation systems,Online social networking sites have become popular platforms on which users can link with each other and share information not only basic rating information but also information such as contexts social relationships and item contents. However as far as we know no existing works systematically combine diverse types of information to build more accurate recommender systems. In this paper we propose a novel context-aware hierarchical Bayesian method. First we propose the use of spectral clustering for user-item subgrouping so that users and items in similar contexts are grouped. We then propose a novel hierarchical Bayesian model that can make predictions for each user-item subgroup our model incorporate not only topic modeling to mine item content but also social matrix factorization to handle ratings and social relationships. Experiments on an Epinions dataset show that our method significantly improves recommendation performance compared with six categories of state-of-the-art recommendation methods in terms of both prediction accuracy and recall. We have also conducted experiments to study the extent to which ratings contexts social relationships and item contents contribute to recommendation performance in terms of prediction accuracy and recall.
2357Deep Salience Visual Salience Modeling via Deep Belief Propagation,Richard Jiang,"Robotics ROB
2358Vision VIS ","Visual Salience
2359Gaze Modelling
2360Deep Belief Propagation
2361Random Field","CM Cognitive Architectures
2362ROB Cognitive Robotics
2363VIS Object Detection
2364VIS Perception",Visual salience has been considered as an intriguing phenomenon observed in biologic neural systems. Numerous efforts have been made on mathematical modeling of visual salience using various feature contrasts either locally or at global range. However these algorithmic models treat this biologic phenomenon more like a mathematic problem and somehow ignores its biological instinct that visual salience arouses from the deep propagation of visual stimuli along the visual cortex. In this paper we present a Deep Salience model that emulates this bio-inspired task where a multi-layer successive Markov random fields sMRF is proposed to analyze the input image successively through its deep belief propagation. As its outcome the foreground object can be automatically separated from the background in a fully unsupervised way. Experimental evaluation on benchmark datasets validated that our model can consistently outperform state-of-the-art salience models yielding the highest recall rates precision and F-measure scores in object detection. With this experimental validation it is shown that the proposed bio-plausible deep belief network as an emulation of successive visual signal propagation along human visual cortex can functionally work well on solving real-world computational problems.
2365A Local Non-negative Pursuit Method for Intrinsic Manifold Structure Preservation,Dongdong Chen Jian Cheng Lv and Yi Zhang,Novel Machine Learning Algorithms NMLA ,"neighborhood selection
2366non-negative representation learning
2367intrinsic manifold structure preservation","NMLA Clustering
2368NMLA Dimension Reduction/Feature Selection
2369NMLA Unsupervised Learning Other ",The local neighborhood selection plays a crucial role for most representation based manifold learning algorithms. This paper reveals that an improper selection of neighborhood for learning representation will introduce negative components in the learnt representations. Importantly the representations with negative components will affect the intrinsic manifold structure preservation. In this paper a local non-negative pursuit LNP method is proposed for neighborhood selection and non-negative representations are learnt. Moreover it is proved that the learnt representations are sparse and convex. Theoretical analysis and experimental results show that the proposed method achieves or outperforms the state-of-the-art results on various manifold learning problems.
2370Regret-based Optimization and Preference Elicitation for Stackelberg Security Games with Uncertainty,Thanh Nguyen Amulya Yadav Bo An Milind Tambe and Craig Boutilier,Game Theory and Economic Paradigms GTEP ,"security game
2371robust optimization
2372minimax regret
2373preference elicitation
2374payoff uncertainty",GTEP Game Theory,"Stackelberg security games SSGs have been deployed a number of real-world security domains. One key challenge in these applications is the assessment of attacker payoffs which may not be perfectly known. Previous work has studied SSGs with uncertain payoffs modeled by interval uncertainty and maximin-based robust optimization. In contrast this paper is the first to propose the use of the less conservative minimax regret as a decision criterion for payoff-uncertain SSGs
2375and to present several algorithms for computing minimax regret for such games. This paper also for the first time addresses the challenge of preference elicitation in SSGs providing novel regret-based solution strategies. Experimental results validate the runtime performance and solution quality of our approaches."
2376Adding Local Exploration to Greedy Best-First Search for Satisficing Planning,Fan Xie Martin Mueller and Robert Holte,"Heuristic Search and Optimization HSO
2377Planning and Scheduling PS ","Heuristic Search
2378Satisficing Planning
2379Greedy Best First Search","HSO Heuristic Search
2380HSO Search General/Other
2381PS Deterministic Planning","Greedy Best-First Search GBFS is a powerful algorithm at the heart of many state of the art satisficing planners. One major weakness of GBFS is its behavior in so-called uninformative heuristic regions UHR - parts of the search space in which no heuristic provides guidance towards states with improved heuristic values. In such regions GBFS degenerates into an inefficient breadth-first type search.
2382This work analyzes the problem of UHR in planning in detail and proposes a two level search framework as a solution. In Greedy Best-First Search with Local Exploration GBFS-LE local exploration is started from within a global GBFS whenever the search seems stuck in UHRs.
2383
2384Two different local exploration strategies are developed and evaluated experimentally Local GBFS LS and Local Random Walk Search LRW . The two new planners LAMA-LS and LAMA-LRW integrate these strategies into the GBFS component of LAMA-2011. Both are shown to yield clear improvements in terms of both coverage and search time on standard International Planning Competition benchmarks especially for domains that are proven to have large or unbounded UHRs."
2385Pre-trained Multi-view Word Embedding using Two-side Neural Network,Yong Luo Jian Tang Jun Yan Chao Xu and Zheng Chen,"AI and the Web AIW
2386Novel Machine Learning Algorithms NMLA ","word embedding
2387neural network
2388pre-train
2389multiple data sources","AIW Machine learning and the web
2390NMLA Neural Networks/Deep Learning",Word embedding aims to learn a continuous representation for each word. It attracts increasing attention due to its effectiveness in various tasks such as named entity recognition and language modeling. Most existing word embedding results are generally trained on one individual data source such as news pages or Wikipedia articles. However when we apply them to other tasks such as web search the performance suffers. To obtain a robust word embedding for different applications multiple data sources could be leveraged. In this paper we proposed a two-side multimodal neural network to learn a robust word embedding from multiple data sources including free text user search queries and search click-through data. This framework takes the word embeddings learned from different data sources as pre-train and then uses a two-side neural network to unify these embeddings. The pre-trained embeddings are obtained by adapting the recently proposed CBOW algorithm. Since the proposed neural network does not need to re-train word embeddings for a new task it is highly scalable in real world problem solving. Besides the network allows weighting different sources differently when applied to different application tasks. Experiments on two real-world applications including web search ranking and word similarity measuring show that our neural network with multiple sources outperforms state-of-the-art word embedding algorithm with each individual source. It also outperforms other competitive baselines using multiple sources.
2391Trust Prediction with Propagation and Similarity Regularization,Xiaoming Zheng Yan Wang Mehmet Orgun Youliang Zhong and Guanfeng Liu,"AI and the Web AIW
2392Applications APP ","Trust prediction
2393Social network
2394Trust propagation
2395Trust tendency","AIW Representing reasoning and using provenance trust privacy and security on the web
2396APP Social Networks",Online social networks have been used for a variety of rich activities in recent years such as investigating potential employees or seeking recommendations of high quality services and service providers. In such activities trust is one of the most vital factors for users' decision-making. In the literature the state-of-the-art trust prediction approaches either focus on dispositional bias and propagated trust value of the pair-wise trust relationship along a path or on the similarity of trust rating values. However another factor the distribution of trust ratings also affects the trust between users. In addition bias propagated trust and similarity are of different types but were treated the same. Therefore how to utilize the factors needs further improvement. In this paper we propose a new trust prediction model based on trust decomposition and matrix factorization considering all the above essential factors to predict the trust between two users who are not directly connected. In this model we firstly decompose trust into biased trust and bias-reduced trust. Then based on bias-reduced trust ratings matrix factorization with a similarity regularization term which takes advantages of both users' rating habits and propagated trust is proposed to predict missing trust values. In the end the missing trust is recomposed with predicted trust values and bias. Experiments conducted on a real-world dataset illustrate significantly improved prediction accuracy over the state-of-the-art approaches.
2397Recommendation by Mining Multiple User Behaviors with Group Sparsity,Ting Yuan Jian Cheng Xi Zhang Shuang Qiu and Hanqing Lu,AI and the Web AIW ,"Recommender System
2398Collaborative Filtering
2399Matrix Factorization",AIW Web-based recommendation systems,Recently some recommendation methods try to improve the prediction results by integrating information from user s multiple types of behaviors. How to model the dependence and independence between different behaviors is critical for them. In this paper we propose a novel recommendation model the Group-Sparse Matrix Factorization GSMF which factorizes the rating matrices for multiple behaviors into the user and item latent factor space with group sparsity regularization. It can 1 select out the different subsets of latent factors for different behaviors addressing that users decisions on different behaviors are determined by different sets of factors; 2 model the dependence and independence between behaviors by learning the shared and private factors for multiple behaviors automatically; 3 allow the shared factors between different behaviors to be different instead of all the behaviors sharing the same set of factors. Experiments on the real-world dataset demonstrate that our model can integrate users multiple types of behaviors into recommendation better compared with other state-of-the-arts.
2400Modal Ranking A Uniquely Robust Voting Rule,Ioannis Caragiannis Ariel Procaccia and Nisarg Shah,Game Theory and Economic Paradigms GTEP ,"Computational social choice
2401Crowdsourcing
2402Noise models",GTEP Social Choice / Voting,Motivated by applications to crowdsourcing we study voting rules that output a correct ranking of alternatives by quality from a large collection of noisy input rankings. We seek voting rules that are supremely robust to noise in the sense of being correct in the face of any reasonable type of noise. We show that there is such a voting rule which we call the modal ranking rule. Moreover we establish that the modal ranking rule is the unique rule with the preceding robustness property within a large family of voting rules which includes a slew of well-studied rules.
2403Proximal Iteratively Reweighted Algorithm with Multiple Splitting for Nonconvex Sparsity Optimization,Canyi Lu Yunchao Wei Zhouchen Lin and Shuicheng Yan,"Machine Learning Applications MLA
2404Novel Machine Learning Algorithms NMLA ","Nonconvex Sparsity Optimization
2405General iterative solver
2406multiple splitting for multi-variable problem",NMLA Machine Learning General/other ,This paper proposes the Proximal Iteratively REweighted PIRE algorithm for solving a general problem which involves a large body of nonconvex sparse and structured sparse related problems. Comparing with previous iterative solvers for nonconvex sparse problem PIRE is much more general and efficient. The computational cost of PIRE in each iteration is usually as low as the state-of-the-art convex solvers. We further propose the PIRE algorithm with Parallel Splitting PIRE-PS and PIRE algorithm with Alternative Updating PIRE-AU to handle the multi-variable problems. In theory we prove that our proposed methods converge and any limit solution is a stationary point. Extensive experiments on both synthesis and real data sets demonstrate that our methods achieve comparative learning performance but are much more efficient by comparing with previous nonconvex solvers.
2407Extending Tournament Solutions,Felix Brandt Markus Brill and Paul Harrenstein,Game Theory and Economic Paradigms GTEP ,"Social Choice Theory
2408Tournament Solutions
2409Possible Winners",GTEP Social Choice / Voting,An important subclass of social choice functions so-called majoritarian or C1 functions only take into account the pairwise majority relation between alternatives. In the absence of majority ties--e.g. when there is an odd number of agents with linear preferences--the majority relation is antisymmetric and complete and can thus conveniently be represented by a tournament. Tournaments have a rich mathematical theory and many formal results for majoritarian functions assume that the majority relation constitutes a tournament. Moreover most majoritarian functions have only been defined for tournaments and allow for a variety of generalizations to unrestricted preference profiles none of which can be seen as the unequivocal extension of the original function. In this paper we argue that restricting attention to tournaments is justified by the existence of a conservative extension which inherits most of the commonly considered properties from its underlying tournament solution.
2410Optimal Neighborhood Preserving Visualization by Maximum Satisfiability,Kerstin Bunte Matti Järvisalo Jeremias Berg Petri Myllymäki Jaakko Peltonen and Samuel Kaski,"Novel Machine Learning Algorithms NMLA
2411Search and Constraint Satisfaction SCS ","visualization
2412boolean optimization
2413maximum satisfiability
2414nonlinear dimensionality reduction
2415neighbor embedding","NMLA Dimension Reduction/Feature Selection
2416NMLA Unsupervised Learning Other
2417SCS Constraint Optimization
2418SCS SAT and CSP Modeling/Formulations",We present a novel approach to low-dimensional neighbor embedding for visualization we formulate an information retrieval based neighborhood preservation cost function as Maximum satisfiability on a discretized output display and maximize the number of clauses preserved. The method has a rigorous interpretation as optimal visualization for neighbor retrieval. Unlike previous low-dimensional neighbor embedding methods our satisfiability formulation is guaranteed to yield a global optimum and does so reasonably fast. Unlike previous manifold learning methods yielding global optima of their cost functions our cost function and method are designed for low-dimensional visualization where evaluation and minimization of visualization errors are crucial. Our method performs well in experiments yielding clean embeddings of data sets where a state-of-the-art comparison method yields poor arrangements. In a real-world case study for semi-supervised WLAN positioning in buildings we outperform state-of-the-art methods especially when having few measurements.
2419Dynamic Bayesian Probabilistic Matrix Factorization,Sotirios Chatzis,"Novel Machine Learning Algorithms NMLA
2420Reasoning under Uncertainty RU ","Probabilistic Matrix Factorization
2421Dynamic Hierarchical Dirichlet Process
2422Bayesian Nonparametrics
2423Collaborative Filtering","NMLA Bayesian Learning
2424NMLA Preferences/Ranking Learning
2425NMLA Recommender Systems
2426RU Probabilistic Inference
2427RU Relational Probabilistic Models",Collaborative filtering algorithms generally rely on the assumption that user preference patterns remain stationary. However real-world relational data are seldom stationary. User preference patterns may change over time giving rise to the requirement of designing collaborative filtering systems capable of detecting and adapting to preference pattern shifts. Motivated by this observation in this paper we propose a dynamic Bayesian probabilistic matrix factorization model designed for modeling time-varying distributions. Formulation of our model is based on imposition of a dynamic hierarchical Dirichlet process dHDP prior over the space of probabilistic matrix factorization models to capture the time-evolving statistical properties of modeled sequential relational datasets. We develop a simple Markov Chain Monte Carlo sampler to perform inference. We present experimental results to demonstrate the superiority of our temporal model.
2428Echo-State Conditional Restricted Boltzmann Machines,Sotirios Chatzis,Novel Machine Learning Algorithms NMLA ,"Conditional Restricted Boltzmann Machine
2429Echo-State Network
2430Contrastive Divergence","NMLA Neural Networks/Deep Learning
2431NMLA Time-Series/Data Streams
2432NMLA Structured Prediction",Restricted Boltzmann machines RBMs are a powerful generative modeling technique based on a complex graphical model of hidden latent variables. Conditional RBMs CRBMs are an extension of RBMs tailored to modeling temporal data. A drawback of CRBMs is their consideration of linear temporal dependencies which limits their capability to capture complex temporal structure. They also require many variables to model long temporal dependencies a fact that might provoke overfitting proneness. To resolve these issues in this paper we propose the echo-state CRBM ES-CRBM our model uses an echo-state network reservoir in the context of CRBMs to efficiently capture long and complex temporal dynamics with much fewer trainable parameters compared to conventional CRBMs. In addition we introduce an implicit mixture of ES-CRBM experts im-ES-CRBM to enhance even further the capabilities of our ES-CRBM model. The introduced im-ES-CRBM allows for better modeling temporal observations which might comprise a number of latent or observable subpatterns that alternate in a dynamic fashion. It also allows for performing sequence segmentation using our framework. We apply our methods to sequential data modeling and classification experiments using public datasets. As we show our approach outperforms both existing RBM-based approaches as well as related state-of-the-art methods such as conditional random fields.
2433A Joint Optimization Model for Image Summarization Based on Image Content and Tags,Hongliang Yu Zhi-Hong Deng Yunlun Yang and Tao Xiong,"AI and the Web AIW
2434Machine Learning Applications MLA ","image summarization
2435image tags
2436optimization model
2437similarity-inducing regularizer","AIW AI for multimedia and multimodal web applications
2438MLA Machine Learning Applications General/other ",As an effective technology for navigating a large number of images image summarization is becoming a promising task with the rapid development of image sharing sites and social networks. Most existing summarization approaches use the visual-based features for image representation without considering tag information. In this paper we propose a novel framework named JOINT which employs both image content and tag information to summarize images. Our model generates the summary images which can best reconstruct the original collection. Based on the assumption that an image with representative content should also have typical tags we introduce a similarity-inducing regularizer to our model. Furthermore we impose the lasso penalty on the objective function to yield a concise summary set. Extensive experiments demonstrate our model outperforms the state-of-the-art approaches.
2439A Computational Method for MSSCoMSS Partitioning,Jean Marie Lagniez Eric Gregoire and Bertrand Mazure,"Heuristic Search and Optimization HSO
2440Search and Constraint Satisfaction SCS ","SAT
2441CoMSS
2442MSS
2443Maximal Satisï¬able Subset","HSO Search General/Other
2444SCS Constraint Satisfaction
2445SCS SAT and CSP Solvers and Tools",MSS Maximal Satisï¬able Subset and CoMSS also called Minimal Correction Subset concepts play a key role in many A.I. approaches and techniques. In this paper a novel algorithm for partitioning a Boolean CNF into one MSS and its corresponding CoMSS is introduced. Extensive empirical evaluation shows that it is more robust and more efï¬cient on most instances than currently available techniques.
2446Feature Selection at the Discrete Limit,Miao Zhang Chris Ding and Ya Zhang,"Machine Learning Applications MLA
2447Novel Machine Learning Algorithms NMLA ","feature selection
2448sparse
2449L2p norm",NMLA Dimension Reduction/Feature Selection,"Feature selection plays an important role in many machine
2450learning and data mining applications. In this paper
2451we propose to use L2p norm for feature selection
2452with emphasis on small p. As p appoaches 0 feature selection
2453becomes discrete feature selection problem.We provide
2454two algorithms proximal gradient algorithm and rank one
2455update algorithm which is more efficient at large
2456regularization . Experiments on real life data sets show
2457that features selected at small p consistently outperform
2458features selected at p = 1 the standard L21 approach
2459and other popular feature selection methods."
2460Computing General First-order Parallel and Prioritized Circumscription,Hai Wan Zhanhao Xiao Yuan Zhenfeng Heng Zhang and Yan Zhang,Knowledge Representation and Reasoning KRR ,"first-order parallel and prioritized circumscription
2461first-order stable model semantics
2462translation
2463optimization","KRR Common-Sense Reasoning
2464KRR Nonmonotonic Reasoning",This paper focuses on computing general first-order parallel and prioritized circumscription with varied constants. We propose polynomial translations from general first-order circumscription to first-order stable model semantics over arbitrary structures including $Tr_v$ for parallel circumscription and $Tr^s_v$ for several parallel circumscriptions further for prioritized circumscription . To improve the efficiency we give an optimization called $\Gamma_{\exists}$ to reduce auxiliary predicates in number and logic programs in size when eliminating existential quantifiers during the translations. Based on these results a general first-order circumscription solver named cfo2lp is developed by calling answer set programming solvers. Using circuit diagnosis problem and extended stable marriage problem as benchmarks we compare cfo2lp with a propositional circumscription solver circ2dlp on efficiency. Experimental results demonstrate that for problems represented by general first-order circumscription naturally and intuitively cfo2lp can compute all solutions over finite structures. We also apply our approach to description logics with circumscription and repairs in inconsistent databases which can be handled effectively.
2465Learning Word Representation Considering Proximity and Ambiguity,Lin Qiu Yong Cao Zaiqing Nie and Yong Rui,"AI and the Web AIW
2466Machine Learning Applications MLA
2467NLP and Knowledge Representation NLPKR
2468NLP and Machine Learning NLPML ","word proximity
2469word ambiguity
2470neural networks","AIW Knowledge acquisition from the web
2471AIW Machine learning and the web
2472MLA Applications of Unsupervised Learning
2473NLPKR Natural Language Processing General/Other
2474NLPML Natural Language Processing General/Other
2475NMLA Neural Networks/Deep Learning",Distributed representations of words aka word embedding have been proven helpful in solving NLP tasks. Training distributed representations of words with neural networks has received much attention of late. Especially the most recent work on word embedding the Continuous Bag-of-Words CBOW model and the Continuous Skip-gram Skip-gram model proposed by Google shows very impressive results by significantly speeding up the training process to enable word representation learning from very large-scale data. However both CBOW and Skip-gram do not pay enough attention to the word proximity in terms of model or the word ambiguity in terms of linguistics. In this paper we propose Proximity-Ambiguity Sensitive PAS models i.e. PAS CBOW and PAS Skip-gram for producing high quality distributed representations of words considering both word proximity and ambiguity. From the model perspective we introduce proximity weights as parameters to be learned in PAS CBOW and used in PAS Skip-gram. By better modeling word proximity we reveal the real strength of the pooling-structured neural networks in word representation learning. The proximity-sensitive pooling layer can also be applied to other neural network applications that employ pooling layers. From the linguistics perspective we train multiple representation vectors per word. Each representation vector corresponds to a particular sense of the word. By using PAS models we achieved a maximum accuracy increase of 16.9% over the state-of-the-art models on the word representation test set.
2476Planning as Model Checking in Hybrid Domains,Sergiy Bogomolov Daniele Magazzeni Andreas Podelski and Martin Wehrle,Planning and Scheduling PS ,"Planning in Mixed Discrete-Continuous Domains
2477PDDL+
2478Model Checking
2479Hybrid Automata
2480Planning as Model Checking","PS Mixed Discrete/Continuous Planning
2481PS Temporal Planning","Planning in hybrid domains is an important and challenging task and
2482various planning algorithms have been proposed in the last years.
2483From an abstract point of view hybrid planning domains are based on
2484hybrid automata which have been studied intensively in the model
2485checking community. In particular powerful model checking
2486algorithms and tools have emerged for this formalism. However
2487despite the quest for more scalable planning approaches model
2488checking algorithms have not been applied to planning in hybrid
2489domains so far.
2490
2491In this paper we make a first step in bridging the gap between
2492these two worlds. We provide a formal translation scheme from PDDL+
2493to the standard formalism of hybrid automata as a solid basis for
2494using hybrid system model-checking tools for dealing with hybrid
2495planning domains. As a case study we use the SpaceEx model checker
2496showing how we can address PDDL+ domains that are out of the scope
2497of state-of-the-art planners."
2498Novel Density-based Clustering Algorithms for Uncertain Data,Xianchao Zhang Han Liu Xiaotong Zhang and Xinyue Liu,Novel Machine Learning Algorithms NMLA ,"Uncertain data
2499Clustering
2500Density-based algorithm","NMLA Clustering
2501RU Uncertainty in AI General/Other ","Density-based techniques seem promising for handling data
2502uncertainty in uncertain data clustering. Nevertheless some
2503issues have not been addressed well in existing algorithms.
2504In this paper we firstly propose a novel density-based uncertain
2505data clustering algorithm which improves upon existing
2506algorithms from the following two aspects 1 it employs
2507an exact method to compute the probability that the distance
2508between two uncertain objects is less than or equal to
2509a boundary value instead of the sampling-based method in
2510previous work; 2 it introduces new definitions of core object
2511probability and direct reachability probability thus reducing
2512the complexity and avoiding sampling. We then further improve
2513the algorithm by using a novel assignment strategy to
2514ensure that every object will be assigned to the most appropriate
2515cluster. Experimental results show the superiority of
2516our proposed algorithms over existing ones."
2517Parallel Restarted Search,Andre Cire Serdar Kadioglu and Meinolf Sellmann,Heuristic Search and Optimization HSO ,"Restarts
2518Parallel Search
2519Deterministic Parallelization",HSO Search General/Other ,We consider the problem of parallelizing restarted backtrack search. With few notable exceptions most commercial and academic constraint programming solvers do not learn no-goods during search. Depending on the branching heuristics used this means that there are little to no side-effects between restarts making them an excellent target for parallelization. We develop a simple technique for parallelizing restarted search deterministically and demonstrate experimentally that we can achieve near-linear speed-ups in practice.
2520MaxSAT Portfolio,Carlos Ansotegui-Gil Yuri Malitsky and Meinolf Sellmann,Search and Constraint Satisfaction SCS ,"MaxSAT
2521Algorithm Portfolios
2522Algorithm Tuning",SCS Satisfiability General/Other ,Our objective is to boost the state-of-the-art performance in MaxSAT solving. To this end we employ the instance-specific algorithm configurator ISAC and improve it by combining it with the latest in portfolio technology. Experimental results on SAT show that this combination marks a significant step forward in our ability to tune algorithms instance-specifically. We then apply the new methodology to a number of MaxSAT problem domains and show that the resulting solvers consistently outperform the best existing solvers on the respective problem families. In fact the solvers presented here were independently evaluated at the 2013 MaxSAT Evaluation where they won six out of eleven categories.
2523Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization,Dongqing Zhang and Wu-Jun Li,"Novel Machine Learning Algorithms NMLA
2524Vision VIS ","Multimodal Hashing
2525Cross-view Similarity Search
2526Image Retrieval
2527Scalability","NMLA Big Data / Scalability
2528NMLA Supervised Learning Other
2529VIS Image and Video Retrieval","Due to its low storage cost and fast query speed hashing has been widely adopted for similarity search in multimedia data. In particular more and more attentions have been payed to multimodal hashing for search in multimedia data with multiple modalities such as images with tags. Typically supervised information of semantic
2530labels is also available for the data points in many real applications. Hence many supervised multimodal hashing~ SMH methods have been proposed to utilize such semantic labels to further improve the search accuracy. However the training time complexity of most existing SMH methods is too high which makes them unscalable to large-scale datasets. In this paper a novel SMH method called semantic
2531correlation maximization~ SCM is proposed to seamlessly integrate semantic labels into the hashing learning procedure for large-scale data modeling. Experimental results on two real-world datasets show
2532that SCM can significantly outperform the state-of-the-art SMH methods in terms of both accuracy and scalability."
2533Role-aware Conformity Modeling and Analysis in Social Networks,Jing Zhang Jie Tang Honglei Zhuang Cane Leung and Juanzi Li,"Applications APP
2534Novel Machine Learning Algorithms NMLA ","social conformity
2535role base conformity
2536probabilistic model","APP Computational Social Science
2537NMLA Data Mining and Knowledge Discovery",Conformity is the inclination of a person to be influenced by others. In this paper we study how the conformity tendency of a person changes with her {\em role} as defined by her structural properties in a social network. We first formalize conformity using a utility function based on the conformity theory from social psychology and validate the proposed utility function by proving the existence of Nash Equilibria when all users in a network behave according to it. We then extend and incorporate the utility function into a probabilistic topic model called the Role-Conformity Model RCM for modeling user behaviors under the effect of conformity. We apply the proposed RCM to several academic research networks and discover that people with higher degree and lower clustering coefficient are more likely to conform to others. We also evaluate RCM through the task of word usage prediction in academic publications and show significant improvements over the performance of baselines.
2538Data Quality in Ontology-based Data Access The Case of Consistency,Marco Console and Maurizio Lenzerini,Knowledge Representation and Reasoning KRR ,"Ontology-based data access
2539Description Logics
2540Data Quality
2541Data Management","APP Other Applications
2542KRR Ontologies
2543KRR Description Logics
2544KRR Knowledge Representation General/Other ","Ontology-based data access OBDA is a new paradigm aiming at
2545 accessing and managing data by means of an ontology i.e. a
2546 conceptual representation of the domain of interest in the
2547 underlying information system. In the last years this new paradigm
2548 has been used for providing users with abstract independent from
2549 technological and system-oriented aspects effective and
2550 reasoning-intensive mechanisms for querying the data residing at the
2551 information system sources. In this paper we argue that OBDA
2552 besides querying data provides the right principles for devising a
2553 formal approach to data quality. In particular we concentrate on
2554 one of the most important dimensions considered both in the
2555 literature and in the practice of data quality namely
2556 consistency. We define a general framework for data consistency in
2557 OBDA and present algorithms and complexity analysis for several
2558 relevant tasks related to the problem of checking data quality under
2559 this dimension both at the extensional level content of the data
2560 sources and at the intensional level schema of the data sources ."
2561Sparse Compositional Metric Learning,Yuan Shi Aurélien Bellet and Fei Sha,Novel Machine Learning Algorithms NMLA ,"Metric Learning
2562Sparse Methods
2563Local Metric Learning
2564Multi-task Learning","NMLA Classification
2565NMLA Dimension Reduction/Feature Selection
2566NMLA Transfer Adaptation Multitask Learning",We propose a new approach for metric learning by framing it as learning a sparse combination of locally discriminative metrics that are inexpensive to generate from the training data. This flexible framework allows us to naturally derive formulations for global multi-task and local metric learning. These new algorithms have several advantages over existing methods in the literature a much smaller number of parameters to be estimated and a principled way to generalize learned metrics to new testing data points. To analyze the approach theoretically we derive a generalization bound that justifies the sparse combination. Empirically we evaluate our algorithms on several datasets against state-of-the-art metric learning methods. The results are consistent with our theoretical findings and demonstrate the superiority of our approach in terms of classification performance and scalability.
2567Improving Context and Category Matching for Entity Search,Yueguo Chen Lexi Gao Shuming Shi Xiaoyong Du and Ji-Rong Wen,AI and the Web AIW ,"entity search
2568language model
2569category matching
2570context matching
2571result re-ranking","AIW Enhancing web search and information retrieval
2572AIW Question answering on the web",Entity search is to retrieve a ranked list of named entities of target types to a given query. In this paper we propose an approach of entity search by formalizing both context matching and category matching. In addition we propose a result re-ranking strategy that can be easily adapted to achieve a hybrid of two context matching strategies. Experiments on the INEX 2009 entity ranking task show that the proposed approach achieves a significant improvement of the entity search performance xinfAP from 0.27 to 0.39 over the existing solutions.
2573ARIA Asymmetry Resistant Instance Alignment,Sanghoon Lee and Seung-Won Hwang,AI and the Web AIW ,"instance alignment
2574entity matching
2575resolution
2576deduplication
2577linkage
2578linked data",AIW AI for web services semantic descriptions planning matching and coordination,This paper studies the problem of instance alignment between knowledge bases KBs . Existing approaches exploiting the symmetry of structure and information across KBs suffer in the presence of asymmetry which is frequent as KBs are independently built. Specifically we observe three types of asymmetry -- concept feature and structural asymmetry. The goal of this paper is to identify key techniques for overcoming each type of asymmetry then build them into a framework that robustly aligns matches over asymmetry. In particular we propose an Asymmetry-Resistant Instance Alignment framework ARIA implementing two-phased blocking methods considering concept and feature asymmetry with a novel similarity measure overcoming structural asymmetry. Our evaluation results validate that this framework outperforms state-of-the-arts in terms of both response time and accuracy by increasing 18% in precision and 2% in recall in matching large-scale real-life KBs.
2579Diagnosing Analogue Linear Systems Using Dynamic Topological Reconfiguration,Alexander Feldman and Gregory Provan,Knowledge Representation and Reasoning KRR ,"model-based diagnosis
2580analogue systems
2581modeling and analysis",KRR Diagnosis and Abductive Reasoning,"Fault diagnosis of analogue linear systems is a challenging task and no fully
2582automated solution exists. Two challenges in this diagnosis task are the size of the search space that much be explored and the possibility of simulation instabilities introduced by particular fault classes. We study a novel algorithm that addresses both problems. This algorithm dynamically modifies the simulation model during diagnosis by pruning parametrized components that cause discontinuity in the model. We provide a theoretical framework for predicting the speedups which depends on the topology of the model. We empirically validate the theoretical predictions through extensive experimentation on a benchmark of circuits."
2583Simpler Bounded Suboptimal Search,Matthew Hatem and Wheeler Ruml,Heuristic Search and Optimization HSO ,"bounded suboptimal search
2584distance estimates
2585additional heuristics
2586iterative deepening
2587implementation","HSO Heuristic Search
2588HSO Evaluation and Analysis Search and Optimization ",It is commonly appreciated that solving search problems optimally can take too long. Bounded suboptimal search algorithms trade increased solution cost for reduced solving time. Explicit Estimation Search EES is a recent state-of-the-art algorithm specifically designed for bounded suboptimal search. Although it tends to expand fewer nodes than alternative algorithms such as weighted A* WA* its per-node expansion overhead is much higher causing it to sometimes take longer. In this paper we present simplified variants of EES SEES and an earlier algorithm A*epsilon SA*epsilon that use different implementations of the same motivating ideas to significantly reduce search overhead and implementation complexity. In an empirical evaluation we find that SEES like EES outperforms classic bounded suboptimal search algorithms such as WA* on domains tested where distance-to-go estimates enable better search guidance. We also confirm that while SEES and SA*epsilon expand roughly the same number of nodes as their progenitors they solve problems significantly faster and are much easier to implement. This work widens the applicability of state-of the-art bounded suboptimal search by making it easier to deploy.
2589PAC Rank Elicitation through Adaptive Sampling of Stochastic Pairwise Preferences,Róbert Busa-Fekete Balazs Szorenyi and Eyke Huellermeier,Novel Machine Learning Algorithms NMLA ,"Preference learning
2590Online learning
2591Ranking models
2592Rank elicitation
2593Sample complexity","NMLA Online Learning
2594NMLA Preferences/Ranking Learning",We introduce the problem of PAC rank elicitation which consists of sorting a given set of options based on adaptive sampling of stochastic pairwise preferences. More specifically the goal is to predict a ranking that is sufficiently close to a target order with high probability. We instantiate this setting with combinations of two different distance measures and ranking procedures for determining the target order. For these instantiations we devise efficient sampling strategies and analyze the corresponding sample complexity. We also present first experiments to illustrate the practical performance of our methods.
2595Elementary Loops Revisited,Jianmin Ji Hai Wan Peng Xiao Ziwei Huo and Zhanhao Xiao,Knowledge Representation and Reasoning KRR ,"elementary loops
2596proper loops
2597positive body-head dependency graph",KRR Logic Programming,The notions of loops and loop formulas play an important role in answer set computation. However there would be an exponential number of loops in the worst case. Recently Gebser and Schaub characterized a subclass elementary loops and showed that they are sufficient for selecting answer sets from models of a logic program. In this paper we propose an alternative definition of elementary loops. Based on the new perspective we identify a subclass of elementary loops called proper loops and show that by applying a special form of their loop formulas they are also sufficient for the SAT-based answer set computation. We also provide a polynomial algorithm to recognize a proper loop and show that for certain logic programs identifying all proper loops of a program is more efficient than identifying all elementary loops. Furthermore we prove that by considering the structure of the positive body-head dependency graph of a program a large number of loops could be ignored for identifying proper loops. Based on the observation we provide another algorithm for identifying all proper loops of a program. The experiments show that for certain programs whose dependency graphs consisting of sets of components that are densely connected inside and sparsely connected outside the new algorithm is more efficient.
2598On Computing Optimal Strategies in Open List Proportional Representation the Two Parties Case,Ning Ding and Fangzhen Lin,"Game Theory and Economic Paradigms GTEP
2599Multiagent Systems MAS ","pure Nash equilibrium
2600open list proportional representation
2601computational social choice","GTEP Game Theory
2602GTEP Social Choice / Voting
2603GTEP Equilibrium
2604MAS Evaluation and Analysis Multiagent Systems ","Open list proportional representation is an election mechanism used in many
2605elections including the 2012 Hong Kong Legislative Council
2606Geographical Constituencies election. In this paper assuming that there
2607are just two parties in the election and that the number of votes that a
2608list would get is the sum of the numbers of votes
2609that the candidates in the list would get if each of them would go alone in the election
2610we formulate the election as a mostly zero-sum game and show that while the
2611game always has a pure Nash equilibrium it is NP-hard to compute it."
2612Joint Morphological Generation and Syntactic Linearization,Linfeng Song Yue Zhang Kai Song and Qun Liu,NLP and Knowledge Representation NLPKR ,"joint method
2613natural language generation
2614meaning text theory
2615syntactic linearization
2616morphological generation",NLPKR Natural Language Processing General/Other ,"There has been a growing interest in stochastic methods
2617to natural language generation NLG . While most NL-
2618G pipelines separate morphological generation and syn-
2619tactic linearization the two tasks are closely related to
2620each other. In this paper we study joint morphological
2621generation and linearization making use of word order
2622and inflections information for both tasks and reducing
2623error propagation. Our experiments show that the join-
2624t method significantly outperforms a strong pipelined
2625baseline by 1.0 BLEU points . It also achieves the
2626best reported result on the Generation Challenge 2011
2627shared task."
2628Machine Translation with Real-time Web Search,Lei Cui Ming Zhou Qiming Chen Dongdong Zhang and Mu Li,AI and the Web AIW ,"machine translation
2629real-time web search
2630web-based machine translation
2631phrase-level translation
2632sentence-level translation
2633search snippets",AIW Human language technologies for web systems including text summarization and machine translation,Contemporary machine translation systems usually rely on offline data retrieved from the web for individual model training such as translation models and language models. Distinct from existing methods we propose a novel approach that treats machine translation as a web search task and utilizes the web on the fly to acquire translation knowledge. This end-to-end approach takes advantage of fresh web search results that are capable of leveraging tremendous web knowledge to obtain phrase-level candidates on demand and then compose sentence-level translations. Experimental results show that our web-based machine translation method demonstrates very promising performance in leveraging fresh translation knowledge and making translation decisions. Furthermore when combined with offline models it significantly outperforms a state-of-the-art phrase-based statistical machine translation system.
2634Latent Low-Rank Bi-Directional Transfer Subspace Learning,Zhengming Ding Ming Shao and Yun Fu,Novel Machine Learning Algorithms NMLA ,"transfer learning
2635latent low-rank
2636subspace learning","NMLA Classification
2637NMLA Dimension Reduction/Feature Selection
2638NMLA Transfer Adaptation Multitask Learning","We consider an interesting problem in this paper that using transfer learning in two directions to compensate missing knowledge from the target domain. Transfer learning is usually exploited as a powerful tool to mitigate the discrepancy between different databases for knowledge transfer; it can also be used for knowledge transfer between different modalities within one database. However in either case transfer learning will fail if the target data are missing. To overcome this we consider knowledge transfer between different databases and modalities simultaneously in a single framework where missing target data from one database are recovered to facilitate recognition task. We call this framework Latent Low-rank Bi-Directional Transfer Subspace Learning method L2BTSL . First we propose to use low-rank constraint as well as dictionary learning in a learned subspace to guide the knowledge transfer between and within different databases. Second a latent factor is introduced to uncover the underlying structure of the missing target data. Third bi-directional transfer learning is proposed to integrate auxiliary
2639database for transfer learning with missing target data. Experimental results of multi-modalities knowledge transfer with missing target data demonstrate that our method can successfully inherit knowledge from the auxiliary database to complete the target domain and therefore enhance the performance when recognizing data from the modality without any training data."
2640Adaptive Knowledge Transfer for Multiple Instance Learning in Image Classification,Qifan Wang Lingyun Ruan and Luo Si,Novel Machine Learning Algorithms NMLA ,"Multiple Instance Learning
2641Image Classification
2642Transfer Learning","NMLA Classification
2643NMLA Transfer Adaptation Multitask Learning
2644VIS Categorization","Multiple Instance Learning MIL is a popular learning technique in various
2645vision tasks including image classification.
2646However most existing MIL methods do not consider the problem of insufficient examples in the given target category. In this case it is difficult for traditional MIL methods to build an accurate classifier due to the lack of training examples. Motivated by the empirical success of transfer learning this paper proposes a novel approach of Adaptive Knowledge Transfer for Multiple Instance Learning AKT-MIL in image classification. The new method transfers cross-category knowledge from source categories under multiple instance setting for boosting the learning process. A unified learning framework with a data-dependent mixture model is designed to adaptively combine the transferred knowledge from sources with a weak classifier built in the target domain.
2647Based on this framework an iterative coordinate descent method with Constraint
2648Concave-Convex Programming CCCP is proposed as the optimization procedure. An extensive set of experimental results demonstrate that the proposed AKT-MIL approach substantially outperforms several state-of-the-art algorithms on two benchmark datasets especially in the scenario when very few training examples are available in the target domain."
2649Finding the k-best Equivalence Classes of Bayesian Network Structures for Model Averaging,Yetian Chen and Jin Tian,"Novel Machine Learning Algorithms NMLA
2650Reasoning under Uncertainty RU ","Bayesian network
2651Equivalence class
2652Model averaging
2653Dynamic programming","NMLA Bayesian Learning
2654NMLA Graphical Model Learning
2655RU Bayesian Networks
2656RU Graphical Models Other
2657RU Probabilistic Inference
2658RU Uncertainty Representations",In this paper we develop an algorithm to find the k-best equivalence classes of Bayesian networks. Our algorithm is capable of finding much more best DAGs than the previous algorithm that directly finds the k-best DAGs Tian He and Ram 2010 . We demonstrate our algorithm in the task of Bayesian model averaging. Empirical results show that our algorithm significantly outperforms the k-best DAG algorithm in both time and space to achieve the same quality of approximation. Our algorithm goes beyond the maximum-a-posteriori MAP model by listing the most likely network structures and their relative likelihood and therefore has important applications in causal structure discovery.
2659On the Challenges of Physical Implementations of RBMs,Vincent Dumoulin Ian J. Goodfellow Aaron Courville and Yoshua Bengio,Machine Learning Applications MLA ,"Restricted Boltzmann Machine
2660RBM
2661Hardware
2662Neuromorphic
2663Empirical
2664Deep Learning",MLA Applications of Unsupervised Learning,"Restricted Boltzmann machines RBMs are powerful machine learning models but
2665learning and some kinds of inference in the model require sampling-based
2666approximations which in classical digital computers are implemented using
2667expensive MCMC. Physical computation offers the opportunity to reduce the cost
2668of sampling by building physical systems whose natural dynamics correspond to
2669drawing samples from the desired RBM distribution. Such a system avoids the
2670burn-in and mixing cost of a Markov chain. However hardware implementations of
2671this variety usually entail limitations such as low-precision and limited range
2672of the parameters and restrictions on the size and topology of the RBM. We
2673conduct software simulations to determine how harmful each of these restrictions
2674is. Our simulations are designed to reproduce aspects of the D-Wave Two
2675computer but the issues we investigate arise in most forms of physical
2676computation.
2677Our findings suggest that designers of new physical computing hardware and
2678algorithms for physical computers should concentrate their efforts on overcoming
2679the limitations imposed by the topology restrictions of currently existing
2680physical computers."
2681SLE Signed Laplacian Embedding for Supervised Dimension Reduction,Chen Gong Dacheng Tao Jie Yang and Keren Fu,Novel Machine Learning Algorithms NMLA ,"Dimension reduction
2682Manifold learning
2683Signed graph Laplacian",NMLA Dimension Reduction/Feature Selection,Manifold learning is a powerful tool for solving nonlinear dimension reduction problems. By assuming that the high-dimensional data usually lie on a low-dimensional manifold many algorithms have been proposed. However most algorithms simply adopt the traditional graph Laplacian to encode the data locality so the discriminative ability is limited and the embedding results are not always suitable for the subsequent classification. Instead this paper deploys the signed graph Laplacian and proposes Signed Laplacian Embedding SLE for supervised dimension reduction. By exploring the label information SLE comprehensively transfers the discrimination carried by the original data to the embedded low-dimensional space. Without perturbing the discrimination structure SLE also retains the locality. Theoretically we prove the immersion property by computing the rank of projection and relate SLE to existing algorithms in the frame of patch alignment. Thorough empirical studies on synthetic and real datasets demonstrate the effectiveness of SLE.
2684Envy-Free Division of Sellable Goods,Jeremy Karp Aleksandr Kazachkov and Ariel Procaccia,Game Theory and Economic Paradigms GTEP ,"Computational social choice
2685Fair division
2686Envy-free allocation","GTEP Auctions and Market-Based Systems
2687GTEP Social Choice / Voting",We study the envy-free allocation of indivisible goods between two players. Our novel setting includes an option to sell each good for a fraction of the minimum value any player has for the good. To rigorously quantify the efficiency gain from selling we reason about the price of envy-freeness of allocations of sellable goods � the ratio between the maximum social welfare and the social welfare of the best envy-free allocation. We show that envy-free allocations of sellable goods are significantly more efficient than their unsellable counterparts.
2688Potential-Aware Imperfect-Recall Abstraction with Earth Mover's Distance in Imperfect-Information Games,Sam Ganzfried and Tuomas Sandholm,Game Theory and Economic Paradigms GTEP ,"Game Theory
2689Multiagent Systems
2690Game Solving
2691Game Abstraction
2692Imperfect Information
2693Poker","GTEP Game Theory
2694GTEP Imperfect Information",There is often a large disparity between the size of a game we wish to solve and the size of the largest instances solvable by the best algorithms; for example a popular variant of poker has about $10^{165}$ nodes in its game tree while the currently best approximate equilibrium-finding algorithms scale to games with around $10^{12}$ nodes. In order to approximate equilibrium strategies in these games the leading approach is to create a sufficiently small strategic approximation of the full game called an abstraction and to solve that smaller game instead. The leading abstraction algorithm for imperfect-information games generates abstractions that have imperfect recall and are distribution aware using $k$-means with the earth mover's distance metric to cluster similar states together. A distribution-aware abstraction groups states together at a given round if their full distributions over future strength are similar as opposed to for example just the expectation of their strength . The leading algorithm considers distributions over future strength at the final round of the game. However one might benefit by considering the distribution over strength in all future rounds not just the final round. An abstraction algorithm that takes all future rounds into account is called potential aware. We present the first algorithm for computing potential-aware imperfect-recall abstractions using earth mover's distance. Experiments on no-limit Texas Hold'em show that our algorithm improves performance over the previously best approach.
2695Partial Multi-View Clustering,Shao-Yuan Li Yuan Jiang and Zhi-Hua Zhou,"Machine Learning Applications MLA
2696Novel Machine Learning Algorithms NMLA ","machine learning
2697unsupervised learning
2698multi-view clustering
2699partial view","MLA Applications of Unsupervised Learning
2700MLA Machine Learning Applications General/other
2701NMLA Clustering
2702NMLA Unsupervised Learning Other
2703NMLA Machine Learning General/other ",Real data are often with multiple modalities or coming from multiple channels while multi-view clustering provides a natural formulation for generating clusters from such data. Previous studies assumed that each example appears in all views or at least there is one view containing all examples. In real tasks however it is often the case that every view suffers from the missing of some data and therefore results in many partial examples i.e. examples with some views missing. In this paper we present possibly the first study on partial multi-view clustering. Our proposed approach PVC works by establishing a latent subspace where the instances corresponding to the same example in different views are close to each other and the instances belonging to different examples in the same view are gathering smoothly. Experiments demonstrate the advantages of our proposed approach.
2704User Intent Identification from Online Discussions using a Joint Aspect-Action Topic Model,Ghasem Heyrani Nobari and Chua Tat Seng,"AI and the Web AIW
2705NLP and Machine Learning NLPML
2706NLP and Text Mining NLPTM
2707Novel Machine Learning Algorithms NMLA ","Web
2708Online Discussions
2709Joint Modeling
2710Topic Modeling
2711Information Extraction
2712AI","AIW Enhancing web search and information retrieval
2713AIW Knowledge acquisition from the web
2714AIW Machine learning and the web
2715NLPTM Information Extraction
2716NMLA Classification
2717NMLA Clustering
2718NMLA Data Mining and Knowledge Discovery
2719NMLA Unsupervised Learning Other ",Online discussions are growing as a popular effective and reliable source of information for users because of their liveliness flexibility and up-to-date information. Online discussions are usually developed and advanced by groups of users with various backgrounds and intents. However because of the diversities in topics and issues discussed by the users supervised methods are not able to accurately model such dynamic conditions. In this paper we propose a novel unsupervised generative model to derive aspect-action pairs from online discussions. The proposed method simultaneously captures and models these two features with their relationships that exist in each thread. We assume that each user post is generated by a mixture of aspect and action topics. Therefore we design a model that captures the latent factors that incorporates the aspect types and intended actions which describe how users develop a topic in a discussion. In order to demonstrate the effectiveness of our approach we empirically compare our model against the state of the art methods on large-scale discussion dataset crawled from apple discussions with over 3.3 million user posts from 340k discussion threads.
2720Voting with Rank Dependent Scoring Rules,Judy Goldsmith Jerome Lang Nicholas Mattei and Patrice Perny,Game Theory and Economic Paradigms GTEP ,"Computational Social Choice
2721Voting
2722Order Weighted Averages
2723Multi-agent Systems","GTEP Auctions and Market-Based Systems
2724GTEP Social Choice / Voting","Positional scoring rules in voting compute the score of an alternative by summing the
2725scores for the alternative induced by every vote. This summation principle ensures that all
2726votes contribute equally to the score of an alternative.
2727We relax this assumption and instead aggregate scores by taking into account
2728the rank of a score in the ordered list of scores obtained from the votes.
2729This defines a new family of voting rules rank-dependent scoring
2730rules RDSRs based on ordered weighted average OWA operators which include
2731scoring rules plurality k-approval and Olympic
2732averages. We study some properties of these rules and show
2733empirically that certain RDSRs are less manipulable than Borda voting
2734across a variety of statistical cultures."
2735Implementing GOLOG in Answer Set Programming,Malcolm Ryan,"Knowledge Representation and Reasoning KRR
2736Planning and Scheduling PS ","GOLOG
2737Answer Set Programming
2738Planning
2739Search Control","KRR Action Change and Causality
2740KRR Logic Programming
2741PS Planning General/Other ",In this paper we investigate four different approaches to encoding domain-dependent control knowledge for Answer-Set Planning. Starting with a standard implementation of the answer-set planning language B we add control knowledge expressed in the GOLOG logic programming language. A naive encoding following the original definitions of Reiter et al. is shown to scale poorly. We investigate three alternative codings based on the finite-state machine semantics of ConGOLOG. These perform better although there is no clear winner. We discuss the pros and cons of each approach.
2742Agent Behavior Prediction and Its Generalization Analysis,Fei Tian Haifang Li Wei Chen Tao Qin Enhong Chen and Tie-Yan Liu,"Machine Learning Applications MLA
2743Novel Machine Learning Algorithms NMLA ","generalization analysis
2744agent behavior model prediction
2745Markov Chain in Random Environments
2746empirical risk minimization algorithm","MLA Environmental
2747MLA Humanities
2748MLA Applications of Supervised Learning
2749NMLA Learning Theory",Machine learning algorithms have been applied to predict agent behaviors in real-world dynamic systems such as advertiser behaviors in sponsored search and worker behaviors in crowdsourcing and the prediction models have been used for the optimization of these systems. Note that the behavior data in these systems are generated by \emph{live} agents once the systems change due to the adoption of the prediction models learnt from the behavior data agents will observe directly or indirectly and respond to these changes by changing their own behaviors accordingly. As a result the behavior data will evolve and will not be identically and independently distributed which poses great challenges to the theoretical analysis on the machine learning algorithms for behavior prediction. To tackle this challenge in this paper we propose to use \emph{Markov Chain in Random Environments} MCRE to describe the behavior data and perform generalization analysis of the machine learning algorithms on its basis. However since the one-step transition probability matrix of MCRE depends on both previous states and the random environment conventional techniques for generalization analysis cannot be directly applied. To address this issue we propose a novel technique that transforms the original MCRE into a higher-dimensional time-homogeneous Markov chain. The new Markov chain involves more variables but is more regular and thus easier to deal with. We prove the convergence of the new Markov chain when time approaches infinity. Then we prove a generalization bound for the machine learning algorithms on the behavior data generated by the new Markov chain which depends on both the Markovian parameters and the covering number of the function class compounded by the loss function for behavior prediction and the behavior prediction model. To the best of our knowledge this is the first work that performs the generalization analysis on data generated by complex processes in real-world dynamic systems.
2750Sparse Learning for Stochastic Composite Optimization,Weizhong Zhang Lijun Zhang Yao Hu Rong Jin Deng Cai and Xiaofei He,Heuristic Search and Optimization HSO ,"Stochastic Optimization
2751Online Learning
2752Composite Gradient Mapping
2753Stochastic Gradient Descent",HSO Optimization,In this paper we focus on the Sparse Learning for Stochastic Composite Optimization SCO . Many algorithms have been proposed for SCO and they have reached the optimal convergence rate $\mathcal{O} 1/T $ recently. However the sparsity of the solutions obtained by the existing methods is unsatisfactory due to mainly two reasons 1 taking the average of the intermediate solutions as the final solution 2 the reducing of the magnitude of the sparse regularizer in the iterations. In order to improve the sparse pattern of the solutions we propose a simple but effective stochastic optimization scheme by adding a novel sparse online-to-batch conversion to the traditional algorithms for SCO. Two specific approaches are discussed in this paper to reveal the power of our scheme. The theoretical analysis shows that our scheme can find a solution with better sparse pattern without affecting the current optimal convergence rate. Experiment results on both synthetic and real-world data sets show that our proposed methods have obviously superior sparse recovery ability and have comparable convergence rate as the state-of-the-art algorithms for SCO.
2754Cross-lingual Knowledge Validation Based Taxonomy Derivation from Heterogeneous Online Wikis,Zhigang Wang Juanzi Li Shuangjie Li Mingyang Li and Jie Tang,"AI and the Web AIW
2755Knowledge Representation and Reasoning KRR
2756NLP and Knowledge Representation NLPKR ","Taxonomy Derivation
2757Knowledge Validation
2758Cross Lingual
2759Online Wikis","AIW Knowledge acquisition from the web
2760AIW Languages tools and methodologies for representing managing and visualizing semantic web data
2761AIW Ontologies and the web creation extraction evolution mapping merging and alignment; tags and folksonomies
2762KRR Knowledge Acquisition
2763KRR Ontologies
2764NLPKR Ontology Induction","Creating knowledge bases based on the crowd-sourced wikis like Wikipedia has attracted significant research interest in the field of intelligent Web. However the derived taxonomies usually contain many mistakenly imported taxonomic relations due to the difference between the user-generated subsumption relations and the semantic taxonomic relations. Current approaches to solving the problem still suffer the following issues i the heuristic-based methods strongly rely on specific language dependent rules. ii the corpus-based methods depend on a large-scale high-quality corpus which is often unavailable.
2765
2766In this paper we formulate the cross-lingual taxonomy derivation problem as the problem of cross-lingual taxonomic relation prediction. We investigate different linguistic heuristics and language independent features and propose a cross-lingual knowledge validation based dynamic adaptive boosting model to iteratively reinforce the performance of taxonomic relation prediction. The proposed approach successfully overcome the above issues and experiments show that our approach significantly outperforms the designed state-of-the-art comparison methods."
2767A Relevance-Based Compilation Method for Conformant Probabilistic Planning,Ran Taig and Ronen I. Brafman,Planning and Scheduling PS ,"Planning under uncertainty.
2768Conformant probabilistic planning.
2769Relevance based method.
2770compilation based approach.","PS Probabilistic Planning
2771PS Planning General/Other ","Conformant probabilistic planning CPP differs from conformant planning CP by two key elements the initial belief state is probabilistic
2772and the conformant plan must achieve the goal with probability $\geq\theta$ for some $0<\theta\leq 1$.
2773Taig and Brafman observed that one can reduce CPP to CP by finding a set of initial states whose probability $\geq\theta$ for which
2774a conformant plan exists. Previous solvers based on this idea used the underlying planner to select this set of states and to plan for them simultaneously.
2775Here we suggest an alternative approach Our planner starts with a separate preprocessing relevance analysis phase that determines a promising set of initial states on which to focus and then calls an off-the-shelf conformant planner to solve the resulting problem.
2776This approach has three major advantages. First we can introduce specific and efficient relevance reasoning techniques for introducing the set of initial states rather than depend on
2777the heuristic function used by the planner. Second we can benefit from various optimizations used by existing conformant planners which are unsound if applied to the original
2778CPP. Finally we have the freedom to select among different existing CP solvers. Consequently the new planner dominates previous solvers on almost all domains and scales to instances that were not solved before."
2779Incentivizing High-quality Content from Heterogeneous Users On the Existence of Nash Equilibrium,Yingce Xia Tao Qin and Tie-Yan Liu,"Game Theory and Economic Paradigms GTEP
2780Multiagent Systems MAS ","User generated content
2781Heterogeneous users
2782Equilibrium analysis","GTEP Game Theory
2783GTEP Equilibrium
2784GTEP Imperfect Information
2785MAS Mechanism Design",In this paper we study the existence of pure Nash equilibrium PNE for the mechanisms used in Internet services e.g. online reviews and question-answer websites to incentivize users to generate high-quality content. Most existing work assumes that users are homogeneous and have the same ability. However real-world users are heterogeneous and their abilities can be very different from each other due to their diverse background culture and profession. In this work we consider heterogeneous users with the following framework 1 the users are heterogeneous and each of them has a private type indicating the best quality of the content she can generate; 2 there is a fixed amount of reward to allocate to the participated users. Under this framework we study the existence of pure Nash equilibrium of several mechanisms composed by different allocation rules action spaces and information settings. We prove the existence of PNE for some mechanisms and the non-existence of PNE for some mechanisms. We also discuss how to find a PNE for those mechanisms with PNE either through a constructive way or a searching algorithm.
2786DJAO A Communication-Constrained DCOP algorithm that combines features of ADOPT and Action-GDL,Yoonheui Kim and Victor Lesser,"Heuristic Search and Optimization HSO
2787Search and Constraint Satisfaction SCS ","Distributed Problem Solving
2788DCOP
2789AND/OR search","HSO Heuristic Search
2790HSO Optimization
2791HSO Distributed Search
2792MAS Coordination and Collaboration
2793MAS Distributed Problem Solving
2794SCS Constraint Optimization
2795SCS Distributed CSP/Optimization","In this paper we propose a novel DCOP algorithm called DJAO that is able to
2796efficiently find a solution with low communication overhead; this algorithm can be used for optimal and bounded approximate solutions by appropriately setting the error bounds. Our approach builds
2797 on distributed junction trees used in Action-GDL to represent independence relations
2798among variables. We construct an AND/OR search space based on these junction trees.
2799This new type of search space results in higher degrees for each OR node consequently yielding a more efficient search graph in the distributed settings. DJAO uses a branch-and-bound search algorithm to distributedly find solutions within this search graph. We introduce a heuristics to compute the upper and lower bound estimates that the search starts with which is integral to our approach for reducing communication overhead. We empirically evaluate our approach in various settings."
2800Oversubscription Planning Complexity and Compilability,Meysam Aghighi and Peter Jonsson,Planning and Scheduling PS ,"oversubscription planning
2801computational complexity
2802compilability","PS Deterministic Planning
2803PS Planning General/Other ","Many real-world planning problems are oversubscription problems where all
2804goals are not simultaneously achievable and the planner needs to find a
2805feasible subset. We present complexity results for the so-called partial satisfaction
2806and net benefit problems under various restrictions; this extends previous work
2807by van den Briel et al. Our results reveal strong connections between these
2808problems and with classical planning. We also present a method for efficiently
2809compiling oversubscription problems into the ordinary plan existence problem;
2810this generalizes previous work by Keyder & Geffner."
2811Monte Carlo Filtering using Kernel Embedding of Distributions,Motonobu Kanagawa Yu Nishiyama Arthur Gretton and Kenji Fukumizu,Novel Machine Learning Algorithms NMLA ,"kernel method
2812kernel embedding of distributions
2813Monte Carlo filtering
2814state-space model",NMLA Kernel Methods,Recent advances of kernel methods have yielded the framework for representing probabilities using a reproducing kernel Hilbert space called kernel embedding of distributions. In this paper we propose a Monte Carlo filtering algorithm based on kernel embeddings. The proposed method is applied to state-space models where sampling from the transition model is possible while the observation model is to be learned from training samples without assuming a parametric model. We derive convergence rates for the sampling method introduced to the kernel embedding approach. Experimental results on synthetic models and a real vision-based robot localization problem confirm the effectiveness of the proposed approach.
2815New Models for Competitive Contagion,Moez Draief Hoda Heidari and Michael Kearns,Game Theory and Economic Paradigms GTEP ,"Competitive Contagion
2816Networks
2817Connectivity
2818Endogenous budgets","GTEP Game Theory
2819GTEP Equilibrium","In this paper we introduce and examine two new and natural models for competitive contagion in networks a game-theoretic generalization of the viral marketing problem. In our setting firms compete to maximize their market share in a network of consumers whose adoption decisions are stochastically determined by the choices of their neighbors.
2820
2821Building on the switching-selecting framework introduced by Goyal and Kearns 2012 we first introduce a new model in which the payoff to firms comprises not only the number of vertices who adopt their competing technologies but also the network connectivity among those nodes. For a general class of stochastic dynamics driving the local adoption process we derive upper bounds for 1 the pure strategy Price of Anarchy PoA which measures the inefficiency of resource use at equilibrium and 2 the Budget Multiplier which captures the extent to which the network amplifies the imbalances in the firms' initial budgets. These bounds depend on the firm budgets and the maximum degree of the network but no other structural properties. In addition we give general conditions under which the PoA and the Budget Multiplier can be unbounded.
2822
2823We also introduce a model in which budgeting decisions are endogenous rather than externally given as is typical in the viral marketing problem. In this setting the firms are allowed to choose the number of seeds to initially infect at a fixed cost per seed as well as which nodes to select as seeds. In sharp contrast to the results of Goyal and Kearns 2012 we show that for almost any local adoption dynamics there exists a family of graphs for which the PoA and Budget Multiplier are unbounded."
2824Hybrid Heterogeneous Transfer Learning through Deep Learning,Joey Tianyi Zhou Sinno Jialin Pan Ivor W. Tsang and Yan Yan,Novel Machine Learning Algorithms NMLA ,"transfer learning
2825domain adaptation
2826deep learning",NMLA Transfer Adaptation Multitask Learning,Most previous heterogeneous transfer learning methods learn a cross-domain feature mapping between heterogeneous feature spaces based on a few cross-domain instance-correspondences and these corresponding instances are assumed to be representatives in the source and target domains respectively. However in many real-world scenarios this assumption may not hold. As a result the feature mapping may not be constructed precisely or the mapped data from the source or target domain may suffer from a data or feature shift issue in the target or source domain. In this case a classifier trained on the labeled transformed-source-domain data may not be useful for the target domain. In this paper we present a new transfer learning framework called {\em Hybrid Heterogeneous Transfer Learning} HHTL which allows choosing the corresponding instances to be biased in either the source or target domain. Moreover we propose a deep learning approach to learn better feature representations of each domain data to reduce the data shift issue and a better feature mapping between cross-domain heterogeneous features for the accurate transfer simultaneously. Extensive experiments on several multilingual sentiment classification tasks verify the effectiveness of our proposed approach compared with the baseline methods.
2827Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs,Nguyen Duc Thien William Yeoh Hoong Chuin Lau Shlomo Zilberstein and Chongjie Zhang,"Multiagent Systems MAS
2828Search and Constraint Satisfaction SCS ","Distributed Constraint Optimization Problems
2829DCOPs
2830Reinforcement Learning","MAS Coordination and Collaboration
2831MAS Distributed Problem Solving
2832MAS Multiagent Learning
2833SCS Distributed CSP/Optimization",Researchers have introduced the Dynamic Distributed Constraint Optimization Problem Dynamic DCOP formulation to model dynamically changing multi-agent coordination problems where a dynamic DCOP is a sequence of static canonical DCOPs each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps which might not hold in some applications. Therefore in this paper we make the following contributions i We introduce a new model called Markovian Dynamic DCOPs MD-DCOPs where the DCOP in the next time step is a function of the value assignments in the current time step; ii We introduce two distributed reinforcement learning algorithms the Distributed RVI Q-learning algorithm and the Distributed R-learning algorithm that balance exploration and exploitation to solve MD-DCOPs in an online manner; and iii We empirically evaluate them against an existing multi-arm bandit DCOP algorithm on dynamic DCOPs.
2834Approximate Equilibrium and Incentivizing Social Coordination,Elliot Anshelevich and Shreyas Sekar,Game Theory and Economic Paradigms GTEP ,"Approximate Nash Equilibrium
2835Coordination Games
2836Price of Stability
2837Price of Anarchy
2838Network Games","GTEP Game Theory
2839GTEP Coordination and Collaboration
2840GTEP Equilibrium",We study techniques to incentivize self-interested agents to form socially desirable solutions in scenarios where they benefit from mutual coordination. Towards this end we consider coordination games where agents have different intrinsic preferences but they stand to gain if others choose the same strategy as them. For non-trivial versions of our game stable solutions like Nash Equilibrium may not exist or may be socially inefficient even when they do exist. This motivates us to focus on designing efficient algorithms to compute almost stable solutions like Approximate Equilibrium that can be be realized if agents are provided some additional incentives. Alternatively approximate stability corresponds to the addition of a switching cost that agents have to pay in order to deviate. Our results apply in many settings like adoption of new products project selection and group formation where a central authority can direct agents towards a strategy but agents may defect if they have better alternatives. We show that for any given instance we can either compute a high quality approximate equilibrium or a near-optimal solution that can be stabilized by providing a small fraction of the social welfare to all players. Our results imply that little influence is necessary in order to ensure that selfish players coordinate and form socially efficient solutions.
2841Qualitative Reasoning with Modelica Models,Matthew Klenk Daniel Bobrow Johan De Kleer and Bill Janssen,"Applications APP
2842Knowledge Representation and Reasoning KRR ","qualitative reasoning
2843design
2844Modelica
2845model-based reasoning","APP Other Applications
2846KRR Knowledge Representation Languages
2847KRR Qualitative Reasoning",Applications of qualitative reasoning to engineering design face a knowledge acquisition challenge. Designers are not fluent in qualitative modeling languages and techniques. To overcome this barrier we perform qualitative simulation using models solely written in Modelica a popular language among designers for modeling hybrid systems. This paper has two contributions 1 a formalization of the relationship between the results of the Modelica and qualitative simulations for the same model along with a novel algorithm for computing the consequences of events in qualitative simulation and 2 three classes of additional constraints that reduce the number of unrealizable trajectories when performing qualitative simulation with Modelica models. We support these contributions with examples and a case study that shows a reduction by a factor of six the size of the qualitative simulation.
2848Evaluating Trauma Patients Addressing Missing Covariates Using Joint Optimization,Alex Van Esbroeck Satinder Singh Ilan Rubinfeld and Zeeshan Syed,Machine Learning Applications MLA ,"missing values
2849classification
2850trauma","MLA Bio/Medicine
2851MLA Applications of Supervised Learning",Missing values are a common problem when applying classification algorithms to real-world medical data. This is especially true for trauma patients where clinical variables frequently go uncollected due to the severity and urgency of their condition. Standard approaches to handling missingness first learn a model to estimate missing data values and subsequently train and evaluate a classifier using data imputed with this model. Recently several works have demonstrated the benefits of jointly estimating the imputation model and classifier parameters. However existing approaches make assumptions that limit their utility with many real-world medical datasets particularly that data elements are missing at random which is often invalid. We present a novel approach to jointly learning the imputation model and classifier. Unlike existing methods the proposed approach makes no assumptions about the missingness of the data can be used with arbitrary probabilistic data models and classification loss functions and can be used when both training and testing data have missing values. We investigate the approach's utility in the prediction of several patient outcomes in a large national registry of trauma patients and find that it significantly outperforms standard sequential methods.
2852Tightening Bounds for Bayesian Network Structure Learning,Xiannian Fan Changhe Yuan and Brandon Malone,Reasoning under Uncertainty RU ,"Bayesian network
2853structure learning
2854lower and upper bounds","NMLA Graphical Model Learning
2855RU Bayesian Networks",A recent breadth-first branch and bound algorithm BFBnB for learning Bayesian network structures Malone et al. 2011 uses two bounds to prune the search space for better efficiency; one is a lower bound calculated from pattern database heuristics and the other is an upper bound obtained by a hill climbing search. Whenever the lower bound of a search path exceeds its upper bound the path is guaranteed to lead to suboptimal solutions and is discarded immediately. This paper introduces methods for tightening the bounds. The lower bound is tightened by using more informed variable groupings in creating the pattern databases and the upper bound is tightened using an anytime learning algorithm. Empirical results show that these bounds improve the efficiency of Bayesian network learning by two to three orders of magnitude.
2856Pathway Specification and Comparative Queries A High Level Language with Petri Net Semantics,Saadat Anwar and Chitta Baral,Knowledge Representation and Reasoning KRR ,"Biological pathways
2857Comparative queries
2858Question answering","APP Biomedical / Bioinformatics
2859KRR Knowledge Representation Languages",Understanding biological pathways is an important activity in the biological domain for drug development. Due to the parallelism and complexity inherent in pathways computer models that can answer queries about pathways are needed. A researcher may ask ``what-if'' questions comparing alternate scenarios that require deeper understanding of the underlying model. In this paper we present overview of such a system we developed and an English-like high level language to expressed pathways and queries. Our language is inspired by high level action and query languages and it uses Petri Net execution semantics.
2860Solving the Inferential Frame Problem in the General Game Description Language,Javier Romero Abdallah Saffidine and Michael Thielscher,"Game Playing and Interactive Entertainment GPIE
2861Knowledge Representation and Reasoning KRR ","General game playing
2862Inferential frame problem
2863Game description language","GPIE General Game Playing
2864KRR Action Change and Causality",The Game Description Language GDL is the standard input language for general game-playing systems. While players can gain a lot of traction by an efficient inference algorithm for GDL state-of-the-art reasoners suffer from a variant of a classical KR problem the inferential frame problem. We present a method by which general game players can transform any given game description into a representation that solves this problem. Our experimental results demonstrate that with the help of automatically generated domain knowledge a significant speedup can thus be obtained for the majority of the game descriptions from the AAAI competition.
2865Testable Implications of Linear Structural Equation Models,Bryant Chen Judea Pearl and Jin Tian,"Knowledge Representation and Reasoning KRR
2866Reasoning under Uncertainty RU ","structural equation models
2867causality
2868causal models
2869linear
2870testable implications
2871constraints
2872verma constraints
2873overidentifying constraints
2874overidentifying restrictions
2875graphical models","KRR Action Change and Causality
2876RU Graphical Models Other ",In causal inference all methods of model learning rely on testable implications namely properties of the joint distribution that are dictated by the model structure. These constraints if not satisfied in the data allow us to reject or modify the model. Most common methods of testing a linear structural equation model SEM rely on the likelihood ratio or chi-square test which simultaneously tests all of the restrictions implied by the model. Local constraints on the other hand offer increased power Bollen and Pearl 2013; McDonald 2002 and in the case of failure provide the modeler with insight for revising the model specification. One strategy of uncovering local constraints in linear SEMs is to search for overidentified path coefficients. While these overidentifying constraints are well known no method has been given for systematically discovering them. In this paper we extend the half-trek criterion of Foygel et al. 2012 to identify a larger set of structural coefficients and use it to systematically discover overidentifying constraints. Still open is the question of whether our algorithm is complete.
2877Manifold Spanning Graphs,Cj Carey and Sridhar Mahadevan,Novel Machine Learning Algorithms NMLA ,"graph construction
2878manifold learning
2879topology","NMLA Dimension Reduction/Feature Selection
2880NMLA Unsupervised Learning Other ",Graph construction is the essential first step for nearly all manifold learning algorithms. While many applications assume that a simple k-nearest or epsilon-close neighbors graph will accurately model the topology of the underlying manifold these methods often require expert tuning and may not produce high quality graphs. In this paper the hyperparameter sensitivity of existing graph construction methods is demonstrated. We then present a new algorithm for unsupervised graph construction based on minimal assumptions about the input data and its manifold structure. Notably this method requires no hyperparameter tuning.
2881Learning Relative Similarity by Stochastic Dual Coordinate Ascent,Pengcheng Wu Yi Ding Peilin Zhao Steven C.H. Hoi and Chunyan Miao,"Machine Learning Applications MLA
2882Novel Machine Learning Algorithms NMLA ","distance metric learning
2883similarity learning
2884online learning
2885retrieval","MLA Applications of Supervised Learning
2886MLA Machine Learning Applications General/other
2887NMLA Online Learning
2888NMLA Supervised Learning Other
2889VIS Image and Video Retrieval
2890VIS Statistical Methods and Learning",Learning relative similarity from pairwise instances is an important problem in machine learning and potentially very useful for many applications such as image and text retrieval. Despite being studied for years some existing methods solved by Stochastic Gradient Descent SGD techniques generally suffer from slow convergence. In this paper we investigate the application of Stochastic Dual Coordinate Ascent SDCA technique to tackle the optimization task of relative similarity learning by extending from vector to matrix parameters. Theoretically we prove the optimal linear convergence rate for the proposed SDCA algorithm beating the well-known sublinear convergence rate by the previous best metric learning algorithms. Empirically we conduct extensive experiments on both standard and large-scale data sets to validate the effectiveness of the proposed algorithm for retrieval tasks.
2891locality preserving projection via multi-objective learning for domain adaptation,Le Shu Tianyang Ma and Longin Latecki,"Knowledge Representation and Reasoning KRR
2892Machine Learning Applications MLA
2893Novel Machine Learning Algorithms NMLA ","locality preserving projection
2894multi-objective learning
2895domain adaptation","KRR Knowledge Representation General/Other
2896NMLA Transfer Adaptation Multitask Learning",In many practical cases we need to generalize a model trained in a source domain to a new target domain.However the distribution of these two domains may differ very significantly especially sometimes some crucial target features may not have support in the source domain. This paper propose a novel locality preserving projection methods for domain adaptation taskwhich can find a linear mapping preserving the 'intrinsic structure' for both the source and the target domain. In this work we first construct two graphs encoding the neighborhood information for the source domain and target domain separately. We then try to find linear projection coefficients which have the property of locality preserving for each graph.Instead of combing the two objective function under compatibility assumption and requiring the user to decide the importance of each objective function. We propose a multi-objective formulation for this problem and solve it simultaneously using pareto optimization.The pareto frontier captures all possible good linear projection vectors that are prefered by one or more objectives.The effectiveness of our approach is justified by both theoretical analysis and empirical results on real world data sets. The new feature representation shows better prediction accuracy as our experiment clearly demonstrated.
2897Internally Stable Kidney Exchange,Yicheng Liu Pingzhong Tang and Wenyi Fang,Multiagent Systems MAS ,"Kidney exchange
2898Stable matching
2899Maximum weighted matching","GTEP Auctions and Market-Based Systems
2900MAS Mechanism Design","Stability is a central concept in matching-based mechanism design. It imposes a fundamental requirement that no subset of agents could beneficially deviate from the prescribed outcome. However deployment of stability in the current design of kidney exchange mechanisms presents at least two challenges. First it reduces social welfare of the mechanism and sometimes prevent the mechanism from producing any outcome at all. Second it sometimes incurs computation cost to clear the mechanism.
2901
2902In this paper we propose an alternative notion of stability. Our theoretical and experimental studies demonstrate that the new notion of stability addresses both challenges above and could be deployed in the current kidney exchange design."
2903Intra-view and Inter-view Supervised Correlation Analysis for Multi-view Feature Learning,Xiao-Yuan Jing Rui-Min Hu Yang-Ping Zhu Shan-Shan Wu Chao Liang and Jing-Yu Yang,Novel Machine Learning Algorithms NMLA ,"Canonical correlation analysis CCA
2904Multi-view supervised feature learning
2905Inter-view and intra-view supervised correlation analysis I2SCA
2906Analytical solution
2907Kernelized extension","NMLA Dimension Reduction/Feature Selection
2908NMLA Supervised Learning Other
2909VIS Statistical Methods and Learning",The object always can be observed at multiple views and multi-view feature learning is an attractive research topic with great practical success. Canonical correlation analysis CCA has become an important technique in multi-view learning since it can fully utilize the inter-view correlation. In this paper we mainly study the CCA based multi-view supervised feature learning technique where the labels of training samples are known. Several supervised CCA based multi-view methods have been presented which focus on investigating the supervised correlation across different views. However they take no account of the intra-view correlation between samples. Researchers have also introduced the discriminant analysis technique into multi-view feature learning such as multi-view discriminant analysis MvDA . But they ignore the canonical correlation within each view and between all views. In this paper we propose a novel multi-view feature learning approach based on intra-view and inter-view supervised correlation analysis I2SCA which can explore the useful correlation information of samples within each view and between all views. The objective function of I2SCA is designed to simultaneously extract the discriminatingly correlated features from both inter-view and intra-view. It can obtain an analytical solution without iterative calculation. And we provide a kernelized extension of I2SCA to tackle the linearly inseparable problem in the original feature space. Three widely-used datasets are employed as test data. Experimental results demonstrate that our proposed approaches outperform several representative multi-view supervised feature learning methods.
2910Finding Median Point-Set Using Earth Mover's Distance,Hu Ding and Jinhui Xu,"Heuristic Search and Optimization HSO
2911Machine Learning Applications MLA ","Prototype learning
2912Earth mover's distance
2913Optimization","HSO Heuristic Search
2914HSO Optimization
2915HSO Evaluation and Analysis Search and Optimization
2916MLA Applications of Unsupervised Learning
2917MLA Machine Learning Applications General/other
2918NMLA Evaluation and Analysis Machine Learning
2919NMLA Unsupervised Learning Other
2920NMLA Machine Learning General/other ",In this paper we study a prototype learning problem called {\em Median Point-Set} whose objective is to construct a prototype for a set of given point-sets so as to minimize the total earth mover's distances EMD between the prototype and the point-sets where EMD between two point-sets is measured under affine transformation. For this problem we present the first purely geometric approach. Comparing to existing graph-based approaches {\em e.g.} median graph shock graph our approach has several unique advantages 1 No encoding and decoding procedures are needed to map between objects and graphs and therefore avoid errors caused by information losing during the mappings; 2 Staying only in the geometric domain makes our approach computationally more efficient and robust to noise. As a key ingredient of our approach we present the first quality guaranteed algorithm for minimizing EMD between two point-sets under affine transformation. We evaluate the performance of our technique for prototype reconstruction on a random dataset and a benchmark dataset handwriting Chinese characters. Experiments suggest that our technique considerably outperforms the existing graph-based methods.
2921Preprocessing for Propositional Model Counting,Jean Marie Lagniez and Pierre Marquis,Search and Constraint Satisfaction SCS ,"model counting
2922preprocessing
2923propositional
2924compilation
2925direct model counting","SCS SAT and CSP Solvers and Tools
2926SCS Constraint Satisfaction General/other ",This paper is concerned with preprocessing techniques for propositional model counting. We have implemented a preprocessor which includes many elementary preprocessing techniques including occurrence reduction vivification backbone identification as well as equivalence AND and XOR gate identification and replacement. We performed intensive experiments using a huge number of benchmarks coming from a large number of families. Two approaches to model counting have been considered downstream �direct � model counting using Cachet and compilation-based model counting based on the C2D compiler. The experimental results we have obtained show that our preprocessor is both efficient and robust.
2927Propagating Regular Counting Constraints,Nicolas Beldiceanu Pierre Flener Justin Pearson and Pascal Van Hentenryck,Search and Constraint Satisfaction SCS ,"constraints over a sequence
2928finite automata
2929counting
2930propagators
2931domain consistency","SCS Constraint Satisfaction
2932SCS Global Constraints",Constraints over finite sequences of variables are ubiquitous in sequencing and timetabling. This led to general modeling techniques and generic propagators often based on deterministic finite automata DFA and their extensions. We consider counter-DFAs cDFA which provide concise models for regular counting constraints that is constraints over the number of times a regular-language pattern occurs in a sequence. We show how to enforce domain consistency in polynomial time for atmost and atleast regular counting constraints based on the frequent case of a cDFA with only accepting states and a single counter that can be increased by transitions. We also show that the satisfaction of exact regular counting constraints is NP-hard and that an incomplete propagator for exact regular counting constraints is faster and provides more pruning than the existing propagator from Beldiceanu Carlsson and Petit 2004 . Finally by avoiding the unrolling of the cDFA used by COSTREGULAR the space complexity reduces from O n|\Sigma||Q| to O n |\Sigma|+|Q| where \Sigma is the alphabet and Q the state set of the cDFA.
2933SOML Sparse Online Metric Learning with Application to Image Retrieval,Xingyu Gao Steven C.H. Hoi Yongdong Zhang Ji Wan and Jintao Li,Machine Learning Applications MLA ,"Sparse Online Learning
2934Distance Metric Learning
2935Image Retrieval",MLA Applications of Supervised Learning,"Image similarity search plays a key role in many multimedia
2936applications where multimedia data such as images and videos are
2937usually represented in high-dimensional feature space. In this
2938paper we propose a novel Sparse Online Metric Learning SOML
2939scheme for learning sparse distance functions from large-scale
2940high-dimensional data and explore its application to image
2941retrieval. In contrast to many existing distance metric learning
2942algorithms that are often designed for low-dimensional data the
2943proposed algorithms are able to learn sparse distance metrics from
2944high-dimensional data in an efficient and scalable manner. Our
2945experimental results show that the proposed method achieves better
2946or at least comparable accuracy performance than the
2947state-of-the-art non-sparse distance metric learning approaches but
2948enjoys a significant advantage in computational efficiency and
2949sparsity making it more practical for real-world applications."
2950Strategyproof exchange with multiple private endowments,Taiki Todo Haixin Sun and Makoto Yokoo,Game Theory and Economic Paradigms GTEP ,"Mechanism design
2951Exchange
2952Manipulation
2953Strategyproofness","GTEP Auctions and Market-Based Systems
2954GTEP Game Theory
2955MAS E-Commerce
2956MAS Mechanism Design","We study a mechanism design problem for exchange economies where each agent is initially endowed with a set of indivisible goods and side payments are not allowed. We assume each agent can withhold some endowments as well as misreport her preference. Under this assumption strategyproofness requires that for each agent reporting her true preference with revealing all her endowments is a dominant strategy and thus implies individual rationality.
2957Our objective in this paper is to analyze the effect of such private ownership in exchange economies with multiple endowments. As fundamental results we first show that the revelation principle holds under a natural assumption and that strategyproofness and Pareto efficiency are incompatible even under the lexicographic preference domain. We then propose a class of exchange rules each of which has a corresponding directed graph to prescribe possible trades and provide a necessary and sufficient condition on the graph structure so that the rule satisfies strategy-proofness."
2958Mechanism design for mobile geo-location advertising,Nicola Gatti Marco Rocco Sofia Ceppi and Enrico H. Gerding,"Game Theory and Economic Paradigms GTEP
2959Multiagent Systems MAS ","Mechanism Design
2960Auctions
2961Computational Advertising
2962Mobile Advertising
2963Game Theory cooperative and non cooperative ","GTEP Auctions and Market-Based Systems
2964GTEP Game Theory
2965MAS Mechanism Design",Mobile geo-location advertising where mobile ads are targeted based on a user's location has been identified as a key growth factor for the mobile market. As with online advertising a crucial ingredient for their success is the development of effective economic mechanisms. An important difference is that mobile ads are shown sequentially over time and information about the user can be learned based on their movements. Furthermore ads need to be shown selectively to prevent ad fatigue. To this end we introduce for the first time a user model and suitable economic mechanisms which take these factors into account. Specifically we design two truthful mechanisms which produce an advertisement plan based on the user's movements. One mechanism is allocatively efficient but requires exponential compute time in the worst case. The other requires polynomial time but is not allocatively efficient. Finally we experimentally evaluate the trade-off between compute time and efficiency of our mechanisms.
2966A Propagator Design Framework for Constraints over Sequences,Jean-Noël Monette Pierre Flener and Justin Pearson,Search and Constraint Satisfaction SCS ,"Constraint Programming
2967Global Constraints
2968Stepwise Refinement
2969Tuple Variables","SCS Constraint Satisfaction
2970SCS Global Constraints",Constraints over variable sequences are ubiquitous and many of their propagators have been inspired by dynamic programming DP . We propose a conceptual framework for designing such propagators pruning rules are refined upon the application of transformation operators to a DP-style formulation of a constraint; a representation of the variable domains is picked; and a coordination of the pruning rules is picked.
2971Regression Model and Privacy Preserved Learning by Matrix Completion,Jinfeng Yi Jun Wang and Rong Jin,"Applications APP
2972Machine Learning Applications MLA
2973Novel Machine Learning Algorithms NMLA ","Regression analysis
2974Privacy
2975Matrix completion","APP Biomedical / Bioinformatics
2976APP Computational Social Science
2977APP Security and Privacy
2978MLA Bio/Medicine
2979MLA Applications of Supervised Learning
2980MLA Machine Learning Applications General/other
2981MAS Mechanism Design
2982NMLA Data Mining and Knowledge Discovery
2983NMLA Feature Construction/Reformulation
2984NMLA Machine Learning General/other
2985RU Uncertainty in AI General/Other ",Sensitive data such as medical records and business reports usually contains valuable information that can be used to build prediction models. However designing learning models by directly using sensitive data might result in severe privacy and copyright issues. In this paper we propose a novel matrix completion based framework that is able to handle two challenging issues simultaneously i recovering missing and noisy sensitive data and ii preserving the privacy of the sensitive data during the learning process. In particular the proposed framework is able to mask the sensitive data while ensuring that the transformed data are still usable for training regression models. We show that two key properties namely \emph{model preserving} and \emph{privacy preserving} are satisfied by the transformed data obtained from the proposed framework. In \emph{model preserving} we guarantee that the linear regression model built from the masked data approximates the regression model learned from the original data in a perfect way. In \emph{privacy preserving} we ensure that the original sensitive data cannot be recovered since the transformation procedure is irreversible. Given these two characteristics the transformed data can be safely released to any learners for designing prediction models without revealing any private content. Our empirical studies with a synthesized dataset and multiple sensitive benchmark datasets verify our theoretical claim as well as the effectiveness of the proposed framework.
2986A Constructive Argumentation Framework,Souhila Kaci and Yakoub Salhi,,"Logic-based Argumentation
2987Constructive Arguments
2988Intuitionistic Logic",KRR Argumentation,Dung s argumentation framework is an abstract framework based on a set of arguments and a binary attack relation defined over the set. One instantiation among many others of Dung s framework consists in constructing the arguments from a set of propositional logic formulas. Thus an argument is seen as a reason for or against the truth of a particular statement. Despite its advantages the argumentation approach for inconsistency handling also has important shortcomings. More precisely in some applications what one is interested in are not so much only the conclusions supported by the arguments but also to the precise explications of such conclusions. We show that argumentation framework applied to classical logic formulas is not suitable to deal with this problem. On the other hand intuitionistic logic appears to be a natural alternative candidate logic instead of classical logic to instantiate Dung s framework. We develop constructive argumentation framework. We show that intuitionistic logic offers nice and desirable properties of the arguments. We also provide a characterization of the arguments in this setting in terms of minimal inconsistent subsets when intuitionistic logic is embedded in the modal logic S4.
2989Point-based POMDP solving with factored value function approximation,Tiago Veiga Matthijs Spaan and Pedro Lima,"Planning and Scheduling PS
2990Reasoning under Uncertainty RU ","POMDP
2991Value function approximation
2992Point-based methods","PS Probabilistic Planning
2993PS Planning General/Other
2994RU Sequential Decision Making
2995RU Uncertainty in AI General/Other ",Partially observable Markov decision processes POMDPs provide a principled mathematical framework for modeling autonomous decision-making problems. POMDP solutions are often represented by a value function comprised of a set of vectors. In the case of factored models the size of these vectors grows exponentially with the number of state factors leading to scalability issues. We consider an approximate value function representation based on a linear combination of basis functions. In particular we present a backup operator that can be used in any point-based POMDP solvers. Furthermore we show how independence between observation factors can be exploited for large computational gains. We experimentally verify our contributions and show that they can improve point-based methods in policy quality and solution size.
2996A Multiarmed Bandit Incentive Mechanism for Crowdsourcing Demand Response in Smart Grids,Shweta Jain Balakrishnan Narayanaswamy and Yadati Narahari,"Computational Sustainability and AI CSAI
2997Game Theory and Economic Paradigms GTEP
2998Human-Computation and Crowd Sourcing HCC ","CSAI Modeling the interactions of agents with different and often conflicting interests
2999GTEP Auctions and Market-Based Systems
3000HCC Game-theoretic mechanism design of incentives for motivation and honest reporting","CSAI Modeling the interactions of agents with different and often conflicting interests
3001GTEP Auctions and Market-Based Systems
3002HCC Game-theoretic mechanism design of incentives for motivation and honest reporting","Demand response is a critical part of renewable integration and energy cost reduction goals across the world. Motivated by the need to reduce costs arising from electricity shortage and renewable energy fluctuations we propose a novel multiarmed bandit mechanism for demand response MAB-MDR which makes monetary offers to strategic consumers who have unknown response characteristics. Our work is inspired by connection to and intuition from crowdsourcing mechanisms. The proposed mechanism incorporates realistic features such as a time varying and quadratic cost function. The mechanism marries auctions that allow users to report their preferences with online algorithms that allow distribution companies to learn user-specific parameters. We show that MAB-MDR is dominant strategy incentive compatible individually rational and achieves sublinear regret.
3003Such mechanisms can be effectively deployed in smart grids using new information and control architecture innovations
3004and lead to welcome savings in energy costs."
3005Binary Aggregation by Selection of the Most Representative Voter,Ulle Endriss and Umberto Grandi,"Game Theory and Economic Paradigms GTEP
3006Knowledge Representation and Reasoning KRR
3007Multiagent Systems MAS ","Computational Social Choice
3008Approximation
3009Judgment Aggregation","GTEP Social Choice / Voting
3010KRR Preferences
3011MAS Multiagent Systems General/other ","In binary aggregation a group of voters each express yes/no choices
3012regarding a number of possibly correlated issues and we are asked to
3013decide on a collective choice that accurately reflects the views of this
3014group. A good collective choice will minimise the distance to each of
3015the individual choices but using such a distance-based aggregation rule
3016is computationally intractable. Instead we explore a class of
3017low-complexity aggregation rules that select the most representative
3018voter in any given situation and return that voter's choice as the
3019collective outcome."
3020Pairwise-Covariance Linear Discriminant Analysis,Deguang Kong Chris Ding and Qihe Pan,"Applications APP
3021Machine Learning Applications MLA ","linear discriminant analysis
3022gradient
3023pairwise
3024covariance",,"In machine learning linear discriminant analysis LDA is a
3025popular dimension reduction method. In this paper we first pro-
3026vide a new perspective of LDA from information theory per-
3027spective. From this new perspective we propose a new formula-
3028tion of LDA which uses the pairwise averaged class covariance
3029instead of the globally averaged class covariance used in stan-
3030dard LDA. This pairwise averaged covariance describes data
3031distribution more accurately. The new perspective also provides
3032a natural way to properly weight different pairwise distances
3033which emphasizes the pairs of class with small distances and
3034this leads to the proposed pairwise covariance properly weight-
3035ed LDA pcLDA . The kernel version of pcLDA is presented to
3036handle nonlinear projections. Efficient algorithms are presented
3037to efficiently compute the proposed models."
3038State Aggregation in Monte Carlo Tree Search,Jesse Hostetler Alan Fern and Tom Dietterich,"Planning and Scheduling PS
3039Reasoning under Uncertainty RU ","markov decision process
3040monte carlo tree search
3041state abstraction","PS Probabilistic Planning
3042RU Sequential Decision Making",Monte Carlo tree search MCTS is a popular class of algorithms for online decision making in large Markov decision processes MDPs . The effectiveness of these algorithms however often deteriorates for MDPs with high stochastic branching factors. In this paper we study state aggregation as a way of reducing stochastic branching in tree search. Prior work has studied formal properties of MDP state aggregation in the context of dynamic programming and reinforcement learning but little attention has been paid to state aggregation in MCTS. Our main contribution is to establish basic results about the optimality-preserving properties of state aggregation for search trees. We then apply these results to show that popular MCTS algorithms such as UCT and sparse sampling can employ fairly coarse state aggregation schemes while retaining their theoretical properties. As a proof of concept we experimentally confirm that state aggregation in MCTS improves finite-sample performance.
3043Saturated Path-Constrained MDP Planning under Uncertainty and Deterministic Model-Checking Constraints,Jonathan Sprauel Andrey Kolobov and Florent Teichteil-Königsbuch,"Planning and Scheduling PS
3044Reasoning under Uncertainty RU ","Safe and Optimal Controller Synthesis
3045Uncertainty and Stochasticity
3046Planning under Uncertainty
3047Model-Checking PCTL Constraints
3048Path-Constrained Markov Decision Processes","PS Markov Models of Environments
3049PS Probabilistic Planning
3050PS Planning General/Other
3051RU Sequential Decision Making
3052SCS Constraint Satisfaction General/other ",In many probabilistic planning scenarios a system's behavior needs to not only maximize the expected utility but also obey certain restrictions. This paper presents Saturated Path-Constrained Markov Decision Processes SPC MDPs a new MDP type for planning under uncertainty with deterministic model-checking constraints e.g. state s must be visited before s' the system must end up in s or the system must never enter s . We present a mathematical analysis of SPC MDPs showing that although SPC MDPs generally have no optimal policies every instance of this class has an epsilon-optimal randomized policy for any epsilon > 0. We propose a dynamic programming-based algorithm for finding such policies and empirically demonstrate this algorithm to be orders of magnitude faster than its next-best alternative.
3053Bounding the Support Size in Extensive Form Games with Imperfect Information,Martin Schmid Matej Moravcik and Milan Hladik,Game Theory and Economic Paradigms GTEP ,"Game theory
3054Nash equilibrium
3055Support
3056Extensive form games
3057Bayesian extensive games
3058Poker
3059Equilibrium preserving transformation","GTEP Game Theory
3060GTEP Equilibrium
3061GTEP Imperfect Information",It's a well known fact that in extensive form games with perfect information there is a Nash equilibrium with support of size one. This doesn't hold for games with imperfect information where the size of minimal support can be larger. We present a dependency between the level of uncertainty and the minimum support size. For many games there is a big disproportion between the game uncertainty and the number of actions available. In Bayesian extensive games with perfect information the only uncertainty is about the type of players. In card games the uncertainty comes from dealing the deck. In these games we can significantly reduce the support size. Our result applies to general-sum extensive form games with any finite number of players.
3062Exploiting Support Sets for Answer Set Programs with External Evaluations,Thomas Eiter Michael Fink Christoph Redl and Daria Stepanova,Knowledge Representation and Reasoning KRR ,"Answer Set Programming
3063External Sources
3064Description Logic Programs","KRR Ontologies
3065KRR Description Logics
3066KRR Knowledge Representation Languages
3067KRR Logic Programming
3068KRR Nonmonotonic Reasoning
3069KRR Knowledge Representation General/Other ",Answer set programs ASP with external evaluations are a declarative means to capture advanced applications. However their evaluation can be expensive due to external source accesses. In this paper we consider hex-programs that provide external atoms as a bidirectional interface to external sources and present a novel evaluation method based on support sets which informally are portions of the input to an external atom that will determine its output for any completion of the partial input. Support sets allow one to shortcut the external source access which can be completely eliminated. This is particularly attractive if a compact representation of suitable support sets is efficiently constructible. We discuss some applications with this property among them description logic programs over DL-Lite ontologies and present experimental results showing that support sets can significantly improve efficiency.
3070Towards Topological-transformation Robust Shape Comparison A Sparse Representation Based Manifold Embedding Approach,Longwen Gao and Shuigeng Zhou,Vision VIS ,"Shape Comparison
3071Manifold Embedding
3072Sparse Representation",VIS Object Recognition,Non-rigid shape comparison based on manifold embedding using Generalized Multidimensional Scaling GMDS has attracted a lot of attention for its high accuracy. However this method requires that shape surface is not elastic. In other words it is sensitive to topological transformations such as stretching and compressing. To tackle this problem we propose a new approach that constructs a high-dimensional space to embed the manifolds of shapes which could completely withstand rigid transformations and considerably tolerate topological transformations. Experiments on TOSCA shapes validate the proposed approach.
3073Uncorrelated Multi-view Fisher Discrimination Dictionary Learning For Recognition,Xiao-Yuan Jing Rui-Min Hu Fei Wu Xi-Lin Chen Qian Liu and Yong-Fang Yao,Vision VIS ,"Fisher discrimination Dictionary learning FDDL
3074Multi-view FDDL MFDDL
3075Uncorrelated MFDDL UMFDDL
3076Uncorrelated constraint","NMLA Supervised Learning Other
3077VIS Categorization
3078VIS Object Recognition",Dictionary learning DL has now become an important feature learning technique that owns state-of-the-art recog-nition performance. Due to sparse characteristic of data in real-world applications dictionary learning uses a set of learned dictionary bases to represent the linear decomposi-tion of a data point. Fisher discrimination DL FDDL is a representative supervised DL method which constructs a structured dictionary whose atoms correspond to the class labels. Recent years have witnessed a growing interest in multi-view more than two views feature learning tech-niques. Although some multi-view or multi-modal DL methods have been presented there still exists much room for improvement. How to enhance the total discriminability of dictionaries and reduce their redundancy is a crucial re-search topic. To boost the performance of multi-view dic-tionary learning technique we propose an uncorrelated multi-view fisher discrimination DL UMFDDL approach for recognition. By making dictionary atoms correspond to the class labels such that the obtained reconstruction error is discriminative UMFDDL aims to jointly learn multiple dic-tionaries with totally favorable discriminative power. Fur-thermore we design the uncorrelated constraint for multi-view DL so as to reduce the redundancy among dictionaries learned from different views. Experiments on several public datasets demonstrate the effectiveness of the proposed approach.
3079Semantic Data Representation for Improving Tensor Factorization,Makoto Nakatsuji Yasuhiro Fujiwara Hiroyuki Toda Hiroshi Sawada Jin Zheng and James Hendler,"AI and the Web AIW
3080Novel Machine Learning Algorithms NMLA ","Collaborative Filtering
3081Recommender System
3082Tensor Factorization
3083Rating prediction
3084Bayesian probabilistic tensor factorization
3085Linked Open Data
3086Taxonomy
3087Semantics on the web","AIW Exploiting Linked Open Data
3088AIW Ontologies and the web creation extraction evolution mapping merging and alignment; tags and folksonomies
3089AIW Web-based recommendation systems
3090NMLA Data Mining and Knowledge Discovery
3091NMLA Recommender Systems","Predicting human activities is important for improving recommendation
3092 systems or analyzing social relationships among users. Those human
3093 activities are usually represented as multi-object relationships
3094 e.g. user's tagging activity for items or user's tweeting activity at
3095 some location . Since multi-object relationships are naturally
3096 represented as a tensor tensor factorization is becoming more
3097 important for predicting users' possible activities. However the
3098 prediction accuracy of tensor factorization is weak for ambiguous
3099 and/or sparsely observed objects. Our solution Semantic data
3100 Representation for Tensor Factorization SRTF tackles these problems
3101 by incorporating semantics into tensor factorization based on the
3102 following ideas 1 it links objects to vocabularies/taxonomies and
3103 resolves the ambiguity caused by objects that can be used for multiple
3104 purposes. 2 it links objects to composite classes that merge classes
3105 in different kinds of vocabularies/taxonomies e.g. classes in
3106 vocabularies for movie genres and those for directors to avoid low
3107 prediction accuracy caused by rough-grained semantic space. 3 it
3108 lifts sparsely observed objects into their classes to solve the
3109 sparsity problem for rarely observed objects.
3110 Experiments show that SRTF achieves 10\% higher accuracy than current
3111 best methods."
3112Local-To-Global Consistency Implies Tractability of Abduction,Michał Wrona,Knowledge Representation and Reasoning KRR ,"Diagnosis and Abductive Reasoning
3113Spatial and Temporal Reasoning
3114Local Consistency
3115Computational Complexity","KRR Computational Complexity of Reasoning
3116KRR Diagnosis and Abductive Reasoning
3117KRR Geometric Spatial and Temporal Reasoning
3118KRR Nonmonotonic Reasoning
3119KRR Qualitative Reasoning","Abduction is a form of nonmonotonic reasoning that looks for an explanation
3120built from a given set of hypotheses
3121for an observed manifestation according to some knowledge base.
3122Following the concept behind the Schaefer's parametrization
3123CSP Gamma of the Constraint Satisfaction Problem CSP
3124we study here the complexity of the abduction problem
3125Abduction Gamma Hyp M parametrized by certain omega-categorical infinite
3126relational structures Gamma Hyp and M
3127from which a knowledge base hypotheses and a manifestation are built respectively.
3128
3129We say that Gamma has local-to-global consistency if
3130there is k such that establishing strong k-consistency on an instance of CSP Gamma yields a globally consistent
3131 whose every solution may be obtained straightforwardly from partial solutions set of constraints.
3132In this case CSP Gamma is solvable in polynomial time.
3133Our main contribution is an algorithm that under some natural conditions
3134decides Abduction Gamma Hyp M in P when Gamma
3135has local-to-global consistency.
3136
3137As we show in the number of examples our approach offers
3138an opportunity to consider abduction
3139in the context of spatial and temporal reasoning qualitative calculi such as Allen's
3140interval algebra or RCC-5 and that our procedure solves some related abduction problems in polynomial time."
3141Generalized Higher-Order Tensor Decomposition via Parallel ADMM,Fanhua Shang Yuanyuan Liu and James Cheng,Novel Machine Learning Algorithms NMLA ,"Tensor decomposition
3142Higher-order orthogonal iteration
3143Parallel Optimization","NMLA Dimension Reduction/Feature Selection
3144NMLA Feature Construction/Reformulation
3145NMLA Unsupervised Learning Other ",Higher-order tensors are becoming prevalent in many scientific areas such as computer vision social network analysis data mining and neuroscience. Traditional tensor decomposition approaches face three major challenges model selecting gross corruptions and computational efficiency. To address these problems we first propose a parallel trace norm regularized tensor decomposition method and formulate it as a convex optimization problem. This method does not require the rank of each model to be specified beforehand and can automatically determine the number of factors in each mode through our optimization scheme. By considering the low-rank structure of the observed tensor we analyze the equivalent relationship of the trace norm between a low-rank tensor and its core tensor. Then we cast a non-convex tensor decomposition model into a weighted combination of multiple much smaller-scale matrix trace norm minimization. Finally we develop two parallel alternating direction methods of multipliers ADMM to solve the proposed problems. Experimental results verify that our regularized formulation is reasonable and our methods are very robust to noise or outliers.
3146Abduction Framework for Repairing Incomplete EL Ontologies Complexity Results and Algorithms,Fang Wei-Kleiner Zlatan Dragisic and Patrick Lambrix,"AI and the Web AIW
3147Knowledge Representation and Reasoning KRR ","ontology debugging
3148ontology engineering
3149description logics","AIW Ontologies and the web creation extraction evolution mapping merging and alignment; tags and folksonomies
3150KRR Ontologies
3151KRR Description Logics
3152KRR Diagnosis and Abductive Reasoning","In this paper we consider the problem of repairing missing is-a relations in ontologies. We formalize the problem as a generalized TBox abduction problem GTAP .
3153Based on this abduction framework we present complexity results for the existence relevance and necessity decision problems for the GTAP with and without some
3154specific preference relations for ontologies that can be represented using a member of the EL family of description logics. Further we present an algorithm for finding solutions a system as well as experiments."
3155Mixing-time Regularized Policy Gradient,Tetsuro Morimura Takayuki Osogami and Tomoyuki Shirai,Novel Machine Learning Algorithms NMLA ,"Reinforcement learning
3156Policy gradient
3157Markov chain mixing time","CS Problem solving and decision making
3158NMLA Reinforcement Learning
3159RU Sequential Decision Making",Policy gradient reinforcement learning PGRL methods have received substantial attention as a mean for seeking stochastic policies that maximize a cumulative reward. However PGRL methods can often take a huge number of learning steps before it finds a reasonable stochastic policy. This learning speed depends on the mixing time of the Markov chains that are given by the policies that PGRL explores. In this paper we give a new PGRL approach that regularizes the rule of updating the policy with the hitting time that bounds the mixing time. In particular hitting-time regressions based on temporal-difference learning are proposed. The proposed approach will keep the Markov chain compact and can improve the learning efficiency. Numerical experiments show the proposed method outperforms the conventional policy gradient methods.
3160Exploring the Boundaries of Decidable Verification of Non-Terminating Golog Programs,Jens Classen Martin Liebenberg Gerhard Lakemeyer and Benjamin Zarrieß,Knowledge Representation and Reasoning KRR ,"Situation Calculus
3161Golog
3162Verification","KRR Action Change and Causality
3163KRR Geometric Spatial and Temporal Reasoning",The action programming language \Golog\ has been found useful for the control of autonomous agents such as mobile robots. In scenarios like these tasks are often open-ended so that the respective control programs are non-terminating. Before deploying such programs on a robot it is often desirable to verify that they meet certain requirements. For this purpose Claßen and Lakemeyer recently introduced algorithms for the verification of temporal properties of Golog programs. However given the expressiveness of Golog their verification procedures are not guaranteed to terminate. In this paper we show how decidability can be obtained by suitably restricting the underlying base logic the effect axioms for primitive actions and the use of actions within Golog programs. Moreover we show that dropping any of these restrictions immediately leads to undecidability of the verification problem.
3164Using Timed Game Automata to Synthesize Execution Strategies for Simple Temporal Networks with Uncertainty,Alessandro Cimatti Luke Hunsberger Andrea Micheli and Marco Roveri,"Planning and Scheduling PS
3165Reasoning under Uncertainty RU ","Dynamic Controllability
3166Strategy Synthesis
3167Simple Temporal Networks with Uncertainty
3168Timed Game Automata","PS Plan Execution and Monitoring
3169PS Scheduling
3170RU Uncertainty in AI General/Other ","A Simple Temporal Network with Uncertainty STNU is a structure for
3171representing and reasoning about temporal constraints in domains where
3172some temporal durations are not controlled by the executor. The most
3173important property of an STNU is whether it is dynamically
3174controllable DC ; that is whether there exists a strategy for
3175executing the controllable time-points that guarantees that all
3176constraints will be satisfied no matter how the uncontrollable
3177durations turn out.
3178
3179This paper provides a novel mapping from STNUs to Timed Game Automata
3180 TGAs that 1 explicates the deep theoretical relationships between
3181STNUs and TGAs; and 2 enables the memoryless strategies generated
3182from the TGA to be transformed into equivalent STNU execution
3183strategies that reduce the real-time computational burden for the
3184executor. The paper formally proves that the STNU-to-TGA encoding
3185properly captures the execution semantics of STNUs. It also provides
3186experimental evidence of the proposed approaches generating offline
3187execution strategies for dynamically controllable STNUs encoded as
3188TGAs."
3189Adaptation Guided Case Base Maintenance,Vahid Jalali and David Leake,Novel Machine Learning Algorithms NMLA ,"Adaptation-Guided Case-Based Maintenance
3190Case-Based Maintenance
3191Case-Based Reasoning",NMLA Case-Based Reasoning,In case-based reasoning CBR problems are solved by retrieving prior cases and adapting their solutions to fit new problems. Controlling the growth of the case base in CBR is a fundamental problem. Much research on case-base maintenance has developed methods aimed at compacting case bases while maintaining system competence by deleting cases whose absence is considered least likely to degrade the system's problem-solving given static case adaptation knowledge. This paper proposes adaptation-guided case-base maintenance AGCBM a case-base maintenance approach exploiting the ability to dynamically generate new adaptation knowledge from cases. In AGCMB case retention decisions are based both on their value as base cases for solving problems and on their value for generating new adaptation rules in turn increasing the problem-solving value of other cases in the case base. The paper tests the method for numerical prediction tasks case-based regression in which rules are generated automatically using the case difference heuristic. Tests on four sample domains compare accuracy with a set of five candidate case-based maintenance methods for varying case-base densities. AGCBM outperformed the alternatives all domains with the benefit most substantial for the greatest amounts of compression.
3192The Fisher Market Game Equilibrium and Welfare,Simina Brânzei Yiling Chen Xiaotie Deng Aris Filos-Ratsikas Søren Kristoffer Stiil Frederiksen and Jie Zhang,Game Theory and Economic Paradigms GTEP ,"Fisher markets
3193Fisher market game
3194Equilibrium analysis
3195Price of anarchy","GTEP Auctions and Market-Based Systems
3196GTEP Game Theory","The Fisher market model is one of the most fundamental resource allocation models in economics. In a Fisher market the prices and allocations of goods are determined according to the preferences and budgets of buyers to clear the market.
3197
3198In a Fisher market game however buyers are strategic and report their preferences over goods; the market-clearing prices and allocations are then determined based on their reported preferences rather than their real preferences.
3199We show that the Fisher market game always has a pure Nash equilibrium for buyers with linear Leontief and Cobb-Douglas utility functions which are three representative classes of utility functions in the important Constant Elasticity of Substitution CES family. Furthermore to quantify the social efficiency we prove Price of Anarchy bounds for the game when the utility functions of buyers fall into these three classes respectively."
3200Huffman Coding for Storing Non-uniformly Distributed Messages in Networks of Neural Cliques,Bartosz Boguslawski Vincent Gripon Fabrice Seguin and Frédéric Heitzmann,"Applications APP
3201Machine Learning Applications MLA
3202Novel Machine Learning Algorithms NMLA ","neural clique
3203sparsity
3204associative memory
3205non-uniform distribution
3206compression code","APP Security and Privacy
3207APP Other Applications
3208MLA Machine Learning Applications General/other
3209NMLA Neural Networks/Deep Learning",Associative memories are data structures that allow retrieval of previously stored messages given part of their content. They thus behave similarly to human brain's memory that is capable for instance of retrieving the end of a song given its beginning. Among different families of associative memories sparse ones are known to provide the best efficiency ratio of the number of bits stored to that of bits used . Nevertheless it is well known that non-uniformity of the stored messages can lead to dramatic decrease in performance. Recently a new family of sparse associative memories achieving almost-optimal efficiency has been proposed. Their structure induces a direct mapping between input messages and stored patterns. In this work we show the impact of non-uniformity on the performance of this recent model and we exploit the structure of the model to introduce several strategies to allow for efficient storage of non-uniform messages. We show that a technique based on Huffman coding is the most efficient.
3210An Agent-Based Model Studying the Acquisition of a Language System of Logical Constructions,Josefina Sierra-Santibanez,Cognitive Modeling CM ,"Cognitive Modeling
3211Symbolic AI
3212Simulating Humans
3213Adaptive Behavior","CM Adaptive Behavior
3214CM Simulating Humans
3215CM Symbolic AI","This paper presents an agent-based model that studies the emergence
3216and evolution of a language system of logical constructions i.e. a
3217vocabulary and a set of grammatical constructions that allow
3218expressing logical combinations of categories. The model assumes
3219the agents have a common vocabulary for basic categories the ability
3220to construct logical combinations of categories using Boolean
3221functions and some general purpose cognitive capacities for
3222invention adoption induction and adaptation. But it does not assume the agents
3223have a vocabulary for Boolean functions nor grammatical constructions
3224for expressing such logical combinations of categories through
3225language. The results of the experiments we have performed show that
3226a language system of logical constructions emerges as a result of a
3227process of self-organisation of the individual agents'
3228interactions when these agents adapt their preferences for vocabulary
3229and grammatical constructions to those they observe are used more
3230often by the rest of the population and that such language system
3231is transmitted from one generation to the next."
3232Tree-Based On-line Reinforcement Learning,Andre Barreto,Reasoning under Uncertainty RU ,"Reinforcement Learning
3233Markov Decision Processes
3234Fitted Q-Iteration",RU Sequential Decision Making,Fitted Q-iteration FQI stands out among reinforcement-learning algorithms for its flexibility and easy of use. FQI can be combined with any regression method and this choice determines the algorithm's theoretical and computational properties. The combination of FQI with an ensemble of regression trees gives rises to an algorithm FQIT that is computationally efficient scalable to high dimensional spaces and robust to irrelevant variables outliers and noise. Despite its nice properties and good performance in practice FQIT also has some limitations the fact that an ensemble of trees must be constructed or updated at each iteration confines the algorithm to the batch scenario. This paper aims to address this specific issue. Based on a strategy recently proposed in the literature called the stochastic-factorization trick we propose a modification of FQIT that makes it fully incremental and thus suitable for on-line learning. We call the resulting method tree-based stochastic factorization TBSF . We derive an upper bound for the difference between the value functions computed by FQIT and TBSF and also show in which circumstances the approximations coincide. A series of computational experiments is presented to illustrate the properties of TBSF and to show its usefulness in practice.
3235Doubly Regularized Portfolio with Risk Minimization,Weiwei Shen Jun Wang and Shiqian Ma,"Applications APP
3236Machine Learning Applications MLA ","Portfolion Management
3237Risk Minimization
3238Doubly Regularized Portfolio","APP Other Applications
3239MLA Machine Learning Applications General/other ",Due to recent empirical success machine learning algorithms have drawn sufficient attention and are becoming important analysis tools in financial industry. In particular as the core engine of many financial services such as private wealth and pension fund management portfolio management calls for the application of those novel algorithms. Most of portfolio allocation strategies do not account for costs from market frictions such as transaction costs and capital gain taxes as the complexity of sensible cost models often causes the induced problem intractable. In this paper we propose a doubly regularized sparse and consistent portfolio that provides a modest but effective solution to the above difficulty. Specifically as all kinds of trading costs primarily root in large transaction volumes to reduce volumes we synergistically combine two penalty terms with classic risk minimization models to ensure 1 only a small set of assets are selected to invest in each period; 2 portfolios in subsequent trading periods are similar. To assess the new portfolio we apply standard evaluation criteria and conduct extensive experiments on well-known benchmarks and market datasets. Compared to various state-of-the-art portfolios the proposed portfolio demonstrates a superior performance of having both higher risk-adjusted returns and dramatically decreased transaction volumes.
3240Non-convex feature learning via $\ell_{p\infty}$ operator,Deguang Kong Chris Ding and Qihe Pan,"Machine Learning Applications MLA
3241Novel Machine Learning Algorithms NMLA ","feature
3242sparse
3243learning",,"We present a feature selection method for solving sparse regularization problem which has a composite regularization of $\ell_p$ norm and $\ell_{\infty}$ norm.
3244We use proximal gradient method to solve this \L1inf operator problem where a simple but efficient algorithm is designed to minimize a relative simple objective function which contains a vector of $\ell_2$ norm and $\ell_\infty$ norm. Proposed method brings some insight for solving sparsity-favoring norm and
3245extensive experiments are conducted to characterize the effect of varying $p$ and to compare with other approaches on real world multi-class and multi-label datasets."
3246Robust Non-negative Dictionary Learning,Deguang Kong Chris Ding and Qihe Pan,"Machine Learning Applications MLA
3247Novel Machine Learning Algorithms NMLA ","dictionary learning
3248non-negative
3249clustering
3250multiplicative",,"Dictionary learning plays an important role in machine learning where data vectors are modeled as a sparse linear combinations of basis factors i.e. dictionary . However how to conduct dictionary learning in noisy environment has not been well studied. Moreover in practice the dictionary i.e. the lower rank approximation of the data matrix and the sparse representations are required to be nonnegative such as applications for image annotation document summarization microarray analysis. In this paper we propose a new formulation for non-negative dictionary learning in noisy environment where structure sparsity is enforced on sparse representation. The proposed new formulation is also robust for data with noises and outliers due to a robust loss function used. We derive an efficient multiplicative updating algorithm to solve the optimization problem where dictionary and sparse representation are updated iteratively. We prove the convergence and correctness of proposed algorithm rigorously.
3251We show the differences of dictionary at different level of sparsity constraint.
3252The proposed algorithm can be adapted for data clustering and semi-supervised learning purpose. Promising results in extensive experiments validate the effectiveness of proposed approach."
3253On the Axiomatic Characterization of Runoff Voting Rules,Rupert Freeman Markus Brill and Vincent Conitzer,Game Theory and Economic Paradigms GTEP ,"computational social choice
3254runoff scoring rules
3255independence of clones",GTEP Social Choice / Voting,Runoff voting rules such as single transferable vote STV and Baldwin's rule are of particular interest in computational social choice due to their recursive nature and hardness of manipulation as well as in human practice because they are relatively easy to understand. However they are not known for their compliance with desirable axiomatic properties which we attempt to rectify here. We characterize runoff rules that are based on scoring rules using two axioms a weakening of local independence of irrelevant alternatives and a variant of population-consistency. We then show as our main technical result that STV is the only runoff scoring rule satisfying an independence-of-clones property. Furthermore we provide axiomatizations of Baldwin's rule and Coombs' rule.
3256Supervised Scoring with Monotone Multidimensional Splines,Abraham Othman,Computational Sustainability and AI CSAI ,"Sustainability
3257Green Buildings
3258Interpolation
3259Scientific Computing","CSAI Modeling and control of complex high-dimensional systems
3260CSAI Support for public engagement and decision making by the public
3261HCC Optimality in the context of human computation",Scoring involves the compression of a number of quantitative attributes into a single meaningful value. We consider the problem of how to generate scores in a setting where they should be weakly monotone either non-increasing or non-decreasing in their dimensions. Our approach allows an expert to score an arbitrary set of points to produce meaningful continuous monotone scores over the entire domain while exactly interpolating through those inputs. In contrast existing monotone interpolating methods only work in two dimensions and typically require exhaustive grid input. Our technique significantly lowers the bar to score creation allowing domain experts to develop mathematically coherent scores. The method is used in practice to create the LEED Performance energy and water scores that gauge building sustainability.
3262CoreCluster A Degeneracy Based Graph Clustering Framework,Christos Giatsidis Fragkiskos Malliaros Dimitrios Thilikos and Michalis Vazirgiannis,"AI and the Web AIW
3263Applications APP ","Community detection
3264Graph clustering
3265Graph degeneracy
3266Graph mining","AIW Social networking and community identification
3267APP Social Networks","Graph clustering or community detection constitutes an important
3268task for investigating the internal structure of graphs with a
3269plethora of applications in several diverse domains. Traditional
3270tools for graph clustering such as spectral methods typically suffer
3271from high time and space complexity. In this article we present
3272CoreCluster an efficient graph clustering framework based on
3273the concept of graph degeneracy that can be used along with any
3274known graph clustering algorithm. Our approach capitalizes on
3275processing the graph in a hierarchical manner provided by its core
3276expansion sequence an ordered partition of the graph into different
3277levels according to the k-core decomposition. Such a partition
3278provides a way to process the graph in an incremental manner that
3279preserves its clustering structure while making the execution of the
3280chosen clustering algorithm much faster due to the smaller size of
3281the graph s partitions onto which the algorithm operates."
3282A Generalized Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles of Complex Types,Dror Sholomon Omid E. David and Nathan S. Netanyahu,"Novel Machine Learning Algorithms NMLA
3283Vision VIS ","genetic algorithms
3284jigsaw puzzle
3285computer vision
3286recombination operators","NMLA Evolutionary Computation
3287VIS Perception",In this paper we introduce new types of square-piece jigsaw puzzles where in addition to the unknown location and orientation of each piece a piece might also need to be flipped. These puzzles which are associated with a number of real world problems are considerably harder from a computational standpoint. Specifically we present a novel generalized genetic algorithm GA -based solver that can handle puzzle pieces of unknown location and orientation Type 2 puzzles and puzzle pieces of unknown location orientation and attitude Type 4 puzzles . To the best of our knowledge our solver provides a new state-of-the-art solving previously attempted puzzles faster and far more accurately solving puzzle sizes that have never been attempted before and solving the newly introduced double-sided puzzle types automatically and effectively. This paper also presents among other results the most extensive set of experimental results compiled as of yet on Type 2 puzzles.
3288Solving Zero-Sum Security Games in Discretized Spatio-Temporal Domains,Haifeng Xu Fei Fang Albert Jiang Vincent Conitzer Shaddin Dughmi and Milind Tambe,"Game Theory and Economic Paradigms GTEP
3289Multiagent Systems MAS ","Security Games
3290Zero-Sum Games
3291Minimax Equilibrium
3292Oracle
3293Equilibria Computation","GTEP Game Theory
3294GTEP Equilibrium
3295MAS Multiagent Systems General/other ","Security games model the problem of allocating multiple resources to defend multiple targets. In such settings the defender's action space is usually exponential in the input size. Therefore known general-purpose linear or mixed integer program formulations scale up exponentially. One method that has been widely deployed to address this computational issue is to instead compute the marginal probabilities of which there are only polynomially many.
3296
3297In this paper we address a class of problems that cannot be handled by previous approaches based on marginal probabilities. We consider security games in discretized spatio-temporal domains in which the schedule set is so large that even the marginal probability formulation has exponential size. We develop novel algorithms under an oracle-based algorithmic framework and show that
3298this framework allows us to efficiently compute Stackelberg mixed strategy
3299when the problem allows a polynomial-time oracle. For the cases in which efficient oracles are difficult to find we propose new direct algorithms or prove hardness results. All our algorithms are examined in experiments with realistic and artificial data."
3300Q-intersection Algorithms for Constraint-Based Robust Parameter Estimation,Clement Carbonnel Gilles Trombettoni Philippe Vismara and Gilles Chabert,Search and Constraint Satisfaction SCS ,"intersection graph
3301computational complexity
3302parameter estimation
3303soft numerical constraints
3304q-intersection",SCS Constraint Satisfaction,"Given a set of axis-parallel n-dimensional boxes the q-intersection is defined as the smallest box encompassing all the points that belong to at least q boxes. Computing the q-intersection is a combinatorial problem that allows us to handle robust parameter estimation with a numerical constraint programming approach.
3305The q-intersection can be viewed as a filtering operator for soft constraints that model measurements subject to outliers. This paper highlights the equivalence of this operator with the search of q-cliques in a graph whose boxicity is bounded by the number of variables in the constraint network. We present a computational study of the q-intersection. We also propose a fast incomplete algorithm and a sophisticated exact q-intersection algorithm. First experimental results show that our exact algorithm outperforms the existing one while our heuristic performs an efficient filtering on hard problems."
3306Fused Feature Representation Discovery for High-dimensional and Sparse Data,Jun Suzuki and Masaaki Nagata,"NLP and Machine Learning NLPML
3307Novel Machine Learning Algorithms NMLA ","feature representation discovery
3308feature selection
3309dimensionality reduction
3310semi-supervised learning
3311feature grouping","NLPML Natural Language Processing General/Other
3312NMLA Feature Construction/Reformulation
3313NMLA Semisupervised Learning",The automatic discovery of a significant low-dimensional feature representation from given data set is a fundamental problem in machine learning. This paper specifically focuses on to develop feature representation discovery methods appropriate for high-dimensional and sparse data which are remain a frontier but are now becoming a highly important tool. We formulate our feature representation discovery problem as a variant of semi-supervised learning problem namely an optimization problem over unsupervised data whose objective is evaluating the impact of each feature with respect to modeling a target task according to the initial model constructed by using supervised data. The most notable characteristic of our method is that it offers feasible processing speed even if the numbers of data and features both exceed the billions and successfully provides significantly small feature sets i.e. less than 10 that can also offer improved performance comparing with those obtained with using the original feature sets. We demonstrate the effectiveness of our method on experiments of two well-known natural language processing tasks.
3314Joule Counting Correction for Electric Vehicles using Artificial Neural Nets,Michael Taylor,"Applications APP
3315Machine Learning Applications MLA
3316Novel Machine Learning Algorithms NMLA
3317Reasoning under Uncertainty RU
3318Robotics ROB ","Electric Vehicles
3319Battery Estimation
3320State of Charge
3321Artificial Neural Nets
3322Lithium Iron Phosphate LiFePo ","APP Other Applications
3323MLA Applications of Unsupervised Learning
3324MLA Machine Learning Applications General/other
3325NMLA Active Learning
3326NMLA Time-Series/Data Streams
3327NMLA Unsupervised Learning Other
3328NMLA Machine Learning General/other
3329RU Uncertainty Representations
3330ROB Human-Robot Interaction
3331ROB State Estimation",Estimating the remaining energy in high-capacity electric vehicle batteries is essential to safe and efficient operation. Accurate estimation remains a major challenge however because battery state cannot be observed directly. In this paper we demonstrate a method for estimating battery remaining energy using real data collected from the Charge Car electric vehicle. This new method relies on energy integration as an initial estimation step which is then corrected using a neural net that learns how error accumulates from recent charge/discharge cycles. In this way the algorithm is able to adapt to nonlinearities and variations that are difficult to model or characterize. On the collected dataset this method is demonstrated to be accurate to within 2.5% to 5% of battery remaining energy which equates to approximately 1 to 2 miles of residual range for the Charge Car given its 10kWh battery pack.
3332Managing Change in Graph-structured Data Using Description Logics,Shqiponja Ahmetaj Diego Calvanese Magdalena Ortiz and Mantas Simkus,Knowledge Representation and Reasoning KRR ,"Graph structured data
3333Description Logics
3334Static analysis of transactions
3335Planning","KRR Computational Complexity of Reasoning
3336KRR Description Logics",In this paper we consider the setting of graph-structured data that evolves as a result of operations carried out by users or applications. We study different reasoning problems which range from ensuring the satisfaction of a given set of integrity constraints after a given sequence of updates to deciding the non- existence of a sequence of actions that would take the data to an un desirable state starting either from a specific data instance or from an incomplete description of it. We consider a simple action language in which actions are finite sequences of insertions and deletions of nodes and labels and use Description Logics for describing integrity constraints and partial states of the data. We then formalize the data management problems mentioned above as a static verification problem and several planning problems. We provide algorithms and tight complexity bounds for the formalized problems both for an expressive DL and for a variant of DL-Lite.
3337BCI Based Control Without Explicit Calibration,Jonathan Grizou Iñaki Iturrate Luis Montesano and Manuel Lopes,"Cognitive Modeling CM
3338Humans and AI HAI ","Brain-Computer Interfaces
3339Calibration
3340Human-Robot Interaction","CM Adaptive Behavior
3341HAI Brain-Sensing and Analysis
3342HAI Communication Protocols
3343HAI Human-Computer Interaction
3344HAI User Experience and Usability","Recent work has shown that it is possible to extract feedback information from
3345EEG measurements of brain activity such as error potentials and use it to solve
3346sequential tasks. As most Brain-Computer Interfaces a calibration phase is required
3347to build a decoder that translates raw EEG signals to understandable feedback
3348signals. This paper proposes a method to solve sequential tasks based on
3349feedback extracted from the brain without any calibration. The proposed method
3350uses optimal policies to hallucinate the meaning of the EEG signals and select the
3351target with the lowest expected error. Also we use the task and symbol uncertainty
3352as an exploration bonus for an active strategy to speed up the learning. We
3353report online experiments where four users directly controlled an agent on a 2D
3354grid world to reach a target without any previous calibration process."
3355Towards Scalable Exploration of Diagnoses in an Ontology Stream,Freddy Lecue,AI and the Web AIW ,"Semantic Web
3356Ontology Stream
3357Semantic Reasoning
3358Knowledge Evolution","AIW Languages tools and methodologies for representing managing and visualizing semantic web data
3359AIW Ontologies and the web creation extraction evolution mapping merging and alignment; tags and folksonomies",Diagnosis or the process of identifying the nature and cause of an anomaly in an ontology has been largely studied by the Semantic Web community. In the context of ontology stream diagnosis results are not captured by a unique fixed ontology but numerous time-evolving ontologies. Thus any anomaly can be diagnosed by a large number of different explanations depending on the version and evolution of the ontology. We address the problems of identifying representing exploiting and exploring the evolution of diagnoses representations. Our approach consists in a graph-based representation which aims at i efficiently organizing and linking time-evolving diagnoses and ii being used for scalable exploration. The experiments have shown scalable diagnoses exploration in the context of real and live data from Dublin City.
3360Flexible and Scalable Partially Observable Planning with Linear Translations,Blai Bonet and Hector Geffner,"Knowledge Representation and Reasoning KRR
3361Planning and Scheduling PS
3362Reasoning under Uncertainty RU ","Planning with sensing and partial information
3363Planning with beliefs
3364On-line planning
3365Replanning","KRR Reasoning with Beliefs
3366PS Deterministic Planning
3367PS Replanning and Plan Repair
3368PS Planning General/Other
3369RU Sequential Decision Making","The problem of on-line planning in partially observable settings involves two problems keeping track of beliefs about the environment and selecting actions for achieving goals. While the two problems are computationally intractable in the worst case significant progress has been achieved in recent years through the use of suitable reductions. In particular the state-of-the-art CLG planner is based on a translation that maps deterministic partially observable problems into fully observable nondeterministic ones. The translation which is quadratic in the number of problem fluents and gets rid of the belief tracking problem is adequate for most benchmarks; it is in fact complete for problems that have width 1. The more recent K-replanner uses two translations that are linear one for keeping track of beliefs and the other for selecting actions using off-the-shelf classical planners.
3370As a result the K-replanner scales up better but is not as general as CLG. In this work we combine the benefits of these two approaches the scope of the CLG planner and the efficiency of the K-replanner by introducing a new planner called LW1 that is based on a translation that is linear but which is complete for width-1 problems. The scope and performance of the new planner is evaluated by considering the existing benchmarks and new problems."
3371Online Portfolio Selection with Group Sparsity,Puja Das Nicholas Johnson and Arindam Banerjee,"Applications APP
3372Game Theory and Economic Paradigms GTEP
3373Machine Learning Applications MLA
3374Novel Machine Learning Algorithms NMLA ","online learning
3375portfolio selection
3376group lasso
3377non-smooth convex optimization
3378alternating direction method of multipliers","APP Other Applications
3379GTEP Adversarial Learning
3380MLA Machine Learning Applications General/other
3381NMLA Big Data / Scalability
3382NMLA Data Mining and Knowledge Discovery
3383NMLA Online Learning
3384NMLA Time-Series/Data Streams
3385NMLA Machine Learning General/other ","In portfolio selection it often might be preferable to focus on a few top
3386performing industries/sectors to beat the market. These top performing sectors
3387however might change over time. In this paper we propose an online portfolio
3388selection algorithm that can take advantage of sector information through the use of a group sparsity inducing regularizer while making lazy updates to the portfolio. The lazy updates prevent changing ones portfolio too often which otherwise might incur huge transaction costs.
3389The proposed formulation is not straightforward to solve due to the presence of
3390non-smooth functions along with the constraint that the portfolios have to lie
3391within a probability simplex. We propose an efficient primal-dual based alternating direction method of multipliers algorithm and demonstrate its effectiveness for the problem of online portfolio selection
3392with sector information. We show that our algorithm O-LUGS has sub-linear
3393regret $w.r.t.$ the best \textit{fixed} and best \textit{shifting} solution in
3394hindsight. We successfully establish the robustness and scalability of O-LUGS
3395by performing extensive experiments on two real-world datasets."
3396Prediction of Helpful Reviews using Emotions Extraction,Lionel Martin and Pearl Pu,"Machine Learning Applications MLA
3397NLP and Machine Learning NLPML ","helpfulness prediction
3398product review analysis
3399emotions extraction","MLA Applications of Supervised Learning
3400NLPML Evaluation and Analysis","Reviews keep playing an increasingly important role in the decision process of buying products and booking hotels. However the large amount of available information can be confusing to users. A more succinct interface gathering only the most helpful reviews can reduce information processing time and save effort. To create such an interface in real time we need reliable prediction algorithms to classify and predict new reviews which have not been voted but are potentially helpful. So far such helpfulness prediction algorithms have benefited from structural aspects such as the length and readability score. Since emotional words are at the heart of our written communication and are powerful to trigger listeners attention we believe that emotional words can serve as important parameters for predicting helpfulness of review text.
3401
3402Using GALC a general lexicon of emotional words associated with a model representing 20 different categories we extracted the emotionality from the review text and applied supervised classification method to derive the emotion-based helpful review prediction. As the second contribution we propose an evaluation framework comparing three different real-world datasets extracted from the most well-known product review websites. This framework shows that emotion-based methods are outperforming the structure-based approach by up to 9%."
3403An Adversarial Interpretation of Information-Theoretic Bounded Rationality,Pedro A. Ortega and Daniel D. Lee,"Game Theory and Economic Paradigms GTEP
3404Planning and Scheduling PS
3405Reasoning under Uncertainty RU ","bounded rationality
3406free energy
3407game theory
3408legendre transform","PS Probabilistic Planning
3409PS Planning General/Other
3410RU Decision/Utility Theory","Recently there has been a growing interest in modelling planning with information constraints. Accordingly an agent maximizes a regularized expected utility known as the free energy where the regularizer is given by the information divergence from a prior to a posterior policy. While this approach can be justified in various ways most importantly from statistical mechanics and information theory it is still unclear how it relates to game theory. This connection has been suggested previously in work relating the free energy to risk-sensitive control and to extensive form games. In this work we present an adversarial interpretation that is equivalent to the free energy optimization problem. The adversary can by paying an exponential
3411penalty generate costs that diminish the decision maker's payoffs. It turns out
3412that the optimal strategy of the adversary consists in choosing costs so as to
3413render the decision maker indifferent among its choices which is a definining
3414property of a Nash equilibrium thus tightening the connection between free
3415energy optimization and game theory."
3416Learning to Recognize Novel Objects in One Shot through Human-Robot Interactions in Natural Language Dialogues,Evan Krause Michael Zillich Thomas Williams and Matthias Scheutz,"Cognitive Systems CS
3417Robotics ROB
3418Vision VIS ","one-shot learning
3419object recognition
3420natural language dialogues","CS Natural language understanding and dialogue
3421ROB Human-Robot Interaction
3422VIS Language and Vision
3423VIS Object Recognition",Being able to quickly and naturally teach robots new knowledge is critical for many future open-world human-robot interaction scenarios. In this paper we present a novel approach to using natural language context for one-shot learning of visual objects where the robot is immediately able to recognize the described object. We describe the architectural components and demonstrate the proposed approach on a robotic platform in a proof-of-concept evaluation.
3424Coactive Learning for Locally Optimal Problem Solving,Robby Goetschalckx Alan Fern and Prasad Tadepalli,"Humans and AI HAI
3425Knowledge Representation and Reasoning KRR ","Coactive Learning
3426Local Optimization
3427Preference Learning","HCC Active learning from imperfect human labelers
3428HAI Interaction Techniques and Devices
3429KRR Preferences
3430NMLA Preferences/Ranking Learning","Coactive learning is an online problem solving setting where the solutions
3431provided by a solver are interactively improved by a domain expert which
3432in turn drives learning.
3433In this paper we extend the
3434study of coactive learning to problems where obtaining a
3435global or near-optimal solution may be intractable or where an expert
3436can only be expected to make small local improvements to a candidate solution.
3437The goal of learning in this new setting is to minimize the cost
3438as measured by the expert effort
3439over time. We first establish theoretical bounds
3440on the average cost of the existing coactive Perceptron
3441algorithm. In addition we consider new online algorithms that use
3442cost-sensitive and Passive-Aggressive PA updates showing similar
3443or improved theoretical bounds. We provide an empirical evaluation
3444of the learners in 5 domains which show that the Perceptron based
3445algorithms are quite effective and that unlike the case for online
3446classification the PA algorithms do not yield significant performance
3447gains."
3448Large Scale Analogical Reasoning,Vinay Chaudhri Stijn Heymans Adam Overholtzer Aaron Spaulding and Michael Wessel,"Cognitive Systems CS
3449Knowledge Representation and Reasoning KRR ","analogical reasoning
3450case-based reasoning
3451question answering
3452knowledge base systems","CS Conceptual inference and reasoning
3453CS Structural learning and knowledge capture
3454KRR Qualitative Reasoning","It has been argued that one can use cognitive simulation of analogical
3455processing to answer comparison questions. In the context of a
3456knowledge base KB system a comparison question takes the form What
3457are the similarities and/or differences between A and B? where
3458\concept{A} and \concept{B} are concepts in the KB. Previous attempts
3459to use a general purpose analogical reasoner to answer this question
3460revealed three major problems a the system presented too much
3461information in the answer and the salient similarity or difference was
3462not highlighted b analogical inference found some incorrect
3463differences c some expected similarities were not found. The primary
3464cause of these problems was the lack of availability of a well-curated
3465KB and secondarily there were also some algorithmic deficiencies.
3466In this paper we present an of comparison questions that is inspired
3467by the general model of analogical reasoning but is specific to the
3468questions at hand. We also rely on a well-curated biology KB. We
3469present numerous examples of answers produced by the system and
3470empirical data on the quality of the answers to claim that we have
3471addressed many of the problems faced in the previous system."
3472Learning Models of Unknown Events,Matthew Molineaux and David Aha,Cognitive Systems CS ,"learning environment models
3473explanation generation
3474execution monitoring
3475goal-driven autonomy","CM Symbolic AI
3476CS Problem solving and decision making
3477CS Introspection and meta-cognition
3478CS Structural learning and knowledge capture
3479PS Learning Models for Planning and Diagnosis
3480PS Plan Execution and Monitoring",Agents with incomplete models of their environment are likely to be surprised. For agents in immense environments that defy complete modeling this represents an opportunity to learn. We investigate approaches for situated agents to detect surprises discriminate among different forms of surprise and hypothesize new models for the unknown events that surprised them. We instantiate these approaches in a new goal reasoning agent named FOOLMETWICE investigate its performance in simulation studies and show that it produces plans with significantly reduced execution cost when compared to not learning models for surprising events.
3481Learning Concept Embeddings for Query Expansion by Quantum Entropy Minimization,Alessandro Sordoni Yoshua Bengio and Jian-Yun Nie,"NLP and Knowledge Representation NLPKR
3482NLP and Machine Learning NLPML ","Embedding
3483Density Matrix
3484Query Expansion","NLPKR Natural Language Processing General/Other
3485NLPML Natural Language Processing General/Other
3486NMLA Neural Networks/Deep Learning",In web search users queries are formulated using only few terms and term-matching retrieval functions could fail at retrieving relevant documents. Given a user query the technique of query expansion QE consists in selecting related terms that could enhance the likelihood of retrieving relevant documents. Selecting such expansion terms is challenging and requires a computational framework capable of encoding complex semantic relationships. In this paper we propose a novel method for learning in a supervised way semantic representations for words and phrases. By embedding queries and documents in special matrices our model disposes of an increased representational power with respect to existing approaches adopting a vector representation. We show that our model produces high-quality query expansion terms. Our expansion increase IR mesures beyond expansion from current word-embeddings models and well-established traditional QE methods.
3487Non-Restarting SAT Solvers With Simple Preprocessing Efficiently Simulate Resolution,Paul Beame and Ashish Sabharwal,Search and Constraint Satisfaction SCS ,"clause learning
3488satisfiability
3489proof complexity
3490p-simulation
34911-UIP clauses
3492asserting clauses
3493resolution","SCS Constraint Satisfaction
3494SCS Constraint Learning and Acquisition
3495SCS SAT and CSP Evaluation and Analysis
3496SCS Satisfiability General/Other ","Propositional satisfiability SAT solvers based on conflict directed
3497clause learning CDCL implicitly produce resolution refutations of
3498unsatisfiable formulas. The precise class of formulas for which they
3499can produce polynomial size refutations has been the subject of
3500several studies with special focus on the clause learning aspect of
3501these solvers. The results however either assume the use of
3502non-standard and non-asserting learning schemes such as FirstNewCut
3503or rely on polynomially many restarts for simulating individual steps
3504of a resolution refutation or work with a theoretical model that
3505significantly deviates from certain key aspects of all modern CDCL
3506solvers such as learning only one asserting clause from each conflict
3507and other techniques such as conflict guided backjumping and clause
3508minimization. We study non-restarting CDCL solvers that learn only one
3509asserting clause per conflict and show that with simple preprocessing
3510that depends only on the number of variables of the input formula
3511such solvers can polynomially simulate resolution."
3512Worst-Case Solution Quality Analysis When Not Re-Expanding Nodes in Best-First Search,Richard Valenzano Nathan Sturtevant and Jonathan Schaeffer,Heuristic Search and Optimization HSO ,"best-first search
3513re-expansions
3514heuristics
3515inconsistency
3516inadmissibility
3517solution quality
3518suboptimality
3519suboptimal heuristic search
3520worst-case analysis","HSO Heuristic Search
3521HSO Evaluation and Analysis Search and Optimization ",The use of inconsistent heuristics with A* can result in increased runtime due to the need to re-expand nodes. Poor performance can also be seen with Weighted A* if nodes are re-expanded. While the negative impact of re-expansions can often be minimized by setting these algorithms to never expand nodes more than once the result can be a lower solution quality. In this paper we formally show that the loss in solution quality can be bounded based on the amount of inconsistency along optimal solution paths. This bound holds regardless of whether the heuristic is admissible or inadmissible though if the heuristic is admissible the bound can be used to show that not re-expanding nodes can have at most a quadratic impact on the quality of solutions found when using A*. We then show that the bound is tight by describing a process for the construction of graphs for which a best-first search that does not re-expand nodes will find solutions whose quality is arbitrarily close to that given by the bound. Finally we will use the bound to extend a known result regarding the solution quality of WA* when weighting a consistent heuristic so that it applies to other types of heuristic weighting.
3522Natural Temporal Difference Learning,William Dabney and Philip Thomas,Novel Machine Learning Algorithms NMLA ,"natural gradient
3523temporal difference learning
3524reinforcement learning",,In this paper we investigate the application of natural gradient descent to Bellman error based reinforcement learning algorithms. This combination is interesting because natural gradient descent is invariant to the parameterization of the value function. This invariance property means that natural gradient descent adapts its update directions to correct for poorly conditioned representations. We present and analyze quadratic and linear time natural temporal difference learning algorithms and prove that they are covariant. We conclude with experiments which suggest that the natural algorithms can match or outperform their non-natural counterparts using linear function approximation and drastically improve upon their non-natural counterparts when using non-linear function approximation.
3525Relaxation Search a Simple Way of Managing Optional Clauses,Maria Tsimpoukelli Fahiem Bacchus Jessica Davies and George Katsirelos,Search and Constraint Satisfaction SCS ,"Constraint Optimization
3526Satisfiability
3527Maximum Satisfiability
3528Minimal Correction Sets","SCS Constraint Optimization
3529SCS SAT and CSP Solvers and Tools
3530SCS Satisfiability General/Other ","A number of problems involve managing a set of optional clauses. For example the soft clauses in a MaxSat formula are optional---they can be falsified for a cost. Similarly when computing a Minimum Correction Set for an unsatisfiable formula all clauses are optional---some can be falsified in order to make the
3531 remaining satisfiable. In both of these cases the task is to find a subset of the optional clauses that achieves some optimization criteria and is satisfiable. Relaxation search is a simple method of using a standard SAT solver to solve this task. Relaxation search is very easy to implement sometimes requiring only a simple modification of the variable selection heuristic in the SAT solver. Furthermore considerable flexibility and control can be achieved over the order in which subsets of optional clauses examined. We demonstrate how relaxation search can be used to solve MaxSat and to compute Minimum Correction Sets. In both cases relaxation search is able to achieve state-of-the-art performance and solve some instances other solvers are not able to solve."
3532Using Response Functions to Measure Strategy Strength,Trevor Davis Neil Burch and Michael Bowling,Game Theory and Economic Paradigms GTEP ,"strategy evaluation
3533extensive-form games
3534adaptive opponents","GTEP Game Theory
3535GTEP Equilibrium
3536GTEP Imperfect Information",Extensive-form games are a powerful tool for representing complex multi-agent interactions. Nash equilibrium strategies are commonly used as a solution concept for extensive-form games but many games are too large for the computation of Nash equilibria to be tractable. In these large games exploitability has traditionally been used to measure deviation from Nash equilibrium and thus strategies are aimed to achieve minimal exploitability. However while exploitability measures a strategy's worst-case performance it fails to capture how likely that worst-case is to be observed in practice. In fact empirical evidence has shown that a less exploitable strategy can perform worse than a more exploitable strategy in one-on-one play against a variety of opponents. In this work we propose a class of response functions that can be used to measure the strength of a strategy. We prove that standard no-regret algorithms can be used to learn optimal strategies for a scenario where the opponent uses one of these response functions. We demonstrate the effectiveness of this technique in Leduc poker against opponents that use the UCT Monte Carlo tree search algorithm.
3537Optimal and Efficient Stochastic Motion Planning in Partially-Known Environments,Ryan Luna Morteza Lahijanian Mark Moll and Lydia Kavraki,"Reasoning under Uncertainty RU
3538Robotics ROB ","Planning under uncertainty
3539Motion planning with action and environment uncertainty
3540Optimal stochastic motion planning
3541Computing policies under uncertainty","PS Mixed Discrete/Continuous Planning
3542RU Uncertainty in AI General/Other
3543ROB Motion and Path Planning
3544ROB Robotics General/Other ",A framework capable of computing optimal control policies for a continuous system in the presence of both action and environment uncertainty is presented in this work. The framework decomposes the planning problem into two stages an offline phase that reasons only over action uncertainty and an online phase that quickly reacts to the uncertain environment. Offline a bounded-parameter Markov decision process BMDP is employed to model the evolution of the stochastic system over a discretization of the environment. Online an optimal control policy over the BMDP is computed. Upon the discovery of an unknown environment feature during policy execution the BMDP is updated and the optimal control policy is efficiently recomputed. Depending on the desired quality of the control policy a suite of methods is presented to incorporate new information into the BMDP with varying degrees of detail online. Experiments confirm that the framework recomputes high-quality policies in seconds and is orders of magnitude faster than existing methods.
3545Learning Scripts as Hidden Markov Models,Walker Orr Prasad Tadepalli Thomas Dietterich Xiaoli Fern and Janardhan Rao Doppa,"NLP and Machine Learning NLPML
3546Novel Machine Learning Algorithms NMLA ","HMM
3547Scripts
3548NLP
3549Structural EM
3550Structure Learning","NLPML Natural Language Processing General/Other
3551NMLA Graphical Model Learning",Scripts have been proposed to model the stereotypical event sequences found in narratives. They can be applied to make a variety of inferences including filling gaps in the narratives and resolving ambiguous references. This paper proposes the first formal framework for scripts based on Hidden Markov Models HMMs . Our framework supports robust inference and learning algorithms which are lacking in previous clustering models. We develop an algorithm for structure and parameter learning based on Expectation Maximization and evaluate it on a number of natural and synthetic datasets. The results show that our algorithm is superior to several informed baselines for predicting future events given some past history.
3552Mapping Users Across Networks by Manifold Alignment on Hypergraph,Shulong Tan Ziyu Guan Deng Cai Xuzhen Qin Jiajun Bu and Chun Chen,AI and the Web AIW ,"Social Networks
3553Manifold Alignment
3554Hypergraph
3555De-anonymization
3556User Mapping","AIW Machine learning and the web
3557AIW Ontologies and the web creation extraction evolution mapping merging and alignment; tags and folksonomies
3558AIW Social networking and community identification",Nowadays many people are members of multiple online social networks simultaneously such as Facebook Twitter and some other instant messaging circles. But these networks are usually isolated from each other. Mapping common users cross these social networks will be beneficial for cross network recommendation or expanding one s social circle. Methods based on username comparison perform well on parts of users however they can not work in the following situations a users choose completely different usernames in different networks; b a unique username corresponds to different individuals. In this paper we propose to utilize social structures to improve the mapping performance. Specifically a novel subspace learning algorithm Manifold Alignment on Hypergraph MAH is proposed. Different from traditional semi-supervised manifold alignment methods we use hypergraph to model high-order relations here. For a target user in one network the proposed algorithm ranks all users in the other network by their probabilities of being the corresponding user. Moreover methods based on username comparison can be incorporated with our algorithm easily to further boost the mapping accuracy. In experiments we use both simulation data and real world data to test the proposed method. Experiment results have demonstrated the effectiveness of our proposed algorithm in mapping users cross networks.
3559Compact Aspect Embedding For Diversified Query Expansion,Xiaohua Liu Arbi Bouchoucha Jian-Yun Nie and Alessandro Sordoni,AI and the Web AIW ,"query expansion
3560search result diversification
3561Trace Norm Regularization",AIW Enhancing web search and information retrieval,Diversified query expansion DQE based approaches aim to select a set of expansion terms with less redundancy among them while covering as many query aspects as possible. Recently they have experimentally demonstrate their effectiveness for the task of search result diversification. One challenge faced by existing DQE approaches is how to ensure the aspect coverage. In this paper we propose a novel method for DQE called compact aspect embedding which exploits trace norm regularization to learn a low rank vector space for the query with each eigenvector of the learnt vector space representing an aspect and the absolute value of its corresponding eigenvalue representing the association strength of that aspect to the query. Meanwhile each expansion term is mapped into the vector space as well. Based on this novel representation of the query aspects and expansion terms we design a greedy selection strategy to choose a set of expansion terms to explicitly cover all possible aspects of the query. We test our method on several TREC diversification data sets and show our method significantly outperforms the state-of-the-art approaches.
3562Contraction and Revision over DL-Lite TBoxes,Zhiqiang Zhuang Zhe Wang Kewen Wang and Guilin Qi,Knowledge Representation and Reasoning KRR ,"Belief Change
3563Description Logic
3564Non-monotonic reasoning","KRR Belief Change
3565KRR Description Logics
3566KRR Nonmonotonic Reasoning","An essential task in managing DL ontologies is to deal with changes over the ontologies.
3567In particular outdated axioms have to be removed from the ontology
3568and newly formed axioms have to be incorporated into the ontology.
3569Such changes are formalised as the operations of contraction and revision in the literatures.
3570The operations can be defined in various ways.
3571To investigate properties of a defined operation it is best to identify some postulates that completely
3572characterise the operation such that on the one hand the operation satisfies the postulates
3573and on the other hand it is the only operation that satisfies all the postulates.
3574Such characterisation results have never been shown for contractions under DLs.
3575In this paper we define model-based contraction and revision for DL-Lite$_{core}$ TBoxes
3576and provide characterisation results for both operations.
3577As a first step for applying the operations in practice
3578we also provide tractable algorithms for both operations.
3579Since DL semantics incurs infinite numbers of models for DL-Lite TBoxes
3580it is not feasible to develop algorithms involving DL models.
3581The key to our operations and algorithms is the development of an alternative semantics called type semantics. Type semantics closely resembles the semantics underlays propositional logic
3582thus it is more succinct than DL semantics. Most importantly given a finite signature any DL-Lite$_{core}$ TBox has finite numbers of type models."
3583Zero Pronoun Resolution as Ranking,Chen Chen and Vincent Ng,NLP and Text Mining NLPTM ,"Zero Pronouns
3584Text Mining
3585Natural Language Processing",NLPTM Evaluation and Analysis,Compared to overt pronoun resolution there is less work on the more challenging task of zero pronoun resolution. State-of-the-art approaches to zero pronoun resolution are supervised requiring the availability of documents containing manually resolved zero pronouns. In contrast we propose in this paper an unsupervised approach to this task. Underlying our approach is the novel idea of employing a model trained on manually resolved overt pronouns to resolve zero pronouns. Experimental results on the OntoNotes corpus are encouraging our unsupervised model rivals its supervised counterparts in performance.
3586Supervised Transfer Sparse Coding,Maruan Al-Shedivat Jim Jing-Yan Wang Majed Alzahrani Jianhua Z. Huang and Xin Gao,Novel Machine Learning Algorithms NMLA ,"Sparse coding
3587Transfer learning
3588Supervised learning
3589Classification
3590Support Vector Machine","NMLA Classification
3591NMLA Transfer Adaptation Multitask Learning
3592NMLA Supervised Learning Other ",A combination of sparse coding and transfer learning techniques was shown to be accurate and robust in classification tasks where training and testing objects have a shared feature space but are sampled from different underlying distributions i.e. belong to different domains. The key assumption in such case is that in spite of the domain disparity samples from different domains share some common hidden factors. Previous methods often assumed that all the objects in the target domain are not labeled and thus the training set solely comprised objects from the source domain. However in real world applications the target domain often has some labeled objects or one can always manually label a small number of them. In this paper we explore such possibility and show how a little amount of labeled data in the target domain can significantly leverage classification accuracy of the state-of-the-art transfer sparse coding methods. We further propose a unified framework named Supervised Transfer Sparse Coding STSC which simultaneously optimizes sparse representation domain transfer and supervised classification. Experimental results on three applications demonstrate that little manual labeling and then learning the model in a supervised fashion can significantly improve classification accuracy.