DSAA 2017 Accepted Papers

Research Track

Disentangled Link Prediction for Signed Social Networks via Disentangled Representation Learning
Linchuan Xu, Xiaokai Wei, Jiannong Cao and Philip Yu
BJR-Tree: Fast Skyline Computation Algorithm for Serendipitous Searching Problems
Kenichi Koizumi, Peter Eades, Kei Hiraki and Mary Inaba
A Dynamic Factor Machine Learning Method for Multi-variate and Multi-Step-Ahead Forecasting
Gianluca Bontempi, Yann-Ael Le Borgne and Jacopo De Stefani
Multiple Social Role Embedding
Linchuan Xu, Xiaokai Wei, Jiannong Cao and Philip Yu
Latent Dimensionality Estimation for Probabilistic Canonical Correlation Analysis Using Normalized Maximum Likelihood Code-Length
Tomohiko Nakamura, Tomoharu Iwata and Kenji Yamanishi
RadiusSketch: Massively Distributed Indexing of Time Series
Djamel Edine Yagoubi, Reza Akbarinia, Florent Masseglia and Dennis Shasha
M3A: Model, MetaModel, and Anomaly Detection for Inter-Arrivals of Web Searches and Postings
Da-Cheng Juan, Neil Shah, Zhiliang Qian, Mingyu Tang, Diana Marculescu and Christos Faloutsos
Dynamic and Heterogeneous Ensembles for Time Series Forecasting
Vitor Cerqueira, Luis Torgo, Mariana Oliveira and Bernhard Pfahringer
The k-Nearest Representatives Classifier: A Distance-Based Classifier with Strong Generalization Bounds
Cyrus Cousins and Eliezer Upfal
On the Jeffreys-Lindley Paradox and the Looming Reproducibility Crisis in Machine Learning
Daniel Berrar and Werner Dubitzky
Locally Private Machine Learning over a Network of Data Holders
Bennett Cyphers and Kalyan Veeramachaneni
What Makes a Video Memorable?
Akankshya Kar, Prashasthi Mavin, Yogesh Ghaturle and Vani M.
Exploiting Digital DNA for the Analysis of Similarities in Twitter Behaviours
Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi and Maurizio Tesconi
On Spectral Analysis of Directed Signed Graphs
Yuemeng Li, Xintao Wu and Aidong Lu
A Directional Change Based Trading Strategy with Dynamic Thresholds
Nora Alkhamees and Maria Fasli
Combining Instance and Feature Neighbors for Efficient Multi-label Classification
Len Feremans, Boris Cule, Celine Vens and Bart Goethals
A Consistency-Based Multimodal Graph Embedding Method for Dimensionality Reduction
Ilias Kalamaras, Anastasios Drosou, Eleftheria Polychronidou and Dimitrios Tzovaras
A Novel Approach for Estimating Multiple Sparse Precision Matrices Using $\ell_{0,0}$ Regularization
Duy Nhat Phan and Hoai An Le Thi
FeatureHub: Towards Collaborative Data Science
Micah Smith and Kalyan Veeramachaneni
CSAR: The Cross-Sectional Autoregression Model
Claudio Hartmann, Martin Hahmann and Wolfgang Lehner
Masked Conditional Neural Networks for Automatic Sound Events Recognition
Fady Medhat, David Chesmore and John Robinson
Identifying Anomalous Nodes in Multidimensional Networks
Amani Chouchane and Mohamed Bouguessa
Customizing Travel Packages with Interactive Composite Items
Manish Singh, Ria Mae Borromeo, Anas Hosami, Sihem Amer-Yahia and Shady Elbassuoni
KrowDD: Estimating the Usefulness of a Feature before Obtaining Data for It
Patrick de Boer, Marcel Bhler and Abraham Bernstein
Maximizing Network Performance Based on Group Centrality by Creating Most Effective k-Links
Kouzou Ohara, Kazumi Saito, Masahiro Kimura and Hiroshi Motoda
Learning Low-Rank Document Embeddings with Weighted Nuclear Norm Regularization
Lukas Pfahler, Katharina Morik, Frederik Elwert, Samira Tabti and Volkhard Krech
Sample, Estimate, Tune: Scaling Bayesian Auto-Tuning of Data Science Pipelines
Alec Anderson, Sebastien Dubois and Kalyan Veeramachaneni
Where are you going? Next Place Prediction from Twitter
Carmela Comito
Cyclic Classifier Chain for Cost-Sensitive Multilabel Classification
Yi-An Lin and Hsuan-Tien Lin
Convolutional Neural Networks Based Multi-Task Deep Learning for Movie Review Classification
Xuanyi Li, Weimin Wu and Hongye Su
A Spatial-Cue-Based Probabilistic Model for Bird Song Scene Analysis
Ryosuke Kojima, Osamu Sugiyama, Kotaro Hoshiba, Reiji Suzuki and Kazuhiro Nakadai
Learning Through Utility Optimization in Regression Tasks
Paula Branco, Luis Torgo, Rita P. Ribeiro, Eibe Frank, Bernhard Pfahringer and Markus Michael Rau
Subsequence Search Considering Duration and Relations of Events in Time Interval-Based Events Sequences
Cheng-Wei Yang, Bijay Prasad Jaysawal and Jen-Wei Huang
Multi-label Learning with Label-Specific Features via Clustering Ensemble
Wang Zhan and Min-Ling Zhang
Discrete-State Sequential Modelling of Point Pattern Data
Nhan Dam, Dinh Phung, Ba-Ngu Vo and Viet Huynh
Copula-Based High Dimensional Cross-Market Dependence Modeling
Jia Xu, Wei Wei and Longbing Cao
Multi-Task Network Embedding
Linchuan Xu, Xiaokai Wei, Jiannong Cao and Philip Yu
A Study of Stochastic Mixed Membership Models for Link Prediction in Social Networks
Adrien Dulac, Eric Gaussier and Christine Largeron
Causal Patterns: Extraction of Multiple Causal Relationships by Mixture of Probabilistic Partial Canonical Correlation Analysis
Hiroki Mori, Keisuke Kawano and Hiroki Yokoyama
Discovering Community Structure in Multilayer Networks
Soumajit Pramanik, Raphael Tackx, Anchit Navelkar, Jean-Loup Guillaume and Bivas Mitra

Application Track

Website Navigation Behavior Analysis for Bot Detection
Rabih Haidar and Shady Elbassuoni
Full-Text or Abstract? Examining Topic Coherence Scores Using Latent Dirichlet Allocation
Shaheen Syed and Marco Spruit
A Collaborative Filtering-Based Two Stage Model with Item Dependency for Course Recommendation
Eric L. Lee, Tsung-Ting Kuo and Shou-De Lin
The Data and Science Behind GrabShare Carpooling
Muchen Tang, Serene Ow, Wenqing Chen, Yang Cao and Kong-Wei Lye
An Assessment of Streaming Active Learning Strategies for Real-Life Credit Card Fraud Detection
Fabrizio Carcillo, Yann-Aël Le Borgne, Olivier Caelen and Gianluca Bontempi
Learning to Compress Unstructured Simulation Data
Chandrika Kamath
Animal Recognition and Identification with Deep Convolutional Neural Networks for Automated Wildlife Monitoring
Hung Nguyen, Sarah Maclagan, Tu Dinh Nguyen, Thin Nguyen, Paul Flemons, Kylie Andrews, Euan Ritchie and Dinh Phung
Identification of Signal and Noise Components in Spacecraft Neutral Particle Data using a Bi-Level Mixture Model
Shin’Ya Nakano and Yoshifumi Futaana
Materials Science Literature-Patent Relevance Search: A Heterogeneous Network Analysis Approach
Pingjie Tang, Jed Pitera, Dmitry Zubarev and Nitesh Chawla
Leveraging on Predictive Analytics to Manage Clinic No-Show and Improve Accessibility of Care
Sijia Wang, Guan Hua Lee, Fransiscus Dipuro, Jue Hou, Priyanka Grover, Lian Leng Low, Nan Liu and Chui Yee Loke
Hi-SPEED: A System for Mining Performance Appraisal Data and Text
Girish Keshav Palshikar, Manoj Apte, Sachin Pawar and Nitin Ramrakhiyani
Fine Grained Classification of UAV Imagery for Damage Assessment
Nazia Attari, Ferda Ofli, Mohammad Ebrahim A. Hasan, Ji Lucas and Sanjay Chawla
Ensemble-Based Location Tracking Using Passive RFID
Hao-Ying Liang, Yun-Tung Shieh, Shou-De Lin, Shao-Wen Yang and Addicam Sanjay
NDlib: Studying Network Diffusion Dynamics
Giulio Rossetti, Letizia Milli, Salvatore Rinzivillo, Alina Sirbu, Dino Pedreschi and Fosca Giannotti
Incremental Author Name Disambiguation for Scientific Citation Data
Zhengqiao Zhao, Jason Rollins, Linge Bai and Gail Rosen
Inform Product Change through Experimentation with Data-Driven Behavioral Segmentation
Zhenyu Zhao, Yan He and Miao Chen
A Probabilistic Mechanism-Indepedent Outlier Detection Method for Online Experimentation
Yan He and Miao Chen
Regression Based Model for Autosteering of a Car with Delayed Steering Response
Vsevolod Nikulin, Albert Podusenko, Ivan Tanev and Katsunori Shimohara
Enriching Course-Specific Regression Models with Content Features for Grade Prediction
Qian Hu, Agoritsa Polyzou, George Karypis and Huzefa Rangwala

Special Session on Game Data Science (GDS 2017)

Online k-Maxoids Clustering
Rafet Sifa and Christian Bauckhage

Special Session on Environmental and Geo-spatial Data Analytics (EnGeoData 2017)

There’s a Path For Everyone: A Data-Driven Personal Model Reproducing Mobility Agendas
Riccardo Guidotti, Roberto Trasarti, Mirco Nanni, Fosca Giannotti and Dino Pedreschi
Heterogeneous Information Integration for Mountain Augmented Reality Mobile Apps
Darian Frajberg, Piero Fraternali and Rocio Nahime Torres
Predictive Classification of Water Consumption Time Series Using Non-homogeneous Markov Models
Milad Leyli Abadi, Allou Samé, Latifa Oukhellou, Nicolas Cheifetz, Pierre Mandel, Cédric Féliers and Olivier Chesneau
DP-POIRS: A Diversified and Personalized Point-of-Interest Recommendation System
Xiangfu Meng, Yanhuan Tang and Xiaoyan Zhang
A Shape-Based Approach to Spatio-Temporal Data Analysis of Satellite Imagery
Darpan Baheti and Krishnan Rajan
Mobility Genome – A Framework for Mobility Intelligence from Large-Scale Spatio-Temporal Data
The Anh Dang, Jayakumaran Deepak, Jingxuan Wang, Shixin Luo, Yunye Jin, Yibin Ng, Aloysius Lim and Ying Li
A Peak Detection Method to Uncover Events from Social Media
Carmela Comito, Deborah Falcone and Domenico Talia
Semantic Trajectory Modeling for Dynamic Built Environments
Christophe Cruz

Special Session on Data and Information Quality (DIQ)

SECODA: Segmentation and Combination Based Detection of Anomalies
Ralph Foorthuis
Extended Methods Handling Classification Biases
Emma Beauxis-Aussalet and Lynda Hardman
Toward Optimal Streaming Feature Selection
Noura Alnuaimi and Mohammad Mehedy Masud

Special Session on Data Science in Societal Debates (DSSD)

News Consumption during the Italian Referendum: A Cross-platform Analysis on Facebook and Twitter
Michela Del Vicario, Sabrina Gaito, Walter Quattrociocchi, Matteo Zignani and Fabiana Zollo
Sentiment Spreading: an Epidemic Model for Lexicon-Based Sentiment Analysis on Twitter
Laura Pollacci, Alina Sirbu, Fosca Giannotti, Dino Pedreschi, Claudio Lucchese and Cristina Ioana Muntean
Feature Analysis for Fake Review Detection through Supervised Classification
Julien Fontanarava, Gabriella Pasi and Marco Viviani

Special Session on Evolving Networks (EvoNets)

Scalable RFM-Enriched Representation Learning for Churn Prediction
Sandra Mitrovic, Gaurav Singh, Bart Baesens, Wilfried Lemahieu and Jochen De Weerdt
A Comparative Study of Different Approaches for Tracking Communities in Evolving Social Networks
Ziwei He, Etienne Gael Tajeuna, Shengrui Wang and Mohamed Bouguessa
The Initialization and Parameter Setting Problem in Tensor Decomposition-Based Link Prediction
Sofia Fernandes, Hadi Tork and João Gama

Special Session on Beyond IID: Non-IID Learning (NonIIDLearning)

Steganalysis Feature Subspace Selection Based on Fisher Criterion
Chunfang Yang, Yi Zhang, Ping Wang, Xiangyang Luo, Fenlin Liu and Jicang Lu
Coupled Bayesian Matrix Factorization in Recommender Systems
Xueci Zhao, Chengzhang Zhu and Lizhi Cheng
A Comparative Study of Performance Estimation Methods for Time Series Forecasting
Vitor Cerqueira, Luis Torgo, Jasmina Smailović and Igor Mozetic

Special Session on Big Data and Disaster Management (BDDM)

Supercharging Crowd Dynamics Estimation in Disasters via Spatio-Temporal Deep Neural Network
Fang-Zhou Jiang, Lei Zhong, Kanchana Thilakarathna, Aruna Seneviratne, Kiyoshi Takano, Shigeki Yamada and Yusheng Ji
Geo-spatial Multimedia Sentiment Analysis in Disasters
Abdullah Alfarrarjeh, Sumeet Agrawal, Seon Ho Kim and Cyrus Shahabi
Situational Awareness from Social Media Photographs Using Automated Image Captioning
João Monteiro, Asanobu Kitamoto and Bruno Martins

Special Session on Advanced Informatic Measurement using Statistics, Machine Learning and Pattern Recognition (AimSMLPR)

Machine Learning Independent of Population Distributions for Measurement
Takashi Washio, Gaku Imamura and Genki Yoshikawa