Welcome to the
Data Mining Lab



About

We conduct innovative research on all aspects of knowledge discovery and data mining, ranging from theoretical foundations to novel models and algorithms for data mining problems in science, business, medicine, and engineering.

Lab Mandate
  • Conduct basic research and development in the area of knowledge discovery and data mining.
  • Advance the "science" of knowledge discovery and data mining by supporting training programs and computer science courses.
  • Equip students with both theoretical knowledge and practical experience in the area of knowledge discovery and data mining.
  • Provide an environment for students that fosters the exchange of ideas and collaborations with industry and academic partners, so they can grow as scientists and researchers.
Research Areas
  • data mining
  • machine learning
  • graph mining
  • natural language processing
  • big data analytics
  • knowledge discovery
  • data visualization
  • data mining applications

People

The Data Mining Lab's achievements are the direct result of the talent and dedication of our people.

Faculty Members

Aijun An

Professor

Nick Cercone

(Late) Professor

Manos Papagelis

Associate Professor

Current and Former Postdoctoral Fellows (PDF)
Doctoral Students/Candidates (PhD)
Master Students (MSc/MASc/MScAI)
  • Saeed Abbasi (natural language processing, machine learning)
  • Gian Alix (graph mining, trajectory data mining, machine learning)
  • Mahmoud Alsaeed (graph mining, trajectory data mining, machine learning)
  • Seyed Nima Tayarani Bathaie (natural language processing, computer vision, machine learning)
  • Nicholas Dehnen (natural language processing, machine learning)
  • Ali Faraji (graph mining, trajectory data mining, machine learning)
  • Andrew Jaramillo (databases, query optimization, machine learning)
  • Jing Li (graph mining, trajectory data mining, machine learning)
  • Xingjiang Mao (clustering, reinforcement learning, data compression)
  • Amirhossein Nadiri (graph mining, trajectory data mining, machine learning)
  • Yiming Shao (deep reinforcement learning, GPU job clustering, machine learning)
  • Wei Yuan (natural language processing, machine learning)
  • Chenxing Zheng (natural language processing, machine learning)
Undergraduate research assistants/interns
  • Jingpeng Pan (optimizing data migration using online clustering, F22, W23)
  • Yunfei Peng (optimizing data compression using online clustering, S22, F22, W23)
  • Gian Alix (spatiotemporal epidemic modeling, SU21)
  • Nina Yanin (spatiotemporal epidemic modeling, SU21, F21, W22, SU22, F22, W23)
  • Jing Li (spatiotemporal epidemic modeling, SU20, F20, W21, SU21)
  • Mahmoud Alsaeed (scalable analytics of object intersection problems, SU19)
  • Zhiyuan Cao (dynamic network representation learning, SU19)
  • Kenneth Tjhia (trajectory and network representation learning, SU19, SU20)
Staff / Research Assistants
  • Alireza Naeiji (question answering, machine learning)
  • Bon Ryu (distributed deep learning, machine learning)
Visitors
  • Chalermrat Nontapa, International trainee, PhD candidate, Math, Thammasat University (Bayesian Network, Time Series, Game Theory, Optimization)
Alumni (in chronological order)
  • Shima Khoshraftar (PhD, 2023, Learning Effective Embeddings for Dynamic Graphs and Quantifying Graph Embedding Interpretability)
  • Ali Nemati (MSc, 2022, Evaluating and Forecasting the Operational Performance of Road Intersections)
  • Alireza Naeiji (MSc, 2022, Question Generation Using Sequence-to-Sequence Model with Semantic Role Labels)
  • Mahta Shafieesabet (MSc, 2022, Efficient Mining of Active Components in a Network of Time Series)
  • Fazel Arasteh (MASc, 2022, Network-aware Multi-agent Reinforcement Learning for Adaptive Navigation of Vehicles in a Dynamic Road Network)
  • Hoorieh Marefat (MSc, 2021, Fast Similarity Graph Construction via Data Sketching Techniques)
  • Farnaz Beidokhtinezhad (MSc, 2021, Malicious User Aware Client Selection for Federated Learning)
  • Nastaran Babanejad (PhD, 2020, Enriching Word Representation Learning for Affect Detection and Affect-aware Recommendations)
  • Marjan Delpisheh (MSc, 2020, Neural Question Generation With Transfer Learning And Utilization Of External Knowledge)
  • Saim Mehmood (MSc, 2020, Learning Semantic Relationships of Geographical Areas based on Trajectories)
  • Niloy Eric Costa (MSc, 2020, Effective Density Visualization of Multiple Overlapping Axis-aligned Objects)
  • Xing (Shane) Zhao (MSc, 2020, Elastic Synchronization for Efficient and Effective Distributed Deep Learning)
  • Zana Rashidi (MSc, 2019, Adaptive Momentum for Neural Network Optimization)
  • Tilemachos Pechlivanoglou (MSc, 2019, Sweep-line Extensions to the Multiple Object Intersection Problem: Methods and Applications in Graph Mining)
  • Farzaneh Heidari (MSc, 2019, Evolving Network Representation Learning Based on Random Walks)
  • Heidar Davoudi (PhD, 2018, User Acquisition and Engagement in Digital News Media)
  • Nima Shahbazi (PhD, 2018, Discovery and Effective Use of Frequent Item-sets and Assosciation Rules in Datasets (co-supervised with Jarek Gryz))
  • Emad Gohari (MSc, 2018, Interactive Question Answering Using Frame-based Knowledge Representation)
  • Forouq Khonsari (MSc, 2018, Mining Large-Scale News Articles for Predicting Forced Migration via Machine Learning Techniques)
  • Ameeta Agrawal (PhD, 2018, Enriching Affect Analysis through Emotion and Sarcasm Detection)
  • Morteza Zihayat (PhD, 2016, Mining High Utility Patterns over Data Streams)
  • Yan (Jason) Chen (MSc, 2015, Approximate Parallel High Utility Itemset Mining)
  • Elnaz Delpisheh (PhD, 2015, Extending Topic Models with Syntax and Semantics Relationships)
  • Mehdi Kargar (PhD, 2013, Keyword Search in Graphs, Relational Databases and Social Networks)
  • Martin Dimkovski (MSc, 2012, A Novel Computational Model of Neocortical Columns with Glia as Learning Agent)
  • Ameeta Agrawal (MSc, 2011, Unsupervised Emotion Detection from Text Using Semantic and Syntactic Relations)
  • Bahareh Sarrafzadeh (MSc, 2011, Cross-lingual Word Sense Disambiguation for Languages with Scarce Resources)
  • Hashmat Rohian (MSc, 2011, Discovering Temporal Associations among Significant Changes (co-supervised with Jimmy Huang))
  • Damon Sotoudeh-Hosseini (MSc, 2010, Detecting Partial Drifts Using a Rule Induction Framework)
  • Qian Wan (PhD, 2009, Contrast and Compact Data Mining: Discovering Novel and Useful Patterns from Large Databases)
  • Miro Kuc (MSc, 2009, Cluster Validation Indices: Sensitivity to Distance between Clusters and Affinity to Concurrency)
  • Yang Liu (PhD, 2009, Review Mining from Online Media (co-supervised with Jimmy Huang))
  • Vlad Gerchikov (MSc, 2008, AV-Space with Paging and Performance Comparison)
  • Qinsong Yao (PhD, 2006, Discovering and Using Database User Access Patterns)
  • Bill Andreopoulos (PhD, 2006, Clustering Algorithms for Categorical Data (co-supervised with Steven Wang))
  • Linyan Wang (MSc, 2006, AV-Space for Efficiently Learning Classification Rules from Large Data Sets)
  • Yu Li (MSc, 2005, Integrating XML Data for Virtual OLAP using XML Schemas and UML)
  • Yang Liu (MSc, 2004, Markov Model-based Methods for Web User Clustering and Surfing Recommendation (co-supervised with Jimmy Huang))
  • Ying Zou (MSc, 2004, A Comparison and Selection of Methods for Handling Missing Data in Data Mining)
  • Zhirong Tao (MSc, 2004, Scalable-CLUES: A Scalable Non-parametric Clustering Method Based on Local Shrinking)
  • Qian Wan (MSc, 2003, Efficient Mining of Indirect Associations Using HI-mine)
  • Leah Spo (MSc)
  • Xiangdong An (PhD)
  • Serene Wong (PhD)
  • Kayvan Tirdad (PhD Candidate)
Join Us!

We are looking for bright and hard-working domestic or international students at all levels (Postdoc, PhD, MSc, Senior undergrad). If you are interested in conducting research in the area of data mining and machine learning we would love to hear from you.

Research

Our research builds upon a foundation of academic and industry collaborations that aims to transfer knowledge and have a global impact to academia and industry.

Active

Reinforcement Learning

Active

Deep Learning

Active

Text Mining

Active

Streaming Graph Mining

Active

Trajectory Data Mining

Active

Network Representation Learning

Publications (2010 - present)

Our research is published in peer-reviewed conferences and journals, ensuring the impact of our work reaches to the data mining community at large ([J]ournal, [C]onference, [W]orkshop).

2022
  • [J] Epidemic Spreading in Trajectory Networks. T. Pechlivanoglou, J. Li, J. Sun, F. Heidari, M. Papagelis. Big Data Research (BDR, Vol. 27, 100275, pp 1-15, 2022).
  • [C] Evaluating and Forecasting the Operational Performance of Road Intersections. A. Nematichari, T. Pechlivanoglou, M. Papagelis. Proceedings of the 30th International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2022).
  • [C] Network-aware Multi-agent Reinforcement Learning for the Vehicle Navigation Problem. F. Arasteh, S. SheikhGarGar, M. Papagelis. Proceedings of the 30th International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2022).
  • [C] A Mobility-based Recommendation System for Mitigating the Risk of Infection during Epidemics. G. Alix*, N. Yanin*, T. Pechlivanoglou, J. Li, F. Heidari, M. Papagelis. Proceedings of the 23rd IEEE International Conference on Mobile Data Management (IEEE MDM 2022).
  • [C] Temporal Graph Representation Learning via Maximal Cliques. S. Khoshraftar, A. An, N. Babanejad. Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2022).
  • [W] Unsupervised Knowledge Graph Generation Using Semantic Similarity Matching. L. Liu, A. Omidvar, Z. Ma, A. Agrawal, A. An. Proceedings of the Third Workshop on Deep Learning for Low-Resource Natural Language Processing (NAACL/DeepLo 2022)
  • [W] Microscopic Modeling of Spatiotemporal Epidemic Dynamics. T. Pechlivanoglou, G. Alix, N. Yanin, J. Li, F. Heidari, M. Papagelis. Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Spatial Computing for Epidemiology (ACM SIGSPATIAL/SpatialEpi 2022).
2021
  • [J] ZipLine: An Optimized Algorithm for the Elastic Bulk Synchronous Parallel Model. X. Zhao, M. Papagelis, A. An, B.X. Chen, J. Liu, Y. Hu. Machine Learning (MACH, Vol. 110, pp 2867–2903, 2021)
  • [J] Extending Isolation Forest for Anomaly Detection in Big Data via K-Means. MD T. Laskar, J. Huang, V. Smetana, C. Stewart, K. Pow, A. An, S. Chan, L. Liu. ACM Transactions on Cyber-Physical Systems (ACM TCPS, 5(4), 1-26, 2021).
  • [J] Paywall Policy Learning in Digital News Media. H. Davoudi, Z. Rashidi, A. An, M. Zihayat and G. Edall. IEEE Transactions on Knowledge and Data Engineering (TKDE), Vol.33, No.10, 2021, pp.3394-3409.
  • [J] OL-HeatMap: Effective Density Visualization of Multiple Overlapping Rectangles. N. E. Costa, T. Pechlivanoglou, M. Papagelis. Big Data Research (BDR, Vol. 25, 100235, pp 1-12, 2021)
  • [C] (Extended Abstract) ZipLine: An Optimized Algorithm for the Elastic Bulk Synchronous Parallel Model. X. Zhao, M. Papagelis, A. An, B. X. Chen, J. Liu, Y. Hu. Proceedings of the 8th IEEE International Conference on Data Science and Advanced Analytics (IEEE DSAA 2021).
  • [C] Centrality-based Interpretability Measures for Graph Embeddings. S. Khoshraftar, S. Mahdavi and A. An. Proceedings of the 8th IEEE International Conference on Data Science and Advanced Analytics (IEEE DSAA 2021).
2020
  • [J] Evolving Network Representation Learning based on Random Walks. F. Heidari, M. Papagelis. Applied Network Science (APNS, Vol. 5, No. 18, pp 1-38, 2020)
  • [C] Affective and Contextual Embedding for Sarcasm Detection. N. Babanejad, A. Agrawal, H. Davoudi, A. An, M. Papagelis. Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020).
  • [C] MRSweep: Distributed In-Memory Sweep-line for Scalable Object Intersection Problems. T. Pechlivanoglou, M. Alsaeed, M. Papagelis. Proceedings of the 7th IEEE International Conference on Data Science and Advanced Analytics (IEEE DSAA 2020).
  • [C] A Comprehensive Analysis of Preprocessing for Word Representation Learning in Affective Tasks. N. Babanejad, A. Agrawal, A. An, M. Papagelis. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020).
  • [C] Leveraging Transitions of Emotions for Sarcasm Detection. A. Agrawal, A. An, M. Papagelis. Proceedings of the 43rd ACM International SIGIR Conference on Research and Development in Information Retrieval (ACM SIGIR 2020).
  • [C] Learning Semantic Relations of Geographic Areas based on Trajectories. S. Mehmood, M. Papagelis. Proceedings of the 21st IEEE International Conference on Mobile Data Management (IEEE MDM 2020).
  • [C] Towards topology aware pre-emptive job scheduling with deep reinforcement learning. B. Ryu, A. An, Z. Rashidi, J. Liu, Y. Hu. Proceedings of the 30th Annual International Conference on Computer Science and Software Engineering (CASCON 2020).
  • [C] Multiple pedestrian tracking based on modified mask R-CNN and enhanced particle filter using an adaptive information driven motion model. M. Al-Shatnawi, A. Asif, V. Movahedi, A. An, Y. Hu, J. Liu. Proceedings of the 30th Annual International Conference on Computer Science and Software Engineering (CASCON 2020).
  • [C] Adaptive Momentum Coefficient for Neural Network Optimization. Z. Rashidi, K. Ahmadi, A. An and X. Wang. Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD 2020).
  • [C] Question-Worthy Sentence Selection for Question Generation. S. Mahdavi, A. An, H. Davoudi, M. Delpisheh, E. Gohari. Canadian Conference on Artificial Intelligence (CANAI 2020).
  • [C] Learning to Determine the Quality of News Headlines. A. Omidvar, H. Pourmodheji, A. An, G. Edall. Proceedings of the 12th International Conference on Agents and Artificial Intelligence (ICAART 2020).
2019
  • [J] A Versatile Computational Framework for Group Pattern Mining of Pedestrian Trajectories. A. Sawas, A. Abuolaim, M. Afifi, M. Papagelis. GeoInformatica (Vol. 23, No. 4, pp 501-531, 2019).
  • [J] A Utility-based News Recommendation System. M. Zihayat, A. Ayanso, X. Zhao, H. Davoudi and A. An. Decision Support Systems, Vol. 117, February 2019, pp.14-27. (DSS).
  • [C] Elastic Bulk Synchronous Parallel for Distributed Deep Learning. Xing Zhao, Manos Papagelis, Aijun An, Bao Xin Chen, Junfeng Liu, and Yonggang Hu. Proceedings of the 19th IEEE International Conference on Data Mining (IEEE ICDM 2019).
  • [C] Efficient Mining and Exploration of Multiple Axis-aligned Intersecting Objects. T. Pechlivanoglou, V. Chu, M. Papagelis. Proceedings of the 19th IEEE International Conference on Data Mining (IEEE ICDM 2019).
  • [C] Dynamic Graph Embedding via LSTM History Tracking. S. Khoshraftar, S. Mahdavi, A. An, Y. Hu and J. Liu. Proceedings of the 6th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2019).
  • [C] Dynamic Stale Synchronous Parallel Distributed Training for Deep Learning. X. Zhao, A. An, J. Liu and B. X. Chen. Proceedings of the 39th IEEE International Conference on Distributed Computing Systems (IEEE ICDCS 2019).
  • [C] Content-based Dwell Time Engagement Prediction Model for News Articles. H. Davoudi, A. An and G. Edall. Proceedings of the 17th Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019).
  • [C] Practical Key Recovery Model for Self-Sovereign Identity Based Digital Wallets. R. Soltani, U. T. Nguyen, A. An. Proceedings of Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech 2019).
  • [W] Leveraging Emotion Features in News Recommendations. N. Babanejad, A. Agrawal, H. Davoudi, A. An, M. Papagelis. Proceedings of the 13th ACM Conference on Recommender Systems 2019 - News Recommendation and Analytics Workshop (ACM RecSys INRA 2019 Workshop).
  • [W] Dynamic Joint Variational Graph Autoencoders. S. Mahdavi, S. Khoshraftar, A. An. Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (PKDD/ECML Workshops 2019).
2018
  • [J] Machine learning and BIM visualization for maintenance issue classification and enhanced data collection. J. J. McArthur, N. Shahbazi, R. Fok, C. Raghubar, B. Bortoluzzi, A. An. Advanced Engineering Informatics (Vol. 38: pp 101-112, 2018)
  • [C] Adaptive Paywall Mechanism for Digital News Media. H. Davoudi, A. An, M. Zihayat and G. Edall. Proceedings of the 2018 ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2018).
  • [C] Decoupling the Layers in Residual Networks. R. Fok, A. An, Z. Rashidi, X. Wang. Proceedings of the 6th International Conference in Learning Representations (ICLR 2018).
  • [C] Fast and Accurate Mining of Node Importance in Trajectory Networks. T. Pechlivanoglou, M. Papagelis. Proceedings of the 6th IEEE International Conference on Big Data (IEEE Big Data 2018).
  • [C] dynnode2vec: Scalable Dynamic Network Embedding. S. Mahdavi, S. Khoshraftar and A. An. Proceedings of the 2018 IEEE International Conference in Big Data (IEEE BigData 2018).
  • [C] Affective Representations for Sarcasm Detection. A. Agrawal and A. An. Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018).
  • [C] Learning Emotion-enriched Word Representations. A. Agrawal, A. An, M. Papagelis. Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018).
  • [C] Tensor Methods for Group Pattern Discovery of Pedestrian Trajectories. A. Sawas, A. Abuolaim, M. Afifi, M. Papagelis. Proceedings of the 19th IEEE International Conference on Mobile Data Management (IEEE MDM 2018, best paper award).
  • [C] Trajectolizer: Interactive Analysis and Exploration of Trajectory Group Dynamics. A. Sawas, A. Abuolaim, M. Afifi, M. Papagelis. Proceedings of the 19th IEEE International Conference on Mobile Data Management (IEEE MDM 2018, demo).
  • [C] EvoNRL: Evolving Network Representation Learning based on Random Walks. F. Heidari, M. Papagelis. Proceedings of the 7th International Conference on Complex Networks and Their Applications (Complex Networks 2018).
  • [C] Using Neural Network for Identifying Clickbaits in Online News Media. A. Omidvar, H. Jiang, A. An. Proceedings of 5th International Conference on Information Management and Big Data (SIMBig 2018)
  • [C] A New Approach to Client Onboarding Using Self-Sovereign Identity and Distributed Ledger. R. Soltani, U. T. Nguyen, A. An. iThings/GreenCom/CPSCom/SmartData 2018: 1129-1136
  • [W] Effective Team Formation in Expert Networks. M. Zihayat, A. An, L. Golab, M. Kargar, J. Szlichta. Proceedings of the 12th Alberto Mendelzon International Workshop on Foundations of Data Management (AMW 2018)
  • [W] Improving Real-time Pedestrian Detection using Adaptive Confidence Thresholding and Inter-Frame Correlation. M. Al-Shatnawi, V. Movahedi, A. Asif, A. An. Proceedings of the 20th IEEE International Workshop on Multimedia Signal Processing (IEEE MMSP 2018).
  • [W] Scene Classification in Indoor Environments for Robots using Word Embeddings. B. X. Chen, R. Sahdev, D. Wu, X. Zhao, M. Papagelis, and J. K. Tsotsos. Proceedings of the International Conference on Robotics and Automation 2018 Workshop on Multimodal Robot Perception (ICRA MRP 2018).
2017
  • [J] Memory-Adaptive High Utility Sequential Pattern Mining over Data Streams. M. Zihayat, Y. Chen and A. An. Machine Learning, 106(6), 799-836, 2017.
  • [J] Efficiently Mining High Utility Sequential Patterns in Static and Streaming Data. M. Zihayat, C-W. Wu, A. An and V. S. Tseng. Intelligent Data Analysis, Vol.21, No.S1, pp.S103-S135, 2017.
  • [J] Geodesic and Contour Optimization Using Conformal Mapping. R. Fok, A. An and X. Wang. Journal of Global Optimization, 69(1): 23-44 (2017).
  • [J] Mining significant high utility gene regulation sequential patterns. Morteza Zihayat, Heidar Davoudi, Aijun An. BMC Systems Biology 11(6): 109:1-109:14 (2017).
  • [J] Mining Evolving Data Streams with Particle Filters. R. Fok, A. An and X. Wang. Computational Intelligence, 33(2): 147-180 (2017).
  • [J] BIM-based Collaborative Design and Socio-technical Analytics of Green Buildings. T. El-Diraby, T. Krijnen, M. Papagelis. Automation in Construction (AiC, Vol. 82, NO. 10, 2017)
  • [C] Contrast Pattern based Collaborative Behavior Recommendation System for Life Improvement. Y.Chen, M. L. Yann, H. Davoudi, J. Choi, A. An and Z. Mei. Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Jeju, South Korea, May 23-26, 2017.
  • [C] Time-Aware Subscription Prediction Model for User Acquisition in Digital News Media. Heidar Davoudi, Morteza Zihayat and Aijun An. Proceedings of the 2017 SIAM International Conference on Data Mining (SDM'17), Houston, Texas, USA, April 27-29, 2017.
  • [C] Authority-based Team Discovery in Social Networks. Morteza Zihayat, Aijun An, Lukasz Golab, Mehdi Kargar and Jaroslaw Szlichta. Proceedings of the 20th International Conference on Extending Database Technology (EDBT'17), Venice, Italy, March 21-24, 2017.
2016
  • [J] Approximate Parallel High Utility Itemset Mining. Y. Chen and A. An. Big Data Research, Vol. 6: 26-42 (2016). (Source code for PHUI-Miner).
  • [J] Time aware topic based recommender system. E. Delpisheh, A. An, H. Davoudi and E. Gohari. Big Data and Information Analytics (BDIA), Vol. 1, No. 2/3, 261-274, 2016.
  • Top-k Utility-based Gene Regulation Sequential Pattern. M. Zihayat, H. Davoudi, and A. An. Proceedings of the 2016 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM 2016), in Shenzhen, China, Dec 15-18, 2016.
  • [C] Distributed and Parallel High Utility Sequential Pattern Mining. M. Zihayat, Z. Z. Hu, A. An, and Y. Hu. Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), Washington D.C., USA, December 5-8, 2016.
  • [C] Deep Parallelization of Parallel FP-Growth Using Parent-Child MapReduce. A. Makanju, Z. Farzanyar, A. An, N. Cercone, Z. Hu, and Y. Hu. Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), Washington D.C., USA, December 5-8, 2016.
  • [C] Selective Co-occurrences for Word-Emotion Association. A. Agrawal and A. An. Proceedings of the 26th International Conference on Computational Linguistics (COLING'16), Osaka, Japan, December 11-16, 2016.
  • Detecting the Magnitude of Events from News Articles. A. Agrawal, R. Sahdev, H. Davoudi, F. Khonsari, A. An and S. McGrath. Proceedings of the 2016 IEEE/WIC/ACM International Conference on Web Intelligence, Omaha, Nebraska, USA, October 13-16, 2016.
  • [C] Computational Role of Astrocytes in Bayesian Inference and Probability Distribution Encoding. M. Dimkovski and A. An. Proceedings of the 2016 International Conference on Brain Informatics & Health (BIH'16), Omaha, Nebraska, USA, October 13-16, 2016.
  • [C] Ranking Documents through Stochastic Sampling on Bayesian Network-based Models: A Pilot Study. X. Tan, J. X. Huang and A. An. Proceeding of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval (SIGIR'16), Pisa, Italy, July 17-21, 2016. 961-964.
  • [C] Building FP-Tree on the Fly: Single-Pass Frequent Itemset Mining. N. Shahbazi, R. Soltani, J. Gryz, A. An. (2016). Proceedings of the 12th International Conference on Machine Learning and Data Mining in Pattern Recognition (MLDM 2016), New York, United States, July 16-21, 2016. 387-400.
  • [C] Green2.0: Enabling Complex Interactions Between Buildings and People. M. Papagelis, T. F. Krijnen, M. Elshenawy, T. Konomi, R. Fang, T. El-Diraby. Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW 2016)
2015
  • [J] Refining Social Graph Connectivity via Shortcut Edge Addition. M. Papagelis. ACM Transactions on Knowledge Discovery from Data (ACM TKDD, Vol. 10, NO. 2, 2015)
  • [J] A Bayesian Model for Canonical Circuits in the Neocortex for Parallelized and Incremental Learning of Symbol Representations. M. Dimkovski and A. An. Neurocomputing, 149: 1270-1279, 2015.
  • [J] Finding Top-k r-cliques for Keyword Search from Graphs in Polynomial Delay. Kargar and A. An. Knowledge and Information Systems (KAIS), 43(2): 249-280, 2015.
  • [C] Mining high utility sequential patterns from evolving data streams. M. Zihayat, C.-W. Wu, A. An, and V. S. Tseng. Proceedings of the Fifth ASE International Conference on Big Data (BigData 2015), Kaohsiung, Taiwan, pages 52:1-52:6, 2015.
  • [C] Ontology-Based Topic Labeling and Quality Prediction. H. Davoudi and A. An. Proceedings of International Symposium on Methodologies for Intelligent Systems (ISMIS 2015), Lyon, France, October 2015.
  • [C] Meaningful Keyword Search in Relational Databases with Large and Complex Schema. M. Kargar, A. An, N. Cercone, P. Godfrey, J. Szlichta and X. Yu. Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE'15), Seoul, Korea, April 13-17, 2015. 411-422.
2014
  • [J] Mining Top-k High Utility Patterns Over Data Streams. M. Zihayat and A. An. Information Sciences, 285, 2014 (IS).
  • [J] Efficient Duplication Free and Minimal Keyword Search in Graphs. M. Kargar, A. An and X. Yu. IEEE Transactions on Knowledge and Data Engineering (TKDE), 26(7): 1657-1669, 2014.
  • [C] Topic Modeling using Collapsed Typed Dependency Relations. E. Delpisheh, A. An. Proceedings of the 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2014).
  • [C] MeanKS: Meaningful Keyword Search in Relational Databases with Complex Schema. M. Kargar, A. An, N. Cercone, P. Godfrey, J. Szlichta and X. Yu. proceedings of the 2014 ACM International Conference on Management of Data (ACM SIGMOD 2014, demo).
  • [C] Two-Phase Pareto Set Discovery for Team Formation in Social Networks. M. Zihayat, M. Kargar and A. An. Proceedings of the 2014 IEEE/WIC/ACM International Conference on Web Intelligence (WI'14), Warsaw, Poland, August 11-14, 2014. 304-311.
2013
  • [J] Sampling Online Social Networks. M. Papagelis, G. Das, N. Koudas. IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE, Vol. 25, NO. 3, Mar 2013)
  • [J] Detection of Malicious and Non-malicious Website Visitors Using Unsupervised Neural Network Learning. D. Stevanovic, N. Vlajic and A. An. Applied Soft Computing, Elsevier, 13(1): 698-708, 2013.
  • [J] Riding the Tide of Sentiment Change: Sentiment Analysis with Evolving Online Reviews. Y. Liu, X. Yu, X. Huang and A. An. World Wide Web Journal, 16(4), 477-496, 2013.
  • [C] Signal detection in genome sequences using complexity based features. M. Kargar, A. An, N. Cercone, K. Tirdad, M. Zihayat. Proceedings of the 12th International Workshop on Data Mining in Bioinformatics (BioKDD 2013), Chicago, IL, USA, August 2013. 25-33.
  • [C] Finding Affordable and Collaborative Teams from a Network of Experts. M. Kargar, M. Zihayat and A. An. Proceedings of the 2013 SIAM International Conference on Data Mining (SDM'13), Austin, Texas, USA, May, 2013. 587-595.
2012
  • [J] Feature Evaluation for Web Crawler Detection with Data Mining Techniques. D. Stevanovic, A. An and N. Vlajic. Expert Systems with Applications, 39(10): 8707-8717, 2012.
  • [J] Mining Online Reviews for Predicting Sales Performance: A Case Study in the Movie Domain. Yu, X., Liu, Y., Huang, X. and An, A. IEEE Transactions on Knowledge and Data Engineering (TKDE), 24(4): 720-734, 2012.
  • [C] Unsupervised Emotion Detection from Text using Semantic and Syntactic Relations. A. Agrawal and A. An. Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (WI'12), Macau, China, December 4-7, 2012. 346-353.
  • [C] Efficient Bi-objective Team Formation in Social Networks. M. Kargar, A. An and M. Zihayat. Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD'12), Bristol, U.K., September 2012. 483-498.
  • [C] Efficient Top-k Keyword Search in Graphs with Polynomial Delay. M. Kargar and A. An. Proceedings of the 28th IEEE International Conference on Data Engineering (ICDE'12 demos), Washington D.C, April 1-5, 2012. 1269-1272.
2011
  • [J] Keyword Search in Graphs: Finding r-cliques. M. Kargar and A. An. Proceedings of the VLDB Endowment, Vol.4, No.10, 2011. pp.681-692.
  • [J] Finding Best Evidence for Evidence-based Best Practice Recommendations in Health Care: the Initial Decision Support System Design. N. Cercone, X. An, J. Li, Z. Gu, and A. An. Knowledge and Information Systems: an International Journal (KAIS), Vol.29, No.1, 159-201, 2011.
  • [J] Combining Integrated Sampling with SVM Ensembles for Learning from Imbalanced Datasets. Y. Liu, X. Yu, X. Huang and A. An. Information Processing & Management (IPM), Vol.47, No.4, 617-631, 2011.
  • [C] TeamExp: Top-k Team Formation in Social Networks. M. Kargar and A. An. Proceedings of the Workshops for the 2011 IEEE International Conference on Data Mining (ICDM'11 demos), Vancouver, Canada, in December 2011. 1231-1234.
  • [C] Discovering Top-k Teams of Experts with/without a Leader in Social Networks. M. Kargar and A. An. Proceedings of the 20th ACM International Conference on Information and Knowledge Management (CIKM'11), Glasgow, U.K., October 24-28, 2011. 985-994.
  • [C] Keyword Search in Graphs: Finding r-cliques. M. Kargar and A. An. Proceedings of the VLDB Endowment, Vol.4, No.10, 2011. pp.681-692. Full paper for the 37th International Conference on Very Large Data Bases (VLDB'11), Seattle, WA, 2011.
  • [C] Unsupervised Clustering of Web Sessions to Detect Malicious and Non-malicious Website Users. D. Stevanovic, N. Vlajic, A. An. Proceedings of the 2nd International Conference on Ambient Systems, Networks and Technologies, Niagara Falls, Canada, September 2011. 123-131.
  • [C] Detecting Web Crawlers from Web Server Access Logs with Data Mining Classifiers. D. Stevanovic, A. An, and N. Vlajic. Proceedings of the 19th International Symposium on Methodologies for intelligent Systems (ISMIS'11), Warsaw, Poland, June 28-30, 2011.
  • [C] Towards Automatic Acquisition of a Fully Sense Tagged Corpus for Persian. B. Sarrafzadeh, N. Yakovets, N. Cercone and A. An. Proceedings of the 19th International Symposium on Methodologies for intelligent Systems (ISMIS'11), Warsaw, Poland, June 28-30, 2011. 449-455.
  • [C] Cross Lingual Word Sense Disambiguation for Languages with Scarce Resources. B. Sarrafzadeh, N. Yakovets, N. Cercone and A. An. Proceedings of the 24th Canadian Conference on Artificial Intelligence (AI'11), St. John's, Newfoundland and Labrador, Canada, May 25-27, 2011. 347-358.
  • [C] Suggesting Ghost Edges for a Smaller World. M. Papagelis, F. Bonchi, A. Gionis. Proceedings of the 20th ACM Conference on Information and Knowledge Management (ACM CIKM 2011).
  • [C] Individual Behavior and Social Influence in Online Social Systems. M. Papagelis, V. Murdock, R. van Zowl. Proceedings of the 22nd ACM Conference on Hypertext and Hypermedia (ACM Hypertext 2011).
2010
  • [C] Active Media Technology, Proceedings of AMT 2010, Lecture Notes in Computer Science 6335. A. An, P. Lingras, S. Petty and R. Huang, .Springer, 2010.
  • [C] Partial Drift Detection Using a Rule Induction Framework. D. Sotoudeh and A. An. Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM'10), Toronto, Canada, October 26-30, 2010.
  • [C] Evaluation of Different Complexity Measures for Signal Detection in Genome Sequences. M. Kargar and A. An. Proceedings of the 2010 ACM International Conference On Bioinformatics and Computational Biology (ACM-BCB'10), Niagara Falls, NY, August 2-4, 2010.
  • [C] The Effect of Sequence Complexity on the Construction of Protein-Protein Interaction Networks. M. Kargar and A. An. Proceedings of the 2010 International Conference on Brain Informatics (BI'10), Toronto, Canada, August 28-30, 2010.
  • [C] S-PLSA+: Adaptive Sentiment Analysis with Application to Sales Performance Prediction. X. Yu, Y. Liu, X. Huang, A. An. Proceedings of the 33rd Annual International ACM SIGIR Conference on Research & Development on Information Retrieval (SIGIR'10), Geneva, Switzerland, on 19-23 July 2010. 873-874.
  • [C] A Quality-Aware Model for Sales Prediction Using Reviews. X. Yu, Y. Liu, X. Huang, A. An. Proceedings of the 19th International World Wide Web Conference (WWW 2010), Raleigh, North Carolina, April 26-30, 2010. 1217-1218.

Courses

Members of the Data Mining Lab are teaching the following courses related to data mining, graph mining, big data analytics and data visualization.

Contact Us

The Data Mining Lab is hosted at the Electrical Engineering and Computer Science (EECS) department of the Lassonde School of Engineering of York University.

Visit Us

Rooms 2057 & 3057
Lassonde Building
York University
Toronto, ON, Canada.

Lassonde Building

Lassonde Building

Interactive Map & Directions