Dan Yang
Prof. Dan YANG
Innovation and Information Management
Associate Professor
Associate Director, Institute of Digital Economy and Innovation

3917 0015

KK 816

Academic Appointments
  • Associate Director of Institute of Digital Economy and Innovation, HKU, 2022-
  • Associate Professor of Innovation and Information Management (IIM), HKU, 2018-
  • Assistant Professor of Statistics, Rutgers University, 2013-2019
  • Postdoc, Statistical and Applied Mathematical Sciences Institute, 2012-2013
Education
  • Ph.D. in Statistics, University of Pennsylvania, 2007-2012
  • B.S. in Economics, Peking University, China, 2004-2007
  • B.S. in Statistics, Peking University, China, 2003-2007
Biography

Dan Yang is an Associate Professor in Innovation and Information Management of the Business School and Associate Director of Institute of Digital Economy and Innovation at the University of Hong Kong. She received her doctoral degree in Statistics from the Wharton School of the University of Pennsylvania and her bachelor’s degrees in Statistics and Economics from Peking University. Prior to joining HKU, she was an Assistant Professor in Department of Statistics at Rutgers University.

Professor Yang’s research interests include tensor data, high-dimensional statistical inference, time series, dimension reduction, network data, functional data, and business applications in economics, finance and healthcare. Her work has been published in journals such as the Journal of the American Statistical Association (with discussion), Annals of Statistics, Journal of the Royal Statistical Society Series B, Journal of Econometrics, Journal of Machine Learning Research, among others. She also served as an associate editor for Statistica Sinica.

Research Interest
  • Tensor data analysis
  • High-dimensional statistical inference
  • Time series analysis
  • Dimension reduction
  • Network analysis
  • Functional data analysis
  • Portfolio management
  • Analysis of observational studies and causal inference
  • Business applications in Finance, Economics, and Healthcare
Honours and Awards
  • Faculty Knowledge Exchange Award of the Business School at HKU, 2023
  • Duckworth Fellowship of the Wharton School at University of Pennsylvania, 2011
  • Student Travel Award in Health Policy Statistics Section of ASA at Joint Statistical Meeting, 2011
  • Eastern North Atlantic Region Student Paper Award, 2011
  • J. Parker Bursk Memorial Prize for excellence in research at Wharton School, 2010
  • National Third Prize of Chinese Mathematical Olympics, 2003
Grants and Funding
  • Theme-Based Research Fund at HKU Shenzhen Research Institutes, Smart Healthcare – Accessibility, Quality and Affordability, RMB5,000,000, Role: Co-PI, 2023-2028
  • HK Research Grant Council, CRF, Hong Kong RECAP: A Systematic Response Strategy for Novel Infectious Disease Pandemic, Project No. C7162-20G, HKD$3,952,247, Role: Co-PI, 2021-2024
  • HK Research Grant Council, GRF, Statistical Learning with Big Data of Dynamic Tensor Structure, Project No. 17301620, HKD$599,861 Role: PI, 2021-2024
  • US NSF BIGDATA, Statistical Learning with Large Dynamic Tensor Data, IIS-1741390, USD$1,000,000 (with extra USD$170,000 for cloud computing), Role: Co-PI, 2017-2021
Editorial Board
  • Associate Editor, Statistica Sinica, 2020-2023
    JIF: 1.5; journal ranking: 74.4%
Selected Publications
  • Yuefeng Han, Rong Chen, Dan Yang, and Cun-Hui Zhang (2024+). Tensor factor model estimation by iterative projection. Annals of Statistics, in press.
  • Yuefeng Han, Dan Yang, Cun-Hui Zhang, and Rong Chen (2024+). CP factor model for dynamic tensors. Journal of the Royal Statistical Society. Series B: Statistical Methodology, in press.
  • Xin Chen, Dan Yang, Yan Xu, Yin Xia, Dong Wang, and Haipeng Shen (2023). Testing and Support Recovery of Correlation Structures for Matrix-Valued Observations with an Application to Stock Market Data. Journal of Econometrics, 232(2):544-564.
  • Rong Chen, Dan Yang, and Cun-Hui Zhang (2022). Rejoinder: Factor models for high dimensional tensor time series. Journal of the American Statistical Association, Vol.117 (537), p.128-132.
  • Rong Chen, Dan Yang, and Cun-Hui Zhang (2022). Factor Models for High-Dimensional Tensor Time Series. Journal of the American Statistical Association, 117(537):94-116.
  • Rong Chen, Han Xiao, and Dan Yang (2021). Autoregressive Models for Matrix-valued Time Series. Journal of Econometrics, 222(1):539-560.
  • Gen Li, Dan Yang, Andrew B. Nobel, and Haipeng Shen (2016). Supervised Singular Value Decomposition and Its Asymptotic Properties. Journal of Multivariate Analysis, 146:7-17.
  • Dan Yang, Zongming Ma, and Andreas Buja (2016). Rate Optimal Denoising of Simultaneously Sparse and Low Rank Matrices. Journal of Machine Learning Research, 17:1-27.
  • Dan Yang, Zongming Ma, and Andreas Buja (2014). A Sparse Singular Value Decomposition Method for High-Dimensional Data. Journal of Computational and Graphical Statistics, 23(4):923-942.
  • Dan Yang, Dylan S. Small, Jeffrey H. Silber, and Paul R. Rosenbaum (2012). Optimal Matching with Minimal Deviation from Fine Balance in a Study of Obesity and Surgical Outcomes. Biometrics, 68(2):628-636.
  • Dan Yang and Dylan Small (2012). An R package and a study of methods for computing empirical likelihood. Journal of Statistical Computation and Simulation, 83(7), 1363-1372.
Teaching
  • PMBA 6093 Analytics for Managers
  • MSBA 7013 Forecasting and Predictive Analytics
  • MSBA 7028 Deep Learning
  • MSBA 7011 Managing and Mining Big Data
  • China Resources (CR) Data Scientist Program: Big Data
  • CR Data Scientist Program: Forecasting Analytics
  • China Construction Bank (CCB) Data Analyst Program: Big Data
  • CCB Data Analyst Program: Forecasting Analytics
Recent Publications
Testing and Support Recovery of Correlation Structures for Matrix-valued Observations With an Application to Stock Market Data

Estimation of the covariance matrix of asset returns is crucial to portfolio construction. As suggested by economic theories, the correlation structure among assets differs between emerging markets and developed countries. It is therefore imperative to make rigorous statistical inference on correlation matrix equality between the two groups of countries. However, if the traditional vector-valued approach is undertaken, such inference is either infeasible due to limited number of countries comparing to the relatively abundant assets, or invalid due to the violations of temporal independence assumption. This highlights the necessity of treating the observations as matrix-valued rather than vector-valued. With matrix-valued observations, our problem of interest can be formulated as statistical inference on covariance structures under sub-Gaussian distributions, i.e., testing non-correlation and correlation equality, as well as the corresponding support estimations. We develop procedures that are asymptotically optimal under some regularity conditions. Simulation results demonstrate the computational and statistical advantages of our procedures over certain existing state-of-the-art methods for both normal and non-normal distributions. Application of our procedures to stock market data reveals interesting patterns and validates several economic propositions via rigorous statistical testing.

Factor Models for High-Dimensional Tensor Time Series

Large tensor (multi-dimensional array) data routinely appear nowadays in a wide range of applications, due to modern data collection capabilities. Often such observations are taken over time, forming tensor time series. In this paper we present a factor model approach to the analysis of high-dimensional dynamic tensor time series and multi-category dynamic transport networks. Two estimation procedures are presented along with their theoretical properties and simulation results. Two applications are used to illustrate the model and its interpretations.

Autoregressive Models for Matrix-valued Time Series

In finance, economics and many other fields, observations in a matrix form are often generated over time. For example, a set of key economic indicators are regularly reported in different countries every quarter. The observations at each quarter neatly form a matrix and are observed over consecutive quarters. Dynamic transport networks with observations generated on the edges can be formed as a matrix observed over time. Although it is natural to turn the matrix observations into long vectors, then use the standard vector time series 2 models for analysis, it is often the case that the columns and rows of the matrix represent different types of structures that are closely interplayed. In this paper we follow the autoregression for modeling time series and propose a novel matrix autoregressive model in a bilinear form that maintains and utilizes the matrix structure to achieve a substantial dimensional reduction, as well as more interpretability. Probabilistic properties of the models are investigated. Estimation procedures with their theoretical properties are presented and demonstrated with simulated and real examples.