publications

* denotes equal contribution and joint lead authorship.

KDD

CAFO: Feature-Centric Explanation on Time Series Classification

Jaeho Kim, Seok-ju Hanhn, Yoontae Hwang, Junghye Lee, and Seulki Lee,

ACM KDD 2024

Acceptance Rate 20%, Best Poster Award, @UNIST AI Tech Workshop 2024

Abstract

In multivariate time series (MTS) classification, finding the important features (e.g., sensors) for model performance is crucial yet challenging due to the complex, high-dimensional nature of MTS data, intricate temporal dynamics, and the necessity for domain-specific interpretations. Current explanation methods for MTS mostly focus on time-centric explanations, apt for pinpointing important time periods but less effective in identifying key features. This limitation underscores the pressing need for a feature-centric approach, a vital yet often overlooked perspective that complements time-centric analysis. To bridge this gap, our study introduces a novel feature-centric explanation and evaluation framework for MTS, named CAFO (Channel Attention and Feature Orthgonalization). CAFO employs a convolution-based approach with channel attention mechanisms, incorporating a depth-wise separable channel attention module (DepCA) and a QR decomposition-based loss for promoting feature-wise orthogonality. We demonstrate that this orthogonalization enhances the separability of attention distributions, thereby refining and stabilizing the ranking of feature importance. This improvement in feature-wise ranking enhances our understanding of feature explainability in MTS. Furthermore, we develop metrics to evaluate global and class-specific feature importance. Our framework's efficacy is validated through extensive empirical analyses on two major public benchmarks and real-world datasets, both synthetic and self-collected, specifically designed to highlight class-wise discriminative features. The results confirm CAFO's robustness and informative capacity in assessing feature importance in MTS classification tasks. This study not only advances the understanding of feature-centric explanations in MTS but also sets a foundation for future explorations in feature-centric explanations.

FRL

Heterogeneous Trading Behaviors of Individual Investors.

Yoontae Hwang, Junpyo Park, Jangho Kim, Yongjae Lee, and Frank J. Fabozzi,

Financial Research Letters 2023

Acceptance Rate 28%

AOR

Identifying household finance heterogeneity via deep clustering

Yoontae Hwang, Yongjae Lee, and Frank J. Fabozzi,

Annals of Operations Research 2023

Acceptance Rate 33.3%

Abstract

Households are becoming increasingly heterogeneous. While previous studies have revealed many important insights (e.g., wealth effect, income effect), they could only incorporate two or three variables at a time. However, in order to have a more detailed understanding of complex household heterogeneity, more variables should be considered simultaneously. In this study, we argue that advanced clustering techniques can be useful for investigating high-dimensional household heterogeneity. A deep learning-based clustering method is used to effectively handle the high-dimensional balance sheet data of approximately 50,000 households. The employment of appropriate dimension-reduction techniques is the key to incorporate the full joint distribution of high-dimensional data in the clustering step. Our study suggests that various variables should be used together to explain household heterogeneity. Asset variables are found to be crucial for understanding heterogeneity within wealthy households, while debt variables are more important for those households that are not wealthy. In addition, relationships with sociodemographic variables (e.g., age, education, and family size) were further analyzed. Although clusters are found only based on financial variables, they are shown to be closely related to most sociodemographic variables.

QF

Household Financial Health: A Machine Learning Approach for Data-Driven Diagnosis and Prescription

Kyungbin Kim*, Yoontae Hwang*, Dongcheol Lim, Suhyeon Kim, Junghye Lee, and Yongjae Lee,

Quantitative Finance 2023

Acceptance Rate 23.3%

Abstract

Household finances are being threatened by unprecedented social and economic upheavals, including an aging society and slow economic growth. Numerous researchers and practitioners have provided guidelines for improving the financial status of households; however, the challenge of handling heterogeneous households remains nontrivial. In this study, we propose a new data-driven framework for the financial health of households to address the needs for diagnosing and improving financial health. This research extends the concept of healthcare to household finance. We develop a novel deep learning-based diagnostic model for estimating household financial health risk scores from real-world household balance sheet data. The proposed model can successfully manage the heterogeneity of households by extracting useful latent representations of household balance sheet data while incorporating the risk information of each variable. That is, we guide the model to generate higher latent values for households with risky balance sheets. We also show that the gradient of the model can be utilized for prescribing recommendations for improving household financial health. The robustness and validity of the new framework are demonstrated using empirical analyses.

ICAIF

SimStock : Representation Model for Stock Similarities

Yoontae Hwang, Junhyeong Lee, Daham kim, Seunghwan Noh, Joohwan Hong, and Yongjae Lee,

ICAIF 2023

Acceptance Rate 21% (Oral-Accept)

Abstract

In this study, we introduce SimStock, a novel framework leveraging self-supervised learning and temporal domain generalization techniques to represent similarities of stock data. Our model is designed to address two critical challenges: 1) temporal distribution shift (caused by the non-stationarity of financial markets), and 2) ambiguity in conventional regional and sector classifications (due to rapid globalization and digitalization). SimStock exhibits outstanding performance in identifying similar stocks across four real-world benchmarks, encompassing thousands of stocks. The quantitative and qualitative evaluation of the proposed model compared to various baseline models indicates its potential for practical applications in stock market analysis and investment decision-making.

FRL

Stop-loss adjusted labels for machine learning-based trading of risky assets

Yoontae Hwang, Junpyo Park, Dong-Young Lim, and Yongjae Lee,

Financial Research Letters 2023

Acceptance Rate 28%

Abstract

Since the rise of ML/AI, many researchers and practitioners have been trying to predict future stock price movements. In actual implementations, however, stop-loss is widely adopted to manage risks, which sells an asset if its price goes below a predetermined level. Hence, some buy signals from prediction models could be wasted if stop-loss is triggered. In this study, we propose a stop-loss adjusted labeling scheme to reduce the discrepancy between prediction and decision making. It can be easily incorporated to any ML/AI prediction models. Experimental results on U.S. futures and cryptocurrencies show that this simple tweak significantly reduces risk.

KMFA

A Study on the Estimation of Apartment Price Index: Focused on the Machine Learning Algorithm

Yoontae Hwang

Journal of Money & Finance 2019

Acceptance Rate 45.71%, Domestic journal(South Korea)

Abstract

working paper

* denotes equal contribution and joint lead authorship.

Submit

Geodesic Flow Kernels for Semi-Supervised Learning on Mixed-Variable Tabular Dataset

Yoontae Hwang, and Yongjae Lee,

Top AI conferences (coming soon) 2025

Acceptance Rate 20%

Submit

Temporal Representation Learning for Stock Similarities and Its Applications in Investment Management

Yoontae Hwang, Stefan Zohren, and Yongjae Lee,

Finance Journal (coming soon) 2025

Acceptance Rate 20%, Best Paper Award @the Korean Academic Society of Business Administration 2024

© Copyright 2022 Yoontae Hwang.

publications

* denotes equal contribution and joint lead authorship.

CAFO: Feature-Centric Explanation on Time Series Classification

Heterogeneous Trading Behaviors of Individual Investors.

Identifying household finance heterogeneity via deep clustering

Household Financial Health: A Machine Learning Approach for Data-Driven Diagnosis and Prescription

SimStock : Representation Model for Stock Similarities

Stop-loss adjusted labels for machine learning-based trading of risky assets

A Study on the Estimation of Apartment Price Index: Focused on the Machine Learning Algorithm

working paper

* denotes equal contribution and joint lead authorship.

Geodesic Flow Kernels for Semi-Supervised Learning on Mixed-Variable Tabular Dataset

Temporal Representation Learning for Stock Similarities and Its Applications in Investment Management