Curiousity Hub

Zenan Huang (黄泽楠)

Hang Zhou, China

My research focuses on Large Language Models (LLMs) and reasoning, investigating how to build intelligent systems that can perform complex reasoning across diverse domains. I work on advancing the reasoning capabilities of LLMs through novel training paradigms that combine reinforcement learning with structured feedback mechanisms, and develop rigorous benchmarks to evaluate machine intelligence in scientific reasoning tasks. This builds upon my foundational work in causal discovery and inference methods for open-world observational data, including causal discovery algorithms, transfer learning, and the application of these techniques to neuro-behavioral data analysis and medical causal effect inference.

latest posts

Feb 25, 2024	Extend LLMs Context Window
Dec 14, 2023	Introduction to LLMs
Feb 24, 2023	MST-Analysis

selected publications

IEEE TIP
Discriminative Radial Domain Adaptation

Zenan Huang, Jun Wen, Siheng Chen, and 2 more authors

IEEE Transactions on Image Processing, 2023

Abs DOI Bib HTML

Domain adaptation methods reduce domain shift typically by learning domain-invariant features. Most existing methods are built on distribution matching, e.g., adversarial domain adaptation, which tends to corrupt feature discriminability. In this paper, we propose Discriminative Radial Domain Adaptation (DRDA) which bridges source and target domains via a shared radial structure. It’s motivated by the observation that as the model is trained to be progressively discriminative, features of different categories expand outwards in different directions, forming a radial structure. We show that transferring such an inherently discriminative structure would enable to enhance feature transferability and discriminability simultaneously. Speciﬁcally, we represent each domain with a global anchor and each category a local anchor to form a radial structure and reduce domain shift via structure matching. It consists of two parts, namely isometric transformation to align the structure globally and local reﬁnement to match each category. To enhance the discriminability of the structure, we further encourage samples to cluster close to the corresponding local anchors based on optimal-transport assignment. Extensively experimenting on multiple benchmarks, our method is shown to consistently outperforms state-of-the-art approaches on varied tasks, including the typical unsupervised domain adaptation, multi-source domain adaptation, domainagnostic learning, and domain generalization.
@article{huangDiscriminativeRadialDomain2023, ids = {huangDiscriminativeRadialDomain2023a}, title = {Discriminative {{Radial Domain Adaptation}}}, author = {Huang, Zenan and Wen, Jun and Chen, Siheng and Zhu, Linchao and Zheng, Nenggan}, year = {2023}, journal = {IEEE Transactions on Image Processing}, pages = {1--1}, issn = {1941-0042}, doi = {10.1109/TIP.2023.3235583}, copyright = {All rights reserved}, }
ICCV
iDAG: Invariant DAG Searching for Domain Generalization

Zenan Huang, Haobo Wang, Junbo Zhao, and 1 more author

In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Abs Bib HTML

Existing machine learning (ML) models are often fragile in open environments because the data distribution frequently shifts. To address this problem, domain generalization (DG) aims to explore underlying invariant patterns for stable prediction across domains. In this work, we first characterize that this failure of conventional ML models in DG attributes to an inadequate identification of causal structures. We further propose a novel invariant Directed Acyclic Graph (dubbed iDAG) searching framework that attains an invariant graphical relation as the proxy to the causality structure from the intrinsic data-generating process. To enable tractable computation, iDAG solves a constrained optimization objective built on a set of representative class-conditional prototypes. Additionally, we integrate a hierarchical contrastive learning module, which poses a strong effect of clustering, for enhanced prototypes as well as stabler prediction. Extensive experiments on the synthetic and real-world benchmarks demonstrate that iDAG outperforms the state-of-the-art approaches, verifying the superiority of causal structure identification for DG. The code of iDAG is available at https://github.com/lccurious/iDAG.
@inproceedings{huangIDAGInvariantDAG2023, title = {{{iDAG}}: {{Invariant DAG Searching}} for {{Domain Generalization}}}, shorttitle = {{{iDAG}}}, booktitle = {Proceedings of the {{IEEE}}/{{CVF International Conference}} on {{Computer Vision}}}, author = {Huang, Zenan and Wang, Haobo and Zhao, Junbo and Zheng, Nenggan}, year = {2023}, pages = {19169--19179}, urldate = {2023-11-28}, copyright = {All rights reserved}, langid = {english}, }
IJCAI
Latent Processes Identification From Multi-View Time Series

Zenan Huang, Haobo Wang, Junbo Zhao, and 1 more author

In Thirty-Second International Joint Conference on Artificial Intelligence, Aug 2023

Abs DOI Bib HTML

Understanding the dynamics of time series data typically requires identifying the unique latent factors for data generation, a.k.a., latent processes identification. Driven by the independent assumption, existing works have made great progress in handling single-view data. However, it is a nontrivial problem that extends them to multi-view time series data because of two main challenges: (i) the complex data structure, such as temporal dependency, can result in violation of the independent assumption; (ii) the factors from different views are generally overlapped and are hard to be aggregated to a complete set. In this work, we propose a novel framework MuLTI that employs the contrastive learning technique to invert the data generative process for enhanced identifiability. Additionally, MuLTI integrates a permutation mechanism that merges corresponding overlapped variables by the establishment of an optimal transport formula. Extensive experimental results on synthetic and real-world datasets demonstrate the superiority of our method in recovering identifiable latent variables on multi-view time series. The code is available on https://github.com/lccurious/MuLTI.
@inproceedings{huangLatentProcessesIdentification2023, title = {Latent {{Processes Identification From Multi-View Time Series}}}, booktitle = {Thirty-{{Second International Joint Conference}} on {{Artificial Intelligence}}}, author = {Huang, Zenan and Wang, Haobo and Zhao, Junbo and Zheng, Nenggan}, year = {2023}, month = aug, volume = {4}, pages = {3848--3856}, issn = {1045-0823}, doi = {10.24963/ijcai.2023/428}, urldate = {2023-08-27}, copyright = {All rights reserved}, langid = {english}, }