LuFan Cao

Tsinghua University · Qiuzhen College

LuFan Cao 曹露凡

Undergraduate Student in Mathematics and Applied Mathematics

I am an undergraduate student at Tsinghua University. My interests lie in embodied intelligence, reinforcement learning, mathematical foundations of machine learning, and large language models.

LuFan Cao

About

Mathematics · Machine Learning · Reinforcement Learning

I am currently pursuing a B.S. degree in Mathematics and Applied Mathematics at Tsinghua University, with expected graduation in June 2027.

My academic background combines rigorous mathematical training with projects in machine learning, deep reinforcement learning, generative models, and scientific computing.

Institution
Tsinghua University
College
Qiuzhen College
Expected Graduation
June 2027

Research Interests

EI

Embodied Intelligence

Embodied agents, robotic decision-making, interactive environments, and general-purpose intelligent systems.

RL

Reinforcement Learning

Exploration, policy optimization, self-play, offline-to-online reinforcement learning, and sequential decision making.

LLM

Large Language Models

Reasoning, planning, world models, multimodal learning, and LLM-based agents.

Selected Projects

Research and engineering experience
RL / Exploration

L1-Coverage Exploration

Reproduced and extended CODEX.W and PSDP for tightrope grid-world exploration, studying reward design and coverage behavior.

  • Implemented tabular and neural-network-based exploration policies.
  • Compared final-step reward with reward-at-every-step variants.
  • Studied state-based versus transition-based reward modeling.
CODEX.W PSDP GridWorld Exploration
Game AI

Deep Reinforcement Learning for Hex Grid Game

Built a Gymnasium-compatible Hex environment with configurable reward structures for two-agent self-play training.

  • Implemented PPO training with PyTorch DDP on 4 A800 GPUs.
  • Developed Actor-Critic and MCTS-based training pipelines.
  • Designed a ResNet-style policy network with residual connections.
  • Used GAE and action masking to improve training stability.
PPO MCTS ResNet Policy Self-Play

Education

Tsinghua University

2023 – Present

B.S. in Mathematics and Applied Mathematics, Qiuzhen College. Expected graduation: June 2027.

Qiuzhen College

Contact