Siyuan Li (黎思源)

🤖 Artificial Intelligence Student

📞 Tel: (+86) 13815397394

✉️ Email: 1793706453@qq.com & lyq@shu.edu.cn

💬 WeChat: abcdefghi314159

I am Siyuan Li, an incoming Ph.D. student in Computer Science at the University of Georgia. I earned my Bachelor's degree in Artificial Intelligence from Shanghai University, graduating first in my major. I prefer to explore, learn and create things from scratch and bottom, rather than simply using pre-made solutions by others. For instance, I built this webpage myself (with an Easter egg 🐣), and have received Stars Forks. As an intern at the AI for Scientific Simulation and Discovery Lab at Westlake University, I led an explorative project on AI-driven physical law discovery, under the mentorship of Prof. Tailin Wu. Within my university, I have participated in research projects under Prof. Hang Yu and in the Brain-like Computing Center Lab led by Prof. Huiran Zhang, and have written a paper as the first author. I have worked on numerous projects, all of which can be viewed on my GitHub.

My research interests are primarily focused on innovative modeling across several domains, including Multi-Modality and Natural Language Processing (NLP). I also have a strong interest in the application of AI to scientific inquiries, particularly in the development of AI models tailored to address specific scientific questions. From my viewpoint, scientific data can be considered as a type of modality, where methods from Multi-Modality, CV, and NLP can be adapted to apply. To summarize, my fundamental research interest is: Innovating and Applying Artificial Intelligence Models (Designing Algorithms).

🎓 Education

University of Georgia (UGA)

Ph.D. Student in Computer Science (School of Computing)

08/2025 (defer to 01/2026) - 06/2030

Shanghai University (SHU)

B.E. in Artificial Intelligence (School of Computer Engineering and Science)

09/2021 - 06/2025

GPA: 3.81/4.0 (93.60/100)

Ranking: 1/52 (1/31 in class)

Key courses: Calculus(94), Linear Algebra(100), Object Oriented Programs(94), Probability and Statistics(95), Data Structure(97), Pattern Recognition(90), Computer Vision(91), Operations and Optimization(88), Data Mining and Knowledge Processing(94), Mathematical Logic(95), Principles and Techniques of LLMs(95), Principle and Algorithm of AI(93)

🔬 Research & Internship

  • I am so proud that I reported a bug (issue #11978) to vLLM, and my fix has been merged into the main branch (PR #11979). Now, I am a small contributor to vLLM. 😎
  • Migrate vLLM to Ascend NPU platform (vllm-ascend), responsible for unit testing and adaptation of some operators.
  • Adapt speculative decoding on vllm-ascend.
  • Proposed a novel ERP composite formula for analyzing human preferences.
  • Achieved effective classification of preferences using AI methods combined with the developed formula.
  • Authored a paper as the first author, available on arXiv.
  • Developed a transformer-based model and programmed to experiment with symbolic regression tasks.
  • Extended symbolic regression from mathematical expressions to the video domain by building a multimodal model.
  • Explored the discovery of physical system patterns from videos to empower scientific discovery tasks.
  • Proposed a novel encoder-decoder video frame interpolation model leveraging PVT v2 as the encoder and a UNet-like decoder with deconvolution and residual concatenation.
  • Achieved an SSIM of 0.9879 on Vimeo90K Dataset, surpassing state-of-the-art methods.

💻 Projects (Selected by Learning Path)

Fine-Tuning of Multimodal Medical Large Models Integrating the RAG Mechanism

Graduation Project

05/2025

  • Designed and implemented a medical content generation system combining Retrieval-Augmented Generation (RAG) and multimodal large language model (MLLM) fine-tuning.
  • Developed a multimodal RAG framework supporting joint image-text input, featuring multiple retrieval paradigms such as joint embedding, label-guided retrieval, and image-text pair binding.
  • Fine-tuned the Qwen2.5-VL model in two stages using Chinese medical QA and image-text datasets, yielding the Qwen2.5-VL-Med model with domain-specific reasoning capabilities.
  • Built a modular web-based interactive system supporting local/cloud API deployment, multimodal input, streaming response, and history tracking.

Cross-Modal Pretrained Model Alignment

Project website

07/2024

  • Proposed and implemented a method to quickly align pre-trained models from different modalities.
  • Designed a twin neural network similarity module to align pretrained models with varying embedding dimensions.
  • Achieved rapid model alignment between text and image modalities with minimal training on a standard image classification dataset, rather than requiring a large "image-description" dataset typical for models like CLIP.
  • Experimentally demonstrated the project's ability to align quickly with minimal GPU requirements and satisfactory performance.

Reproduction and Experimentation of Paper of TextCNN

Project website

03/2023

  • Reproduced and experimented with the TextCNN model.
  • Performed tokenization and encoding of sentence content, followed by padding or truncating sentence lengths.
  • Implemented word embedding and utilized multiple convolutional kernels of varying sizes for feature extraction, pooling, and final classification through fully connected layers.

Force Video Classification Based on CNN-LSTM

Project website

02/2023

  • Developed a network model based on CNN for video frame feature extraction and LSTM for sequential frame feature computation.
  • Compared the classification performance of KNN and ANN after freezing the feature extraction model parameters.
  • Achieved 92% accuracy on a public dataset, comparable to results from another study using a non-public dataset.

Handwriting Recognition System Based on Siamese Neural Networks

Project website

11/2022

  • Independently designed and coded a system utilizing VGG16 for signature feature extraction.
  • Achieved 100% accuracy on the CEDAR dataset using Siamese neural networks for classification.
  • Developed frontend-backend interaction programs enabling the utilization of training results on web platforms.

🏆 Awards

📚 Papers

The Study of Human Preference Based on Integrated Analysis of N1 and LPP Components.

Paper link. arXiv:2505.19879.

  • Authors: Siyuan Li, Xiangze Meng, Yijian Yang, Yiwen Xu, Yunfei Wang, Chenghu Qiu, Hanyi Jiang, Pin Wu, Shengbo Chen, Xiao Wei, Hao Wang, Lan Ni, Huiran Zhang.

Research advanced in offline handwritten signature verification.

Paper link. Applied and Computational Engineering, 6(1), 1244-1252. DOI: 10.54254/2755-2721/6/20230653.

  • Co-authors: Yuhang Guo, Siyuan Li (Co-first author), Jinxuan Wu

🛠️ Skills

  • Programming languages: Python (Advanced), C++ (Proficient), Html(Proficient), Matlab (Familiar), CSS(Familiar), javascript(Familiar)
  • Tools: Git, Office Word, LATEX(overleaf), Markdown, Remote SSH of VSCode
  • AI-related skills: Pytorch (Advanced), Transformers (Proficient), LLM (General), vLLM(General)

🤝 Extracurricular & Volunteer Activities

New Media Center, School of Computer Engineering and Science

Chairman

01/2022 - 01/2023

  • Managed content publication on the School of Management's official WeChat account and coordinated daily tasks.
  • Organized and managed recruitment presentations, student representative meetings, and other related affairs.

ByteDance

Campus Ambassador

03/2022 - 06/2022

  • Assisted in promoting spring recruitment and summer internships.
  • Distributed local push manuals and internal push codes.
Volunteer time: 100h+ ⏳