Li Miqi
Education
Beijing University of Posts and Telecommunications (BUPT)
2018.9 - 2022.6
• School of Computer Science | Bachelor of Engineering | Major in Data Science and Big Data Technology
The University of Hong Kong (HKU)
2022.9 - 2024.1
• Faculty of Science, Department of Statistics and Actuarial Science | Master of Science | Major in Data Science

Work Experience
AS Watson Group, DataLab
Hong Kong
Full-Time | LLM Developer
2025.8 - Present
Huawei Hong Kong Research Center, Design Automation Lab
Hong Kong
Full-Time | R&D Engineer
2024.6 - 2025.8
  • Developed and deployed AI-driven automation and data science solutions for chip design and manufacturing, focusing on scalable technologies for schematic review, yield improvement and process optimization.
  • Retrieval-Augmented Generation (RAG) System for Schematic Review
    • Designed and implemented a multi-modal data processing pipeline for RAG, supporting various document formats (PDF, DOCX, PPTX, PNG, JPG).
    • Integrated visual LLM models for image-based information extraction alongside text extraction systems.
    • Built a vector database storing both dense and sparse embeddings, enabling advanced semantic and keyword-based search; applied the reciprocal rank fusion (RRF) algorithm and a rerank model to further refine and present the most relevant results.
    • Developed a Gradio-based frontend, allowing users to query PCB chip-related issues with high accuracy. The system now serves enterprise customers and meets key requirements.
  • Automated Code Generation for Schematic Reviews
    • Developed AI-powered solutions to automate schematic review processes, significantly reducing manual scripting efforts. Established a reliable code generation framework, optimizing both accuracy and response time through extensive architecture testing.
    • Designed a workflow where user prompts specify only the core logic of the code, which is then expanded by an LLM-based code writer using pseudo-APIs. A secondary LLM replaces pseudo-APIs with real, validated APIs.
    • Implemented a recursive process to iteratively search the database for missing APIs and integrate them into the final code.
    • Validated the generated code on real-world schematic files, achieving results consistent with manually written scripts.
  • Deep-semi-sight LLM Project
    • Developed an end-to-end framework for fine-tuning, deploying, and evaluating LLMs, leveraging vllm, vllm-ascend, and LLamaFactory. Supported advanced fine-tuning techniques (Full, LoRA, QLoRA, DPO) and concurrent deployment on non-NVIDIA platforms (e.g., Huawei Ascend hardware).
    • Built an evaluation pipeline for both open-source general datasets and custom domain-specific data. Successfully fine-tuned Qwen3-32B and Qwen2.5-32B-Instruct, achieving an average 9.64% and up to 12.69% improvement in general semiconductor capability while maintaining overall model performance.
    • Delivered robust, customer-facing LLM solutions and comprehensive evaluation results for enterprise deployment.
  • Yield Loss Data Analysis
    • Designed two high-efficiency statistical algorithms for single-step yield loss identification, reducing users’ analysis time from 0.5 days to 5 minutes in real-world cases, and significantly lowering the probability of misjudgment caused by excessive highlighting compared to previous methods.
    • Validated the algorithms against expert results in a two-week production test, leading to seamless integration into the latest product release.
    • Developed a multi-step yield loss identification algorithm and graphical interface, completing two rounds of feature iteration. This automation replaced manual experience-based analysis, enabling root cause identification time to drop from weeks to hours.
    • Demonstrated expert-level accuracy and efficiency in a three-month real-world validation, substantially reducing manual investigation time. Delivered a user-friendly executable application, deployed for customer grey testing and now fully adopted in production environments.
BASF East Asia Regional Headquarters, Global Digitalization Unit
Hong Kong
Intern | Data and AI Engineer
2024.1 - 2024.6
  • Contributed to BASF's AI and digitalization initiatives through two key projects: NewsBot Marketing Intelligence and Abnormal Result Inference.
  • NewsBot Marketing Intelligence Project:
    • Built an AI assistant to gather and analyze daily competitor and customer news from online sources, tailored to BASF’s industry-specific terminology and internal knowledge base.
    • Designed an end-to-end news collection pipeline using Azure Databricks, enabling automatic labeling of topics, industries, companies, and locations during preprocessing.
    • Integrated Azure Cognitive Search to enable efficient retrieval of relevant news using filters, and applied the LangChain framework for prompt engineering to answer user queries in the marketing intelligence domain.
    • Developed prompts to teach the model to summarize internal knowledge and provide accurate, context-aware responses.
  • Abnormal Result Inference Project:
    • Developed a solution to identify and diagnose root causes for product batches failing to meet quality standards, leveraging BASF’s extensive sensor network.
    • Designed a data integration pipeline in Azure Databricks to consolidate sensor data from multiple sources.
    • Conducted statistical analyses and correlation comparisons between normal and abnormal production days to identify critical sensor anomalies.
    • Built a linear regression model to pinpoint sensors most associated with production issues and deployed insights via an interactive Power BI dashboard, enabling actionable decision-making.
Alibaba Group, TaoBao and TMall Group, Alimama
Beijing
Intern | Algorithm Engineer
2023.5 - 2023.9
  • Contributed to the development of Alibaba's LLM technologies through two key projects: Natural Language to Crowd Package Generation and participation in the AI Hackathon Competition.
  • Natural Language to Crowd Package Generation Project:
    • Automated the creation of crowd packages—user groups defined by operations (e.g., intersection, union, difference) on multiple user labels—using ChatGPT and LangChain frameworks.
    • Designed and implemented prompt engineering for tasks such as user intent recognition, crowd package naming, and generating recommendation rationales, fully utilizing LLM capabilities.
    • For complex tasks like label selection and operational logic determination, developed a hybrid approach that combined LLMs with traditional machine learning algorithms.
    • The project successfully passed internal evaluations at Alibaba and entered the grey testing phase, demonstrating its value in automating previously manual processes.
  • AI Hackathon Competition:
    • Leveraged large language models to simulate user feedback on brand advertising, addressing the challenge of evaluating brand advertising effectiveness.
    • Designed a memory-and-decision-making framework that analyses user shopping histories, summarizes multidimensional features, and retrieves contextually relevant information based on timeliness, relevance, and importance; integrated chain-of-thought reasoning and consumer behaviour simulations to mimic real-world decision-making.
    • Secured 4th place out of 28 teams in the competition, showcasing innovative use of LLMs for business and advertising optimization.

Ability