Li Miqi - CV

AS Watson Group, DataLab - AI Lab

Hong Kong

Full-Time | LLM Developer

2025.8 - Present

Designed and implemented an agentic AI framework and suite of production-grade applications for commercial analysis and daily business workflows, leveraging Watson enterprise data and LangGraph / Langfuse toolchains.
Agentic AI framework
- Built a reusable, production-ready code template supporting multi-agent orchestration, dynamic planning, and runtime tracing. The template includes:
Category management system (agent-driven analytics application)
- Architected and built a multi-agent application that ingests and reasons over real enterprise data to deliver KPI monitoring, root-cause analysis, and actionable recommendations.
- Capabilities demonstrated:
- Multi-agent workflow:
  - A supervisor agent that decomposes user questions into sub-tasks, assigns them to specialist agents, stitches intermediate outputs, and iteratively updates the plan until a coherent, evidence-backed conclusion is produced.
  - Visualization tools that prepares charts (e.g., sales tree plots, trend dashboards) and integrates them into the front-end for interactive exploration.
- Outcome and production readiness:
  - Delivered automated investigative narratives and dashboards that reduced manual analyst time on high-priority incidents and accelerated root-cause detection.
  - Deployed with monitoring and retraining hooks (via Langfuse telemetry and LangGraph orchestration) to continuously improve prompt templates, agent policies, and model selection.

Huawei Hong Kong Research Center, Design Automation Lab

Hong Kong

Full-Time | R&D Engineer

2024.6 - 2025.8

Developed and deployed AI-driven automation and data science solutions for chip design and manufacturing, focusing on scalable technologies for schematic review, yield improvement and process optimization.
Retrieval-Augmented Generation (RAG) System for Schematic Review
- Designed and implemented a multi-modal data processing pipeline for RAG, supporting various document formats (PDF, DOCX, PPTX, PNG, JPG).
- Integrated visual LLM models for image-based information extraction alongside text extraction systems.
- Built a vector database storing both dense and sparse embeddings, enabling advanced semantic and keyword-based search; applied the reciprocal rank fusion (RRF) algorithm and a rerank model to further refine and present the most relevant results.
- Developed a Gradio-based frontend, allowing users to query PCB chip-related issues with high accuracy. The system now serves enterprise customers and meets key requirements.
Automated Code Generation for Schematic Reviews
- Developed AI-powered solutions to automate schematic review processes, significantly reducing manual scripting efforts. Established a reliable code generation framework, optimizing both accuracy and response time through extensive architecture testing.
- Designed a workflow where user prompts specify only the core logic of the code, which is then expanded by an LLM-based code writer using pseudo-APIs. A secondary LLM replaces pseudo-APIs with real, validated APIs.
- Implemented a recursive process to iteratively search the database for missing APIs and integrate them into the final code.
- Validated the generated code on real-world schematic files, achieving results consistent with manually written scripts.
Deep-semi-sight LLM Project
- Developed an end-to-end framework for fine-tuning, deploying, and evaluating LLMs, leveraging vllm, vllm-ascend, and LLamaFactory. Supported advanced fine-tuning techniques (Full, LoRA, QLoRA, DPO) and concurrent deployment on non-NVIDIA platforms (e.g., Huawei Ascend hardware).
- Built an evaluation pipeline for both open-source general datasets and custom domain-specific data. Successfully fine-tuned Qwen3-32B and Qwen2.5-32B-Instruct, achieving an average 9.64% and up to 12.69% improvement in general semiconductor capability while maintaining overall model performance.
- Delivered robust, customer-facing LLM solutions and comprehensive evaluation results for enterprise deployment.
Yield Loss Data Analysis
- Designed two high-efficiency statistical algorithms for single-step yield loss identification, reducing users’ analysis time from 0.5 days to 5 minutes in real-world cases, and significantly lowering the probability of misjudgment caused by excessive highlighting compared to previous methods.
- Validated the algorithms against expert results in a two-week production test, leading to seamless integration into the latest product release.
- Developed a multi-step yield loss identification algorithm and graphical interface, completing two rounds of feature iteration. This automation replaced manual experience-based analysis, enabling root cause identification time to drop from weeks to hours.
- Demonstrated expert-level accuracy and efficiency in a three-month real-world validation, substantially reducing manual investigation time. Delivered a user-friendly executable application, deployed for customer grey testing and now fully adopted in production environments.

BASF East Asia Regional Headquarters, Global Digitalization Unit

Hong Kong

Intern | Data and AI Engineer

2024.1 - 2024.6

Contributed to BASF's AI and digitalization initiatives through two key projects: NewsBot Marketing Intelligence and Abnormal Result Inference.
NewsBot Marketing Intelligence Project:
- Built an AI assistant to gather and analyze daily competitor and customer news from online sources, tailored to BASF’s industry-specific terminology and internal knowledge base.
- Designed an end-to-end news collection pipeline using Azure Databricks, enabling automatic labeling of topics, industries, companies, and locations during preprocessing.
- Integrated Azure Cognitive Search to enable efficient retrieval of relevant news using filters, and applied the LangChain framework for prompt engineering to answer user queries in the marketing intelligence domain.
- Developed prompts to teach the model to summarize internal knowledge and provide accurate, context-aware responses.
Abnormal Result Inference Project:
- Developed a solution to identify and diagnose root causes for product batches failing to meet quality standards, leveraging BASF’s extensive sensor network.
- Designed a data integration pipeline in Azure Databricks to consolidate sensor data from multiple sources.
- Conducted statistical analyses and correlation comparisons between normal and abnormal production days to identify critical sensor anomalies.
- Built a linear regression model to pinpoint sensors most associated with production issues and deployed insights via an interactive Power BI dashboard, enabling actionable decision-making.

Alibaba Group, TaoBao and TMall Group, Alimama

Beijing

Intern | Algorithm Engineer

2023.5 - 2023.9

Contributed to the development of Alibaba's LLM technologies through two key projects: Natural Language to Crowd Package Generation and participation in the AI Hackathon Competition.
Natural Language to Crowd Package Generation Project:
- Automated the creation of crowd packages—user groups defined by operations (e.g., intersection, union, difference) on multiple user labels—using ChatGPT and LangChain frameworks.
- Designed and implemented prompt engineering for tasks such as user intent recognition, crowd package naming, and generating recommendation rationales, fully utilizing LLM capabilities.
- For complex tasks like label selection and operational logic determination, developed a hybrid approach that combined LLMs with traditional machine learning algorithms.
- The project successfully passed internal evaluations at Alibaba and entered the grey testing phase, demonstrating its value in automating previously manual processes.
AI Hackathon Competition:
- Leveraged large language models to simulate user feedback on brand advertising, addressing the challenge of evaluating brand advertising effectiveness.
- Designed a memory-and-decision-making framework that analyses user shopping histories, summarizes multidimensional features, and retrieves contextually relevant information based on timeliness, relevance, and importance; integrated chain-of-thought reasoning and consumer behaviour simulations to mimic real-world decision-making.
- Secured 4th place out of 28 teams in the competition, showcasing innovative use of LLMs for business and advertising optimization.