SemiKong: An Open Source Foundation Model for Semiconductor Manufacturing Process


Semiconductors are essential in powering various electronic devices and driving development across telecommunications, automotive, healthcare, renewable energy, and IoT industries. In semiconductor manufacturing and design, the two main phases, FEOL and BEOL, present unique challenges. LLMs are trained on vast amounts of text data using self-supervised learning techniques that can capture rich domain knowledge.LLMs can also help in tasks like design rule checking, layout generation, and space exploration in Integrated Circuit (IC) design. LLMs allow the generation of new designs that adhere to the specified constraints and optimize for desired performance metrics, learning from large IC layouts and design rule datasets. However, most models are general and do not possess specific knowledge within the semiconductor industry. This reflects unique problems, such as complex physics and chemistry for semiconductor devices and processes.

Currently, LLMs are general-purpose models that, despite their power, need more specialized knowledge for tasks specific to the semiconductor industry. Artificial Intelligence (AI) improved semiconductor manufacturing by improving mask optimization and hotspot detection through machine learning, deep reinforcement learning, and datasets like LithoBench. In the semiconductor industry, domain-specific large language models (LLMs) such as ChipGPT and ChatEDA outperformed general models in tasks like code generation, debugging, and chatbot assistance. LLMs also evaluated natural language generation tasks, using expert feedback to improve benchmarks and address challenges in complex domain-specific evaluations. 

To integrate the power of LLMs in the semiconductor industry, researchers from Aitomatic Inc., FPT Software AI Center, and Tokyo Electron Ltd conducted detailed research and proposed SemiKong, the first industry-specific LLM for the semiconductor domain that provides a foundation for developing customized proprietary models. SemiKong 1.0 focuses on building a foundational model with an expert-level understanding of etching problems. This approach involves training models with comprehensive domain-specific data. The training process was divided into two stages: pretraining and fine-tuning.

There are very few high-quality datasets for the semiconductor domain. To address this, a large-scale text-based dataset focused on semiconductor concepts and etching problems emerged, including pretraining data from technical books, papers, and patents, along with instruction data featuring 50,000 questions. Tools like GPT-4o-mini handled formatting, while GPT-4o generated and answered some questions. The SemiKong model was trained in three steps. First, it was pre-trained using Llama3 checkpoints to learn about the semiconductor industry. Then, it went through supervised fine-tuning to improve its ability to handle tasks like answering questions and reasoning. Finally, the model was fine-tuned with quantization to make it ready for real-world use, gaining deeper knowledge about semiconductor manufacturing along the way. The researchers used 8 NVIDIA A100 80GB GPUs for training for better performance and training speed.

The evaluation of the SemiKong model involved comparing its performance across several criteria, including Clarity and Directness (C&D), Practicality and Immediate Usability (PIU), Efficiency and Brevity (E&B), Logical Flow and Coherence (LFC), Expert-to-Expert Communication (EEC), and Use of Examples and Specificity (UES). Experiments showed that fine-tuning alone did not significantly improve performance, as domain-specific knowledge was crucial. When pretraining was combined with fine-tuning, performance improved. Larger models with 70B parameters outperformed smaller ones, with the SemiKong 70B model excelling in all criteria. 

In summary, the proposed method provided a robust solution for integrating LLM technology with the semiconductor industry and achieved great performance. It performed better than the open-source foundation model. However, SemiKong is in its initial phase, and significant work remains. This work of integrating the latest LLM technology in manufacturing can act as a baseline for future research in the domain of semiconductors and change it forever!


Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 55k+ ML SubReddit.

[FREE AI VIRTUAL CONFERENCE] SmallCon: Free Virtual GenAI Conference ft. Meta, Mistral, Salesforce, Harvey AI & more. Join us on Dec 11th for this free virtual event to learn what it takes to build big with small models from AI trailblazers like Meta, Mistral AI, Salesforce, Harvey AI, Upstage, Nubank, Nvidia, Hugging Face, and more.


Divyesh is a consulting intern at Marktechpost. He is pursuing a BTech in Agricultural and Food Engineering from the Indian Institute of Technology, Kharagpur. He is a Data Science and Machine learning enthusiast who wants to integrate these leading technologies into the agricultural domain and solve challenges.



Leave a Comment