Big Data & STDF

STDF/CSV/JSON Support and Big Data Analysis Platform

Designed to handle billions to trillions of semiconductor test records, based on lot-level data structures, MySQL 256-sharding, a Python-based analysis engine, and parallel processing.

Platform architecture from data collection, parser, analysis engine, database, and alarm notifications to client integration
Test data integration between testers, MES, analysis server, and analysis client

Data Structure

LotName
Lot-level management key
Process
Process information
Wafer Number (Wno)
Wafer-level identification
Chip Coordinates (Xadr, Yadr)
Site / chip position
Test Item
Measurement item

Technical Specifications

Database
MySQL with 256-sharding configuration (lot-level distribution)
Data Formats
STDF, CSV, JSON, and other customized formats
Processing Engine
Python-based analysis engine with parallel processing support
Client
Windows GUI application

Performance Indicators (Reference Values)

Single Lot Analysis
A few seconds to tens of seconds for approximately several million to tens of millions of data points
Correlation Analysis
Within a few minutes for 1,000 items
Parallel Processing
Multi-threading support

* Depends on data volume and environment.

Scalability

Key Points for Large-Scale Data Support

Sharding

Accelerates access to large volumes of data through lot-level distribution.

Parallel Processing

Supports multi-threaded processing in the analysis engine.

NVMe I/O Optimization

Designed for reading and writing large volumes of test data.

Variable Parameters

Absorbs lot-by-lot parameter differences using JSON measurement data.