Skip to main content

Manual Model Implementation

This page outlines the hand-crafted model development flow for BOS NPU deployment. It describes how teams move from model analysis and PyTorch baselines to TTNN implementation. The process emphasizes functional validation at each stage to preserve model correctness. It also highlights when custom operations should be introduced for unsupported behavior. The final stage focuses on performance optimization using profiling and hardware-aware tuning.



No.StepDescription
1Model analysisEvaluate the model architecture, parameters, and baseline metrics (e.g., FLOPs, dataset characteristics) to establish performance expectations. Besides, deep understanding of model architecture design intuition/techniques is also important.
2PyTorch implementationDevelop and validate the original model using PyTorch, ensuring correct functionality and establishing reference outputs and behavior.
3Torch to TTNN conversionConvert the PyTorch model to TTNN by mapping parameters and fusing operations (e.g., conv-bn), and more. This step ensures a one-to-one correspondence where applicable.
4TTNN implementationBuild the TTNN model by configuring each layer with device- and shape-specific settings, replicating the original architecture within the TTNN framework.
5Functional ValidationCompare outputs between the PyTorch and TTNN models using shape and value comparisons (comparison metrics such as PCC), and perform module-wise tests to verify that all functionalities are correct.
6Operation implementationImplement TTNN-specific operations if not supported.
7Performance OptimizationOptimize individual modules using built-in methods offered by TT-Metal or profiling tools, applying techniques such as operation fusion, pruning, quantization, etc., as needed.