專利授權區國立清華大學國際產學營運總中心 Operations Center for Industry Collaboration

搜尋專利授權區

關鍵字

» 新增關鍵字

選單

專利授權區

專利授權區
專利名稱(中)	基於記憶體內運算電路架構之量化方法及其系統
專利名稱(英)	QUANTIZATION METHOD BASED ON HARDWARE OF IN-MEMORY COMPUTING AND SYSTEM THEREOF
專利家族	中華民國：I737228 大陸：7626746 美國：11,899,742
專利權人	國立清華大學 100.00%
發明人	鄭桂忠,魏瑋辰
技術領域	電子電機

專利摘要(中)
本發明提供一種基於記憶體內運算電路架構之量化方法，其中參數分群步驟依據分群數值將量化權重分成分群量化權重，並依據分群數值將輸入激勵函數分成分群激勵函數。乘積累加步驟將分群量化權重及分群激勵函數執行乘積累加運算而產生卷積輸出。卷積量化步驟依據卷積目標位元將卷積輸出量化成量化卷積輸出。卷積合併步驟依據分群數值將量化卷積輸出執行部分和運算而產生輸出激勵函數。藉此，透過分群配對及考慮硬體限制，並藉由類比數位轉換器的分類分布及具體量化法搭配深度神經網路的穩健性質，可學習到較佳的權重參數。

專利摘要(中)

本發明提供一種基於記憶體內運算電路架構之量化方法，其中參數分群步驟依據分群數值將量化權重分成分群量化權重，並依據分群數值將輸入激勵函數分成分群激勵函數。乘積累加步驟將分群量化權重及分群激勵函數執行乘積累加運算而產生卷積輸出。卷積量化步驟依據卷積目標位元將卷積輸出量化成量化卷積輸出。卷積合併步驟依據分群數值將量化卷積輸出執行部分和運算而產生輸出激勵函數。藉此，透過分群配對及考慮硬體限制，並藉由類比數位轉換器的分類分布及具體量化法搭配深度神經網路的穩健性質，可學習到較佳的權重參數。

專利摘要(英)
A quantization method based on a hardware of in-memory computing and a system thereof. The quantization method includes a quantization parameter providing step, a parameter splitting step, a multiply-accumulate step, a convolution quantization step and a convolution merging step. The quantization parameter providing step is performed to provide a quantization parameter, and the quantization parameter includes a quantized input activation, a quantized weight and a splitting value. The parameter splitting step is performed to split the quantized weight and the quantized input activation into a plurality of grouped quantized weights and a plurality of grouped activations, respectively, according to the splitting value. The multiply-accumulate step is performed to execute a multiply-accumulate operation with one of the grouped quantized weights and one of the grouped activations, and then generate a convolution output. The convolution quantization step is performed to quantize the convolution output to a quantized convolution output according to a convolution target bit. The convolution merging step is performed to execute a partial-sum operation with the quantized convolution output according to the splitting value, and then generate an output activation. Therefore, the quantization method of the present disclosure considers the hardware limitations of nonvolatile in-memory computing (nvIMC) to implement compact convolutional neural networks (CNNs). The nvIMC is simulated for parallel computation of multilevel matrix-vector multiplications (MVMs) by considering the constraints of an analog-to-digital convertor. A concrete-distribution based quantization method is introduced to optimize the small read margin problem caused by variations in nvIMC so as to obtain better updated weights.

專利摘要(英)

A quantization method based on a hardware of in-memory computing and a system thereof. The quantization method includes a quantization parameter providing step, a parameter splitting step, a multiply-accumulate step, a convolution quantization step and a convolution merging step. The quantization parameter providing step is performed to provide a quantization parameter, and the quantization parameter includes a quantized input activation, a quantized weight and a splitting value. The parameter splitting step is performed to split the quantized weight and the quantized input activation into a plurality of grouped quantized weights and a plurality of grouped activations, respectively, according to the splitting value. The multiply-accumulate step is performed to execute a multiply-accumulate operation with one of the grouped quantized weights and one of the grouped activations, and then generate a convolution output. The convolution quantization step is performed to quantize the convolution output to a quantized convolution output according to a convolution target bit. The convolution merging step is performed to execute a partial-sum operation with the quantized convolution output according to the splitting value, and then generate an output activation. Therefore, the quantization method of the present disclosure considers the hardware limitations of nonvolatile in-memory computing (nvIMC) to implement compact convolutional neural networks (CNNs). The nvIMC is simulated for parallel computation of multilevel matrix-vector multiplications (MVMs) by considering the constraints of an analog-to-digital convertor. A concrete-distribution based quantization method is introduced to optimize the small read margin problem caused by variations in nvIMC so as to obtain better updated weights.

聯絡資訊
承辦人姓名	李曉琪
承辦人電話	03-5715131 #31061
承辦人Email	hsiaochi@mx.nthu.edu.tw