專利授權區國立清華大學國際產學營運總中心 Operations Center for Industry Collaboration

搜尋專利授權區

關鍵字

» 新增關鍵字

選單

專利授權區

專利授權區
專利名稱(中)	適用於卷積神經網路之記憶體優化實現之區塊式推論方法及其系統
專利名稱(英)	BLOCK-BASED INFERENCE METHOD FOR MEMORY-EFFICIENT CONVOLUTIONAL NEURAL NETWORK IMPLEMENTATION AND SYSTEM THEREOF
專利家族	中華民國：I765336 大陸：7343088 美國：12,229,651
專利權人	國立清華大學 100.00%
發明人	黃朝宗
技術領域	資訊工程,電子電機

專利摘要(中)
本發明提供一種適用於卷積神經網路之記憶體優化實現之區塊式推論方法。區塊推論步驟驅動運算處理單元將各輸入區塊資料執行多層卷積操作而產生輸出區塊資料。區塊推論步驟依據輸出區塊資料之位置沿掃描換行方向選擇第i層重新計算特徵。區塊推論步驟依據第i層重新計算輸入特徵區塊資料沿區塊掃描方向選取出第i層重複利用特徵。卷積運算步驟依據第i層重新計算特徵及第i層重複利用特徵執行卷積運算。藉此，透過不同方向使用不同特徵之計算方式，既不增加過多計算量及內部區塊暫存器，亦能大幅降低外部記憶體之頻寬需求。

專利摘要(中)

本發明提供一種適用於卷積神經網路之記憶體優化實現之區塊式推論方法。區塊推論步驟驅動運算處理單元將各輸入區塊資料執行多層卷積操作而產生輸出區塊資料。區塊推論步驟依據輸出區塊資料之位置沿掃描換行方向選擇第i層重新計算特徵。區塊推論步驟依據第i層重新計算輸入特徵區塊資料沿區塊掃描方向選取出第i層重複利用特徵。卷積運算步驟依據第i層重新計算特徵及第i層重複利用特徵執行卷積運算。藉此，透過不同方向使用不同特徵之計算方式，既不增加過多計算量及內部區塊暫存器，亦能大幅降低外部記憶體之頻寬需求。

專利摘要(英)
A block-based inference method for memory-efficient convolutional neural network (CNN) implementation is proposed. The block-based inference method for memory-efficient CNN implementation includes a parameter setting step, a dividing step, a block-based inference step and a temporary storing step. The parameter setting step includes setting an inference parameter group. The inference parameter group includes a depth, a block width, a block height and a kernel size. The dividing step includes driving a processing unit to divide the image into a plurality of input block data according to the depth, the block width and the block height. Each of the input block data has an input block size. The block-based inference step includes driving the processing unit to perform a multi-layer convolution operation on each of the input block data to generate an output block data. The multi-layer convolution operation includes a first direction data selecting step, a second direction data selecting step and a convolution operation step. The first direction data selecting step includes selecting a plurality of ith layer recomputing features according to a position of the output block data along a first direction, and then selecting an ith layer recomputing input feature block data according to the position of the output block data and the ith layer recomputing features. i is one of a plurality of positive integers from 1 to the depth. The second direction data selecting step includes selecting a plurality of ith layer reusing features according to the ith layer recomputing input feature block data along a second direction, and then combining the ith layer recomputing input feature block data with the ith layer reusing features to generate an ith layer reusing input feature block data. The convolution operation step includes selecting a plurality of ith layer sub-block input feature groups from the ith layer reusing input feature block data according to an ith layer kernel size, and then performing a convolution operation on each of the ith layer sub-block input feature groups to generate an ith layer sub-block output feature, and combining the ith layer sub-block output features corresponding to the ith layer sub-block input feature groups to form an ith layer output feature block data. The temporary storing step includes driving a block buffer bank to store the output feature block data and the ith layer reusing features. Therefore, the present disclosure reuses the features along the block scanning direction to reduce recomputing overheads and recomputes the features between different scan lines to eliminate the global line buffer, so that the inference flow of the present disclosure can provide great flexibility and good tradeoffs between computing and memory overheads for high-performance and memory-efficient CNN inference.

專利摘要(英)

A block-based inference method for memory-efficient convolutional neural network (CNN) implementation is proposed. The block-based inference method for memory-efficient CNN implementation includes a parameter setting step, a dividing step, a block-based inference step and a temporary storing step. The parameter setting step includes setting an inference parameter group. The inference parameter group includes a depth, a block width, a block height and a kernel size. The dividing step includes driving a processing unit to divide the image into a plurality of input block data according to the depth, the block width and the block height. Each of the input block data has an input block size. The block-based inference step includes driving the processing unit to perform a multi-layer convolution operation on each of the input block data to generate an output block data. The multi-layer convolution operation includes a first direction data selecting step, a second direction data selecting step and a convolution operation step. The first direction data selecting step includes selecting a plurality of ith layer recomputing features according to a position of the output block data along a first direction, and then selecting an ith layer recomputing input feature block data according to the position of the output block data and the ith layer recomputing features. i is one of a plurality of positive integers from 1 to the depth. The second direction data selecting step includes selecting a plurality of ith layer reusing features according to the ith layer recomputing input feature block data along a second direction, and then combining the ith layer recomputing input feature block data with the ith layer reusing features to generate an ith layer reusing input feature block data. The convolution operation step includes selecting a plurality of ith layer sub-block input feature groups from the ith layer reusing input feature block data according to an ith layer kernel size, and then performing a convolution operation on each of the ith layer sub-block input feature groups to generate an ith layer sub-block output feature, and combining the ith layer sub-block output features corresponding to the ith layer sub-block input feature groups to form an ith layer output feature block data. The temporary storing step includes driving a block buffer bank to store the output feature block data and the ith layer reusing features. Therefore, the present disclosure reuses the features along the block scanning direction to reduce recomputing overheads and recomputes the features between different scan lines to eliminate the global line buffer, so that the inference flow of the present disclosure can provide great flexibility and good tradeoffs between computing and memory overheads for high-performance and memory-efficient CNN inference.

聯絡資訊
承辦人姓名	李曉琪
承辦人電話	03-5715131 #31061
承辦人Email	hsiaochi@mx.nthu.edu.tw