MKL 使用時のメモリ使用量測定
概要
- MKL を利用した処理で、allocate したメモリ領域使用量を取得する。
- 「mkl_peak_mem_usage」「mkl_mem_stat」を利用。
環境
- MKL
Intel(R) Math Kernel Library 11.1 Update 2 for Linux*
コマンド
- mkl_peak_mem_usage
- メモリの最大使用量を取得する。単位は byte
- 計測の開始、停止等に応じた定数を引数として与える。
- MKL_PEAK_MEM_ENABLE : 開始
- MKL_PEAK_MEM : 開始後、メモリ量を返す。
- MKL_PEAK_MEM_DISABLE : 停止
MKL_INT64 allocated_bytes; mkl_peak_mem_usage( MKL_PEAK_MEM_ENABLE ); /* mkl_malloc 処理等 */ allocated_bytes = mkl_peak_mem_usage( MKL_PEAK_MEM ); printf( "Uses %d bytes.\n", allocated_bytes ); /* mkl_free 処理等 */ mkl_peak_mem_usage( MKL_PEAK_MEM_DISABLE );
- mkl_mem_stat
- MKL の memory allocate 状況を取得する。領域と byte。
MKL_INT64 allocated_bytes; int allocated_buffers; /* 初期化処理等 */ allocated_bytes = mkl_mem_stat( allocated_buffers ); printf( "Uses %d bytes in %d buffers.\n", allocated_bytes, allocated_buffers );
実行例
- 二つの行列を作成し、行列の乗算 ( cblas_dgemm ) を行った。
- 所々に上記コマンドによる printf を行い、様子を見る。
- 先に出力。以下の通り:
Memory malloc... Uses 128 bytes in 1 buffers. Uses 280 bytes in 2 buffers. Uses 408 bytes in 3 buffers. Set value (A)... Set value (B)... Calc ... Uses 408 bytes in 3 buffers. Check multiply matrix... Complete multiply matrix check!! Uses 408 bytes in 3 buffers. Memory free... Uses 0 bytes in 0 buffers.
- ソースコードは以下:
#include <assert.h> #include <stdio.h> #include <stdlib.h> #include "mkl.h" /* * 最大メモリ使用量取得 */ void startPrintAllocatedBytes() { mkl_peak_mem_usage( MKL_PEAK_MEM_ENABLE ); } void printAllocatedBytes() { MKL_INT64 allocated_bytes; int allocated_buffers; allocated_bytes = mkl_mem_stat( &allocated_buffers ); printf( " Uses %d bytes in %d buffers.\n", allocated_bytes, allocated_buffers ); } void stopPrintAllocatedBytes() { mkl_peak_mem_usage( MKL_PEAK_MEM_DISABLE ); } /* * 2つの入力行列の積を出力行列として返します。 * src1 * src2 = dest * * @param src1 : 入力行列1 size = m * p * @param src2 : 入力行列2 size = p * n * @param dest : 出力行列 size = m * n * @param m * @param p * @param n */ int multiplyMatrix( double *src1, double *src2, double *dest, unsigned int m, unsigned int p, unsigned int n ) { cblas_dgemm( CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n, p, 1.0, src1, p, src2, n, 1.0, dest, n ); return 1; } /* * test */ int main() { double *A, *B, *C; unsigned int m = 2; unsigned int p = 3; unsigned int n = 3; int i; startPrintAllocatedBytes(); printf( "Memory malloc...\n" ); A = (double*)mkl_malloc( m * p * sizeof( double ), 64 ); printAllocatedBytes(); B = (double*)mkl_malloc( p * n * sizeof( double ), 64 ); printAllocatedBytes(); C = (double*)mkl_malloc( m * n * sizeof( double ), 64 ); printAllocatedBytes(); printf( "Set value (A)...\n" ); // A = [ ( 1, 2, 3 ), // ( 1, 0, 1 ) ] A[0] = 1; A[1] = 2; A[2] = 3; A[3] = 1; A[4] = 0; A[5] = 1; printf( "Set value (B)...\n" ); // B = [ ( 2, 1, 0 ), // ( 1, 3, 1 ), // ( 0, 1, 0 ) ] B[0] = 2; B[1] = 1; B[2] = 0; B[3] = 1; B[4] = 3; B[5] = 1; B[6] = 0; B[7] = 1; B[8] = 0; printf( "Calc ...\n" ); multiplyMatrix( A, B, C, m, p, n ); printAllocatedBytes(); printf( "Check multiply matrix...\n" ); // C = [ ( 4, 10, 2 ), // ( 2, 2, 0 ) ] assert( C[0] == 4.0 ); assert( C[1] == 10.0 ); assert( C[2] == 2.0 ); assert( C[3] == 2.0 ); assert( C[4] == 2.0 ); assert( C[5] == 0.0 ); printf( "Complete multiply matrix check!!\n" ); printAllocatedBytes(); printf( "Memory free...\n" ); mkl_free_buffers(); mkl_free( A ); mkl_free( B ); mkl_free( C ); printAllocatedBytes(); stopPrintAllocatedBytes(); return 1; }