With the rapid growth of deep learning models and higher expectations for their accuracy and throughput in real-world applications, the demand for profiling and characterizing model inference on different hardware/software stacks is significantly increased. As the model inference characterization on GPU has already been extensively studied, it is worth exploring how performance-enhancing libraries like Intel MKL-DNN help to boost the performance on Intel CPU. We develop a profiling mechanism to capture the MKL-DNN operation calls and formulate the tracing timeline with spans on the server. Through profiling and characterization that give insights into Intel MKL-DNN, we evaluate and demonstrate that the optimization techniques, including blocked ...
The spread of deep learning on embedded devices has prompted the development of numerous methods to ...
The reconstruction of charged particle trajectories is an essential component of high energy physics...
Machine Learning (ML) frameworks are tools that facilitate the development and deployment of ML mode...
Thesis (Master's)--University of Washington, 2018Embedded platforms with integrated graphics process...
Deep learning is widely used in many problem areas, namely computer vision, natural language process...
In this paper, we analyze heterogeneous performance exhibited by some popular deep learning software...
We devise a performance model for GPU training of Deep Learning Recommendation Models (DLRM), whose ...
While providing the same functionality, the various Deep Learning software frameworks available thes...
The deep learning community focuses on training networks for a better accuracy on GPU servers. Howev...
Machine learning has been widely used in various application domains such as recommendation, compute...
Machine Learning involves analysing large sets of training data to make predictions and decisions to...
In recent years, machine learning (ML) and, more noticeably, deep learning (DL), have be- come incre...
A recent effort to explore a neural network inference in FPGAs focusing on low-latency applications ...
PU is a powerful, pervasive, and indispensable platform for running deep learning (DL) workloads in ...
When executing a deep neural network (DNN), its model parameters are loaded into GPU memory before e...
The spread of deep learning on embedded devices has prompted the development of numerous methods to ...
The reconstruction of charged particle trajectories is an essential component of high energy physics...
Machine Learning (ML) frameworks are tools that facilitate the development and deployment of ML mode...
Thesis (Master's)--University of Washington, 2018Embedded platforms with integrated graphics process...
Deep learning is widely used in many problem areas, namely computer vision, natural language process...
In this paper, we analyze heterogeneous performance exhibited by some popular deep learning software...
We devise a performance model for GPU training of Deep Learning Recommendation Models (DLRM), whose ...
While providing the same functionality, the various Deep Learning software frameworks available thes...
The deep learning community focuses on training networks for a better accuracy on GPU servers. Howev...
Machine learning has been widely used in various application domains such as recommendation, compute...
Machine Learning involves analysing large sets of training data to make predictions and decisions to...
In recent years, machine learning (ML) and, more noticeably, deep learning (DL), have be- come incre...
A recent effort to explore a neural network inference in FPGAs focusing on low-latency applications ...
PU is a powerful, pervasive, and indispensable platform for running deep learning (DL) workloads in ...
When executing a deep neural network (DNN), its model parameters are loaded into GPU memory before e...
The spread of deep learning on embedded devices has prompted the development of numerous methods to ...
The reconstruction of charged particle trajectories is an essential component of high energy physics...
Machine Learning (ML) frameworks are tools that facilitate the development and deployment of ML mode...