AIM 2017 - September 10th - Portland, Oregon, USA
Organizers
-
Dr. Meenakshi Arunachalam, Principal Engineer Intel Corp, USA
Dr. Mahmut Kandemir, Professor, Penn State University, USA
Program
| 8:50am - 9:00am | Welcome and introduction to AIM workshop | |||
|---|---|---|---|---|
| 9:00am - 10:00am | Keynote 1 | |||
| Accelerating Persistent Neural Networks at Datacenter Scale | Jeremy Fowers | Microsoft Research | ||
| 10:00am - 10:30am | Break | |||
| 10:30am - 11:00am | Convolutional Neural networks for Text Classification using Intel Nervana Neon | Kripa Sankaranarayanan, Yinyin Liu | Intel Corp | pdf, pptx |
| 11:00am - 11:30am | Layer-wise Performance Bottleneck Analysis of Deep Neural Networks | Hengyu Zhao, Colin Weinshenker*, Mohamed Ibrahim*, Adwait Jog*, Jishen Zhao | University of California Santa Cruz, *The College of William and Mary | pdf, pptx |
| 11:30am - 12:00pm | Optimizing neon Deep Learning Framework for Intel Architectures | Wei Wang, Peng Zhang, Jayaram Bobba, Dawn Stone, Menglin Wu, Xiaohui Zhao, Mingfei Ma, Wenting Jiang, Jason Ye, Huma Abidi, Jennifer Myers, Hanlin Tang, Evren Tumer | Intel Corp | pdf, pptx |
| 12:00pm - 13:30pm | Lunch break | |||
| 13:30pm - 14:30pm | Keynote 2 | |||
| AI: A Platform Perspective | Mohan J Kumar | Data Center Group Intel Corp | ||
| 14:30pm - 14:45pm | Break | |||
| 14:45pm - 15:15pm | Flexible On-chip Memory Architecture for DCNN Accelerators | Arash Azizimazreah, Lizhong Chen | Oregon State University | pdf, pptx |
| 15:15pm - 15:45pm | Accelerating TensorFlow on Modern Intel Architectures | Elmoustapha Ould-Ahmed-Vall, Mahmoud Abuzaina, Md Faijul Amin, Jayaram Bobba, Roman S Dubtsov, Evarist M Fomenko, Mukesh Gangadhar, Niranjan Hasabnis, Jing Huang, Deepthi Karkada, Young Jin Kim, Srihari Makineni, Dmitri Mishura, Karthik Raman, AG Ramesh, Vivek V Rane, Michael Riera, Dmitry Sergeev, Vamsi Sripathi, Bhavani Subramanian, Lakshay Tokas, Antonio C Valles | Intel Corp | pdf, pptx |
| 15:45pm - 16:15pm | Understanding Large-Scale I/O Workload Characteristics via Deep Neural Networks | Jinyoung Moon, Myoungsoo Jung | Yonsei University, Korea | pdf, pptx |
Abstract
With the explosion of data creation and uploading across internet of things, hand-held devices and PCs, and cloud and enterprise, there is truly a big opportunity to apply machine learning and deep learning techniques on these terabytes of massive data and deliver breakthroughs in many domains. Deep learning in computer vision, speech recognition, video processing, etc., have sped up advances in many applications from the domains of manufacturing, robotics, business intelligence, autonomous driving, precision medicine, and digital surveillance, to name a few. Traditional machine learning algorithms such as Support Vector Machine, Principal Components Analyses, Alternate Least Squares, K-Means, and Decision Trees are ever present in product recommendations for online users, fraud detection, and financial services. There is a race to design parallel architectures to innovate, cover end-to-end workflows with low time to train while hitting state-of-the-art or higher accuracies without overfitting, low latency inferencing etc., all the while having good TCO, perf/watt and compute and memory efficiencies. Architectural innovation in CPUs, GPUs, FPGAs, ASICs, memories, and on-chip interconnects are needed with utmost urgency by these neural network and mathematical algorithms to attain their latency and accuracy requirements. Mixed and/or low precision arithmetic, high bandwidth stacked DRAMs, systolic array processing, vector extensions in many cores and multi-cores, special neural network instructions and sparse and dense data structures are some of the ways in which GEMM operations, Winograd convolutions, RELUs, fully connected layers etc., are optimally run to achieve expected accuracies and training and inference requirements.
This workshop aims to bring computer architecture, compiler, AI and machine learning/deep learning researchers as well as domain experts together, to produce research that target the confluence of these disciplines. It will be a venue for discussion and brainstorming of the topics related to these areas. The topics of interest include, but are not limited to:
- Novel CPU, GPU, FPGA and ASIC architectures for AI and deep learning;
- Compiler and runtime system design for AI and data science;
- Evolution of demands of ML frameworks and workloads;
- Low precision and/or mixed precision arithmetic;
- Memory and storage technologies for AI (3DXpoint, HBM, Stacked DRAM, and NVRAM);
- Neural network instructions and special purpose units;
- AI workload design and development for accelerators;
- Hardware/software codesign with intelligent systems;
- End to end flow system optimizations.
Submission Guidelines
All manuscripts will be reviewed by at least three members of the program committee. Submissions should be a complete manuscript (not to exceed 6 pages of single spaced text in the ACM format, including figures and tables). Submissions should be in the PDF format. Templates for paper preparation can be found in ACM. Please follow this link to submit your paper.
Important Dates:
- Submission deadline:
July 17th, 2017Aug 7th, 2017 - Notification,
August, 4th, 2017Aug 25th, 2017 - Camera-Ready:
August 11th, 2017Sep 1st, 2017
Program Committee:
-
Mohammad Arjomand, Georgia Tech, USA
Meena Arunachalam, Intel, USA
Nachiappan Chidambaram N., Apple Inc., USA
Myoungsoo Jung, Yonsei University, Korea
Mahmut Taylan Kandemir, Penn State, USA
Rahul Khanna, Intel, USA
Hassnaa Moustafa, Intel, USA
Ozcan Ozturk, Bilkent University, Turkey
Gabriel Rodriguez, Universidade da Coruna, Spain
Vikram Saletore, Intel, USA
Web/Publicity Chair:
-
Jagadish Kotra, Penn State, USA
Submission Chair:
-
Xulong Tang, Penn State, USA