1st Reproducible Tournament on Pareto-efficient Image Classification

Results of the 1st reproducible ACM ReQuEST-ASPLOS'18 tournament:

Our long-term goal is to develop a common methodology and framework for reproducible co-design of the efficient software/hardware stack for emerging algorithms requested by our advisory board (inference, object detection, training, etc) in terms of speed, accuracy, energy, size, complexity, costs and other metrics. Open ReQuEST competitions bring together AI, ML and systems researchers to share complete algorithm implementations (code and data) as portable, customizable and reusable Collective Knowledge workflows. This helps other researchers and end-users to quickly validate such results, reuse workflows and optimize/autotune algorithms across different platforms, models, data sets, libraries, compilers and tools. We will also use our practical experience reproducing experimental results from ReQuEST submissions to help set up artifact evaluation at the upcoming SysML 2019, and to suggest new algorithms for the inclusion to the MLPerf benchmark.

The associated ACM ReQuEST workshop is co-located with ASPLOS 2018 March 24th, 2018 (afternoon), Williamsburg, VA, USA.

A ReQuEST introduction and long-term goals: cKnowledge.org/request website and ArXiv paper.

Steering committee (A-Z)

Advisory/industrial board (A-Z)

Contact us if you are interested to join the board!

Workshop program

Time slot
Presentation
Reusable artifacts

1:30pm—1:40pm

1:30pm—1:40pm

Workshop introduction

ReQuEST tournaments bring together multidisciplinary researchers (AI, ML, systems) to find the most efficient solutions for realistic problems requested by the advisory board in terms of speed, accuracy, energy, complexity, costs and other metrics across the whole application/software/hardware stack In a fair and reproducible way. All the winning solutions (code, data, workflow) on a Pareto-frontier are then available to the community as portable and customizablelug&play" AI/ML components with a common API and meta information. The ultimate goal is to accelerate research and reduce costs by reusing the most accurate and efficient AI/ML blocks continuously optimized, autotuned and crowd-tuned across diverse models, data sets and platforms from a cloud to edge.

1:40pm—2:30pm

1:40pm—2:30pm

Keynote "The Retrospect and Prospect of Low-Power Image Recognition Challenge (LPIRC)"

Prof. Yiran Chen, Duke University, USA

Slides in PDF

Abstract: Reducing power consumption has been one of the most important goals since the creation of electronic systems. Energy efficiency is increasingly important as battery-powered systems (such as smartphones, drones, and body cameras) are widely used. It is desirable using the on-board computers to recognize objects in the images captured by these cameras. The Low-Power Image Recognition Challenge (LPIRC) is an annual competition started in 2015, aiming to discover the best technology in both image recognition and energy conservation. In this talk, we will explains the rules of the competition and the rationale, summarizes the teams' scores, and describes the lessons learned in the past years. We will also discuss possible improvements of future challenges and collaboration opportunities with other events and competitions like ReQuEST.

Short bio: Yiran Chen received B.S and M.S. from Tsinghua University and Ph.D. from Purdue University in 2005. After five years in industry, he joined University of Pittsburgh in 2010 as Assistant Professor and then promoted to Associate Professor with tenure in 2014, held Bicentennial Alumni Faculty Fellow. He now is a tenured Associate Professor of the Department of Electrical and Computer Engineering at Duke University and serving as the co-director of Duke Center for Evolutionary Intelligence (CEI), focusing on the research of new memory and storage systems, machine learning and neuromorphic computing, and mobile computing systems. Dr. Chen has published one book and more than 300 technical publications and has been granted 93 US patents. He is the associate editor of IEEE TNNLS, IEEE D&T, IEEE ESL, ACM JETC, and ACM TCPS, and served on the technical and organization committees of more than 40 international conferences. He received 6 best paper awards and 12 best paper nominations from international conferences. He is the recipient of NSF CAREER award and ACM SIGDA outstanding new faculty award. He is the Fellow of IEEE.

See LPIRC tournaments.

2:30pm—2:50pm

2:30pm—2:50pm

"Real-Time Image Recognition Using Collaborative IoT Devices"

Ramyad Hadidi, Jiashen Cao, Matthew Woodward, Michael S. Ryoo, Hyesoon Kim

Georgia Institute of Technology, USA
Nvidia Jetson TX2, ARM, Raspberry Pi, AlexNet, VGG16, TensorFlow, Keras, Avro
Nvidia Jetson TX2, ARM, Raspberry Pi, AlexNet, VGG16, TensorFlow, Keras, Avro

2:50pm—3:10pm

2:50pm—3:10pm

"Highly Efficient 8-bit Low Precision Inference of Convolutional Neural Networks with IntelCaffe"

Jiong Gong, Haihao Shen, Guoming Zhang, Xiaoli Liu, Shane Li, Ge Jin, Niharika Maheshwari

Intel Corporation
Xeon Platinum 8124M, AWS, Intel C++ Compiler 17.0.5 20170817, ResNet-50, Inception-V3, SSD, 32-bit, 8-bit, Caffe
Xeon Platinum 8124M, AWS, Intel C++ Compiler 17.0.5 20170817, ResNet-50, Inception-V3, SSD, 32-bit, 8-bit, Caffe

3:10pm—3:30pm

3:10pm—3:30pm

"VTA: Open Hardware/Software Stack for Vertical Deep Learning System Optimization"

Thierry Moreau, Tianqi Chen, Luis Ceze

University of Washington, USA
Xilinx FGPA (Pynq board), ResNet-*, MXNet, NNVM/TVM
Xilinx FGPA (Pynq board), ResNet-*, MXNet, NNVM/TVM

3:30pm—4:00pm

3:30pm—4:00pm

Break

4:00pm—4:20pm

4:00pm—4:20

"Optimizing Deep Learning Workloads on ARM GPU with TVM"

Lianmin Zheng1, Tianqi Chen2

1 Shanghai Jiao Tong University, China
2 University of Washington, USA
Firefly-RK3399, GCC, LLVM, VGG16, MobileNet, ResNet-18, OpenBLAS vs ArmCL, MXNet, NNVM/TVM
Firefly-RK3399, GCC, LLVM, VGG16, MobileNet, ResNet-18, OpenBLAS vs ArmCL, MXNet, NNVM/TVM

4:20pm—4:50pm

4:20pm—4:50pm

"Introducing open ReQuEST platform, scoreboard and long-term vision"

Grigori Fursin and the ReQuEST organizers

"Exploring performance and accuracy of the MobileNets family using the Arm Compute Library"

Nikolay Chunosov, Flavio Vella, Anton Lokhmotov, Grigori Fursin

dividiti, UK
cTuning foundation, France
HiKey 960 (GPU), GCC, MobileNets exploration, ArmCL (18.01,18.02,dividiti optimizations), OpenCL
HiKey 960 (GPU), GCC, MobileNets exploration, ArmCL (18.01,18.02,dividiti optimizations), OpenCL

5:00pm

5:00pm

"Tackling complexity, reproducibility and tech transfer challenges in a rapidly evolving AI/ML/systems research"

Moderators: Grigori Fursin and Thierry Moreau.

"Exploring performance and accuracy of the MobileNets family using the Arm Compute Library"

We plan to center discussion around the following questions:

  • How do we facilitate tech transfer between academia and industry in a quickly evolving research landscape?
  • How do we incentivize companies and academic researchers to release more artifacts and open source projects as portable, customizable and reusable components which can be collaboratively optimized by the community across diverse models, data sets and platforms from the cloud to edge?
  • How do we ensure reproducible evaluation and fair comparison of diverse AI/ML frameworks, libraries, techniques and tools?
  • What other workloads (AI, ML, quantum) and exciting research challenges should ReQuEST attempt to solve in its future iterations with the help of the multi-disciplinary community: reducing training time and costs, comparing specialized hardware (TPU/FPGA/DSP), distributing learning across edge devices, ...

Participants:

Hillery Hunter, IBM

Hillery Hunter is an IBM Fellow and Director of the Accelerated Cognitive Infrastructure group at IBM's T.J. Watson Research Center in Yorktown Heights, NY. She is interested in cross-disciplinary technology topics, spanning silicon to system architecture to achieve new solutions to traditional problems. Her team pursues hardware-software co-optimization to take the wait time out of machine and deep learning problems. Her prior work was in the areas of DRAM main memory systems and embedded DRAM, and she gained development experience serving as IBM's server and mainframe DDR3-generation end-to-end memory power lead. In 2010, she was selected by the National Academy of Engineering for its Frontiers in Engineering Symposium, a recognition as one of the top young engineers in America. Dr. Hunter received the Ph.D. degree in Electrical Engineering from the University of Illinois, Urbana-Champaign and is a member of the IBM Academy of Technology. Hillery was appointed as an IBM Fellow in 2017.


Yiran Chen, Duke University

Yiran Chen received B.S and M.S. from Tsinghua University and Ph.D. from Purdue University in 2005. After five years in industry, he joined University of Pittsburgh in 2010 as Assistant Professor and then promoted to Associate Professor with tenure in 2014, held Bicentennial Alumni Faculty Fellow. He now is a tenured Associate Professor of the Department of Electrical and Computer Engineering at Duke University and serving as the co-director of Duke Center for Evolutionary Intelligence (CEI), focusing on the research of new memory and storage systems, machine learning and neuromorphic computing, and mobile computing systems. Dr. Chen has published one book and more than 300 technical publications and has been granted 93 US patents. He is the associate editor of IEEE TNNLS, IEEE D&T, IEEE ESL, ACM JETC, and ACM TCPS, and served on the technical and organization committees of more than 40 international conferences. He received 6 best paper awards and 12 best paper nominations from international conferences. He is the recipient of NSF CAREER award and ACM SIGDA outstanding new faculty award. He is the Fellow of IEEE.


Charles Qi, Cadence

Charles Qi is a system solutions architect in Cadence's IPG System and Software team, responsible for providing vision system solutions based on the Cadence(R) Tensilica Vision DSP technology and a broad range of interface IP portfolio. At system level, his primary focus is image sensing, computer vision and deep learning hardware and software for high-performance automotive vision ADAS SoC. Currently he is also an active internal architecture team member for high performance neural network acceleration hardware IPs. Prior to joining Cadence, Charles held various technical positions in Intel, Broadcom and several high-tech startups.

Important dates

Call for submissions

The 1st ReQuEST tournament is co-located with ACM ASPLOS'18 and will focus on optimizing the whole model/software/hardware stack for image classification based on the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Unlike the classical ILSVRC where submissions are ranked according to their classification accuracy, however, ReQuEST submissions will be evaluated according to multiple metrics and trade-offs selected by the authors (e.g. accuracy, speed, throughput, energy consumption, hardware cost, usage cost, etc.) in a unified, reproducible and objective way using the Collective Knowledge framework (CK). Restricting the competition to a single application domain will allow us to test our open-source ReQuEST tournament infrastructure, validate it across multiple platforms and environments, and prepare a dedicated live scoreboard with results similar to this public CK scoreboard.

We encourage participants to target accessible, off-the-shelf hardware to allow our evaluation committee to conveniently reproduce their results. Example systems include:

If a submission relies on an exotic hardware platform, the participants can either provide restricted access to their evaluation platform to the artifact evaluation committee, or notify the organizers in advance (please try to give us at least 3 weeks notice) about their choice so that a similar platform can be acquired in time (assuming the cost is not prohibitive).

Example optimizations include:

We strongly encourage artifact submissions for already published optimization techniques since one of the ReQuEST goals is to prepare a reference (baseline) set of implementations of various algorithms shared as portable, customizable and reusable CK components with a common API. In fact, the ReQuEST submissions will be directly fed into pilot CK integrations with the ACM Digital Library.

Submission

We follow standard procedures for submitting and evaluating experimental workflows, as established at leading systems conferences including CGO, PPoPP, PACT and SuperComputing ("artifact evaluation"):

Evaluation

ReQuEST is backed by the ACM Task Force on Data, Software, and Reproducibility in Publication and uses the standard artifact evaluation methodology. Artifact evaluation is single blind (see PPoPP, CGO, PACT, RTSS and SuperComputing). Reviews will be performed by the organizers and volunteers ("reviewers"), and can be made public upon the authors' request (see ADAPT). Quality and efficiency metrics will be collected for each submission, and displayed on a live ReQuEST scoreboard similar to this open CK repository.

Presentation

Feel free to contact us if you have questions or suggestions!

Advisory/industrial board

After the workshop, we will prepare a public report for the ReQuEST Advisory/Industrial Board. The board members will provide their feedback on the results, collaborate on a common methodology for reproducible evaluation and optimization, suggest realistic workloads, help provide access to rare hardware platforms to the Artifact Evaluation Committee for future tournaments, and provide prizes for distinguished entries. We will use our practical experience reproducing experimental results from ReQuEST submissions to help set up artifact evaluation at the upcoming SysML 2019, and to suggest new algorithms for the inclusion to the MLPerf benchmark.