Results of the 1st reproducible ACM ReQuEST-ASPLOS'18 tournament:
[ Report ] [ ACM proceedings with reproducibility badges ] [ ReQuEST live dashboard ] [ CK framework ] [ portable, customizable and reusable CK workflows ] [ shared CK repositories ] [ shared CK packages ] [ discussion group ]
Our long-term goal is to develop a common methodology and framework for reproducible co-design of the efficient software/hardware stack for emerging algorithms requested by our advisory board (inference, object detection, training, etc) in terms of speed, accuracy, energy, size, complexity, costs and other metrics. Open ReQuEST competitions bring together AI, ML and systems researchers to share complete algorithm implementations (code and data) as portable, customizable and reusable Collective Knowledge workflows. This helps other researchers and end-users to quickly validate such results, reuse workflows and optimize/autotune algorithms across different platforms, models, data sets, libraries, compilers and tools. We will also use our practical experience reproducing experimental results from ReQuEST submissions to help set up artifact evaluation at the upcoming SysML 2019, and to suggest new algorithms for the inclusion to the MLPerf benchmark.
The associated ACM ReQuEST workshop is co-located with ASPLOS 2018
March 24th, 2018 (afternoon), Williamsburg, VA, USA.
|Time slot||Presentation||Reusable artifacts|
ReQuEST tournaments bring together multidisciplinary researchers (AI, ML, systems) to find the most efficient solutions for realistic problems requested by the advisory board in terms of speed, accuracy, energy, complexity, costs and other metrics across the whole application/software/hardware stack In a fair and reproducible way. All the winning solutions (code, data, workflow) on a Pareto-frontier are then available to the community as portable and customizable "plug&play" AI/ML components with a common API and meta information. The ultimate goal is to accelerate research and reduce costs by reusing the most accurate and efficient AI/ML blocks continuously optimized, autotuned and crowd-tuned across diverse models, data sets and platforms from a cloud to edge.
Keynote "The Retrospect and Prospect of Low-Power Image Recognition Challenge (LPIRC)"
Prof. Yiran Chen, Duke University, USA
Abstract: Reducing power consumption has been one of the most important goals since the creation of electronic systems. Energy efficiency is increasingly important as battery-powered systems (such as smartphones, drones, and body cameras) are widely used. It is desirable using the on-board computers to recognize objects in the images captured by these cameras. The Low-Power Image Recognition Challenge (LPIRC) is an annual competition started in 2015, aiming to discover the best technology in both image recognition and energy conservation. In this talk, we will explains the rules of the competition and the rationale, summarizes the teams' scores, and describes the lessons learned in the past years. We will also discuss possible improvements of future challenges and collaboration opportunities with other events and competitions like ReQuEST.
Short bio: Yiran Chen received B.S and M.S. from Tsinghua University and Ph.D. from Purdue University in 2005. After five years in industry, he joined University of Pittsburgh in 2010 as Assistant Professor and then promoted to Associate Professor with tenure in 2014, held Bicentennial Alumni Faculty Fellow. He now is a tenured Associate Professor of the Department of Electrical and Computer Engineering at Duke University and serving as the co-director of Duke Center for Evolutionary Intelligence (CEI), focusing on the research of new memory and storage systems, machine learning and neuromorphic computing, and mobile computing systems. Dr. Chen has published one book and more than 300 technical publications and has been granted 93 US patents. He is the associate editor of IEEE TNNLS, IEEE D&T, IEEE ESL, ACM JETC, and ACM TCPS, and served on the technical and organization committees of more than 40 international conferences. He received 6 best paper awards and 12 best paper nominations from international conferences. He is the recipient of NSF CAREER award and ACM SIGDA outstanding new faculty award. He is the Fellow of IEEE.
See LPIRC tournaments.
"Real-Time Image Recognition Using Collaborative IoT Devices"
Ramyad Hadidi, Jiashen Cao, Matthew Woodward, Michael S. Ryoo, Hyesoon Kim
Georgia Institute of Technology, USA
Nvidia Jetson TX2, ARM, Raspberry Pi, AlexNet, VGG16, TensorFlow, Keras, Avro
"Highly Efficient 8-bit Low Precision Inference of Convolutional Neural Networks with IntelCaffe"
Jiong Gong, Haihao Shen, Guoming Zhang, Xiaoli Liu, Shane Li, Ge Jin, Niharika Maheshwari
Xeon Platinum 8124M, AWS, Intel C++ Compiler 17.0.5 20170817, ResNet-50, Inception-V3, SSD, 32-bit, 8-bit, Caffe
"VTA: Open Hardware/Software Stack for Vertical Deep Learning System Optimization"
Thierry Moreau, Tianqi Chen, Luis Ceze
University of Washington, USA
Xilinx FGPA (Pynq board), ResNet-*, MXNet, NNVM/TVM
"Optimizing Deep Learning Workloads on ARM GPU with TVM"
Lianmin Zheng1, Tianqi Chen2
1 Shanghai Jiao Tong University, China
Firefly-RK3399, GCC, LLVM, VGG16, MobileNet, ResNet-18, OpenBLAS vs ArmCL, MXNet, NNVM/TVM
"Introducing open ReQuEST platform, scoreboard and long-term vision"
Grigori Fursin and the ReQuEST organizers
"Exploring performance and accuracy of the MobileNets family using the Arm Compute Library"
Nikolay Chunosov, Flavio Vella, Anton Lokhmotov, Grigori Fursin
HiKey 960 (GPU), GCC, MobileNets exploration, ArmCL (18.01,18.02,dividiti optimizations), OpenCL
Demonstrating live ReQuEST scoreboard with latest validated results
Note that the idea of ReQuEST tournaments is to continuously update this scoreboard with the help of authors and the community even after the workshop! Please, stay tuned!
Live ReQuEST scoreboard
and shared ReQuEST workflow with all artifacts.
Other shared CK artifact and workflows available here.
Open panel and discussion: "Tackling complexity, reproducibility and tech transfer challenges in a rapidly evolving AI/ML/systems research"
We plan to center discussion around the following questions:
The 1st ReQuEST tournament is co-located with ACM ASPLOS'18 and will focus on optimizing the whole model/software/hardware stack for image classification based on the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Unlike the classical ILSVRC where submissions are ranked according to their classification accuracy, however, ReQuEST submissions will be evaluated according to multiple metrics and trade-offs selected by the authors (e.g. accuracy, speed, throughput, energy consumption, hardware cost, usage cost, etc.) in a unified, reproducible and objective way using the Collective Knowledge framework (CK). Restricting the competition to a single application domain will allow us to test our open-source ReQuEST tournament infrastructure, validate it across multiple platforms and environments, and prepare a dedicated live scoreboard with results similar to this public CK scoreboard.
We encourage participants to target accessible, off-the-shelf hardware to allow our evaluation committee to conveniently reproduce their results. Example systems include:
If a submission relies on an exotic hardware platform, the participants can either provide restricted access to their evaluation platform to the artifact evaluation committee, or notify the organizers in advance (please try to give us at least 3 weeks notice) about their choice so that a similar platform can be acquired in time (assuming the cost is not prohibitive).
Example optimizations include:
We strongly encourage artifact submissions for already published optimization techniques since one of the ReQuEST goals is to prepare a reference (baseline) set of implementations of various algorithms shared as portable, customizable and reusable CK components with a common API. In fact, the ReQuEST submissions will be directly fed into pilot CK integrations with the ACM Digital Library.
We follow standard procedures for submitting and evaluating experimental workflows, as established at leading systems conferences including CGO, PPoPP, PACT and SuperComputing ("artifact evaluation"):
Step 1: Share your experimental artifacts and workflows
You should make all artifacts and experimental workflows publicly available via GitHub, GitLab, Bitbucket or similar, or pack them in a zip/tar archive or Docker/VM image. You should also provide instructions and scripts to build and run your workflows on a target platform, measure the characteristics and compare the results against a reference implementation.
If you are already familiar with the open-source Collective Knowledge framework (CK), you are encouraged to convert your experimental workflows to to portable CK workflows. Such workflows can automatically set up the environment, detect required software dependencies, install missing packages and run experiments, thus automating artifact evaluation. (See some examples here.)
If you are not familiar with CK, worry not! We will gladly help you convert your submission to CK during the evalution stage.
Step 2: Submit an extended abstract with Artifact Appendix
You should prepare an extended abstract (max 4 pages) using this ReQuEST LaTex template in the SIGPLAN conference style. Include your name, affiliation, and a brief description of your work (which can be novel or already presented elsewhere). Please also fill in the Artifact Appendix in the above template, including how to obtain your artifacts and workflows. Provide a detailed specification of your experimental workflow, a list of optimization metrics (speed, accuracy, energy, costs, etc.) and the expected results (which the reviewers will need to independently validate). Please submit your extended abstract as a PDF via the ReQuEST HotCRP website. Please contact the organizers if your encounter any problems.
ReQuEST is backed by the ACM Task Force on Data, Software, and Reproducibility in Publication and uses the standard artifact evaluation methodology. Artifact evaluation is single blind (see PPoPP, CGO, PACT, RTSS and SuperComputing). Reviews will be performed by the organizers and volunteers ("reviewers"), and can be made public upon the authors' request (see ADAPT). Quality and efficiency metrics will be collected for each submission, and displayed on a live ReQuEST scoreboard similar to this open CK repository.
Step 1: Collaborate on converting your workflows to CK
If your submission is not in the CK format, we will help you to add a portable CK workflow for your algorithm while reusing available CK packages and modules shared by the community (see CK Getting Started Guides, CK ReQuEST workflow example to explore MobileNets on ARM GPUs, shared CK packages CK software detection plugins, reusable CK modules (unified scripts and tool wrappers) and CK repositories with AI/ML workflows). You may choose how to communicate with us during this step: either privately via HotCRP, semi-privately via a dedicated Slack channel with all authors and reviewers, or, preferably, publicly via CK slack channel or the CK mailing list (thus making the community immediately aware of your artifact).
Step 2: Collaborate on validating your results
We will form a ReQuEST artifact evaluation committee (AEC) from the organizers and volunteers ("reviewers"). The AEC task is to objectively evaluate submissions on appropriate hardware platforms, reproduce results and aggregate them on a multi-objective public scoreboard. AE will be a friendly and interactive process between the authors and the reviewers, with the goal of making the artifacts as useful as possible for the community. For example, the reviewers may encounter some unexpected problems, and ask the authors for help to fix them.
Again, the authors can communicate with the reviewers privately via HotCRP, semi-privately via Slack, or publicly by opening tickets in shared repositories (see examples 1 and 2) and/or via the CK mailing list. If any of the organizers submit their workflows (mainly to provide reference implementations), their submissions will go through public evaluation.
Step 3: Collaborate on visualizing your results on a public scoreboard
Due to the multi-faceted nature of the competition, submissions will not be ranked according to a single metric (as this often results in over-engineered solutions), but instead the AEC will assess their Pareto optimality on two or more metrics exposed by the authors. As such, there will not be a single winner, but rather better and worse designs based on their relative Pareto optimality (up to 3 design points allowed per each submission). We will collaborate with the authors to correctly visualize the results and SW/HW/model configurations on a public scoreboard while grouping them according to certain categories of their choice (e.g. embedded vs. server). A unique submission may define a category in its own right. To win, the results of an entry will normally lie close to the Pareto-optimal frontier in its category. However, a winning entry can be also praised for its originality, reproducibility, adaptability, scalability, portability, ease of use, etc.
Step 1: Present at the ReQuEST workshop at ASPLOS'18
We will announce accepted SW/HW/model configurations at the end of February, and invite the authors to present their work at the 1st ReQuEST workshop co-located with ASPLOS 2018 (ACM conference on Architectural Support for Programming Languages and Operating Systems, which is the premier forum for multidisciplinary systems research spanning computer architecture and hardware, programming languages and compilers, operating systems and networking). This will give the authors an opportunity to share their research and implementation insights with the research community as well as discuss future R&D directions.
A common academic and industrial panel will be held at the end of the workshop to discuss how to improve the common SW/HW co-design methodology and infrastructure for deep learning and other real-world workloads.
Step 2: Publish in the ACM Digital Library
The authors of the winning submissions will publish their extended abstracts with an Artifact Appendix and related artifacts in the ACM Digital Library (even if their techniques have already been published, since the workshop focuses on validated and reusable artifacts!) Furthermore, we have partnered with ACM to award "available / reusable / replicated" badges to all the winning artifacts. This will make them discoverable via the ACM Digital Library (check this out by selecting "Artifact Badge" for a field and then select any badge you wish in the ACM DL advanced search)!
After the workshop, we will prepare a public report for the ReQuEST Advisory/Industrial Board. The board members will provide their feedback on the results, collaborate on a common methodology for reproducible evaluation and optimization, suggest realistic workloads, help provide access to rare hardware platforms to the Artifact Evaluation Committee for future tournaments, and provide prizes for distinguished entries. We will use our practical experience reproducing experimental results from ReQuEST submissions to help set up artifact evaluation at the upcoming SysML 2019, and to suggest new algorithms for the inclusion to the MLPerf benchmark.