Converting ad-hoc code and data into reusable components with CK Python wrappers and JSON API
CK allows users to provide a common structure to their local ad-hoc code and data,
and pack them with associated CK Python wrappers and JSON API into CK repositories
(a - typical ad-hoc experimental packs for Artifact Evaluation,
b - unified and reusable experimental repositories such as the one for
Such repositories can be easily shared and reused via public or private services including GitHub, BitBucket and GitLab.
We keep track of some CK repositories and
reusable CK modules.
Feel free to add your own ones! In the future, we would like to assign DOI for stable artifacts
and make a web-service to automatically find their location.
Converting ad-hoc experimental scripts into customizable and sustainable CK workflows
Researchers can now use CK SDK to convert their complex and ad-hoc experimental scripts into unified CK workflows assembled from shared CK artifacts as LEGO bricks:
Furthermore, CK has its own portable package manager
for Linux, Windows, Android and MacOS to automatically adapt such workflows to underlying software,
detect multiple versions of installed dependencies which can co-exist and used in parallel
(see available CK software detection plugins),
and automatically install missing packages:
Such approach can considerably simplify artifact evaluation and validation of experimental results
at conferences and journals.
Furthermore, such workflows can be reused and collaboratively improved by the community
(rather than just archiving Docker images which become quickly outdated) thus enabling
practical, agile and open research!
For example, experimental workflow for the CGO'17 article from the University of Cambridge researchers which won distinguished artifact award was implemented using CK:
Enabling interactive dashboards and articles connected to CK repositories
CK has an integrated web server allowing users to quickly prototype
various web-based dashboards to run their workflows and analyze experimental results in workgroups.
CK even allows creation of interactive reports and articles assembled from CK artifacts.
Here are some real examples of practical CK dashboards and interactive articles:
Grigori Fursin's personal website with all cross-linked CK artifacts from past R&D:
[ fursin.net/research ]
Crowdsourcing experiments using public or private CK repositories
Integrated CK web server combined with unified JSON API of CK components allows workflows
running on different machines to interact with each other via JSON-based web interfaces.
This, in turn, allows users to effectively crowdsource experiments across diverse platforms
(mobile devices, tablets, laptops, servers, cloud, supercomputers) provided by supporters
similar to SETI@HOME. Results can be aggregated in public or private CK repositories
for further visualization, analysis and improvement.
Collaboratively benchmarking realistic workloads across diverse platforms
CK helps our partners collaboratively benchmark their workloads such as deep learning
across diverse platforms. You can test a simple example of compiling and running some shared
workload on your own machine as following:
$ ck pull repo:ck-autotuning
$ ck pull repo:ctuning-programs
You can see shared programs in the CK format (JSON meta information
describing how to compile and run shared program with all dependencies and data sets):
$ ck list program
You can now compile a given program simply as following:
$ ck compile program:cbench-automotive-susan --speed
Unifying and crowd-sourcing multi-objective and multi-dimensional optimization and co-design
High-level CK workflows together with unified JSON API allowed us to implement universal, customizable,
multi-objective and multi-dimensional autotuning as described
Furthermore, we can now crowdsource exploration of large and non-linear design and optimization spaces
to improve performance, energy, accuracy, memory consumption and other characteristics
or automatically detect bugs across diverse workloads and platforms provided by volunteers.
You can check our shared workflow to crowdsource optimization as following:
$ ck pull repo:ck-crowdtuning
$ ck crowdsource experiment
For example, you can execute shared workflow for collaborative program optimization
with all related artifacts, and start participating in multi-objective crowdtuning
simply as following:
$ ck crowdtune program
You can also crowd-tune GCC on Windows as following:
$ ck crowdtune program --gcc --target_os=mingw-64
If you have GCC or LLVM compilers installed, you can start continuously crowd-tune
their optimization heuristics in a quiet mode (for example overnight) via
$ ck crowdtune program --llvm --quiet
$ ck crowdtune program --gcc --quiet
This experimental workflow will be optimizing different shared workloads
for multiple objectives (execution time, code size, energy, compilation time, etc)
using all exposed design and optimization knobs, while sending most profitable
optimization choices to the public CK-based server.
CK server will, in turn, perform on-line learning to classify optimization
versus workloads which can be useful for compiler/hardware designers and
performance engineers (described in more detail in this article).
You can even use our small Android application
to crowdsource tuning of GCC and LLVM compiler optimization heuristics
while continuously learning and aggregating optimization results
in the public CK repository.
You can also participate in crowdtuning of popular third-party OpenCL, OpenMP and CUDA-based mathematical libraries
as described here.
Collaboratively creating realistic, representative and diverse training sets
Having a common experimental infrastructure allows us to
build reusable, realistic, diverse,
and continuously evolving training sets in a common format
(programs, data sets, models, unexpected behavior, mispredictions)
with the help of our partners and the community.
See the following examples of shared training sets:
Surviving in a Cambrian AI/SW/HW explosion with the Collective Knowledge and open AI research
Our personal ultimate goal behind CK development is to
a) reinvent computer engineering and make it more collaborative, reproducible and reusable,
b) develop efficient and reliable computer systems from IoT to supercomputers,
c) enable open science via reusable and customizable artifacts, and
d) create a public repository of reusable AI artifacts (models, data sets, tools, etc)
and portable AI algorithms/workflows (classification, detection, etc).
This should help us enable open AI research,
boost innovation in science and technology,
get back to our AI-related projects,
develop artificial brain and have fun ☺!