Grigori Fursin, PhD

My Collective Knowledge playground    Reproducibility initiatives    GitHub    Vision    LinkedIn    Google scholar    Medium    EMail

I am a British expat based in the Greater Paris area. I am using AI to co-design more efficient and cost-effective AI systems at FlexAI. I am also a Founder and Architect of cKnowledge.org and cTuning.org, Former Founder and Architect of a collaborative AI benchmarking and optimization platform acquired by OctoAI (now Nvidia), Former VP of MLOps at OctoAI, Former Co-Director of the Intel Exascale Lab, Former Senior Tenured Scientist at INRIA, and Former Adjunct Professor at the University of Paris-Saclay with a PhD in Self-Optimizing Compilers and Systems from the University of Edinburgh.


Brief biography:

I am a forward-thinking and agile computer scientist, software engineer, educator, startup and investor advisor, technical director, open source contributor, and open science advocate. I hold a PhD in self-optimizing compilers and systems from the University of Edinburgh. My interdisciplinary background spans computer engineering (with expertise in co-designing the full hardware-software stack from the cloud to the edge), machine learning, AI systems, data analytics, workflow automation, knowledge management, and physics and electronics.

This foundation allowed me, as an undergraduate in the 1990s, to prototype Hopfield-based analog semiconductor neural networks with full software/hardware automation for training and inference. Later, it helped me pioneer and champion visionary uses of machine learning, AI, crowd-tuning, and crowd-learning to co-design more efficient, cost-effective, and scalable computer systems—including compilers, runtimes, software, and hardware—during my PhD at the University of Edinburgh and postdoctoral research at Inria.

This work addressed the growing complexity of modern systems, and served as a precursor to AutoML, workflow automation, agent-based optimization, and federated learning. It also enabled me to initiate and support open science and reproducibility initiatives starting in 2008, when I launched cTuning.org (followed by cKnowledge.org with my Collective Knowledge Technology aka CK in 2014) and released all my research code, data, models, and experiments for our ML-based self-optimizing compiler—considered the first of its kind (ACM TechTalk'21). I was honored to receive the ACM CGO Test of Time Award, multiple Best Paper Awards, the INRIA Award for Scientific Excellence, and the EU HiPEAC Technology Transfer Award for this research and open-source tools.

After serving as a senior tenured research scientist at INRIA, an adjunct professor at the University of Paris-Saclay, and co-director of the Intel Exascale Lab, I transitioned my research and open-source tools into industry. I first established a non-profit cTuning foundation and co-founded a successful engineering company to automatically benchmark and optimize deep learning across diverse software and hardware stacks, with a focus on mobile phones and edge devices. I helped bootstrap it as CTO and Chief Architect, quickly growing it to $1M+ in revenue with just 4 people, thanks to my CK automation technology. I then joined Entrepreneur First, a highly selective company-building program for scientists and technologists, where I learned to build lean startups and avoid common pitfalls. As a result, I founded and bootstrapped two startups in the fields of performance optimization, MLOps automation, and knowledge management—the latter of which was acquired by OctoAI (now part of NVIDIA).

At the same time, I remained actively involved in community service and open-source initiatives. I helped establish MLCommons and launch reproducibility efforts at ACM and IEEE conferences: cTuning.org/ae . I also introduced a unified artifact appendix, which has since been adopted by major conferences such as ASPLOS, CGO, PPoPP, SuperComputing and MICRO. Finally, I co-organized several successful Quantum Hackathons, including one at Ecole 42 in Paris, where we utilized my CK workflow automation and platform for collaborative benchmarking and optimization of Quantum workloads (Hackathon page and a list of my events).

Throughout my career, I’ve been honored to collaborate with and learn from brilliant minds across leading universities, non-profits, startups, and companies — including Google, Amazon, Meta, Arm, AMD, Intel, IBM, Qualcomm, NVIDIA, Raspberry Pi, OpenAI, Tesla, OctoAI, Neural Magic, Red Hat, Dell, HPE, Lenovo, Apple, INRIA, ACM, IEEE, HiPEAC, MLCommons, the Linux Foundation, and Hugging Face: Acknowledgments (1), Acknowledgments (2), and Acknowledgments (3).

My passion lies in using this knowledge and experience to help startups, established companies, universities, non-profits, researchers, students, and investors rapidly prototype novel ideas, launch innovative deep-tech projects, reduce time to market, and deliver real-world impact through collaborative, reproducible, and automated R&D methodologies, tools, and platforms.

I usually take on roles as a strategic advisor, technical program manager, head of an R&D lab, or individual contributor, helping connect research, engineering, and product teams. I support them in adapting to the complex and rapidly evolving technological landscape, managing project complexity, avoiding common pitfalls, and achieving meaningful progress quickly—even with limited resources and time.

In 2024, I began architecting and prototyping the next generation of automation, self-optimizing and self-learning technologies to make it easier to co-design more efficient and cost-effective AI/ML systems. This effort builds on my existing initiatives, including Collective Mind, virtualized MLOps, MLPerf, Collective Knowledge Playground, and reproducible optimization tournaments - see my white paper for more details, and feel free to get in touch if you are interested in learning more! I also joined FlexAI as Head of the R&D Lab, where I am applying this experience to co-design more efficient software and hardware for AI inference and training.

Feel free to check my latest software developments:

Feel free to explore my key presentations and publications to gain insight into my projects and long-term vision:


Brief summary of my current activities:
  • Founder, President, and Chief Scientist of the cTuning.org — a non-profit educational organization and founding member of MLCommons developing open-source tools and methodologies to support reproducibility initiatives, artifact evaluation and open science in collaboration with ACM, IEEE and MLCommons since 2008. Please see Artifact Evaluation page for more details.
  • Founder and Architect of the Collective Knowledge Playground - an educational platform for learning how to co-design software and hardware to run AI, ML and other emerging workloads efficiently and cost-effectively across diverse models, datasets, software and hardware (trading off performance, power consumption, accuracy, cost and other characteristics). CK playground leverages the MLCommons CMX workflow automation framework with virtual MLOps developed in collaboration with MLCommons, cTuning.org and other organizations. Please see ArXiv white paper and an online catalog of reusable and virtual automation recipes for MLOps and DevOps.
  • Head of R&D Lab at FlexAI, coordinating efforts to leverage AI for co-designing more efficient and cost-effective AI systems.
    Core technologies used: HuggingFace models and datasets, vLLM, PyTorch, Triton, TensorRT, Nsight, MLPerf, OpenSearch, MLCommons CMX, FastAPI, Docker, Bayesian search, reinforcement learning and LLMs, Nvidia and AMD GPUs.
  • Organizer of reproducibility initiatives and artifact evaluation for AI, ML and Systems conferences and MLPerf benchmarks in collaboration with ACM, IEEE and MLCommons since 2013. I am leading the development of a common interface and automation language to make it easier to rerun and reuse code, data and experiments from published papers - see my ACM Tech Talk'21, ACM REP'23 keynote and white paper'24 for more details.
  • Member of the Program Committee at ACM Conference on Reproducibility and Replicability 2025.
Brief summary of my past activities:
  • founder and co-chair of the MLCommons Task Force on Automation and Reproducibility to modularize and automate MLPerf benchmarks using my CM framework (white paper);
  • author and tech.lead of the Collective Mind workflow automation framework (CM) adopted by MLCommons and the Autonomous Vehicle Computing Consortium (AVCC) to modularize MLPerf benchmarks and make it easier to run them across diverse models, data sets, software and hardware from different vendors using portable, reusable and technology-agnostic automation recipes (see online catalog of MLOps and MLPerf scripts and online docs to run MLPerf inference benchmarks). I donated this open-source technology to MLCommons to benefit everyone and continue developing it as a community effort. You can learn more about this project in this white paper. Since 2025, we split CM developments into an extended version of CM (CMX) and a simplified version of CM for MLPerf. I thank our great contributors for their feedback and support.
  • vice president of MLOps at OctoML where I prototyped the first version of CM and CM4MLOps together with the cTuning foundation before donating it to MLCommons to benefit everyone;
  • founder and chief architect of the virtual MLOps platform (cKnowledge.io) acquired by OctoML (now Nvidia);
  • author of the Collective Knowledge technology (CK) powering cKnowledge.io;
  • author of the Artifact Evaluation and Reproducibility checklist (Unified Artifact Appendix) for ACM/IEEE conferences (see example of my artifact appendix at the end of this ASPLOS'24 paper "PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation");
  • co-founder of a CodeReef platform for universal MLOps with Nicolas Essayan;
  • co-director of the Intel Exascale Lab and tech.lead for performance analysis, optimization and co-design of high-performance and cost-effecitve computer systems;
  • senior tenured scientist at INRIA developing the foundations to co-design more efficient and cost-effective computer systems using auto-tuning, machine-learning and run-time adaptation;
  • research associate at the University of Edinburgh;
  • holder of the PhD in computer science from the University of Edinburgh with the Overseas Research Student Award (self-optimizing compilers, run-time systems and software/hardware co-design);
  • recipient of the European technology transfer award, ACM CGO test of time award and INRIA award of scientific excellence for my original research to use AI, ML, federated learning and collective tuning (cTuning) to automate development of high-performance and cost-effective computer systems and reduce R&D costs and time to market by an order of magnitude.

Timeline: