Collective Knowledge Aggregator proof-of-concept
Crowd results Raw CK browser Graphs Reports Datasets Models Home

This page is outdated! New version is available here.

Distinct solutions after online classification (auto/crowd-tune GCC compiler flags (minimize execution time))

Scenario UID8289e0cf24346aa7 (experiment.tune.compiler.flags.gcc.e)
Data UID047388119420623f
Discuss (optimizations to improve compilers,
semantic/data set/hardware features
to improve predictions
, etc):
GitHub wiki, Google group
Download:[ All solutions in JSON ], [ Solutions' classification in JSON ]
Reproduce all (with reactions):ck replay 8289e0cf24346aa7:047388119420623f
CompilerGCC 4.9
CPUQualcomm Technologies, Inc SM8150
Improvement key IK1Main kernel execution time speedup [min]
Improvement key IK2Code size improvement

Improvements (<4% variation) Distinct workload for highest improvement
# Solution UID IK1 IK2 New distinct optimization choices Ref Best species Worst species Touched Iters Program CMD Dataset Dataset file CPU freq (MHz) Cores Platform OS Replay
S1 9b6a40ef8524526b 3.99 0.26 -O3 -fbranch-target-load-optimize -fcaller-saves -fno-cx-fortran-rules -fdelete-null-pointer-checks -fdevirtualize -fno-inline-functions-called-once -fno-inline-small-functions -fno-ira-hoist-pressure -fno-isolate-erroneous-paths-attribute -fkeep-inline-functions -fno-loop-nest-optimize -flto -fno-function-cse -fno-peephole -fpredictive-commoning -free -fno-schedule-insns2 -fno-shrink-wrap -ftree-dce -fno-tree-loop-linear -ftree-pre -fno-tree-tail-merge -ftree-vrp -fno-web -falign-jumps=0 --param min-vect-loop-bound=2 --param large-function-growth=43 --param large-unit-insns=10987 --param large-stack-frame-growth=1365 --param max-unrolled-insns=259 --param max-completely-peel-times=28 --param scev-max-expr-complexity=5 --param vect-max-version-for-alias-checks=14 --param min-size-for-stack-sharing=52 --param l1-cache-size=51 --param use-canonical-types=0 --param min-insn-to-prefetch-ratio=10 --param ipa-max-agg-items=2 --param uninit-control-dep-attempts=105 -O3 1 0 2 1 milepost-codelet-mibench-automotive-bitcount-src-bitcnts-codelet-1-1 default 825.6, 825.6, 825.6, 825.6, 825.6, 825.6, 825.6, 825.6 1 SAMSUNG SM-G970U Android 10
S2 dcbed24f50a4307b 1.36 1.03 -O3 -fconserve-stack -fno-dse -fno-expensive-optimizations -fno-indirect-inlining -fno-ipa-reference -fkeep-static-consts -fno-move-loop-invariants -fno-toplevel-reorder -fomit-frame-pointer -foptimize-sibling-calls -fprefetch-loop-arrays -fno-reorder-functions -fno-rerun-cse-after-loop -fno-sched-pressure -fselective-scheduling2 -fno-signaling-nans -ftree-fre -ftree-loop-vectorize -fsched-stalled-insns-dep=0 --param max-inline-insns-recursive-auto=132 --param comdat-sharing-probability=8 --param gcse-after-reload-critical-fraction=13 --param gcse-unrestricted-cost=6 --param max-hoist-depth=59 --param max-unswitch-insns=50 --param sms-dfa-history=0 --param vect-max-peeling-for-alignment=59 --param selsched-max-sched-times=3 --param sched-mem-true-dep-cost=0 --param max-jump-thread-duplication-stmts=2 --param max-vartrack-expr-depth=16 --param tm-max-aggregate-size=8 --param lto-partitions=53 --param asan-instrument-reads=1 -O3 1 0 2 1 milepost-codelet-mibench-automotive-bitcount-src-bitcnt-1-codelet-2-1 default 1171.2, 1171.2, 1171.2, 1171.2, 1171.2, 1171.2, 1171.2, 1171.2 1 SAMSUNG SM-G970U Android 10

[ Participated users, platforms, OS, CPU, GPU, GPGPU, NN, NPU ] [ How to participate ] [ Motivation (PPT) (PDF) ] [ Papers 1 , 2 , 3] [ Android app ] [ Collective training set ] [ Unified AI ]
View entry in raw format

Developed by Grigori Fursin           
Implemented as a CK workflow
                      Hosted at