Papers
Autoscheduling: http://graphics.cs.cmu.edu/projects/halidesched/
Examples
Sorting and median: https://github.com/halide/Halide/blob/e9ece5ee8ee9cb62295d5e7e5409e54390bb01b4/test/correctness/sort_exprs.cpp
CMake example for Halide w/ AOT w/ minimal scaffolding: https://github.com/halide/Halide/blob/master/apps/wavelet/CMakeLists.txt
Tiling for GPUs w/out enough memory: https://github.com/halide/Halide/blob/master/apps/interpolate/interpolate.cpp#L155
NL Means implementations: https://github.com/ravi-teja-mullapudi/Halide/tree/master/apps/non_local_means
Use GenGen for AOT compilation to create a static library with different targets-all included (don't need separate binaries for each architecture): http://stackoverflow.com/a/38704281/14878
Interesting
Halide demosaicer for RawTherapee: https://github.com/Beep6581/RawTherapee/issues/2934. Source for for mentioned: https://github.com/bobobo1618/RawTherapee/blob/ca0c7e0663d6217bf9c153e38b5fb04654b3abb6/rtengine/halide/README.md
Fast GPU NL means: https://users.soe.ucsc.edu/~milanfar/publications/conf/PPNLM_ICIP16_Final.pdf