* squash
* bump tg
* bump tg
* debump tinygrad
* bump tinygrad
* bump tg
* Skip init iteration
* fixes
* cleanups
* skip first test sample
* typos
* linter unhappy
* update cpu usage
* OPENCL just zeros for now
* imports
* Try printing
* Runs again, but slower
* unused import
* Allow more buffer with tg and all on gpu
* bump tinygrad
* seems ok
* stricter timings for driving looser for dm
* try llvm
* check nvidia
* More timeout for now
* make test pass
* Revert "try llvm"
This reverts commit ef136e478320101fea262bae3579e558da991902.
* small fixes
* whitespace
* revert test timeout
* No model runners
* Always CPU always fast
* No onnx runtime GPU
* more cores
* cleanup
* Is this faster
* Is this faster
* at least runs
* FP32 is faster than 16
* fix deps
* whitespace
* comment
---------
Co-authored-by: Adeeb Shihadeh <adeebshihadeh@gmail.com>
* Started work on model runner refactor
* Fixed some compile errors
* everything compiles
* Fixed bug in SNPEModel
* updateInput -> setInputBuffer
* I understand nothing
* whoops lol
* use std::string instead of char*
* Move common logic into RunModel
* formatting fix
old-commit-hash: c9f00678af
* delete unused stuff
* remove CL interceptor from thneed since we don't use SNPE anymore
* remove dead files from release
* that's removed
* oops, didn't save
old-commit-hash: 6c39382d71
* pc thneed prereqs
* ugh, out of date
* that can stay private
* memcpy here is fine in SNPE variant
* release files
* thneed docs don't work anymore. they didn't look too useful
Co-authored-by: Comma Device <device@comma.ai>
old-commit-hash: b6e355a933
* get log
* simplify two nonsense
* not needed
* libyuv is a joke
* clean up
* try small
* fast but not bad
* working
* clean up driverview
* simplified
* thats mirrored
* smol
* tweak
* ref is screen
* w/ ee
* update camera model
* no if TICI
* start
* update pose thresh
* less cpu more dsp
* new libyuv
* new snpe
* add files
* test
* should be fast
* update out len
* trigger test
* use master snpe
* add cereal
* update cereal
* refactor parsing
* missing ;
* get
* wrong type
* test model
* use driver data
* 10829278-72fe-4283-a118-2cef959ce174/1550
* no pf
* adapt driverview
* ;
* rhd learner
* update libyuv buildi x64
* ad4337ea
* remove blink slack
* test
* no
* use toggle
* b16
* fix for nv12
* 5b02cff5 both
* update test
* update cereal
* update cereal
* update cereal
* v2 packets
* revert libyuv
* no /
* update snpemodel
* ;
* memcpy
* fix test
* use toggle in driverview
* update power
* update replay
* Revert "update replay"
This reverts commit 1d0979ca59.
* update model ref
* halve cpu
* fake 8bit onnx runner
* same thresh as report
* cereal master
Co-authored-by: Comma Device <device@comma.ai>
old-commit-hash: d6c07a6b15
* update cereal
* run but not use
* log distraction type
* regression scaling
* clean up naming
* add calib buf
* add to header
* fake model
* no calib model
* adjust threshs
* 018a305f
* fix bn
* tweak1
* tweak2
* 0ff2/666
* tweak3
* t4
* t5
* fix out of bound
* skip when replaying old segments
* update ref
* fix onnxmodel
* get calib
* update model replay refs
* up ref
old-commit-hash: de4031c98e
* Added wide cam vipc client and bigmodel transform logic
* Added wide_frame to ModelState, should still work normally
* Refactored image input into addImage method, should still work normally
* Updated thneed/compile.cc
* Bigmodel, untested: 44f83118-b375-4d4c-ae12-2017124f0cf4/200
* Have to initialize extra buffer in SNPEModel
* Default paramater value in the wrong place I think
* Move USE_EXTRA to SConscript
* New model: 6c34d59a-acc3-4877-84bd-904c10745ba6/250
* move use extra check to runtime, not on C2
* this is always true
* more C2 checks
* log if frames are out of sync
* more logging on no frame
* store in pointer
* print sof
* add sync logic
* log based on sof difference as well
* keep both models
* less assumptions
* define above thneed
* typo
* simplify
* no need for second client is main is already wide
* more comments update
* no optional reference
* more logging to debug lags
* add to release files
* both defines
* New model: 6831a77f-2574-4bfb-8077-79b0972a2771/950
* Path offset no longer relevant
* Remove duplicate execute
* Moved bigmodel back to big_supercombo.dlc
* add wide vipc stream
* Tici must be tici
* Needs state too
* add wide cam support to model replay
* handle syncing better
* ugh, c2
* print that
* handle ecam lag
* skip first one
* so close
* update refs
Co-authored-by: mitchellgoffpc <mitchellgoffpc@gmail.com>
Co-authored-by: Harald Schafer <harald.the.engineer@gmail.com>
Co-authored-by: Adeeb Shihadeh <adeebshihadeh@gmail.com>
Co-authored-by: Comma Device <device@comma.ai>
old-commit-hash: 85efde269d
* add thneed optimizer
* local work group opt
* kernels and final mods
* release files
* build system touchups
* fix kernel path, rand inputs for self test
* broken since extra is gone
* update model replay ref
Co-authored-by: Comma Device <device@comma.ai>
old-commit-hash: 90beaebefb
* use cstring instead of string.h
* use cstdio instead of stdio.h
* remove inttypes.h
* use cstdlib instead of stdlib.h
* use cstdint instead of stdint.h
* #include <cstddef>
* cstdlib
* use cmath
* remove stddef.h
* use cassert
* use csignal
* use ctime
* use cerror
* rebase master
old-commit-hash: c53cb5d570
* no need to malloc one extra byte
* combine two read_file into a faster one
* cleanup #include
* use resize
* apply suggestions from review
* space
* rebase master
old-commit-hash: fe2f63849a
* enable Wunused, first pass
* unused stuff in snpe model
* these are used on phone
* handle sigint and sigterm in modeld
* fix phone build
* camera qcom
* QCOM build works
* delete unused camerad vars
Co-authored-by: Comma Device <device@comma.ai>
old-commit-hash: eb1aa3d831
* add thneed self test
* don't do the memset in thneed, shouldn't matter though
Co-authored-by: Comma Device <device@comma.ai>
old-commit-hash: 6c0ad1e675
* thneed runs the model
* thneed is doing the hooking
* set kernel args
* thneeding the bufferS
* print the images well
* thneeds with better buffers
* includes
* disasm adreno
* parse packets
* disasm works
* disasm better
* more thneeding
* much thneeding
* much more thneeding
* thneed works i think
* thneed is patient
* thneed works
* 7.7%
* gpuobj sync
* yay, it mallocs now
* cleaning it up, Thneed
* sync objs and set power
* thneed needs inputs and outputs
* thneed in modeld
* special modeld runs
* can't thneed the DSP
* test is weird
* thneed modeld uses 6.4% CPU
* add thneed to release
* move to debug
* delete some junk from the pr
* always track the timestamp
* timestamp hacks in thneed
* create a new command queue
* fix timestamp
* pretty much back to what we had, you can't use SNPE with thneed
* improve thneed test
* disable save log
Co-authored-by: Comma Device <device@comma.ai>
old-commit-hash: 302d06ee70
* add traffic convention
* hope this work
* no comment
* latest and gratest
* big gru model
* 1af55c7d-ee15-414a-9e98-a0cb08c3441f/75
* much later in training
* wrong temporal size
* converged
* fix lane changes
old-commit-hash: d3edc594ce
* cleanup simulator files
* minor updates
* update readme
* keras runner builds
* hmm, still doesn't work
* keras runner works
* should work with python3 keras mod
* touchups
old-commit-hash: c50c718293