| 
						
						
							
								
							
						
						
					 | 
					 | 
					@ -2,7 +2,7 @@ | 
				
			
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					To view the architecture of the ONNX networks, you can use [netron](https://netron.app/) | 
					 | 
					 | 
					 | 
					To view the architecture of the ONNX networks, you can use [netron](https://netron.app/) | 
				
			
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					
 | 
					 | 
					 | 
					 | 
					
 | 
				
			
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					## Supercombo | 
					 | 
					 | 
					 | 
					## Supercombo | 
				
			
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					### Supercombo input format (Full size: 393738 x float32) | 
					 | 
					 | 
					 | 
					### Supercombo input format (Full size: 799906 x float32) | 
				
			
			
				
				
			
		
	
		
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					* **image stream** | 
					 | 
					 | 
					 | 
					* **image stream** | 
				
			
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					  * Two consecutive images (256 * 512 * 3 in RGB) recorded at 20 Hz : 393216 = 2 * 6 * 128 * 256 | 
					 | 
					 | 
					 | 
					  * Two consecutive images (256 * 512 * 3 in RGB) recorded at 20 Hz : 393216 = 2 * 6 * 128 * 256 | 
				
			
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					    * Each 256 * 512 image is represented in YUV420 with 6 channels : 6 * 128 * 256 | 
					 | 
					 | 
					 | 
					    * Each 256 * 512 image is represented in YUV420 with 6 channels : 6 * 128 * 256 | 
				
			
			
		
	
	
		
		
			
				
					| 
						
						
						
							
								
							
						
					 | 
					 | 
					@ -16,11 +16,11 @@ To view the architecture of the ONNX networks, you can use [netron](https://netr | 
				
			
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					      * Channel 4 represents the half-res U channel | 
					 | 
					 | 
					 | 
					      * Channel 4 represents the half-res U channel | 
				
			
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					      * Channel 5 represents the half-res V channel | 
					 | 
					 | 
					 | 
					      * Channel 5 represents the half-res V channel | 
				
			
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					* **desire** | 
					 | 
					 | 
					 | 
					* **desire** | 
				
			
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					  * one-hot encoded vector to command model to execute certain actions, bit only needs to be sent for 1 frame : 8 | 
					 | 
					 | 
					 | 
					  * one-hot encoded buffer to command model to execute certain actions, bit needs to be sent for the past 5 seconds (at 20FPS) : 100 * 8 | 
				
			
			
				
				
			
		
	
		
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					* **traffic convention** | 
					 | 
					 | 
					 | 
					* **traffic convention** | 
				
			
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					  * one-hot encoded vector to tell model whether traffic is right-hand or left-hand traffic : 2 | 
					 | 
					 | 
					 | 
					  * one-hot encoded vector to tell model whether traffic is right-hand or left-hand traffic : 2 | 
				
			
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					* **recurrent state** | 
					 | 
					 | 
					 | 
					* **feature buffer** | 
				
			
			
				
				
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					  * The recurrent state vector that is fed back into the GRU for temporal context : 512 | 
					 | 
					 | 
					 | 
					  * A buffer of intermediate features that gets appended to the current feature to form a 5 seconds temporal context (at 20FPS) : 99 * 128 | 
				
			
			
				
				
			
		
	
		
		
	
		
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					
 | 
					 | 
					 | 
					 | 
					
 | 
				
			
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					
 | 
					 | 
					 | 
					 | 
					
 | 
				
			
			
		
	
		
		
			
				
					
					 | 
					 | 
					 | 
					### Supercombo output format (Full size: XXX x float32) | 
					 | 
					 | 
					 | 
					### Supercombo output format (Full size: XXX x float32) | 
				
			
			
		
	
	
		
		
			
				
					| 
						
							
								
							
						
						
						
					 | 
					 | 
					
  |