|  |  | @ -5,14 +5,14 @@ To view the architecture of the ONNX networks, you can use [netron](https://netr | 
			
		
	
		
		
			
				
					
					|  |  |  | ### Supercombo input format (Full size: 393738 x float32) |  |  |  | ### Supercombo input format (Full size: 393738 x float32) | 
			
		
	
		
		
			
				
					
					|  |  |  | * **image stream** |  |  |  | * **image stream** | 
			
		
	
		
		
			
				
					
					|  |  |  |   * Two consecutive images (256 * 512 * 3 in RGB) recorded at 20 Hz : 393216 = 2 * 6 * 128 * 256 |  |  |  |   * Two consecutive images (256 * 512 * 3 in RGB) recorded at 20 Hz : 393216 = 2 * 6 * 128 * 256 | 
			
		
	
		
		
			
				
					
					|  |  |  |     * Each 256 * 512 image is represented in YUV with 6 channels : 6 * 128 * 256 |  |  |  |     * Each 256 * 512 image is represented in YUV420 with 6 channels : 6 * 128 * 256 | 
			
				
				
			
		
	
		
		
	
		
		
			
				
					
					|  |  |  |       * Channels 0,1,2,3 represent the full-res Y channel and are represented in numpy as Y[::2, ::2], Y[::2, 1::2], Y[1::2, ::2], and Y[1::2, 1::2] |  |  |  |       * Channels 0,1,2,3 represent the full-res Y channel and are represented in numpy as Y[::2, ::2], Y[::2, 1::2], Y[1::2, ::2], and Y[1::2, 1::2] | 
			
		
	
		
		
			
				
					
					|  |  |  |       * Channel 4 represents the half-res U channel |  |  |  |       * Channel 4 represents the half-res U channel | 
			
		
	
		
		
			
				
					
					|  |  |  |       * Channel 4 represents the half-res V channel |  |  |  |       * Channel 4 represents the half-res V channel | 
			
		
	
		
		
			
				
					
					|  |  |  | * **desire** |  |  |  | * **desire** | 
			
		
	
		
		
			
				
					
					|  |  |  |   * one-shot encoded vector to command model to execute certain actions, bit only needs to be sent for 1 frame : 8 |  |  |  |   * one-hot encoded vector to command model to execute certain actions, bit only needs to be sent for 1 frame : 8 | 
			
				
				
			
		
	
		
		
	
		
		
			
				
					
					|  |  |  | * **traffic convention** |  |  |  | * **traffic convention** | 
			
		
	
		
		
			
				
					
					|  |  |  |   * one-shot encoded vector to tell model whether traffic is right-hand or left-hand traffic : 2 |  |  |  |   * one-hot encoded vector to tell model whether traffic is right-hand or left-hand traffic : 2 | 
			
				
				
			
		
	
		
		
	
		
		
			
				
					
					|  |  |  | * **recurrent state** |  |  |  | * **recurrent state** | 
			
		
	
		
		
			
				
					
					|  |  |  |   * The recurrent state vector that is fed back into the GRU for temporal context : 512 |  |  |  |   * The recurrent state vector that is fed back into the GRU for temporal context : 512 | 
			
		
	
		
		
			
				
					
					|  |  |  | 
 |  |  |  | 
 | 
			
		
	
	
		
		
			
				
					|  |  | 
 |