The battery and ESCs are located on the lower chassis plane in front of the motors

Ideally, each benchmark evaluation would involve the sensor undergoing an identical sequence of rotations while running the estimation filter. One method to accomplish this is to mount three sensors on a platform and direct the signals to the hardware systems under evaluation. This is cumbersome, however, and requires more components than were available. Instead we repeated a sequence of rotations in a total time of approximately 30 seconds of the sensor for each evaluation. The sequence of rotations is a ±90◦ rotation in the yaw axis,a ±90◦ rotation in the roll axis, and a ±90◦ rotation in the pitch axis. A sample rotation sequence is shown in Fig. 6.2. Inside the algorithm the duration of every iteration of the algorithm was timed with the internal hardware timers available in the three systems. The algorithm updated every 20 msec, or at 50 Hz. The resulting update duration and the attitude estimate were transmitted to a computer that stored the data. Several conclusions can be drawn from this benchmark study. For the microcontroller systems the order of the four AHRS algorithms from lowest latency to highest was consistent: SQM,DQM,DQA,DDA. Comparing SQM vs DQM, we can see that single precision calculations are faster than double precision ones, all else being equal. This is unsurprising for 32 bit microcontrollers. We would expect that the double precision calculations would take twice as long as single precision ones. Because the bulk of the latency is driven by floating point operations we should expect an increase the total latency by as much as a factor of two. The results bear this out with an increase of 95% for the UC32, 43% for the OSAVC, and 65% for the Pico.

Another observation is that automatic code generation introduces significant latency increases. Comparing version DQM and DQA on the OSAVC we see an increase of 78 µsec when using the Matlab-generated functions. The comparison is even more striking on the Pico, hydroponic channel which exhibits an increase of 130 µsec on average. These differences are significant, however, but not so large as to discourage the use of automatic code generation. This points to a feasible path for developing algorithms on the computer using Matlab and compiling them for the microcontroller without the introduction of bugs and with a manageable increase in latency. This is important because it demonstrates a means for vehicle developers to implement real-time control or estimation algorithms without needing to be specialists in embedded firmware. Finally, when comparing quaternions to DCMs, we see that DCMs require longer computation times, ranging from roughly 5% to 15% . This is due to the larger number of floating point operations due to the redundant information carried in the DCM as well as the more expensive method to integrate the result at each time step. This last is a subtle point. DCMs require the use of the matrix exponential for integration to ensure numerical accuracy of the result, that is, to ensure that the resulting attitude from the DCM algorithm is orthonormal. For quaternion integration, it is possible to integrate using Euler integration and then re-normalize the result if necessary. There are a few conclusions we can draw regarding the microcontroller systems themselves. Clearly, the Digilent UC32 development board performs much worse than either the OSAVC or the Pico.

This is mysterious given the similar microcontroller specifications and architecture of the UC32 to the OSAVC. The OSAVC performed slightly worse than the Pico when evaluating the hand coded algorithms. In all likelihood this is due to the faster clock speed of the Pico—120 MHz vs 80 MHz. Interestingly, that result is reversed upon comparing the automatic code generation in which case the OSAVC outperformed the Pico. It isn’t clear why this is the case but may be due to an increase in latency due to the use of the SDK as opposed to bare-metal programming. When evaluating the variation in latency, however, the OSAVC performed better than the Pico for all AHRS versions. The OSAVC had less than 0.6% standard deviation over mean for all AHRS variants. The Pico on the other hand, performed four to five times worse than this for all variants. This appears to be due to the bimodal distribution of latency for the Pico, possibly as a result of its optimized math libraries. Although 2.5% latency variation is small, it translates into additional noise in the attitude estimate due to inaccuracy in the integration time constant inside the algorithm. If we consider that most vehicles will have an update period on the order of milliseconds to tens of milliseconds, both the Pico and the OSAVC conceivably can perform the attitude estimation without consuming the bulk of the time budget allocated to real-time control. This is an encouraging result. The last system to discuss is the Raspberry Pi. Its behavior is different than the microcontroller systems. The benchmark experiments demonstrate that floating point precision doesn’t affect the latency of the algorithms at least within the bounds of the experiments. This is likely due to the fact that the Raspberry Pi is running a 64 bit OS, so all calculations are computed using 64 bit registers regardless of the specified precision . Although the Raspberry Pi is much faster than any of the microcontroller systems— as one would expect due to the difference in clock speed and register width—its latency is not normally distributed. This demonstrates the issue of using a system without a real-time OS.

During these studies, the Raspberry Pi was not running other processes besides the OS in the background. If, for example, the Raspberry Pi was performing a background task that was delayed, this would affect the calculation of the attitude in a detrimental way. This is in contrast to the OSAVC, with its deterministic code execution. Put another way, while the Pi has much more computational power and speed than any of the microcontrollers, the OS can’t guarantee a fixed latency, whereas bare-metal programming does if designed appropriately because there is no OS. In summary, the OSAVC performed nearly well in terms of latency as the fastest microcontroller studied, the Raspberry Pi Pico, but had smaller latency variation. The UC32 had the worst latency of the all the systems. The Raspberry Pi performed the best by far in terms of latency but the latency distribution when considered relative to the mean was the worst—understandably, as it is not a real-time processor. Edge TPUs for ML applications have recently become widely available to the consumer market. There are two notable models: the Intel Neural Compute Stick 2 and the Coral USB Accelerator. Both connect to a host computer using a USB interface, hydroponic dutch buckets cost less than $100, and are small . These attributes allow them to be used with compatible SBCs. Again, our choice of the Raspberry Pi 4b does not constrain a new vehicle developer to this SBC as the Coral communicates over USB and the object detection models are compatible with many different Linux distributions. The main difference between the two TPUs is the software interface—the Coral unit is manufactured for Google and has native support for their Tensor Flow Lite ML applications, whereas the NCS2 uses a translation software to convert ML models to a compatible form. For the proposed architecture we chose the Coral because of its native support of TensorFlow Lite image classification and object detection models. What follows is a discussion of some of the interesting capabilities enabled by this architecture. The first is using the TPU to help identify landmarks in the environment and the CPU to provide guidance based on that information. The second is onboard optimal sensor calibration. A third capability that we have developed is presented in the next chapter when discussing optimal guidance strategies. The AGV uses a camera to identify markers that determine the boundaries of a closed circuit. The course markers are cones of different colors to differentiate between the left boundary and the right boundary of the circuit. Through a research project funded by the Google Summer of Code, we developed and trained several machine learning models to identify the cones, including the MobileNet V2 model, the YOLOv52 model, and the Efficient Det model. After successfully running these on standard computers, we ported them to the AGV. We first implemented them on the SBC where we were able to get object detection at about 1 frame per second. Fig. 7.2 shows the results of the YOLOv5 model running on the SBC. This model was accurate in identifying cones at a useful inference rate even in shadow. As seen in the figure the model returns the bounding box coordinates and confidence level of the object detection inference. We have been able to compile and test the Efficient Det model on the TPU, offloading the object detection entirely from the CPU and achieving frame rates in excess of 10 frames per second.

A key requirement for autonomous navigation is sensor calibration, in particular the calibration of magnetometer and accelerometer scaling, offset, and alignment variables.There are numerous algorithms for inertial sensor calibration most relying on fitting a set of data taken while rotating the sensor around all three axes to an ellipsoid and determining the calibration parameters through a least squares fit . Because inertial sensors like magnetometers and accelerometers experience a constant force regardless of orientation when at rest, a set of data sampled slowly—to minimize accelerations— and uniformly during rotations around all three axes should describe a sphere centered at the origin. Rather than perform a direct fit to an ellipsoid, instead we demonstrate in Fig. 7.3 a simulated calibration using a method that iteratively fits the data to a unit sphere. This algorithm is written in python and runs easily on the AGV SBC and computes the calibration parameters without assuming ellipsoidal constraints to the data. We have employed this algorithm successfully on real data taken on the vehicle while rotating it through a random series of orientations. We used the distributed control architecture to sample and store the data, as well as calculate the calibration parameters. Implementing this algorithm as part of vehicle calibration suite is a rich area of future research on the platform. The platform used as a development test bed is the DFRobot Asurada GPX. This model has been replaced with a substantially similar one, the DFRobot ROB0170 NXP Cup Race Car Chassis. Sold as a kit, it consists of a small aluminum chassis that allows for extensive modification due to numerous mounting points. It is driven by dual BLDC rear motors rated for 12 V operation at 930 Kv. A 30 A ESC with BLHeli firmware in bi-directional mode conrols each motor. This means that the motor can be operated both forward and in reverse. A 13:50 gear reduction is applied to increase the torque at the wheels. It uses an Ackerman steering mechanism controlled by a servo motor. It has a wheelbase of approximately 174 mm, and four 65 mm diameter hollow rubber wheels. It is powered by up to a three cell lithium polymer or equivalent 12V battery. Fig. 7.4 shows the lower chassis with the steering assembly and the motors installed.The vehicle is equipped with a GPS sensor, IMU, four rotary encoders, and a LiDAR . The vehicle has four outputs controlling the two ESCs for the drive motors, the steering servo, and a servo to control the pointing direction of the LiDAR. The motors are powered from the battery directly; the servo motors are powered by the onboard 5V LDO. The vehicle is also equipped with a serial radio for telemetry communication to a remote ground station, and a radio control receiver for manual control. The OSAVC is the real-time controller of the overall system, but it is part of a larger distributed control system architecture discussed in more detail in the subsequentsection. The OSAVC communicates over the USB port to a Raspberry Pi4b SBC using the MAVLink protocol. The SBC connects to the Raspberry Pi V2 camera, used to record video and detect obstacles or landmarks. It also connects to the Coral TPU for ML inferencing. The complete block diagram of the AGV system is shown in Fig. 7.6.