I’ve done some testing today with regards to this and I thought I’d share my findings. These aren’t wholly conclusive, but I’m working towards understanding the source of the problem.
I’ll start with the design of the robot. Currently, there are two computing modules within the robot’s head, a 410 module, and an 820 module, both from Intrinsyc. The 410 module is running Windows 10 Iot Core and the 820 is running Android 8.1. Today, in a normal, factory-fresh robot, all wifi communication goes through the 820, with specific ports routed to he 410. Today, ICMP requests are served by the 820.
My experiment centered around understanding average ping times to the robot, max ping times, and the number of timeouts that occur. To that end, I created script that pinged the robot’s IP address 1000 times, with a 200 ms delay between requests. I set the timeout to 2000 ms, even though that’s ridiculous.
My first experiment was centered around testing a robot that has just booted and either charging or not charging, with the thought that the wireless charger was somehow interfering. The resulting metrics for charging and not charging under these conditions were similar:
Cold boot, not charging: Avg: 64.66, Max: 1672, Errors: 1
Cold boot, charging: Avg: 69.94, Max: 1742, Errors: 21
WOW, the ping times were high, and the errors during charging were unexpected. I’m still testing the not charging case to prove that the recorded error count is correct (I suspect it was a fluke), but that’s an exercise for tomorrow.
I tried subsequent runs after 30 minutes, 60 minutes, and 90 minutes, both charging and not charging, all of which looked similar:
30 minutes: Avg: 79.37, Max: 1647, Errors: 28
60 minutes: Avg: 84.13, Max: 1423, Errors: 25
90 minutes: Avg: 82.67, Max: 414, Errors: 20
120 minutes: Avg: 90.37, Max: 1026, Errors: 35
That all looks pretty poor but doesn’t show appreciable degradation over time. For funsies, I connected an ethernet to USB dongle (subtracts the wifi component and the communication bridge from the 820 to 410) to the backpack and pinged the Windows instance directly, just to see if it was also responding poorly:
Avg: 5.97, Max 29, Errors: 0
That looks better, but isn’t useful in getting Android to respond correctly. Next, I took the 820 development kit and ran the same ping tests:
Avg: 61.94, Max: 631, Errors: 0
HMMMM… that’s interesting. The max response time is still awful, but the average is plausible and the errors are gone. Next, I started commenting out sections of our init scripts on the 820 to see if we’re doing something with the networking on the device to cause the errors. There were four methods that looked suspect, so I commented each out in combination to see if I could get the adapter to respond in a reasonable way. After a few tries, I got to a place where I’m at ~0 errors. The ping times however are still quite bad with the average being ~100 ms.
I’ll be doing more testing this week to see if I can get better information, but it looks like I’m narrowing in on the source of the errors. I’m still a bit puzzled by the swings in ping times. I’ll be enhancing my testing methodology this week to track standard deviation, and probably playing with the timeouts to see what shakes out as nominal.
More to come.