At the end of every calendar year, the complaints from autonomous vehicle companies start piling up. This annual tradition is the result of a requirement by the California Department of Motor Vehicles that AV companies deliver “disengagement reports” by January 1 of each year showing the number of times an AV operator had to disengage the vehicle’s autonomous driving function while testing the vehicle.
However, all disengagement reports have one thing in common: their usefulness is ubiquitously criticized by those who have to submit them. The CEO and founder of a San Francisco-based self-driving car company publicly stated that disengagement reporting is “woefully inadequate … to give a meaningful signal about whether an AV is ready for commercial deployment.” The CEO of a self-driving technology startup called the metrics “misguided.” Waymo stated in a tweet that the metric “does not provide relevant insights” into its self-driving technology or “distinguish its performance from others in the self-driving space.”
1/7 We appreciate what the California DMV was trying to do when creating this requirement, but the disengagement metric does not provide relevant insights into the capabilities of the Waymo Driver or distinguish its performance from others in the self-driving space.
— Waymo (@Waymo) February 26, 2020
Why do AV companies object so strongly to California’s disengagement reports? They argue the metric is misleading based on lack of context due to the AV companies’ varied testing strategies. I would argue that a lack of guidance regarding the language used to describe the disengagements also makes the data misleading. Furthermore, the metric incentivizes testing in less difficult circumstances and favors real-world testing over more insightful virtual testing.
Understanding California reporting metrics
To test an autonomous vehicle on public roads in California, an AV company must obtain an AV Testing Permit. As of June 22, 2020, there were 66 Autonomous Vehicle Testing Permit holders in California and 36 of those companies reported autonomous vehicle testing in California in 2019. Only five of those companies have permits to transport passengers.
To operate on California public roads, each permitted company must report any collision that results in property damage, bodily injury, or death within 10 days of the incident.
There have been 24 autonomous vehicle collision reports in 2020 thus far. However, though the majority of those incidents occurred in autonomous mode, accidents were almost exclusively the result of the autonomous vehicle being rear-ended. In California, rear-end collisions are almost always deemed the fault of the rear-ending driver.
The usefulness of collision data is evident — consumers and regulators are most concerned with the safety of autonomous vehicles for pedestrians and passengers. If an AV company reports even one accident resulting in substantial damage to the vehicle or harm to a pedestrian or passenger while the vehicle operates in autonomous mode, the implications and repercussions for the company (and potentially the entire AV industry) are substantial.
However, the usefulness of disengagement reporting data is much more questionable. The California DMV requires AV operators to report the number and details of disengagements while testing on California public roads by January 1 of each year. The DMV defines this as “how often their vehicles disengaged from autonomous mode during tests (whether because of technical failure or situations requiring the test driver/operator to take manual control of the vehicle to operate safely).”
Operators must also track how often their vehicles disengaged from autonomous mode, and whether that disengagement was the result of software malfunction, human error, or at the option of the vehicle operator.
AV companies have kept a tight lid on measurable metrics, often only sharing limited footage of demonstrations performed under controlled settings and very little data, if any. Some companies have shared the occasional “annual safety report,” which reads more like a promotional deck than a source of data on AV performance. Furthermore, there are almost no reporting requirements for companies doing public testing in any other state. California’s disengagement reports are the exception.
This AV information desert means that disengagement reporting in California has often been treated as our only source of information on AVs. The public is forced to judge AV readiness and relative performance based on this disengagement data, which is incomplete at best and misleading at worst.
Disengagement reporting data offers no context
Most AV companies claim that disengagement reporting data is a poor metric for judging advancement in the AV industry due to a lack of context for the numbers: knowing where those miles were driven and the purpose of those trips is essential to understanding the data in disengagement reports.
Some in the AV industry have complained that miles driven in sparsely populated areas with arid climates and few intersections are miles dissimilar from miles driven in a city like San Francisco, Pittsburgh, or Atlanta. As a result, the number of disengagements reported by companies that test in the former versus the latter geography are incomparable.
It’s also important to understand that disengagement reporting requirements influence AV companies’ decisions on where and how to test. A test that requires substantial disengagements, even while safe, would be discouraged, as it would make the company look less ready for commercial deployment than its competitors. In reality, such testing may result in the most commercially ready vehicle. Indeed, some in the AV industry have accused competitors of manipulating disengagement reporting metrics by easing the difficulty of miles driven over time to look like real progress.
Furthermore, while data can look particularly good when manipulated by easy drives and clear roads, data can look particularly bad when it’s being used strategically to improve AV software.
Let’s consider an example provided by Jack Stewart, a reporter for NPR’s Marketplace covering transportation:
“Say a company rolls out a brand-new build of their software, and they’re testing that in California because it’s near their headquarters. That software could be extra buggy at the beginning, and you could see a bunch of disengagements, but that same company could be running a commercial service somewhere like Arizona, where they don’t have to collect these reports.
That service could be running super smoothly. You don’t really get a picture of a company’s overall performance just by looking at this one really tight little metric. It was a nice idea of California some years ago to start collecting some information, but it’s not really doing what it was originally intended to do nowadays.”
Disengagement reports lack prescriptive language
The disengagement reports are also misleading due to a lack of guidance and uniformity in the language used to describe the disengagements. For example, while AV companies used a variety of language, “perception discrepancies” was the most common term used to describe the reason for a disengagement — however, it’s not clear that the term “perception discrepancies” has a set meaning.
Several operators used the phrase “perception discrepancy” to describe a failure to detect an object correctly. Valeo North America described a similar error as “false detection of object.” Toyota Research Institute almost exclusively described their disengagements vaguely as “Safety Driver proactive disengagement,” the meaning of which is “any kind of disengagement.” Whereas, Pony.ai described each instance of disengagement with particularity.
Many other operators reported disengagements that were “planned testing disengagements” or that were described with such insufficient particularity as to be virtually meaningless.
For example, “planned disengagements” could mean the testing of intentionally created malfunctions, or it could simply mean the software is so nascent and unsophisticated that the company expected the disengagement. Similarly, “perception discrepancy” could mean anything from precautionary disengagements to disengagements due to extremely hazardous software malfunctions. “Perception discrepancy,” “planned disengagement” or any number of other vague descriptions of disengagements make comparisons across AV operators virtually impossible.
So, for example, while it appears that a San Francisco-based AV company’s disengagements were exclusively precautionary, the lack of guidance on how to describe disengagements and the many vague descriptions provided by AV companies have cast a shadow over disengagement descriptions, calling them all into question.
Regulations discourage virtual testing
Today, the software of AV companies is the real product. The hardware and physical components — lidar, sensors, etc. — of AV vehicles have become so uniform, they’re practically off-the-shelf. The real component that is being tested is software. It’s well known that software bugs are best found by running the software as often as possible; road testing simply can’t reach the sheer numbers necessary to find all the bugs. What can reach those numbers is virtual testing.
However, the regulations discourage virtual testing as the lower reported road miles would seem to imply that a company is not road-ready.
Jack Stewart of NPR’s Marketplace expressed a similar point of view:
“There are things that can be relatively bought off the shelf and, more so these days, there are just a few companies that you can go to and pick up the hardware that you need. It’s the software, and it’s how many miles that software has driven both in simulation and on the real roads without any incident.”
So, where can we find the real data we need to compare AV companies? One company runs over 30,000 instances daily through its end-to-end, three-dimensional simulation environment. Another company runs millions of off-road tests a day through its internal simulation tool, running driving models that include scenarios that it can’t test on roads involving pedestrians, lane merging, and parked cars. Waymo drives 20 million miles a day in its Carcraft simulation platform — the equivalent of over 100 years of real-world driving on public roads.
One CEO estimated that a single virtual mile can be just as insightful as 1,000 miles collected on the open road.
Jonathan Karmel, Waymo’s product lead for simulation and automation, similarly explained that Carcraft provides “the most interesting miles and useful information.”
Where we go from here
Clearly there are issues with disengagement reports — both in relying on the data therein and in the negative incentives they create for AV companies. However, there are voluntary steps that the AV industry can take to combat some of these issues:
- Prioritize and invest in virtual testing. Developing and operating a robust system of virtual testing may present a high expense to AV companies, but it also presents the opportunity to dramatically shorten the pathway to commercial deployment through the ability to test more complex, higher risk, and higher number scenarios.
- Share data from virtual testing. Voluntary disclosure of virtual testing data will reduce reliance on disengagement reports by the public. Commercial readiness will be pointless unless AV companies have provided the public with reliable data on AV readiness for a sustained period.
- Seek the greatest value from on-road miles. AV companies should continue using on-road testing in California, but they should use those miles to fill in the gaps from virtual testing. They should seek the greatest value possible out of those slower miles, accept the higher percentage of disengagements they will be required to report, and when reporting on those miles, describe their context in particularity.
With these steps, AV companies can lessen the pain of California’s disengagement reporting data and advance more quickly to an AV-ready future.