A new NIST first-of-its-kind study into face recognition algorithms found that there is a wide range of accuracy across demographics, such as: age, sex, and either race or country of birth. The resulting report, NISTIR 8280 Face Recognition Vendor Test (FRVT) Part 3: Demographic Effects, provides an overview of FRVT for developers, integrators, end users, policy makers, and others who have familiarity with biometrics applications and performance metrics. It also outlines how NIST conducted tests and analyzed the outcomes of demographic differentials across nearly 200 face recognition algorithms from nearly 100 developers, using four collections of photographs with more than 18 million images of more than 8 million people.
Using both ‘one-to-one verification’ and ‘one-to-many identification’ algorithms submitted to NIST, the FRVT report evaluates how well algorithms perform one of these two tasks, which are among face recognition’s most common applications. This is also the first report to describe and quantify demographic differences for face recognition ‘one-to-many identification’—or whether a person in a photo has any match in a database.
To evaluate algorithm’s performance on its task, the team measured the two classes of error the software can make: false positives and false negatives. A false positive means that the software wrongly considered photos of two different individuals to show the same person, while a false negative means the software failed to match two photos that in fact do show the same person. Making these distinctions is important because the class of error and the search type can carry vastly different consequences depending on the real-world application.
The report results show a wide range in accuracy across developers, with the most accurate algorithms producing many fewer errors. These algorithms can therefore be expected to have smaller demographic differentials. Contemporary face recognition algorithms exhibit demographic differentials of various magnitudes. The main result is that false positive differentials are much larger than those related to false negatives and exist broadly, across many, but not all, algorithms tested.
We are happy to speak with you if you have questions or feedback. Please contact Jennifer Huergo (Jennifer.Huergo@NIST.gov) or Chad Boutin (Charles.Boutin@NIST.gov) from the NIST Public Affairs Office with any questions.
|