1
Fighting Crime With Science
As a teenager, Dr. Susan Walsh loved the TV show “The X-Files.” She was particularly drawn to the character of Dana Scully, a hyper-rational doctor-cum-FBI agent who brought a scientist’s skepticism to investigations of paranormal phenomena and deployed her medical training to determine cause of death for the show’s victims.
The fact that Scully used science to solve problems and pursue justice intrigued Walsh. She wanted to explore a career in forensics but was on the fence about how to do it. Should she go into law enforcement? Become a scientist? The show helped her to decide. She loved the science. “It did start with Scully, if I’m being honest,” she said.
Walsh studied biochemistry and, while working on her master’s degree in DNA profiling, she happened onto a research paper that caught her attention. Australian scientists had found DNA markers corresponding to eye color, and Walsh began to wonder whether those techniques could be applied to criminal investigations. If crime-scene DNA could be analyzed for markers that relate to physical appearance, Walsh suspected that could help investigators identify suspects — and take crime fighting to a new level.
“Oh wow, that’s so cool that we’ll one day be able to predict what people look like,” using DNA, she thought. “In an application of a forensic setting, that’s amazing.”
That was 2005. Today, Walsh is at the top of her field. An assistant professor in the School of Science at Indiana University Indianapolis, she runs a lab researching what is now known as forensic DNA phenotyping, or FDP. Walsh has worked on locating genes related to eye, hair, and skin color and has built an open-source tool for people, including in law enforcement, who want to use DNA to predict those traits. She has also investigated connections between DNA markers and the appearance of various facial features, known as facial morphology.
Through her research, she came to learn that FDP works as she imagined it could: An unknown DNA sample can be parsed for genetic markers related to various traits, like hair or eye color, offering criminal investigators a glimpse into what the owner of the DNA might look like. That, in turn, could be useful information for prioritizing suspects to investigate. If the DNA says a person is likely to have red hair, for example, detectives could bump redheads to the top of their suspect list.
Still, Walsh remains cautious about how she describes what DNA can and cannot tell us about what a person might look like. At present, the idea that DNA can be used to predict facial structure — for example, what a person’s chin might look like — is more science fiction, like her beloved “X-Files,” and less science fact. The human face is a complicated structure defined by both nature (so, DNA) and nurture (like, if you’ve had your nose broken). Like others in her field, Walsh is unsure that research into morphology will ever bear reliable fruit. “We can’t even do a nose right now,” she said.
Walsh is adamant: It’s scientifically premature to deploy these methods to predict a person’s face, especially when their life and liberty is at stake. Not everyone in the field has been as chary.
“The Science Isn’t There”
A private company based in Reston, Virginia, Parabon NanoLabs was founded in 2008 with the mission of creating “breakthrough products” using DNA, with an initial focus on developing cancer therapies. It has since evolved into a prominent purveyor of forensic products, including DNA phenotyping, to police agencies. Though it’s well known among forensic scientists, it maintains a fairly low public profile and publishes few details about its operation online.
According to Parabon, its Snapshot FDP System “accurately” predicts not only eye, hair, and skin color, but also face shape. For a fee, the company will provide law enforcement agencies with a rendering of its predictions in the form of a color composite sketch, along with a “corresponding measure of confidence” in the predicted traits. The company says it has worked with hundreds of police agencies in the nine years it’s been doing this work.
As Parabon’s foothold in the world of forensics deepened, so did the concern among scientists and legal experts, who warn that the company’s sketches are, at best, misleading. Leading experts agree the science has not evolved enough to accurately and reliably provide the kind of singular image Parabon produces for police investigations. Even a scientist who helped develop the technology says it’s not ready for real-world use.
Parabon’s methodology for generating its phenotype predictions is a closely guarded secret; its system has not faced independent scientific verification and validation — the gold standard among scientists for vetting the efficacy of computer-based programs — nor has it been peer reviewed. Still, Parabon insists that its phenotyping work is based on good science. While it acknowledges that its program has not gone through traditional scientific review processes, it says the proof of Snapshot’s ability and value is in the number of law enforcement agencies that use it and say it has helped them solve cases.
For years, Walsh privately pressed the company to explain its work and grew frustrated by Parabon’s refusal to engage with her questions. Her concerns were not just hypothetical: In a criminal legal system rife with wrongful convictions and racial bias, there are countless ways using an unproven tool to solve crimes can, and does, go wrong.
Those frustrations came to a head during a March 2024 workshop at the National Academy of Sciences covering the good and bad of several next-generation forensic tools used by law enforcement, where Walsh and others sharply criticized Parabon. Selling these singular images to police is “detrimental to the field and something we need to stop,” Walsh said.
Police pay hundreds per case for appearance prediction, yet “how these tools function remains shrouded in secrecy,” noted Rebecca Brown, the former policy director for the Innocence Project and the founder of Maat Strategies, a criminal legal policy consulting firm. Speaking at the workshop, Brown cautioned against the use of FDP and other novel disciplines absent robust validation and regulation. There are “too many examples of investigative tools that become runaway trains,” she said.
Parabon’s FDP service follows a predictable pattern in forensic science: Novel techniques are developed, often by private industry, and pressed into service for law enforcement purposes before their limitations have been fully assessed and addressed.
As with other forensic innovations, like forensic genetic genealogy or facial recognition, FDP is sold as an “investigative tool” — that is, a product not intended for use as evidence in a criminal proceeding, but as a behind-the-scenes aide to police searching for perpetrators. But selling a scientifically questionable product as a mere investigative tool can have real-world consequences.
For FDP in particular, experts warn that the composite images can reinforce racial stereotypes, encourage the over-surveillance of marginalized communities, and deny criminal defendants important information about how they became a target of an investigation, which raises serious implications for Fourth Amendment privacy rights. Composites like those Parabon sells could also inadvertently taint the memories of eyewitnesses to a crime, risking potentially valuable evidence.
Paula Armentrout, Parabon’s co-founder, provided written responses to questions from The Intercept about the company’s Snapshot program. In part, the company said that The Intercept “should not quote any of the presenters” at the NAS workshop, who it claims “made many false, uninformed, and misleading statements that were not based on evidence or facts, but on misinformation propagated by inaccurate media articles, hearsay, and their own personal and political agendas.”
Walsh insists her criticisms are motivated solely by her fidelity to the science and to ensuring the transparency and accuracy of forensic tools used in the criminal legal system. To that end, she was emphatic during the workshop: Law enforcement should not be allowed to purchase phenotyping composites. “The science isn’t there. We shouldn’t be doing it,” she said. At this juncture, she said, those sketches are about as scientific as “my son drawing them.”
2
Marketing a DNA Blueprint
Parabon’s foray into forensics began in 2009, when the company secured the first of several contracts with the Pentagon’s Defense Threat Reduction Agency, which was looking for a way to identify individuals in combat zones responsible for building improvised explosive devices. Parabon proposed extracting physical traits from DNA collected from the weapons to get the job done, and a subsequent 2012 contract led to the development of the Snapshot system. “Traditional DNA analysis treats DNA like a fingerprint, useful for identification,” Parabon co-founder and CEO Steven Armentrout told the military’s Success Stories publication in 2022. “But Snapshot treats it like a blueprint for how to build a human.”
The company began marketing the service to police agencies in 2015, an effort that has been “extremely successful,” Ellen McRae Greytak, the company’s director of bioinformatics, said during a webinar for a military organization in 2020. In her presentation, Greytak briefly outlined Parabon’s work to create Snapshot: how researchers collected existing DNA information for individuals across the world to home in not only on markers for hair, skin and eye color, but also for specific geographic ancestry information; how they used machine learning to create the algorithm that generates predictions; and how, at the time, the company was developing a phone app to help gather three-dimensional images of faces to aid its morphology work.
Once the software makes a phenotype prediction, a forensic artist steps in to shade the composite. Of course, the process has its limitations, Greytak acknowledged. It can’t predict hairstyle, for example, or any other form of non-genetic modification — like scarring, tattoos, or dyed hair — and it can’t discern a person’s weight. Parabon’s composites are developed for what a person would look like as “a young adult at a normal body weight,” she said, which the company defines as a body mass index of 22.
Parabon had already worked on “hundreds of cases,” Greytak said during the webinar, sharing a couple of alleged success stories. In 2016, Massachusetts police investigating the 24-year-old cold-case murder of Lisa Ziegert used crime-scene DNA to obtain a Parabon sketch of her possible murderer.
Detectives used the composite information to narrow down the pool of “thousands” of people who, over the years, had been noted in the case file, Greytak said. There “were maybe five guys who closely matched the predictions we made,” she said, so the cops went knocking on their doors. Gary Schara wasn’t home when the police arrived at his place, so they told Schara’s roommate to pass on the message that “we’d like to speak to him,” Greytak explained. “When Gary hears that, he flees.” Police were eventually able to track Schara down and to match his DNA to the crime, she said, prompting him to confess. “They were finally able to close this homicide case.”
According to news reports, Schara was more than just a note in the case file. In fact, he had long been a suspect: His wife gave him up to police in 1993, and he was subsequently interviewed multiple times by investigators, including from the FBI.
After police received the Parabon phenotyping report and returned once again, talking to his roommate, Schara penned a confession and tried to kill himself. Police found him the next day in a Connecticut hospital. Schara ultimately pleaded guilty and was sentenced to life in prison.
It is unclear why detectives were unable to close the case years earlier. The Hampden district attorney’s office did not respond to The Intercept’s requests for comment, but in 2019, MassLive reported that District Attorney Anthony Gulluni said the “embrace of new technology” had helped to solve the case. Still, it appears the most Parabon can claim credit for is reminding cops of at least one of their top suspects.
A Singular Image
Susan Walsh had been working on FDP for nearly a decade when Parabon’s service debuted for law enforcement agencies. Back then, Walsh was mostly curious. She started asking Parabon questions. “I was saying, ‘Oh, what [DNA] markers are you using? And where’s your paper? Where can I read it? And what data set are you working with? And what’s your algorithm?’” she recalled. “And I was just getting nothing back.”
She approached company representatives at conferences and asked how the program worked. “They just didn’t answer my questions,” she said. “And then I was like, ‘OK. Well, I don’t think that you should be allowed in the field if you’re not going to answer the questions a scientist asks you.’” Scientists should be open to having their work scrutinized by peers, she said; they should be forthcoming about what parameters they’re using, about what their tool does well — and where it fails. “I was a bit curious at first and then kind of a little bit angry.” It felt to her like snake oil, selling hope in the form of a tool that could provide answers in cases that had long gone cold.
Walsh repeatedly tried to raise the alarm within the forensics community, but “it still wasn’t working.” By the time the NAS workshop rolled around in March, she did not mince words. Parabon’s sketches are “detrimental,” she said to the the scientists, legal scholars, academics, and advocates gathered at the National Academies’ headquarters in Washington, D.C., for the two-day event. “I was just sick of saying it all the time — that we need science,” she later told The Intercept. “We need publications. We need peer review.”
Walsh emphasized that she believes selling composite images is scientifically indefensible. Experts agree that the most accurate way to describe phenotypic predictions is individually — the likelihood of brown eyes or blonde hair, for example — which offers police solid and actionable intelligence without tipping into science fiction, she said. Currently, each of the three predictions available via Walsh’s tool, which has been validated and peer reviewed, are reported to be approximately 80 percent accurate.
Although Walsh’s tool is available to law enforcement agencies free of charge, she said she doesn’t get that many cases. She suspects that’s because she won’t offer the cops a composite. “They go off and they pay because they want that singular image.”
For that, they can turn to Parabon.
3
Proprietary Methods
For Parabon, independent verification and peer review are superfluous pursuits. In response to a series of questions from The Intercept, the company said its program can’t be externally vetted because the code is “proprietary.” As for peer review, while it is a “valuable process for academic research because it allows researchers to contribute to the broader body of knowledge,” the company said, Parabon instead focuses on “delivering actionable results” to law enforcement customers.
“Unlike academics, whose primary goal is to contribute to scientific literature and educate, our priority is to serve the immediate needs of our clients,” the company wrote. Peer review can “sometimes become bogged down in theoretical debates,” it opined, noting that if Parabon had gone that route and hadn’t started selling its system to police, the service “would still not be available to them.”
The proof that Parabon’s system works is in the real-world validation the company has received from law enforcement agencies that have hired it to help solve cases. The 70 composites the company has posted online “from actual cases where identifications were later made,” it wrote, “represent the most stringent and authentic performance evaluation possible.” Many of those cases “would not have been solved without Snapshot phenotyping,” the company insists, “a fact to which the involved agencies can attest.”
The company sidestepped specific questions about how it is able to predict facial characteristics when other scientists say that isn’t currently possible. Instead, the company said it approaches things differently than the “academic literature,” using what’s known as principal component analysis — a statistical method that essentially sorts and makes sense of complex, noisy data — to fuel its predictions.
The company said its predictions are based on data collected from more than 1,000, mostly young-adult volunteers, 37 percent of whom “self-identify as White.” Although asked twice to do so, Parabon did not supply the total number of volunteers or a detailed breakdown of the population. Instead, it said the entire sample is “diverse and balanced,” including individuals from various ethnic backgrounds, “such as African, Asian, European, Hispanic/Latino, and Middle Eastern,” as well as individuals with mixed heritage. “This diversity helps ensure the robustness and applicability of our predictions.”
Parabon also provided two diagrams that purport to show how its Snapshot system sorts data to predict face shape, using DNA from Greytak and co-founder Paula Armentrout as an example. The first diagram features a star-like array of blank, gray faces, which Greytak said represent the five main face shapes deduced through principal component analysis. To the side is a heat map of those five faces, which supposedly shows which portions of each face is fueling the ultimate prediction. The second is a more sparse but similar diagram showing the two women’s faces alongside the face shapes the program predicted.
The company declined to say which DNA markers it uses in this process, saying the specific genetic markers were “chosen based on our proprietary analysis.”
Mark Shriver, a geneticist and professor of anthropology at Penn State University who is a leading expert on phenotyping, reviewed the diagrams and relevant portions of the responses that Parabon provided to The Intercept. He said they are fundamentally flawed. “If you want to study variation within a population, then you need a large sample from just that population,” he said. “If you want to distinguish the two white women like they were doing in their example figure … then you need 1,000 white people.”
Garbage In/Garbage Out
Shriver knows better than most how Parabon’s model works. More than a decade ago, Shriver collaborated with the company on its Pentagon contract. He and a colleague conducted the research that now underpins the Snapshot system, he said, including the information from the 1,000 or so volunteers. It was designed more as proof of concept, and in need of significantly more time, research, and work to transform into a truly predictive model. But Parabon was not interested in doing that work, Shriver said, which led him and his colleague to part ways with the company. “It became clear they just wanted to take it to market immediately,” he said.
In Parabon’s telling, its relationship with Shriver “ended without acrimony” at the conclusion of his subcontract. His “concerns … were never communicated to us,” Paula Armentrout wrote to The Intercept in an email.
Shriver told The Intercept that Parabon’s data set is far too small to support the kind of individualizing predictions the company sells to police. “And this was one of the points I made clear to them from the start,” Shriver said.
“One of the phrases that goes way back in computer science is ‘garbage in, garbage out,’” he said. “The input data is fundamental to any kind of analysis, any kind of conclusions, any kind of predictions you’re going to be able to do from it.” A thousand volunteers from one population could, “perhaps, start to get you some information about what’s going on within that population,” he said. But the sample Parabon is working with was selected to cover a “bunch of populations.” Meaning, the system is primed for drawing general conclusions, but not for making detailed predictions about individuals.
The company pursued an approach that differs from the “methods being explored in academia,” Armentrout reiterated in response to questions about Shriver. “Dr. Shriver was developing his own face prediction methods for casework, although we’re not aware if they have ever been used in a forensic case.”
Armentrout is right that Shriver hasn’t deployed his research forensically in the way Parabon has — with good reason. Though his research now includes data from tens of thousands of people — from both diverse populations and within closed groups, including families — he cautions that there is still more to be done to develop an effective, predictive tool. He won’t put it to work until it has been tested, validated, and peer reviewed.
“It really isn’t science until it’s been looked at by somebody who could understand what you did wrong and what you did right,” Shriver said. “And not just one person, but the whole community has to be able to review what you’ve done if you want to call it science.”
“Otherwise,” he said, “you’re just playing games in the closet.”
Photo: Artur Widak/NurPhoto via AP
“That Could Be the Guy”
Investigators at the Edmonton Police Service in Alberta, Canada, were desperate to solve the violent rape of a young woman in March 2019. The man who attacked her was a stranger and had been bundled up against the cold, leaving her with few details about his appearance. There was no CCTV footage or other witnesses, save for DNA left behind.
Three years later, the department turned to Parabon for help. The company used the DNA to generate a sketch of a nondescript Black man. According to Parabon, the suspect is of East African descent — as well as part South and West African — and likely has dark skin, dark hair, dark eyes, and no freckles. The police department posted the generic image online, including to its social media accounts.
The backlash was fierce. The image did little more than implicate nearly every Black man in Edmonton, critics noted, essentially encouraging racial profiling and the continued over-surveillance of minority and other marginalized communities. “If they’re generating an image of a face of a Black person, like what happened in Canada, and then releasing that image to the general public … then you have a bunch of white people who are looking at Black people around them and thinking, ‘Oh, well, that could be the guy,’ and then they just report on that person,” Jennifer Lynch, general counsel at the Electronic Frontier Foundation, told The Intercept.
“It obviously doesn’t help the investigation in any sense,” Lynch continued, “because it’s not a real image of a person, certainly not the real image of the perpetrator, and it can only harm both the investigation and communities of color, because it puts them at greater risk of arrest for things that they didn’t do.”
“It’s not a real image of a person, certainly not the real image of the perpetrator, and it can only harm both the investigation and communities of color.”
Two days after posting the image, the Edmonton police pulled it offline and issued a statement. “The potential that a visual profile can provide far too broad a characterization from within a racialized community and in this case, Edmonton’s Black community, was not something I adequately considered,” Enyinnah Okere, the agency’s chief operating officer said.
Despite the police department’s actions, Parabon kept the image on its website. The sketch merely reported “what the signals in the DNA” indicated about the perpetrator’s “traits and biogeographic ancestry,” the company told The Intercept. It was “unfortunate the community misunderstood the purpose of the composite and reacted the way it did.” Besides, Parabon added, it had been told an arrest was made in the case and that “our prediction was accurate.”
That was news to the Edmonton police. In emails to The Intercept, spokesperson Sgt. Dan Tames said no suspect has been arrested in the case. He also said that after receiving The Intercept’s inquiry, the agency asked Parabon to remove the image from its website. Nearly two years after it was posted, the image was finally removed. Parabon did not respond to an additional request for comment.
The case is a potent example of the way that FDP, and Parabon’s composites in particular, can perpetuate other harmful practices within the criminal legal system. Faulty eyewitness identifications are a leading cause of wrongful convictions, and science has repeatedly demonstrated that people have a harder time correctly identifying people of a different race.
Research has also shown that introducing a composite image to a witness can reshape their memory, potentially corrupting their initial recollection. “The presentation of a single photograph explicitly to ask about whether or not that person is maybe who the witness saw commit the crime has been found to be really suggestive,” said Dr. Kara Moore, a professor of psychology at the University of Utah.
And if police were to tell a witness that a composite is based on DNA phenotyping, that could be even more suggestive, Moore said. “People find DNA evidence to be really persuasive. So this idea that this facial composite was based on DNA may have some implications for accuracy in the person’s mind,” she said. “People might truly believe this is really what the person who committed the crime looks like.”
“The accuracy of the composite is an interesting component too,” she added. “If it’s wrong, you’re negatively contaminating the eyewitness’s memory and really harming your eyewitness. But even if it’s right, you might be artificially inflating the person’s memory and confidence for the face.”
For Walsh, the potential conflating of ancestry with appearance is another cause for concern. While DNA can offer ancestral information, that intel cannot be cribbed into assumptions about what a person looks like, including about facial features and skin color. “Some individuals can be biased by skin pigmentation to infer ancestry, or ancestry to infer pigmentation,” she wrote in an email. “Unless you actually test for the specific trait … you cannot assume either.” Cautioning that she doesn’t know how Parabon’s system works, she said she worries that using ancestral data to produce an image could cause police to “focus on a particular population without foundation.”
In a January 2024 story in Wired, Greytak seemed to suggest that Parabon’s system does take ancestry into account when making some phenotypic predictions. “What we are predicting is more like — given this person’s sex and ancestry, will they have wider-set eyes than average,” she said. But, she said, “there’s no way you can get individual identifications from that.”
Parabon did not directly address The Intercept’s question about Greytak’s comments to Wired, but insisted that it does not use ancestry categories to inform its morphology predictions. “Categorical divisions are artificial and not reflective of the continuous nature of human genetic variation across the globe,” it said.
Either way, critics say current science does not support Parabon’s individualizing composites. As Rebecca Brown, the policy consultant at Maat Strategies, put it, the automated facial composites are “putting a veneer of science on an already problematic identification procedure.”
4
Behind the Scenes
Parabon markets its Snapshot phenotyping service not as a tool for positive identification, but a tool to generate investigative leads. The company stressed this in its responses to The Intercept. “It’s crucial to understand that the DNA phenotyping information we provide to agencies is not used for definitive identification or conviction,” it wrote. That is, the phenotyping is only intended for use in developing suspects; from there, law enforcement agencies would try to use traditional forensic DNA testing to see if the suspect can be linked to crime-scene evidence. “Our work does not change this process in any way,” the company insisted.
But using such a program merely to generate leads is itself questionable. Parabon told The Intercept that it does “not have an exact count” of all the law enforcement agencies that have purchased its phenotyping services but said that “hundreds of agencies” have used Snapshot for casework. Of those, the company only posts to its website images that its client police agencies have already made public.
To date, Parabon has published only 70 composites. That means there are potentially hundreds of cases where law enforcement has used a composite behind the scenes to inform an investigation — information that almost certainly has not, or will not, be made available to the defense in a criminal prosecution, even if it did help to narrow the cops’ focus onto a particular individual.
That’s because the tools police use to generate investigative leads are generally not considered evidence in criminal cases, meaning the state is not required to share information about those tools or the leads they generate with defense lawyers. So, for example, if police use a Snapshot composite to lead them to a suspect who they then charge with a crime, the defense will likely never know unless the police choose to publicize it.
The lack of transparency is alarming to defense attorneys and civil libertarians. “I would say, if there’s a single biggest issue here, it’s that,” said Clare Garvie, a lawyer with the Fourth Amendment Center at the National Association of Criminal Defense Lawyers.
Garvie is an expert on the use of face recognition, another tool whose outputs are often hidden from scrutiny during criminal prosecutions. “The logic behind asserting that it’s an investigative lead only, is, in theory, to protect people from having adverse action taken against them based on unreliable methods,” she said. “But what it has functionally meant is that — in the face recognition context, but very, very likely in other investigative contexts — the defense never finds out that these searches are run.”
Back in 2016, for example, Garvie discovered that police in Pinellas County, Florida, who had been using facial recognition technology since 2001, were using it, “on average, 8,000 times a month.” But at the same time, she noted, the public defender’s office there “had never had a single case in which it had been disclosed.”
Photo: Charles Eckert/Newsday via AP
Ensnared in a Dragnet
Where FDP is concerned, there is at least one current case where police use of Parabon’s work to identify a suspect is being challenged in court. After the 2016 murder of Karina Vetrano, who was killed while jogging near her family home in Queens, the New York Police Department hired Parabon to do phenotyping. The results reportedly came back that the suspect was of African descent, which the NYPD apparently took to mean the person was Black, subsequently undertaking a vast DNA dragnet of hundreds of Black men in the area. Ultimately, the cops landed on a young, developmentally delayed man named Chanel Lewis, who could not be excluded as a source of a trace amount of DNA found at the crime scene.
The fact that the police had used Parabon’s service at all contradicted their public stance about the case — the official line was that a policeman’s hunch and shoe-leather investigation had cracked it — and the prosecution failed to tell Lewis’s defense the whole story. Eventually the fact that the NYPD had employed Parabon was leaked to Lewis’s trial attorneys by a department insider. After a hung jury during his first trial, Lewis was found guilty in 2019. He has appealed his conviction, which his lawyers argue was tainted by the state’s failure to disclose its questionable use of phenotyping to target their client.
At specific issue is whether police violated Lewis’s Fourth Amendment rights when they collected his DNA as part of the dragnet — a question that largely turns on what police had in mind when they approached Lewis. Did they have a reasonable and individualized suspicion that Lewis might be Vetrano’s killer? And, importantly, what was it that made them suspicious of him? Was it solely the phenotyping prediction that the killer was a Black male?
“If you’re getting a phenotyping conclusion that says, it was a Black man, and then you have an investigative strategy where you only take DNA samples from Black men,” then you are using the phenotyping not just to eliminate people, but to target them, said Rhidaya Trivedi, one of Lewis’s attorneys. “Then the scientific integrity of phenotyping enters that Fourth Amendment inquiry: Was it reasonable that they thought [the suspect] was a Black man?”
None of the questions about the scientific integrity of Parabon’s phenotyping have been answered in court. “It’s a huge question, an unanswered question: Can police use phenotyping to affirmatively generate suspicion?” Trivedi asks. “And if so, under what circumstances? Because I doubt that Chanel’s case is the only one where this happened.”
In an expert affidavit filed with Lewis’s appeal, Shriver, the Penn State geneticist, detailed at length the kinds of questions that law enforcement agencies and courts should be asking of any phenotyping service before it is deployed. That includes whether and how the program has been validated, how any results were explained to police, and whether a distinction between geographic ancestry and any facial trait predictions were “communicated and understood.”
Silencing Critics
Jeanna Matthews is something of an evangelist for verification and validation of computer programs used in the criminal legal system. A professor of computer science at Clarkson University, she is also the vice chair of the AI Policy Committee at the Institute of Electrical and Electronics Engineers, known as the IEEE, which has long promulgated standards for ensuring the scientific integrity of computer-based systems.
For Matthews, the issue is straightforward: Forensic tools like Parabon’s phenotyping program need to be independently verified and validated against accepted scientific standards, like those developed by the IEEE, if they’re going to be deployed in the criminal legal system. Put simply, the tools need to be fully reviewed from code to output to determine whether they are built and function as intended.
This kind of detailed, ground-up review is common in mission-critical fields — like with medical devices or air traffic control systems — but it has not been implemented in the criminal legal system. The verification and validation process, known as V&V, “is pretty much ubiquitous when we all agree that it’s important that the software be accurate,” Matthews said. “Why isn’t it done for criminal justice software? We don’t all seem to agree it’s important enough to do it carefully.”
“The idea that anyone is hiding behind trade secrets when life and liberty is at stake, we have to ask ourselves some serious questions.”
In part, the problem is that many newer forensic tools are developed by private companies that, like Parabon, say their system is “proprietary” or make claims of trade secrets to keep outsiders from looking closely at the tools they’re selling. And that, experts say, should be unacceptable for a system that routinely locks people up or kills them.
“The idea that anyone is hiding behind trade secrets when life and liberty is at stake, we have to ask ourselves some serious questions about what we’re about,” said Rebecca Wexler, a professor at the University of California, Berkeley School of Law, “if we’re sort of like, ‘Nope, that profit motive must transcend this person’s ability to prove their innocence.’”
Parabon, it seems, is not only uninterested in having its phenotyping program externally vetted, but also is not too keen on hearing any criticisms of its work.
In addition to admonishing The Intercept not to quote from the NAS workshop in which its Snapshot system was discussed, Parabon said that it had approached the organization about the workshop and was “pleased to report that after an internal review,” the NAS had removed video recording of the event from its website.
An NAS spokesperson acknowledged that the videos were removed but did not respond to repeated questions about the specific reason. “Concerns were raised about comments made at the workshop,” the spokesperson said in a statement to The Intercept. “Although any statements made at the workshop solely reflect the personal opinions of individual presenters and not the views of all workshop participants or the National Academies, we decided to remove the videos of the workshop from our website.”
Though it may no longer be accessible online, the workshop had a lasting impact on Walsh. She said it helped her to think about her work — and its implications — in new ways. In particular, she more seriously ponders how her work could be misused, and about how she can counteract that possibility. She wants to be sure her predictions are made based on robust population samples and that her tools are described openly, and accurately, so that anyone can understand what they can and cannot do.
“I think along the lines of, ‘How can I protect people more?’” she said. “‘How can I make sure that there is no way this can be used badly?’”