Labels

*ORA 14 Forms of Fun 2013-14 2014 360 degree videos 5 Myths Of Game-based Learning ACH activism ADVAT agent network al Qaeda Alumni Amanda Palmer American Nuclear Society AML analysis analysis of competing hypotheses analyst Analyst's Cookbook analytic methods analytic techniques Angry Bird applied intelligence April fools Arab Spring Arbor Networks artificial intelligence assassination assignments asymmetric warfare attention attitudes augmented reality baking Banking Secrecy Act Bastion Bayes BBC bias biases big data bing Biometrics black swans blogging blogroll blogs Boston bombing Boston marathon Braid brainstorming Breckenridge BSA budget business Button Microscope calendar Call of Duty CAMS Canada card game careers careers in intelligence case officer CASOS casual games CentralDesktop Chechnya China Christmas CIA ciphers classroom exercises Clausewitz codes coffee cognitive bias cognitive biases collaboration collection collection management Competitive intelligence compliance conceptual modeling conference Congressional Budget Office conspiracy convergent thinking cooperative game correlations counterterrorism crime analysis Crimea critical minerals Critical thinking Crowdfunding crowdmap crowdmapping crowdsourcing Cthulhu Cthulhu vs. The Vikings CVTV cyber cyberthreat DAGGRE.org data analytics DDOS dea Decision Games Decision making decisionmaking Defense Language Institute dhs dia DICAS digital immigrant digital native divergent thinking diving doe dos drones DuckDuckGo e-international relations economics education education. conference Effectual reasoning Egypt elections Employment encryption ENTINT Entrepreneurial intelligence entrepreneurs Entrepreneurship Entry-level job epic 2014 epub espionage Ethan Zuckerman ethics Ethnolinguistics eurasia Eve Online experimental scholarship facebook faculty Fancy Hands Farmville FBI Fermi problems Fermi questions flow forbidden desert forecast Forecasting forecasting accuracy foreign language Foreign Service Institute Foursquare Free Syrian Army game Game based learning Game Genome Project game-based learning gamebook Games Games based learning Games for change festival gaming GEOINT Georgia Tech geospatial intelligence gerrymandering Global Intelligence Forum Google Google Translate grading graduate certificate graduate course Graduate school Gravity Models Great Firewall greg fyffe groups hardware heuristics hga hiring projection History Hnefatafl how to HUMINT Hunger Games IAFIE IARPA IMINT India INFORMAÇÕES inr integration intelligence Intelligence agency intelligence analysis Intelligence Analyst's Deck Of Cards intelligence collection Intelligence Community intelligence cycle intelligence in business Intelligence preparation of the battlefield intelligence process intelligence production intelligence studies intelligence theory Internet investigations IPB James Sanborn James Shelton Jane McGonigal Jen Stark Jigsaw Job hunting Job Search jobs John F. Kennedy John Stasko judgment july Kickstarter Kindle Kingdoms of Amalur Kriegspiel Kristan J. Wheaton Kryptos kwheaton Labels: Art Labels: Counterintelligence language languages law law enforcement law enforcement intelligence Learning Leksika Let's Kill The Intelligence Cycle liberal arts link list LinkedIn LKTIC Lord of The Rings Online macro photography MakeUseOf map mapping Mark Lombardi Market Intelligence MASINT Mass Effect MCIIS MCIIS Press Measurement Media Melonie K. Richey mental model Mercyhurst Mercyhurst Model methodologies mindmapping Minecraft Monopoly Moros murder Music Genome Project Myst National Post national security NCTC network analysis networking News NGA nominal group technique North Korea NoScript NOTICIAS NSA odni Online Open Source open source Intelligence organization original research Origins Game Fair OSINT OWS Pakistan pandemic Pandora passports pattern matching Pebble watch perspective PICL pintrest popplet Portal 2 post-mortem power laws pre-order Prediction prediction markets predictive market primary source Privacy privacyscore Problem solving professional development professionalism psychology questions Quickstarter Raph Koster rare earth Reader Recommended reading list Reality is Broken recession refugee crisis refugee population refugees request for information Resource resumes rfi Robert Heibel Role-playing game Roleplaying rolling pins Ronald Reagan ROTM Russia SAMs Games sandpiles Sankey diagram Saras Sarasvathy satellites Sculpture search Secrecy News secret sensors serious games Shippensburg Showdown SIGINT simulation SIRIUS social media social network analysis social networks Society for Effectual Action software Sources and Methods Games soviet union Spencer Vuksic spies spurious correlations spying Spymaster stanford AI course statistics strategic intelligence Strategic Minerals Strategy STRATINT Strawman structured analytic techniques Structured role-playing students survey Swayable symposium Syria tabletop games teaching techniques team building teams technology roadmap technology trends Terrorism textbooks Thanksgiving The Mind's Lie Theory of Fun thought experiment tips Tom Ridge Tor trade training translation travel tree treps Turkey TUTORIAIS Twitter UK Ukraine United States federal budget Upstart US IC US military USA Today USCG VAST Veterans' Day video vikings visual analysis visualizing intelligence voxy.com Wall Street Journal wargame Washington DC weekend What they know Widget wiki Wikipedia Words With Friends Work of art Yelp YouTube

Is Forensic Speaker Recognition The Next "Fingerprint?"

Take a fingerprint... for that matter, go ahead and take a palm print. Now, take a voiceprint. In this day and age, forensic biometric analysis is extraordinarily complex. In a world where we analyze everything from irises to earlobes, what can science tell us about voice?

One increasingly popular form of analysis is forensic speaker recognition (aka voice biometrics or biometric acoustics). Forensic speaker recognition (FSR) has unequivocal potential as a supplementary analytic methodology, with applications in both the fields of law enforcement and counterterrorism (for details, see the last section of the 2012 book on FSR Applications to Law Enforcement and Counter-terrorism).

The utility of the FSR process is either one of identification (1:N or N:1) or verification (1:1).
  • 1:N Identification -- Imagine you have a recording of a voice making threats over the phone. The speaker identification process allows you to query a database of acoustic recordings of known suspects for comparison against your target voice to identify more threats he/she might have made.
  • N:1 Identification -- Imagine you have a bunch of voice recordings and you want to know in which of them, if any, a certain speaker participates. 
  • 1:1 Verification -- Imagine you wish to grant someone access to a building or secure location by assessing whether or not they are who they say they are (this aspect of speaker recognition is less applicable to analysis and more applicable to security). 
That said, the CIA, the NSA and the Swiss IDIAP all turned to automatic speaker verification systems in 2003 to analyze the so-called Osama tapes (for details of the approach, see Graphing the Voice of Terror). This case provides an excellent opportunity to note the distinction between automatic speaker recognition performed by an algorithmic machine and aural speaker recognition performed by acoustic experts. 

The cornerstone methodology supporting forensic speaker recognition is voiceprint analysis,or spectrographic analysis, a process that visually displays the acoustic signal of a voice as a function of time (seconds or milliseconds) and frequency (hertz) such that all components are visible (formants, harmonics, fundamental frequency, etc.).
(Note:  For those who are more acoustically inclined and would enjoy a well-written read on all things acoustic from military strategy to frog communication, Seth Horowitz's new book The Universal Sense: How Hearing Shapes the Mind comes with my highest recommendation.)
Spectrographic analysis differs from human speaker recognition in that it provides a more quantifiable comparison between two speech signals. Under favorable conditions, both approaches yield favorable results: 85 percent identification accuracy (McGehee 1937), 96 percent accuracy (Epsy-Wilson 2006), 98 percent accuracy (Clifford 1980), 100 percent accuracy (Bricker and Pruzansky 1966). These approaches, however, do not come without caveats.

Forensic speaker recognition has many limitations and is currently inadmissible in federal court as expert testimony. Bonastre et al (2003) summarize these limitations quite well:  
"The term voiceprint gives the false impression that voice has characteristics that are as unique and reliable as fingerprints... this is absolutely not the case."
The thing about voices is that they are susceptible to a myriad of external factors such as psychological/emotional state, age, health, weather... the list goes on. From an application standpoint, the most prominent of these factors is intentional vocal disguise. There are a number of things people can intentionally do to their voices to drastically reduce the ability of machine or human expert to identify their voice correctly (you would be amazed at how difficult it is - nearly impossible - to identify a whispered voice). Under these conditions, identification accuracy falls to 40 - 52 percent (Thompson 1987), 36 percent (Andruski 2007), 26 percent (Clifford 1980). 
Top: Osama bin Laden's "dirty" 2003 telephonic spectrogram
Bottom: Osama bin Laden's "clean" spectrogram
Source: Owl Investigations


More problematic still is communication by telephone. Much of the input law enforcement and national security analysts have to work with comes from telephone wiretaps or calls made from jail cells. Telephones, cellphones in particular, create a filtering phenomenon of an acoustic signal, whereby all acoustic information under a certain frequency simply does not get transmitted (within this frequency range lie some of the key characteristics for voice identification). 

While the forensic speaker recognition capability has come a long way since 2003, the consensus among the analytic community remains that it is not a stand-alone methodology, rather a promising supplementary tool. Biometric analysis was also a topic brought to the Intelligence Technology panel of the 2013 Global Intelligence Forum conference this year. Of note was the expanding applicability and increasing capabilities of all biometric technologies. 

Thus far, the Spanish Guardia Civil is the only law enforcement agency worldwide to have a fully-operational acoustic biometric system (called SAIVOX, the Automatic System for the Identification of Voices). In the Spanish booking process, just like we take fingerprints, they take voice samples that they then contribute to a corpus of over 3,500 samples linked with well-known criminals and certain types of crime. 

In 2011, the FBI commissioned NIST to launch a program on "investigatory voice biometrics." The goal of the committee is to develop best practices and collection standards to launch an operational voice biometric system with robust enough corpora so as to serve as a useful tool in ongoing investigations, modeled off the Spanish system. (This is an ongoing project and you can read the full report here).

FSR is not a perfect methodology, but one that can add substantial value on a case-by-case basis. It is of high interest to the US national security and law enforcement analytic communities.

Additional reading:
Andruski, J., Brugnone, N., & Meyers, A. (2007). Identifying disguised voices through speakers' vocal pitches and formants. 153rd ASA meeting.
Bonastre, J. F., Bimbot, F., Boe, L. J., Campbell, J. P., Reynolds, D. A., & Magrin-Chagnolleau, I. (2003). Person authentication by voice: A need for caution. Eurospeech 2003.
Bricker, P.D., & Pruzansky, S. (1966). Effects of stimulus content and duration ontalk identification. The acoustical society of the Americas, 40, 1441-1449.
Clifford, B. R. (1980). Voice identification by human listeners: On earwitnessreliability. Law and human behavior, 4(4), 373-394.
Epsy-Wilson, C. Y., Manocha, S., & Vishnubhotla, S. (2006). A new set of features fortext-independent speaker identification.
McGehee, F. (1937). The reliability of the identification of the human voice. Journal of general psychology, 31, 53-65.
Parmar, P. (2012). Voice fingerprinting: Avery important tool against crime. J Indian academy forensic med.,34(1), 70-73. doi: 0971-0973

0 Response to "Is Forensic Speaker Recognition The Next "Fingerprint?""

Post a Comment