The Simulated Hazardous Operational Tasks Laboratory was created in 2008 with the assistance of the Critical Job Tasks Simulation Laboratory Expansion for WSU Sleep & Performance Research Center (Vila, PI) grant form the US DOD Office of Naval Research under the Defense University Research Instrumentation Program (DURIP).
A critical lack of scientific evidence about whether deadly force management, accountability and training practices actually have an impact on police officer performance in deadly force encounters, the strength of such impact, or whether alternative approaches to managing deadly force could be more effective. The primary cause of this lack is that current tools for evaluating officer-involved shootings are too coarse or ambiguous to adequately measure such highly variable and complex events. There also are substantial differences in how key issues associated with police deadly encounters are conceptualized, even by subject matter experts, how agencies can or should train for them, and what officers should—or reasonably can—be held accountable for. As a consequence, trainers and policy makers have generally been limited by subjective or rough assessments of deadly force performance or how challenging a deadly force situation was.
Our research addressed this problem by using a novel pairing of two well-established research methods, Thurstone scaling and concept mapping. With them, we developed measurement scales that dramatically improve our ability to measure police officer performance in deadly force encounters. We expect that these metrics will make it possible to better evaluate the impact of management and training practices, refine them, and make assessment of accountability more just and reasonable.
Accelerating Realistic Deadly-Force Judgment and Decision Making Training
Defense Advanced Research Projects Agency (DARPA) through Advanced Brain Monitoring, Inc (Vila, PI)
Johnson et al (2014) Identifying psychophysiological indices of expert versus novice performance in deadly force judgment and decision making. Frontiers in Human Neuroscience 8(512). doi:10.3389/fnhum.2014.00512
Objective: To demonstrate that psychophysiology may have applications for objective assessment of expertise development in deadly force judgment and decision making (DFJDM).
Background: Modern training techniques focus on improving decision-making skills with participative assessment between trainees and subject matter experts primarily through subjective observation. Objective metrics need to be developed. The current proof of concept study explored the potential for psychophysiological metrics in deadly force judgment contexts.
Method: Twenty-four participants (novice, expert) were recruited. All wore a wireless Electroencephalography (EEG) device to collect psychophysiological data during high-ﬁdelity simulated deadly force judgment and decision-making simulations using a modiﬁed Glock ﬁrearm. Participants were exposed to 27 video scenarios, one-third of which would have justiﬁed use of deadly force. Pass/fail was determined by whether the participant used deadly force appropriately.
Results: Experts had a signiﬁcantly higher pass rate compared to novices (p < 0.05). Multiple metrics were shown to distinguish novices from experts. Hierarchical regression analyses indicate that psychophysiological variables are able to explain 72% of the variability in expert performance, but only 37% in novices. Discriminant function analysis (DFA) using psychophysiological metrics was able to discern between experts and novices with 72.6% accuracy.
Conclusion: While limited due to small sample size, the results suggest that psychophysiology may be developed for use as an objective measure of expertise in DFDJM. Speciﬁcally, discriminant function measures may have the potential to objectively identify expert skill acquisition. Application: Psychophysiological metrics may create a performance model with the potential to optimize simulator-based DFJDM training. These performance models could be used for trainee feedback, and/or by the instructor to assess performance objectively.
Experimental Test of the Impact of Work-Related Fatigue on Police Officer Vehicle Collision Risk
California Commission on Police Officer Standards and Training (CA POST) (Vila, PI)
James, S.M. (2015) Distracted driving impairs police patrol officer driving performance. Policing: An International Journal of Police Strategies & Management 38(3), 505-516.
James, S.M., & Vila, B. (2012) Driven to distraction. The Journal Of California Law Enforcement 46(2), 14-18.
James, S. M., & Vila, B. (2015) Police drowsy driving: predicting fatigue-related performance decay. Policing: An International Journal of Police Strategies & Management 38(3), 517-538.
Impact of Work-Related Fatigue on Deadly Force Judgment and Decision Making Performance and Driving Performance Among Day vs. Night Sleepers
US DOD Office of Naval Research (Vila, PI)
James, L., James, S.M., & Vila, B. (2016) The reverse racism effect: are cops more hesitant to shoot black suspects? Criminology and Public Policy 15(2), 457-479.
James, L., James, S.M., & Vila, B. (2017) Does the “reverse racism effect” withstand the test of police officer fatigue? Policing: An International Journal of Police Strategies & Management 40(2), 184-196. doi:10.1108/PIJPSM-01-2016-0006
Empowering the Strategic Corporal: Training Young Warfighters to be Socially Adept with Strangers in Any Culture
Defense Advanced Research Projects Agency (DARPA)
EXECUTIVE SUMMARY from Final Report
The interactions of young enlisted warfighters with strangers often form the operational center of gravity in counterinsurgency, peacekeeping, nation-building, and humanitarian missions. Consequences from the decisions they make in fast paced, low information encounters with strangers can reverberate across tactical, strategic, and political boundaries. Despite the critical nature of their decisions in the field, however, our “strategic corporals” frequently are teenagers whose frontal lobes have yet to develop fully.
The DARPA-funded research reported here took an important step toward empowering these young warfighters to do something that is vital to the success of our nation’s strategic interests, but which few of them are well equipped to do: interact successfully during ambiguous operational encounters in very foreign lands with people who are very different from themselves.
Although most warfighters receive pre-deployment training that touches on language and cultural skills, or teaches them to better attend to the human terrain as they hunt for foes and watch for threats, that training tends to be focused on the characteristics of the place to which they are being deployed. This isn’t efficient when today’s warfighter may be assigned to Iraq on one tour and Afghanistan on the next, then suddenly be rerouted to Uganda or Indonesia. In a world where young enlisted warfighters may be sent anywhere, there is a critical need to help them learn the fundamental skills needed to adapt rapidly in any culture.
Our highly experienced interdisciplinary research and training development team attacked this critical gap using a novel process that included:
1. Logic model and metric development that used novel research techniques we pioneered to rapidly identify generic causal models for understanding the fundamental dynamics of stranger encounters, and to develop interval-level metrics for measuring both the relative difficulty of those encounters and individual performance in them.
2. Training test instrument development that used rigorous experimentation to create novel instruments and techniques that enable trainers—and our research team—to readily assess trainee baseline capabilities, strengths and weaknesses in ways that are objective, scientifically valid, and reliable; measure the relative impact of each training technique and module as well as overall training program success; and also use non-intrusive ambulatory neurophysiological measurement devices to track and differentiate between trainee engagement, frustration, and cognitive workload whenever possible during training in order to assess the individual training dosage received.
3. Tactical social interaction (TSI) curriculum development that identified novel training techniques that connect with young enlisted warfighters and give them the foundation for figuring out how to interact effectively in a novel environment. The elements of the TSI curriculum also were designed to be modular, scalable, and blendable. Modularity makes it possible to teach the curriculum as a whole or piece by piece. Scalability makes it easy to adapt the curriculum and related tools to presentation in settings ranging from the schoolhouse to the company, platoon or even fire-team level. Blendability—an attribute recommended by the Marine Corps’ Training and Education Command—makes it possible to integrate TSI training modules seamlessly into existing training programs to minimize costs and encourage warfighters to see tactical social interaction as a core part of their fieldcraft.
4. Pilot testing of the TSI curriculum using military and police students who learned from the curriculum, critiqued it, and helped refine both the curriculum and test instruments.
5. Assessment of the TSI curriculum using the training test instruments developed to assess students behavioral, cognitive, and affective pre- and post-training improvements.
6. Ongoing coordination and liaison with performers from other technical area performers in DARPA’s Strategic Social Interaction Modules (SSIM) program to absorb as much of their knowledge into our work as possible, then transition the results of each phase of our research to technology developers, social scientists, and evaluators.
In short, this research nailed down logic models and metrics that give trainers, evaluators, and future technology developers a unified framework from which they can proceed and helps assure that the work of one complements each of the others. The interval-level performance metrics we created give research teams, trainers, technology developers, and evaluators a common yardstick that makes it possible to use powerful mathematical and statistical techniques, and also is valuable for software and hardware development. These new capabilities will enable advances in the science, systems, and devices used to train young warfighters and build social skills that are invaluable in counterinsurgency, peacekeeping, nation-building, and humanitarian operations.
Military success requires understanding threat capabilities, intentions, and activities, as well as local human, social, cultural, and behavioral factors. Many assume that the skills necessary to do this require social graces and nuanced insights that are beyond the experience or ability of young warfighters. However, our research challenges that assumption. Every social creature from ants to dogs, dolphins, and people is naturally equipped with the potential to learn social skills and nuances—and a drive to do so. As a species, humans tend to excel at reading one another, establishing connections, and finding ways to communicate. Even though performing these mundane tasks can be problematic in a foreign culture, learning to solve these problems in an organized and intuitively reasonable manner is half the battle.
Our research has demonstrated that creative training approaches which focus on conveying the fundamental dynamics of encounters with strangers can be taught in ways that engage warfighters and help them learn to be better at observing what matters, solving problems, and connecting with people who seem different from themselves. Our work has the potential to radically change established training practices and increase the effectiveness of warfighters on the ground in counterinsurgency, peacekeeping, nation building, and humanitarian missions.
TSI delivered in Colorado
Training Partners: I2s
Applying research techniques pioneered by our team to rapidly identify generic causal models for understanding the fundamental dynamics of situations involving mental illness. During this phase, we brought together 20 law enforcement and mental health professional (MHP) subject matter experts in a two-day focus group to identify key indicators for measuring both the relative difficulty of crisis encounters and individual performance in them. Using this information we created and widely distributed surveys to identify the level of importance law enforcement officers’ and MHPs’ place on difficulty and performance indicators.
WSU conducted controlled laboratory experimental trials designed to test the validity of cognitive test batteries to predict driving performance in high fidelity driving simulators. The validation study design will require twenty-four adult participants to partake in 1) a three hour screening and training session and 2) a six hour period of data collection at the Simulated Hazardous Operational Tasks Laboratory at the Sleep and Performance Research Center.
The research analyzed exploratory data from ONR-funded experiments to identify and develop new ways to manage fatigue and understand its impact on warfighters’ safety and health, interactions with non-combatants, and driving.
Fatigue management applies to every current and long-term ONR expeditionary warfare goal and focus area involving human decision making, information collection, communication and reporting, adaptability in complex combat environments, or operational safety and health. Yet relatively little is known about how to manage operational fatigue in the types of counterinsurgency, stabilization and humanitarian missions that dominate contemporary expeditionary and irregular warfare.
Our focus on individual-level effects of fatigue is especially important in these highly distributed global operations, during which small teams must conduct missions in extreme, politically unstable environments while sleep deprived and physically depleted. Fatigue-related degradation of the strategic corporal’s perception, judgment, decision making, performance, and stress management can undermine both tactical and strategic imperatives. Thus, our lack of knowledge about how to manage the individual-level effects of fatigue constitutes a critical need.
1. Provide Navy/Marine Corps with an empirical basis for setting work-hours, scheduling, and equipment-use policies, training drivers to better manage fatigue and distraction load, and improving the structure and presentation of instruments and equipment inside motorized vehicles;
2. Identify individual risk factors associated with performance of operational driving, deadly force judgment and decision making, and tactical social interaction in order to understand the extent to which performance is affected by fatigue-related risk propensity, PTSD symptomology, and mood;
3. Assess the impact of fatigue on warfighters’ tactical social interaction skills and other behaviors that influence non-combatants’ perceptions of their legitimacy, fairness and civility; and
4. Assess the extent to which fatigue-related driving accidents may be reduced by understanding the effects of fatigue and the timing of work shifts on collision risks as well as operational costs such as fuel consumption and maintenance.
The Simulated Hazardous Operational Tasks Laboratory at Washington State University (WSU) Spokane provided Training Development Assistance to the Spokane Police Department (SPD). The aim of this contract was to assist the SPD in enhancing the department’s training capabilities to improve officer safety and wellbeing, better serve the community, and meet the recommendations set forth by the U.S. Department of Justice, Office of Community Oriented Policing Services’ Collaborative Reform Process.
In response to Seattle Police Department’s (SPD) consent decree, they have implemented an Early Intervention System (EIS) to identify those officers who may be exhibiting potentially concerning behaviors. One of the main concerns when implementing EIS, however, is the selection of “triggers” used to identify such officers. This is problematic because, for example, citizen complaints do not take into account individual officers’ exposure rates to high-risk encounters and situations. As a consequence, officers who are more proactive and/or respond to more calls for service are more likely to receive citizen complaints, regardless of their behavior. The goal of the study was to evaluate the ability of SPD’s EIS to correctly identify officers who are behaving in ways that are truly problematic and warrant investigation. To do this, we will use our newly-developed, NIJ-funded, interval-level metrics to evaluate the field performance of officers flagged by the EIS. In doing so, we determined how many officers flagged by the EIS are actually exhibiting problem behaviors.
The aim of this contract was to assist the Oregon DPSST in enhancing DPSST’s Basic Police training program via the identification and measurement of decision points and behaviors, in dynamic social encounters, that are most likely to contribute to police legitimacy.
We reviewed the existing online training program: ‘NIOSH training for nurses on shift work and long work hours’ on www.cdc.gov/niosh/docs/2015-115/ and assist the Program Manager with tailoring the content for law enforcement. This included providing photographic images to include in the training (taken of local law enforcement officers who will be reimbursed for allowing us to use their images); narration services (sourced in house to avoid cost); videos for the training (6 “motivational” interviews with police officers and experts sourced locally).
In response to broad concerns about racially motivated policing implicit bias training is becoming a staple among many police departments. Two modalities for implicit bias training exist—a classroom based academic presentation on the science of bias, and simulation-based training to teach officers to focus on objective threat indicators over suspect characteristics. The problem, however, is that our knowledge of the effectiveness and persistence of implicit bias training is severely limited. Furthermore, no evidence exists for which implicit bias training modality is superior (from the perspective of the persistence of training-related behavior change over time), or whether both types are required to have an impact on police decision making on the street.
Our subjects were 400 officers, assigned to patrol in diverse metropolitan departments with nationally representative demographics.
Patrol officers were randomly assigned to one of four groups: the first received classroom based implicit bias training, the second received simulation based implicit bias training, the third received both types of training, and the fourth served as a no-training control group.
Test measures included: 1) Fairness in officer decision making (measured by scoring body camera footage using custom metrics for measuring officer performance during police-citizen encounters); 2) Citizen perceptions of police legitimacy (measured by citizen complaints); 3) Arrestee perceptions of how fairly they were treated by police (measured by survey); and, 4) Police perceptions of training effectiveness (measured by survey and focus groups).
Although ample evidence exists that shift-work is dangerous for patients and nurses, very little is known about optimal shift scheduling. Experiments which quantify the fatigue-related risks associated with shiftwork are desperately needed to inform policy regulating shift scheduling. To meet this need, we studied the impact of shift-accumulated fatigue on the spectrum of daily activities nurses engage in: from patient care to post-shift drive home. The between-groups, repeated-measures quasi-experiment was conducted in the Washington State University (WSU) College of Nursing and Sleep and Performance Research Center. Nurse participants (N=100) reported to WSU for testing on two separate occasions—once immediately following their 3rd consecutive 12-hour shift and once on their 3rd consecutive day (72 hours) off work.
This research provided objective evidence of the impact of shift work on nurses’ patient care-related critical skills and risk of collisions during their post-shift drive home. This resulted in concrete recommendations regarding safe shift-scheduling for day and nightshift nurses. The information we generated may provide the push needed to set national work hour policies for nurses. Given that doctors have had regulations on work hours since 1987 it is unacceptable that nurses still do not have set policies protecting them against safety risks and protecting their patients against preventable medical errors.
The issue of how to measure the impact of situational-, suspect-, and officer-level factors on police actions has long been debated in the policing literature. One promising method is to use interval-level metrics developed via a combined method of concept mapping and Thurstone scaling. Our objective here was to use these metrics to score 667 incident reports from a large (n ∼ 1,500) urban police department. From this process, we explored significant trends in how police officers perform during encounters with the public. We found that officers performed better in “higher stakes” encounters and excelled in vigilance situational assessment as well as use of tactics and adapting tactics. Officers tended to receive the worst scores in routine police–citizen interactions and the highest in crisis encounters. Interpretation and implications of these findings for American policing are discussed.
Police departments around the country are implementing Early Intervention Systems (EIS) to identify officers who may be exhibiting problematic or unprofessional behaviors. The goal of EIS is to minimize officer misconduct and increase officer accountability. To evaluate whether EIS can actually differentiate “problem” from “non-problem” officers, we analyzed the performance of officers from incident reports of police–citizen interactions.
Using a blind scoring method, we evaluated performance from 1000 police reports; 500 randomly selected reports from EIS-flagged officers (treatment group) and 500 randomly selected reports from non-flagged officers (control group). Six hundred and sixty-seven reports contained relevant performance data. The interval-level metrics used to score officer performance were developed by Vila and colleagues (2016, 2018) to assess performance—expressed as a percentage—across a range of police–citizen encounters.
The overall performance score assigned to officers across all 667 incident reports was 80.46% (SD = 8.75%). When separated into EIS-flagged and non-EIS–flagged incidents, performance scores were 80.63% (SD = 8.58%) compared to 80.27% (SD = 8.95). There was not a statistically significant difference between EIS-flagged and non-EIS–flagged performance.
The EIS evaluated does not appear to be differentiating between problem behavior and non-problem behavior. This suggests that the “thresholds” used to identify problem officers are not working effectively.