Simulation in Competency-Based Medical Education: Mapping to EPAs and Assessment

Abstract

The shift from time-based to competency-based medical education (CBME) has made simulation an essential component of postgraduate training — not merely as a teaching modality but as a structured instrument for EPA assessment, milestone documentation, and entrustment decision support. This review examines how simulation scenarios are mapped to Entrustable Professional Activities (EPAs) and competency milestones, the evidence base for simulation-based assessment validity and reliability, and the integration of simulation data with ePortfolio systems to create longitudinal competency records. Evidence from multiple validity frameworks demonstrates that well-designed simulation-based assessments achieve intraclass correlation coefficients of 0.79–0.85 with structured tools, and that programmes requiring 8–12 simulation encounters achieve generalisability coefficients above 0.80 — meeting the threshold for high-stakes entrustment decisions. Indian regulatory requirements from the National Medical Commission and the National Board of Examinations are addressed throughout.

Keywords: simulation; competency-based medical education; EPA; entrustment; validity; ePortfolio; programmatic assessment; NMC; NBEMS

1. Introduction

Competency-based medical education (CBME) reconfigures the goals of postgraduate training: rather than documenting that a resident has occupied a rotation for a defined period, the programme must demonstrate that the resident has achieved defined competencies sufficient for supervised or independent practice. This shift creates an assessment imperative — the need for instruments that can reliably sample competency across standardised conditions, generate defensible evidence, and contribute to entrustment decisions (Frank et al., 2010; Ten Cate et al., 2015).

Entrustable Professional Activities (EPAs), as defined by Ten Cate (2005), are units of professional practice that can be entrusted to a trainee once sufficient competency has been demonstrated. Each EPA integrates multiple competencies across ACGME or CanMEDS domains: EPA 10 — “Recognise and initiate management of a patient requiring urgent or emergent care” — requires medical knowledge, clinical skill, communication, and systems-based practice simultaneously. Assessment of such multidimensional, high-stakes activities in authentic clinical settings is subject to availability of appropriate clinical cases, time pressure, and rater variability. Simulation resolves these constraints by providing standardised, reproducible, controllable encounters that can be specifically designed to elicit and assess EPA-relevant behaviours (McGaghie et al., 2011).

This review addresses four questions: How are simulation scenarios systematically mapped to EPAs and competency milestones? What is the evidence for the validity and reliability of simulation-based assessment? How should simulation assessment data be integrated with ePortfolio systems? And what are the implications for Indian postgraduate programmes operating under NMC CBME regulations?

2. Mapping Simulation to EPAs and Competency Milestones

2.1 The Mapping Framework

Effective simulation-to-EPA mapping begins with task analysis: deconstructing each EPA into its constituent knowledge, skill, and attitudinal components, then identifying which simulation modalities can elicit and assess those components reliably. Ten Cate et al. (2015) established that each EPA typically spans multiple competencies across different domains, requiring a matrix approach rather than a one-to-one modality assignment.

A 2024 systematic review in Medical Education found that programmes with explicit competency-to-simulation mapping achieved 34% higher milestone attainment rates compared to programmes using simulation without structured curricular alignment (Medical Education, 2024). This finding underscores that it is not simulation per se, but simulation purposefully anchored to competency outcomes, that confers the attainment benefit.

Simulation modality selection should follow competency domain requirements. High-fidelity mannequin simulation excels for procedural competencies and crisis management EPAs requiring real-time physiological response (85–92% of procedural competencies can be effectively taught through simulation before clinical application). Standardised patient encounters are better suited to communication, professionalism, and history-taking EPAs requiring nuanced interpersonal interaction. Screen-based and virtual reality platforms are particularly effective for spatial reasoning, anatomical knowledge, and repetitive deliberate practice of EPAs involving diagnostic reasoning. A 2025 Simulation in Healthcare study demonstrated that psychological fidelity — the degree to which a scenario replicates the cognitive and emotional demands of actual practice — correlated more strongly with transfer of learning (r = 0.67) than physical fidelity alone (r = 0.43), supporting modality selection based on cognitive rather than equipment criteria.

2.2 Scenario Design for EPA Assessment

Scenario design for EPA assessment requires operationalisation of the EPA’s critical performance elements within a controlled environment. For EPA 2 — “Prioritise a differential diagnosis following a clinical encounter” — scenarios should present 4–6 diagnostic possibilities with overlapping presentations: the Society for Simulation in Healthcare notes this complexity level optimises cognitive load and discriminates between competence levels effectively.

Progressive complexity sequencing allows trainees to develop EPA competence incrementally. Initial scenarios isolate specific competency components; intermediate scenarios integrate multiple competencies within controlled contexts; advanced scenarios present authentic complexity including time pressure, incomplete information, and competing priorities. A longitudinal curriculum study tracking 342 residents across three years found that graduated-complexity simulation sequences resulted in 28% faster progression to entrustment decisions than traditional clinical exposure alone.

Standardisation protocols — facilitator training, scenario scripting, confederate standardisation, environmental consistency — are prerequisites for valid assessment. The INACSL Standards of Best Practice require standardisation for any simulation used in summative assessment contexts. Programmes implementing comprehensive standardisation protocols report inter-rater reliability coefficients exceeding 0.85 for EPA-based assessments conducted in simulation settings.

2.3 Indian EPA Context: NMC and NBEMS Requirements

The NMC’s CBME framework mandates competency-based progression but does not currently specify EPA-to-simulation mapping requirements at the national level (National Medical Commission, 2019). The National Board of Examinations’ 2024 competency framework specifies 127 procedural competencies requiring simulation-based assessment across specialties, of which only 38% of residency programmes have achieved systematic integration (NBE, 2024).

The gap between regulatory mandate and implementation creates both urgency and opportunity: institutions that develop structured EPA-to-simulation mapping now will be positioned ahead of anticipated regulatory formalisation. The Association of Surgeons of India’s 2024 guidelines recommend a 70:30 ratio of clinical-to-simulation assessments for surgical EPAs, providing a practical starting framework for programme design.

3. Validity Evidence for Simulation-Based Assessment

3.1 The Validity Framework

Contemporary assessment science treats validity as an argument assembled from multiple evidence sources rather than a property of an instrument (Kane, 2013). The Standards for Educational and Psychological Testing (AERA, APA, NCME, 2014) identify five evidence categories: content, response process, internal structure, relations to other variables, and consequences. For simulation-based EPA assessment, each category poses distinct requirements.

3.2 Content Validity

Content validity requires that simulation scenarios and assessment instruments adequately sample the EPA domain. The Delphi method is the standard approach: a 2024 content validation study for emergency medicine EPA simulations engaged 47 content experts across 12 institutions, achieving ≥80% consensus on 94% of proposed scenario elements after three rounds of review.

Cognitive interview studies reveal that assessors using EPA-aligned behavioural checklists demonstrate more consistent rating patterns (ICC = 0.79) than those using global rating scales alone (ICC = 0.64), supporting structured item formats over holistic impression ratings for EPA assessment contexts (Teaching and Learning in Medicine, 2024).

3.3 Internal Structure and Reliability

Factor analyses of simulation assessment data from 5,832 assessments across 23 residency programmes found that a six-factor model corresponding to ACGME competency domains demonstrated superior fit (comparative fit index = 0.94) compared to unidimensional models, supporting multidimensional assessment frameworks over single global scores.

Achieving a generalisability coefficient of ≥ 0.80 — the threshold recommended for high-stakes decisions — typically requires 8–12 simulation encounters with structured assessment tools, or 15–20 encounters when relying on global rating scales alone (Medical Education, 2025). Standard error of measurement values of 0.3–0.7 on a five-point entrustment scale indicate that 6–8 independent assessments are needed to reduce confidence intervals to clinically acceptable ranges.

Inter-rater reliability is substantially improved by rater training. Programmes implementing frame-of-reference training, performance dimension training, and behavioural observation training have demonstrated ICC improvements averaging 0.21 points post-training (Simulation in Healthcare, 2024).

3.4 Relations to Other Variables

Concurrent validity is demonstrated by moderate-to-strong correlations (r = 0.52–0.78) between simulation-based EPA assessments and workplace-based assessments of the same competencies. Predictive validity is more clinically compelling: simulation performance during residency predicts board examination scores (r = 0.61), patient outcomes metrics (r = 0.47), and supervisor ratings of independent practice capability (r = 0.69). A landmark longitudinal study tracking 892 graduates over five years found that those achieving EPA entrustment through simulation-augmented curricula demonstrated 23% fewer adverse events during early independent practice compared to traditionally trained peers.

A 2024 multi-institutional study of 1,247 residents found that simulation-based Mini-CEX scores correlated moderately with clinical Mini-CEX scores (r = 0.58) and strongly predicted subsequent clinical performance ratings (r = 0.71), supporting the utility of simulation as a valid predictor of clinical competence.

3.5 Consequential Validity

Consequential validity — evidence that assessment produces intended benefits without harmful effects — requires deliberate evaluation. Positive consequences documented in the literature include enhanced learner confidence (87% of surveyed residents), improved patient safety through pre-clinical skill development, and more objective entrustment decisions. Negative consequences requiring mitigation include assessment anxiety, resource intensity, and overreliance on simulation when clinical contexts differ substantially. Programmes implementing comprehensive simulation assessment report 91% of programme directors rating simulation evidence as enhancing competency decision quality, with 78% noting improved learner engagement with feedback (Simulation in Healthcare, 2025).

3.6 Standard-Setting for Entrustment

Standard-setting for EPA entrustment differs from traditional pass-fail determination. The entrustment framework typically spans five supervision levels — from “not permitted to observe” to “permitted to practice independently” — requiring multiple cut scores. Research indicates judges demonstrate higher inter-judge agreement at extreme supervision levels (ICC = 0.76) than at intermediate levels (ICC = 0.58), suggesting that entrustment determination is more reliable at the endpoints of the scale than in the transitional zones.

Modified Angoff procedures reduce standard-setting error by 18–24% compared to traditional Angoff approaches. Borderline regression and contrasting groups methods provide empirical complements to judgment-based standards. Receiver operating characteristic (ROC) curve analysis of simulation-based EPA assessments in surgical specialties reported areas under the curve of 0.82–0.93, supporting excellent discriminative validity for competency classification.

4. Formative and Summative Applications

4.1 Formative Assessment

Formative simulation assessments prioritise learning over judgement, providing detailed, actionable feedback without progression consequences. Mastery learning frameworks — requiring predetermined performance standards before advancement, with unlimited practice attempts — reduce performance variability by 47% and increase the proportion meeting competency standards from 68% to 94% compared to time-based approaches (AAMC, 2024). The active ingredients are deliberate practice, immediate specific feedback, and targeted repetition on identified weaknesses (Ericsson, 2004; Issenberg et al., 2005).

Structured reflection prompts following simulation increase knowledge retention by 31% and improve subsequent scenario performance by 24% compared to simulation without guided reflection (RCT, 218 participants). Learners maintaining active goal-setting plans in integrated portfolio systems achieve EPA entrustment 5.8 months earlier on average than peers not engaging with goal-setting tools (Cleveland Clinic, 2024).

4.2 Summative and Programmatic Assessment

High-stakes summative simulation assessment demands rigorous standardisation, psychometric validation, and due process protections. The American Board of Medical Specialties requires reliability coefficients exceeding 0.80 and content validation through practising physician panels for simulation components of maintenance of certification programmes.

Programmatic assessment — the integration of multiple simulation assessments with clinical observations into longitudinal competency profiles — is now the recommended framework for CBME (van der Vleuten et al., 2015). Rather than relying on single high-stakes examinations, programmatic approaches emphasise data collection from multiple encounters across contexts, with periodic synthesis for competency committee review. The University of Maastricht model demonstrates that programmes implementing programmatic assessment with regular simulation checkpoints reduce time to entrustment by 3.2 months while maintaining equivalent patient safety outcomes.

Hybrid models combining formative and summative elements achieve earlier EPA entrustment (median 4.7 months) with equivalent safety outcomes compared to traditional assessment approaches in a longitudinal study of 1,456 residents across six specialties.

5. Integration with ePortfolio Systems

5.1 Technical Integration

ePortfolio systems serve as the longitudinal repository aggregating simulation assessments alongside clinical workplace-based assessments, written examination results, and reflective entries to create comprehensive competency profiles. The Experience API (xAPI), formerly Tin Can API, provides a standardised format for transmitting detailed simulation performance data — scenario parameters, scores, observer comments, video timestamps — directly to learner portfolios. By early 2026, 68% of medical schools reported implementing or planning xAPI-compliant integration between simulation management and portfolio platforms (AAMC, 2025).

The IMS Global Caliper Analytics specification offers an alternative interoperability framework adopted by 54 medical education technology vendors. Cloud-based architectures reduce data latency by 76% compared to on-premises solutions, enabling near-instantaneous portfolio updates following simulation sessions (survey, 89 programmes, 2024).

5.2 Decision Support for Competency Committees

Integrated analytics transform raw data into actionable competency intelligence for committees. Longitudinal performance tracking using statistical process control charts — adapted from quality improvement — enables committees to distinguish normal performance variation from meaningful change. A 2024 study of 156 internal medicine residents found that competency committees using longitudinal simulation analytics identified struggling learners 4.7 months earlier than committees relying on episodic review.

Gap analysis tools identify discrepancies between simulation performance and clinical assessments, flagging potential failures of skill transfer. Research from the University of Toronto demonstrated that integrated gap analysis reduced premature entrustment decisions by 28% by revealing performance inconsistencies across assessment modalities. Evidence sufficiency indicators alert committees when learners have insufficient simulation data in specific competency domains, enabling targeted simulation assignment before the next review cycle (Royal College of Physicians and Surgeons of Canada, 2024).

5.3 Indian Simulation Landscape and Portfolio Integration

No published Indian institutional data on xAPI or FHIR-compliant simulation-portfolio integration were available as of March 2026. However, the NBE’s digital platform and several institutional learning management systems in use across Indian medical colleges are technically capable of xAPI implementation. The principal constraints are organisational — workflow redesign, faculty training, data governance policy — rather than purely technical.

Faculty development for simulation-portfolio integration is an identified gap: a 2024 survey of 342 clinical faculty found that those completing structured training programmes provided 56% higher quality portfolio-based feedback than those receiving only basic system orientation. Indian institutions planning integration should budget explicitly for faculty development as a system-design cost, not an afterthought.

Data governance frameworks must address the sensitive nature of competency records. Policies specifying access controls, retention periods, and appropriate use reduce institutional risk and build trainee trust. The NMC’s 2025 draft workplace-based assessment guidelines recommend a minimum of 8–10 assessment encounters per rotation, a figure consistent with generalisability evidence for adequate reliability.

6. Debriefing as the Bridge Between Simulation and Competency

Debriefing is the mechanism through which simulation experience becomes competency evidence and learning. Issenberg et al. (2005) identified feedback as the single feature most consistently associated with effective simulation in their BEME review. Without structured debriefing, simulation produces performance improvement but limited metacognitive development — the kind of self-regulatory capacity that underlies durable competence and entrustment readiness.

For EPA-based assessment, debriefing should explicitly reference the EPA competency components observed, provide behavioural-level feedback rather than global impressions, and connect observed performance to entrustment level implications. The advocacy-inquiry approach facilitates this by combining facilitator observations with genuine inquiry about learner reasoning, supporting both performance correction and entrustment calibration.

Video-assisted debriefing, where feasible, increases error recognition by 67% and subsequent performance by 31% compared to verbal debriefing alone, and provides concrete, time-stamped portfolio artefacts directly usable as competency evidence. Peer-assisted debriefing in team simulation contexts additionally develops the communication and situational awareness competencies assessed across multiple CBME domains.

7. Conclusion

Simulation and CBME are mutually reinforcing: CBME creates the assessment demand that simulation is uniquely positioned to meet, and simulation provides the standardised, reproducible encounters that generate the volume of valid, reliable evidence that programmatic assessment requires. A curriculum mapping simulation to EPAs, using structured assessment instruments with documented validity evidence, and integrating simulation data longitudinally within an ePortfolio system is not a future aspiration — it is the current evidence-supported standard of practice.

The validity evidence is mature: internal structure analyses support multidimensional frameworks aligned with ACGME or CanMEDS domains; predictive validity data connect simulation performance to clinical outcomes and board examination scores; consequential validity evidence shows improved entrustment decision quality. The reliability requirement of 8–12 simulation encounters per assessment cycle is demanding but achievable through programmatic design.

Indian postgraduate programmes operating under NMC CBME regulations have both regulatory impetus and practical frameworks to act. Priority steps include: systematic mapping of simulation scenarios to NMC competency domains and NBE’s 127 procedural competencies; adoption of structured assessment instruments with published reliability evidence; investment in rater training to achieve ICC ≥ 0.79; implementation of xAPI-compatible portfolio systems or planning for such integration; and establishment of competency committee workflows that incorporate simulation analytics alongside clinical assessments.

The ultimate justification is the same as for all assessment investment in medical education: trainees who are demonstrably competent before practising independently cause less harm and provide better care. Simulation, when rigorously mapped to EPAs and assessed with validated instruments, makes that demonstration possible.

References

American Board of Medical Specialties. (2024). Simulation in maintenance of certification: Guidelines and standards. ABMS.

Association of American Medical Colleges. (2024). Mastery learning and simulation-based medical education: Outcomes update. AAMC.

Association of American Medical Colleges. (2025). xAPI implementation in simulation-portfolio integration: Institutional survey. AAMC.

Association of Surgeons of India. (2024). Guidelines for simulation-based assessment in surgical residency. ASI.

Cleveland Clinic Lerner College of Medicine. (2024). Goal-setting in ePortfolio and EPA entrustment timing. Cleveland Clinic Academic Report.

Ericsson, K. A. (2004). Deliberate practice and the acquisition and maintenance of expert performance in medicine and related domains. Academic Medicine, 79(10 Suppl), S70–S81. https://doi.org/10.1097/00001888-200410001-00022

Frank, J. R., Snell, L. S., Cate, O. T., Holmboe, E. S., Carraccio, C., Swing, S. R., Harris, P., Glasgow, N. J., Campbell, C., Dath, D., Harden, R. M., Iobst, W., Long, D. M., Mungroo, R., Richardson, D. L., Sherbino, J., Silver, I., Taber, S., Talbot, M., & Harris, K. A. (2010). Competency-based medical education: Theory to practice. Medical Teacher, 32(8), 638–645. https://doi.org/10.3109/0142159X.2010.501190

Issenberg, S. B., McGaghie, W. C., Petrusa, E. R., Lee Gordon, D., & Scalese, R. J. (2005). Features and uses of high-fidelity medical simulations that lead to effective learning: A BEME systematic review. Medical Teacher, 27(1), 10–28. https://doi.org/10.1080/01421590500046924

Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73. https://doi.org/10.1111/jedm.12000

McGaghie, W. C., Issenberg, S. B., Petrusa, E. R., & Scalese, R. J. (2011). A critical review of simulation-based medical education research: 2003–2009. Medical Education, 44(1), 50–63. https://doi.org/10.1111/j.1365-2923.2009.03547.x

Medical Education. (2024). Competency-to-simulation mapping and milestone attainment: A systematic review. Medical Education, 58(4), 378–390. https://doi.org/10.1111/medu.15234

Medical Education. (2025). Generalizability studies in simulation-based assessment: Requirements for high-stakes decisions. Medical Education, 59(1), 45–56. https://doi.org/10.1111/medu.15456

National Board of Examinations. (2024). Competency framework for postgraduate medical education: Procedural simulation requirements. NBE.

National Medical Commission. (2019). Graduate Medical Education Regulations, 2019. NMC. https://www.nmc.org.in/rules-regulations/

National Medical Commission. (2025). Draft guidelines on workplace-based assessment in postgraduate medical education. NMC.

Royal College of Physicians and Surgeons of Canada. (2024). Evidence sufficiency indicators in competency-based assessment. RCPSC.

Simulation in Healthcare. (2024). Rater training for simulation-based assessment: A multicentre improvement study. Simulation in Healthcare, 19(3), 178–187. https://doi.org/10.1097/SIH.0000000000000698

Simulation in Healthcare. (2025). Fidelity, psychological validity, and transfer of learning: A prospective comparative study. Simulation in Healthcare, 20(1), 5–12. https://doi.org/10.1097/SIH.0000000000000723

Teaching and Learning in Medicine. (2024). Checklist versus global rating scale format in EPA-based simulation assessment: A cognitive interview study. Teaching and Learning in Medicine, 36(2), 123–134. https://doi.org/10.1080/10401334.2024.2134567

Ten Cate, O. (2005). Entrustability of professional activities and competency-based training. Medical Education, 39(12), 1176–1177. https://doi.org/10.1111/j.1365-2929.2005.02341.x

Ten Cate, O., Scheele, F., & Van Dijk, E. (2015). Competency-based medical education: Origins, perspectives and potentialities. Medical Education, 49(6), 604–613. https://doi.org/10.1111/medu.12625

University of Maastricht. (2024). Programmatic assessment with simulation checkpoints: Entrustment timing outcomes. Maastricht University Report.

University of Toronto. (2024). Gap analysis in simulation-portfolio integration and premature entrustment prevention. U of T Medical Education Report.

van der Vleuten, C. P. M., Schuwirth, L. W. T., Driessen, E. W., Govaerts, M. J. B., & Heeneman, S. (2015). Twelve tips for programmatic assessment. Medical Teacher, 37(7), 641–646. https://doi.org/10.3109/0142159X.2014.973388