Technical Skills Assessment Toolbox
Technical Skills Assessment Toolbox
This study provides an assessment toolbox for common surgical skills/procedures, with appraisals of the validity evidence for instruments in the toolbox. Our review shows that few authors have utilized the contemporary unitary concept of validity for development and appraisal of their assessment tools. With the current level of validity evidence, we recommend that most assessment tools should be used only for instructional purposes and/or formative assessment. Before any further application of these tools, for summative and high stakes assessments, such as medical certification and recertification, extensive validation processes must first take place. As part of this process, researchers should determine the short- and long-term impact of the results generated by these tools on trainees. These outcomes can be used to establish evidence-based pass–fail scores. In this context, application of generalizability theory can assist researchers to determine how many assessments are needed to obtain an accurate measure of ability. Using this rigorous method, researchers will be able to establish different pass–fail cut points for various levels of performance in different contexts. Ideally, such cut points can be used as a complementary tool to decide whether and when trainees are ready for the OR, after practicing in a simulated curriculum, or whether they should pass a rotation or get promoted at the end of a postgraduate year. Eventually, such thresholds can be used as part of a multisource competency-based assessment for graduation of surgeons from residency programs. For example, FLS certification is one of the ABS requirements for board eligibility.
As we move toward the competency-based training and assessment model, future studies of the assessment instruments should provide evidence for all sources of validity (especially consequences), address the lack of data for generalizability of current assessments, and determine the appropriateness of these methods and instruments for formative and summative assessment. These studies should focus on filling gaps by providing further validity evidence for existing assessment tools. The results from a decade of research in developing assessment tools can provide a platform and foundation for future research. As demonstrated in this study, there are flaws in the design and conceptualization of some of these tools. Future studies should improve existing tools and take advantage of work already done, instead of "reinventing the wheel" by creating new tools where tools already exist.
Although the feasibility of implementing these assessment tools in any training program is beyond the scope of this article, we would like to emphasize the importance of considering this crucial aspect, that is, feasibility. Training programs should take into account barriers such as faculty time constraints, residents' duty-hour regulations, and lack of familiarity of the faculty and residents with these tools. For successful implementation, program directors should select the tools that are both easy to use and for which there are well-established sources of validity evidence in the literature. Similar to the concept of evidence-based medicine, surgical educators, program directors, and other teaching faculty should embrace only these tools after significant and positive educational outcomes have been demonstrated. The aforementioned deliberations, in combination with faculty development, would most likely result in a high compliance rate in use of these tools by faculty.
Conclusions
This study provides an assessment toolbox for common surgical skills/procedures, with appraisals of the validity evidence for instruments in the toolbox. Our review shows that few authors have utilized the contemporary unitary concept of validity for development and appraisal of their assessment tools. With the current level of validity evidence, we recommend that most assessment tools should be used only for instructional purposes and/or formative assessment. Before any further application of these tools, for summative and high stakes assessments, such as medical certification and recertification, extensive validation processes must first take place. As part of this process, researchers should determine the short- and long-term impact of the results generated by these tools on trainees. These outcomes can be used to establish evidence-based pass–fail scores. In this context, application of generalizability theory can assist researchers to determine how many assessments are needed to obtain an accurate measure of ability. Using this rigorous method, researchers will be able to establish different pass–fail cut points for various levels of performance in different contexts. Ideally, such cut points can be used as a complementary tool to decide whether and when trainees are ready for the OR, after practicing in a simulated curriculum, or whether they should pass a rotation or get promoted at the end of a postgraduate year. Eventually, such thresholds can be used as part of a multisource competency-based assessment for graduation of surgeons from residency programs. For example, FLS certification is one of the ABS requirements for board eligibility.
As we move toward the competency-based training and assessment model, future studies of the assessment instruments should provide evidence for all sources of validity (especially consequences), address the lack of data for generalizability of current assessments, and determine the appropriateness of these methods and instruments for formative and summative assessment. These studies should focus on filling gaps by providing further validity evidence for existing assessment tools. The results from a decade of research in developing assessment tools can provide a platform and foundation for future research. As demonstrated in this study, there are flaws in the design and conceptualization of some of these tools. Future studies should improve existing tools and take advantage of work already done, instead of "reinventing the wheel" by creating new tools where tools already exist.
Although the feasibility of implementing these assessment tools in any training program is beyond the scope of this article, we would like to emphasize the importance of considering this crucial aspect, that is, feasibility. Training programs should take into account barriers such as faculty time constraints, residents' duty-hour regulations, and lack of familiarity of the faculty and residents with these tools. For successful implementation, program directors should select the tools that are both easy to use and for which there are well-established sources of validity evidence in the literature. Similar to the concept of evidence-based medicine, surgical educators, program directors, and other teaching faculty should embrace only these tools after significant and positive educational outcomes have been demonstrated. The aforementioned deliberations, in combination with faculty development, would most likely result in a high compliance rate in use of these tools by faculty.
Source...