Check for updatesReceived:3 October 2022 Revised:13 April 2023Accepted:19 April 2023D0t10.1002acm2.14022JOURNAL OF APPLIED CLINICALMEDICAL PHYSICSRADIATION ONCOLOGY PHYSICSDeep learning-based classification of organs at risk anddelineation guideline in pelvic cancer radiation therapyMichael Lempart1.2 Jonas Scherman1 I Martin P.Nilsson3Christian Jamtheim Gustafsson1,21Radiation Physics,Department ofAbstractHematology,Oncology,and Radiation Physics,Deep learning (DL)models for radiation therapy (RT)image segmentationSkane University Hospital,Lund,Swedenrequire accurately annotated training data.Multiple organ delineation guidelines2Department of Translational Medicine,Medical Radiation Physics,Lund University,exist;however,information on the used guideline is not provided with the delin-Malmo,Swedeneation.Extraction of training data with coherent guidelines can therefore beDepartment of Hematology,Oncology andchallenging.We present a supervised classification method for pelvis structureRadiation Physics,Skane University Hospital,delineations where bowel cavity,femoral heads,bladder,and rectum data,withLund,Swedentwo guidelines,were classified.The impact on DL-based segmentation qualityusing mixed guideline training data was also demonstrated.Bowel cavity wasCorrespondenceChristian Jamtheim Gustafsson,Radiationmanually delineated on CT images for anal cancer patients(n =170)accord-Physics,Department of Hematologying to guidelines Devisetty and RTOG.The DL segmentation quality from usingOncology,and Radiation Physics,Skanetraining data with coherent or mixed guidelines was investigated.A supervisedUniversity Hospital,Klinikgatan 5,221 85Lund,Sweden.3D squeeze-and-excite SENet-154 model was trained to classify two bowelEmail:christian.k.gustafsson@skane.secavity delineation guidelines.In addition,a pelvis CT dataset with manual delin-eations from prostate cancer patients(n =1854)was used where data withFunding informationan alternative guideline for femoral heads,rectum,and bladder were generatedSwedish VINNOVA,GrantAward Number:2019-04735using commercial software.The model was evaluated on internal(n=200)andexternal test data(n 99).By using mixed,compared to coherent,delineationguideline training data mean DICE score decreased 3%units,mean Hausdorffdistance(95%)increased 5 mm and mean surface distance(MSD)increased1 mm.The classification of bowel cavity test data achieved 99.8%unweightedclassification accuracy,99.9%macro average precision,97.2%macro averagerecall,and 98.5%macro average F1.Corresponding metrics for the pelvis inter-nal test data were all 99%or above and for the external pelvis test data theywere 96.3%,96.6%,93.3%,and 94.6%.Impaired segmentation performancewas observed for training data with mixed guidelines.The DL delineation clas-sification models achieved excellent results on internal and external test data.This can facilitate automated guideline-specific data extraction while avoidingthe need for consistent and correct structure l