Towards Better Cephalometric Landmark Detection with Diffusion Data Generation

Abstract

Cephalometric landmark detection is essential for orthodontic diagnostics and treatment planning. Nevertheless, the scarcity of samples in data collection and the extensive effort required for manual annotation have significantly impeded the availability of diverse datasets. This limitation has restricted the effectiveness of deep learning-based detection methods, particularly those based on large-scale vision models. To address these challenges, we have developed an innovative data generation method capable of producing diverse cephalometric X-ray images along with corresponding annotations without human intervention. To achieve this, our approach initiates by constructing new cephalometric landmark annotations using anatomical priors. Then, we employ a diffusion-based generator to create realistic X-ray images that correspond closely with these annotations. To achieve precise control in producing samples with different attributes, we introduce a novel prompt cephalometric X-ray image dataset. This dataset includes real cephalometric X-ray images and detailed medical text prompts describing the images. By leveraging these detailed prompts, our method improves the generation process to control different styles and attributes. Facilitated by the large, diverse generated data, we introduce large-scale vision detection models into the cephalometric landmark detection task to improve accuracy. Experimental results demonstrate that training with the generated data substantially enhances the performance. Compared to methods without using the generated data, our approach improves the Success Detection Rate (SDR) by 6.5%, attaining a notable 82.2%.

Method

Figure.1 Overview of the Anatomy-Informed Cephalometric X-ray Generation (AICG) Framework. Three primary stages include Condition Generation (highlighted in blue), Image Generation (marked in yellow), and Landmark Detection (stroked in purple), delineating the pipeline from condition preparation through image synthesis to landmark detection.

Figure.2 The Anatomy-Informed Topology (AIT) Module: (a) Shows 38 cephalometric landmarks with their positions and names. (b) Describes the AIT module's process, highlighting the construction of a graph with critical landmarks and employing Distance-Based Coloring for intuitive anatomical relationship representation.

Visualization

Figure.3 The Qualitative Results of the Generated Images with Different Prompts. We use seven different prompts to generate the images.

Towards Better Cephalometric Landmark Detection with Diffusion Data Generation

Abstract

Method

Visualization

BibTeX