Articulation modeling aims to infer movable parts and their motion parameters for a 3D object, enabling interactive animation, simulation, and shape editing. Prior work studies the problem from different aspects---including articulation perception, articulated object reconstruction, and generative modeling. These methods typically take images, videos, or point clouds as input and estimate articulation by leveraging the learned priors from the public PartNet-Mobility dataset or its variants. Despite the promising progress, existing methods often provide limited controllability, generalize poorly beyond the training distribution, and rely heavily on dataset-specific priors, making them difficult to deploy in real design workflows.
In this paper, we present Sketch2Arti, the first sketch-based articulation modeling system for CAD objects. Our key observation is that designers naturally communicate articulation intent through lightweight sketches (e.g., arrows and strokes) that indicate how parts should move, yet translating such sketches into articulated 3D models remains largely manual. Sketch2Arti bridges this gap by enabling users to specify articulation through simple 2D sketches drawn from a chosen viewpoint. Given a CAD model and user sketches, our approach automatically discovers the corresponding movable parts and predicts their motion parameters, allowing iterative modeling of multiple articulations on complex objects with fine-grained control. Importantly, Sketch2Arti is trained in a category-agnostic manner without requiring object category information, leading to strong generalization to diverse objects beyond existing articulation datasets. Moreover, for shell models lacking interior structures, Sketch2Arti supports controllable internal completion guided by user sketches, generating plausible internal components consistent with the existing geometry and predicted motion constraints. Comprehensive experiments and user evaluations demonstrate the effectiveness, controllability, and generalization of Sketch2Arti.
Figure 5: Overview. (a) Given an input 3D shape and the user sketches, our method Sketch2Arti addresses the where and how challenges by (b) identifying movable parts (i.e., the two doors) and inferring their articulation parameters. (c) The predicted motion reveals missing internal structure (e.g., an empty drawer), which users can further specify via sketches. Sketch2Arti then tackles the what challenge by (d) generating the full drawer geometry while adhering to both the existing shape and the inferred articulation.
Figure 6: Articulation prediction. Given a static 3D object, we apply category-agnostic articulation recognition on a localized region surrounding the sketch with the local context captured by the depth and normal maps. A trained U-Net module predicts the articulation parameters in 2D maps and 3D local camera coordinates, as well as motion type. The 2D part mask is then back-projected onto the object surface and used to filter through a hierarchy of segments produced by a foundation Partfield model, to select the best matching part at a level undetermined beforehand as the movable 3D component.
Figure 7: Interior shape completion. Our approach leverages 2D and 3D generative models to complete the interior structures exposed by articulated parts. Given a 3D object with recognized articulation part and parameters, the top branch applies a 2D generative model (e.g., Nano banana) to obtain a high-quality reference image, which is used to guide the 3D generative model (e.g., Trellis) to create the interior structure. Crucially for obtaining structure-preserving interiors, masks of loose and strict types are built to control the flow generative process of the 3D generative model and adjust the completed part interior, respectively. Finally, the completed part is refined for kinematic validity and turned into separate meshes that are readily usable as URDF models.
Figure 8: Dataset gallery and statistics. Left: Representative samples from SketchMobility. Note the presence of uncommon articulated objects (e.g., helicopters and motorbikes), which are rarely considered in existing articulation modeling benchmarks. Right: Category distribution of SketchMobility. We report major categories (>=1.5%) individually, while merging minor categories into Others (17.9%).
Figure 10: Results gallery. We show representative articulation modeling sessions using Sketch2Arti. For each example, user sketches are overlaid on the rendered shape under the chosen viewpoint, and the inferred movable parts are color-coded. The black arrow indicates the iterative modeling order across views/parts.