김성호, 영상 문맥 정보를 이용한 계층적 그래피컬 모델 기반 물체 인식 및 분류 기법, 한국과학기술원, 2007 2월.
The goal of object recognition is to label objects from images and
to estimate the poses of the labeled objects. The field of object
recognition has seen tremendous progress with successful
applications in some specific domains such as face recognition.
However, the current state-of-the-art methods show unsatisfactory
results for more general object domains in complex natural
environments with visual ambiguities. In this dissertation, we aim
to enhance the object identification and categorization with the
guide of visual context and graphical model.
In this dissertation, we propose a general framework for the
cooperative object identification and object categorization.
Examplars used in identification provide useful information of
similarity in categorization. Conversely, novel objects are rejected
in identification but the proposed object categorization can label
the novel objects and segment them out for database update in
identification.
In the first part of the work, we propose a hierarchical graphical
model (HGM) for the disambiguation of blurred objects. We define
three types of visual context such as spatial, hierarchical, and
temporal context, which provide powerful disambiguation. To handle
both the visual relation and uncertainty, we model them by the HGM.
It consists of part layer, object layer, and a place node. Pose
information in part and object layer is inserted into nodes for the
utilization of part-object context. Due to the complexity of
graphical model, we apply the piecewise learning which gives
practical learning of the HGM, and propose a context-guided sample
generation and pruning for the variable graph estimation and
distribution estimation. The bidirectional interaction in the HGM
can discriminate ambiguous objects and places simultaneously in real
environment. Large scale experiments for building guidance validate
the robustness. As a direct extension, the HGM is adapted for the
video interpretation by incorporating additional temporal context.
In the second part of the work, we propose a directed graphical
model, a variant of the HGM, for the simultaneous segmentation and
categorization in cluttered environments. Conventional methods show
weak performance due to the ambiguity of figure-ground. We enhance
the categorization by the proposed online boost based on the
part-part and part-object context. It can provide robust bottom-up
proposal for the clutter reduction. The boosted MCMC (Markov Chain
Monte Carlo) optimizes the simultaneous categorization and
segmentation. Samples from bottom-up boost provide fast and accurate
results. The proposed system shows upgraded enhancement for
cluttered environments.
The goal of object recognition is to label objects from images and
to estimate the poses of the labeled objects. The field of object
recognition has seen tremendous progress with successful
applications in some specific domains such as face recognition.
However, the current state-of-the-art methods show unsatisfactory
results for more general object domains in complex natural
environments with visual ambiguities. In this dissertation, we aim
to enhance the object identification and categorization with the
guide of visual context and graphical model.
In this dissertation, we propose a general framework for the
cooperative object identification and object categorization.
Examplars used in identification provide useful information of
similarity in categorization. Conversely, novel objects are rejected
in identification but the proposed object categorization can label
the novel objects and segment them out for database update in
identification.
In the first part of the work, we propose a hierarchical graphical
model (HGM) for the disambiguation of blurred objects. We define
three types of visual context such as spatial, hierarchical, and
temporal context, which provide powerful disambiguation. To handle
both the visual relation and uncertainty, we model them by the HGM.
It consists of part layer, object layer, and a place node. Pose
information in part and object layer is inserted into nodes for the
utilization of part-object context. Due to the complexity of
graphical model, we apply the piecewise learning which gives
practical learning of the HGM, and propose a context-guided sample
generation and pruning for the variable graph estimation and
distribution estimation. The bidirectional interaction in the HGM
can discriminate ambiguous objects and places simultaneously in real
environment. Large scale experiments for building guidance validate
the robustness. As a direct extension, the HGM is adapted for the
video interpretation by incorporating additional temporal context.
In the second part of the work, we propose a directed graphical
model, a variant of the HGM, for the simultaneous segmentation and
categorization in cluttered environments. Conventional methods show
weak performance due to the ambiguity of figure-ground. We enhance
the categorization by the proposed online boost based on the
part-part and part-object context. It can provide robust bottom-up
proposal for the clutter reduction. The boosted MCMC (Markov Chain
Monte Carlo) optimizes the simultaneous categorization and
segmentation. Samples from bottom-up boost provide fast and accurate
results. The proposed system shows upgraded enhancement for
cluttered environments.