>100 Views
December 16, 23
スライド概要
Hiroya Ichihara, Kazushi Okamoto, Atsushi Shibata: Real estate property image classification based on optimal transport costs, The 7th Asian Conference of Management Science and Applications (ACMSA2023), 2023.12, Onna, Okinawa, Japan.
Data Science Research Group, The University of Electro-Communications
Real Estate Property Image Classification Based on Optimal Transport Costs Hiroya Ichihara , Kazushi Okamoto , Atushi Shibata Department of Informatics, Graduate School of informatics and Engineering, The University of Electro-Communications Graduate School of industrial Technology, Advanced Institute of industrial Technology 2023.12.16 ACMSA 2023 1 / 19
Classification task for real estate property images Estimating age and rent from real estate property images [You+ 17] Label classification task for real estate property images [Bappy+ 17] 2023.12.16 The quality of the labels assigned to property images used to train classification model has been highlighted as a problem [Kiyota 21] ACMSA 2023 2 / 19
Problems in real estate property images type Single Label (SL) Noise Multi Label (ML) Noise No-groundtruth Label (NL) Noise should be Images with incorrect Images Images for which no true label assigned multiple labels exists labels, but a single label example assigned label true label 2023.12.16 Surroundings Kitchen Kitchen Exteriors Kitchen, Entrance - ACMSA 2023 3 / 19
LIFULL HOME'S dataset Examine the distribution of each label noise on the LIFULL HOME'S dataset Images with one of the following representative 14 labels in the LIFULL HOME'S dataset Floor plan Exterior Surroundings Entrance Kitchen Bedroom Bathroom Toilet Washroom Storage Facilities Balcony Shared space Parking lots Distribution of each label noise in 30,800 images Label Noise SL Noise ML Noise NL Noise Total # of images 2,680 2,412 164 5,256 ratio 8.7 7.8 0.5 17 2023.12.16 ACMSA 2023 4 / 19
Problems in real estate property images Investigate their effects on the classification model of real estate property images[Ichihara+ 22] SL and NL noise should be removed from the training data ML noise should not be removed Potential for useful data with information on co-occurrence relationships between labels Co-occurrence relationship : tendency for different label objects in an image to appear simultaneously 2023.12.16 ACMSA 2023 5 / 19
Target task Label classification task for real estate property images Learning process:classification model is trained on the training data Prediction process:classification model generates a probability vector when an unknown image is input, is the number of labels which is 14 in this study Loss function during learning process:Categorical Cross Entropy(CCE) Excessive penalization of training on ML noise images Not consider relationships between labels 2023.12.16 ACMSA 2023 6 / 19
Use of optimal transport costs as loss functions To address this problem, we focus on optimal transport costs as a loss function Optimal transport is a type of optimization problem, and optimal transport costs can be used as a measure between probability distributions Optimal transport costs from the predictive distribution to the label distribution : regularization parameter : entropy of the matrix : approximate solution of the optimal transport matrix obtained by Sinkhorn algorithm : cost matrix 2023.12.16 ACMSA 2023 7 / 19
Use of optimal transport costs as loss functions Differentiable [Cuturi 13] Cost matrix allows comparisons between distributions, with considering the conceptual distance structure between labels Predicted Distribution Floor Bathroom Toilet Exterior plan Label distribution Floor Bathroom Toilet Exterior plan The knowledge that bathrooms and toilets are close to each other can be embedded in the cost matrix, and the loss can be calculated by considering the relationships between labels 2023.12.16 ACMSA 2023 8 / 19
Research contents Problems of conventional methods Not consider data with relationships between labels of real estate property images in training Purpose of this study (RQ) Validate the hypothesis that a classification model useing optimal transport costs as a loss function to capture relationships between labels is effective in improving prediction accuracy Proposed Model The cost matrix used to calculate the 2023.12.16 is based on the similarity by language and image ACMSA 2023 9 / 19
Cost matrix by language The cost matrix by language uses linguistic similarity by a total of 14 label names 140 ・・・ house layout hand washing ・・・ LDK restroom ・・・ Similar words 1. 2023.12.16 14 14 Toilet 140 Floor plan ・・・ Label names Extract 9 similar words for each 14 labels. A total of 140 words is obttained 2. Calculate a 140×140 matrix of cosine similarity between 140 words ACMSA 2023 3. Calculate the average of the matrix every 10×10 4. Set diagonal component of 1-S to 0 10 / 19
Cost matrix by image Extraction of prototypes representing each label [Han+ 21] 1. Randomly sample images from each labels of , and obtain the embeddings of each image by , and compute the matrix with the respective cosine similarity 2. Extract 8 images as prototypes based on , 3. The cost matrix by image is designed from the similarity between the prototypes of each label as well as the cost matrix by language vec Image 2 vec (Image 1) (pretrained from ) vec (Image 3) ・・・ Randomly Sampled from Toilets labels , vec Image 3 2023.12.16 vec (Image 2) ・・・ Image 1 vec ACMSA 2023 Toilet Prototype 11 / 19
Experiment summary Purpose Validate effectiveness of classification model using optimal transport costs as a loss function Approach Compare the prediction accuracy when varying each ( ) with the baseline model without optimal transport costs where in Metric Accuracy : percentage of the true labels that match the predicted labels by the model 2023.12.16 ACMSA 2023 12 / 19
Experimental setup Classification model ResNet50 fine-tuned with real estate property images Hyper-parameters The optimization method is Stochastic Gradient Descent (SGD) The number of epochs is 100, the batch size is 32 The learning rate were determined through a grid search using an 8:2 holdout method 2023.12.16 ACMSA 2023 13 / 19
Dataset Training dataset For training the classification model, randomly sampled 2,200 samples from each label, for a total of 30,800 samples Test dataset Randomly sampled 200 new samples from each label, for a total of 2,800 samples 2,723 images are used after manual label correction 2023.12.16 ACMSA 2023 14 / 19
Result : The change in accuracy with α for each cost matrix OT cost CCE In comparision with the baseline case without optimal transport costs, accuracy improves → expected to improve the performance of the classification model 2023.12.16 ACMSA 2023 15 / 19
Qualitive comparison Confusion matrix with 2023.12.16 Confusion matrix : the match or mismatch between the predicted labels and true labels Confusion matrix with ACMSA 2023 , 16 / 19
Confusion matrix with Confusion matrix with , Robust for images with ambiguous data 2023.12.16 ACMSA 2023 17 / 19
Conclusion Summary There are data with co-occurrence relationships between labels However, conventional methods do not consider the relationships between labels → The effectiveness of classification model using optimal transport costs as a loss function Experiment Result Incorporating optimal transport costs into the loss function is expected to improve the performance of the classification model Future perspect Validate the cost matrix design methodology 2023.12.16 ACMSA 2023 18 / 19
References [You+ 17] Q. You, R. Pang, L. Cao, and J. Luo, ``Image based appraisal of real estate properties,'' IEEE Trans. on Multimedia, vol.19, no.12, pp.2751--2759, 2017. [Bappy+ 17] J.H. Bappy, J.R. Barr, N. Srinivasan, and A.K. Roy-Chowdhury, ``Real Estate Image Classification,'' Proc. of 2017 IEEE Winter Conf. on Appl. of Comput. Vis., pp.373--381, 2017. [Kiyota 21] Y. Kiyota, ``Frontiers of computer vision technologies on real estate property photographs and floorplans,'' Front. of Real Estate Sci. in Jpn, pp.325--337, Feb. 2021. [Ichihara+ 22] Hiroya Ichihara and Kazushi Okamoto and Atsushi Shibata Effects of Noisy Labels on Real Estate Property Image Classification (in Japanese), Proc. of the Annual Conf. of JSAI, vol.JSAI2022, pp.4Yin244, 2022. [Cuturi 13] M. Cuturi, ``Sinkhorn Distances: Lightspeed Computation of Optimal Transport,'' Adv. in Neural Inf. Process. Syst., vol.26, pp.2292--2300, 2013. [Han+ 19 ] J. Han, P. Luo, and X. Wang: ``Deep Self-Learning From Noisy Labels,'' Proc.of the IEEE/CVF Int.Conf on Comput. Vis., pp.5138–-5147, 2019. 2023.12.16 ACMSA 2023 19 / 19