Wei-Ta Chu and Chih-Hao Chiu
Multimedia Computing Laboratory
Dept. of Computer Science and Information Engineering
National Chung Cheng University
1. Introduction
Facial images embed age, gender, and other rich information that is implicitly related to occupation. In this work, we advocate that occupation prediction from a single facial image is a doable computer vision problem. We extract multilevel hand-crafted features associated with locality-constrained linear coding and convolutional neural network features as image occupation descriptors. To avoid the curse of dimensionality and overfitting, a boost strategy called multichannel SVM is used to integrate features from face and body. Intra- and interclass visual variations are jointly considered in the boosting framework to further improve performance. In the evaluation, we verify the effectiveness of predicting occupation from face and demonstrate promising performance obtained by combining face and body information. More importantly, our work further integrates deep features into the multichannel SVM framework and shows significantly better performance over the state of the art.
2. DB1 [5.3 MB] (Link)
Images for DB1 were downloaded from the official websites of some Asian organizations such as hospitals, universities, and TV channels. From official websites, the reliability and quality of images in DB1 are guaranteed. We exclude ethnic variations by focusing on Eastern Asian people in DB1. There are totally 2,062 images belonging to five different occupations: doctor, anchorperson, athlete, policeman, and professor. The number of images for each occupation ranges from 300 to 500.
3. DB2 [43 MB] (Link)
We collect images from two popular image search engines, Google Images and Bing Images, using text queries related to occupations. Twenty-one occupations were selected from more than 100 well-defined occupations in Wikipedia. DB2 has ethnic diversity and high intraclass variations, making it more challenging than DB1. It contains 5,671 images in total, and the number of images for each occupation ranges from 122 to 553.
3. Citation
Please cite our work if you utilize this dataset.
Wei-Ta Chu and Chih-Hao Chiu, "Predicting Occupation from Images by Combining Face and Body Context Information," ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 13, no. 1, Article No. 7. 2017.
Last Updated: January 6, 2018