ESLD: eyes segment and landmark detection in the wild; [自然光普通摄像头的眼部分割及特征点定位数据集 ESLD] - Details

Author：

Indexed by：

Scopus

Abstract：

Objective　Human　eyes　physiological　features　are　challenged　to　be　captured,　which　can　reflect　health,　fatigue　and　emotion　of　human　behaviors.　Fatigue　phenomenon　can　be　judged　according　to　the　state　of　the　patients’　eyes.　The　state　of　the　in-class　students’　eyes　can　be　predicted　by　instructorsin　terms　of　students’　emotion,　psychology　and　cognitive　analyses.　Targeted　consumers　can　be　recognized　through　their　gaze　location　when　shopping.　Camera　shot　cannot　be　used　to　capture　the　changes　in　pupil　size　and　orientation　in　the　wild.　Meanwhile,　there　is　a　lack　of　eye　behavior　related　dataset　with　fine　landmarks　detection　and　segment　similar　to　the　real　application　scenario.　Near-infrared　and　head-mounted　cameras　could　be　used　to　capture　eye　images.　Light　is　used　to　distinguish　the　iris　and　pupil,　which　obtain　a　high-quality　image.　Head　posture,　illumination,　occlusion　and　user-camera　distance　may　affect　the　quality　of　image.　Therefore,　the　images　collection　in　the　laboratory　environment　are　difficult　to　apply　in　the　real　world.　Method　An　eye　region　segmentation　and　landmark　detection　dataset　can　resolve　the　issue　of　mismatch　results　between　the　indoor　and　outdoor　scenarios.　Our　research　focuses　on　collection　and　annotation　of　a　new　eye　region　segment　and　landmark　detection　dataset　(eye　segment　and　landmark　detection　dataset,　ESLD)　in　constraint　of　dataset　for　fine　landmark　detection　and　eye　region,　which　contain　multiple　types　of　eye.　First,　facial　images　are　collected.　There　are　three　ways　to　collect　images,　including　the　facial　images　of　user　when　using　the　computer,　images　in　the　public　dataset　captured　by　the　ordinary　camera　and　the　synthesized　eye　images,　respectively.　The　number　of　images　is　developed　to　1　386,　804　and　1　600,　respectively.　Second,　eye　region　is　cut　out　from　the　original　image.　Dlib　is　used　to　detect　landmarks　and　eye　region　is　segmented　according　to　the　labels　of　the　completed　face　images　involved.　For　an　incomplete　face　images,　eye　region　should　be　segment　artificially.　And　then,　all　eye　region　images　are　normalized　in　256　×128　pixels.　The　eye　region　images　are　restored　in　a　folder　according　to　the　type　of　acquisitions.　Finally,　annotators　are　initially　to　be　trained　and　manually　annotated　images　labels　followed.　In　order　to　reduce　the　label　error　caused　by　human　behavior　factors,　each　annotator　selects　four　images　from　each　type　of　image　for　labeling.　An　experienced　annotator　will　be　checked　after　the　landmarks　are　labeled　and　completed.　The　remaining　images　can　be　labeled　when　the　annotate　standard　is　reached.　Each　landmarks　location　is　saved　as　json　file　and　labelme　is　used　to　segment　eye　region　derived　the　json　file.　A　total　of　2　404　images　are　obtained.　Each　image　contains　16　landmarks　around　eyes,　12　landmarks　around　iris　and　12　pupil　surrounded　landmarks.　The　segment　labels　are　relevant　to　sclera,　iris,　and　pupil　and　skip　around　eyes.　Result　Our　dataset　is　classified　into　training,　testing　and　validation　sets　by　0.　6　∶　0.　2　∶　0.　2.　Our　demonstration　evaluates　the　proposed　dataset　using　deep　learning　algorithms　and　provides　baseline　for　each　experiment.　First,　the　model　is　trained　by　synthesized　eye　images.　An　experiment　is　conducted　to　recognize　whether　the　eye　is　real　or　not.　Our　analyzed　results　show　that　model　cannot　recognize　real　and　synthesis　accurately,　which　indicate　synthesis　eye　images　can　be　used　as　training　data.　And,　deep　learning-based　algorithms　are　used　to　eye　region　segment.　Mask　region　convolutional　neural　network　(Mask　R-CNN)　with　different　backbones　are　used　to　train　the　model.　It　shows　that　backbones　with　deep　network　structure　can　obtain　high　segment　accuracy　under　the　same　training　epoch　and　the　mean　average　precision　(mAP)　is　0.　965.　Finally,　Mask　R-CNN　is　modified　to　landmarks　detection　task.　Euclidean　distance　is　used　to　test　the　model　and　the　error　is　5.　828.　Compared　to　eye　region　segment　task,　it　is　difficult　to　detect　landmarks　due　to　the　small　region　of　the　eye.　Deep　structure　is　efficient　to　increase　the　accuracy　of　landmarks　detection　with　eye　region　mask.　Conclusion　ESLD　is　focused　on　multiple　types　of　eye　images　in　a　real　environment　and　bridge　the　gaps　in　the　fine　landmarks　detection　and　segmentation　in　eye　region.　To　study　the　relationship　between　eye　state　and　emotion,　a　deep　learning　algorithm　can　be　developed　further　based　on　combining　ESLD　with　other　datasets.　©　2022　Editorial　and　Publishing　Board　of　JIG.　All　rights　reserved.

Keyword：

E-learning pupil segment dataset user identification in the wild landmark detection

Author Community：

[ 1 ] [Zhang J.]Beijing University of Technology, Beijing, 100024, China
[ 2 ] [Sun G.]Beijing University of Technology, Beijing, 100024, China
[ 3 ] [Zheng K.]Beijing University of Technology, Beijing, 100024, China
[ 4 ] [Li Y.]Beijing University of Technology, Beijing, 100024, China
[ 5 ] [Fu X.]Beijing University of Technology, Beijing, 100024, China
[ 6 ] [Ci K.]Beijing University of Technology, Beijing, 100024, China
[ 7 ] [Shen J.]Beijing University of Technology, Beijing, 100024, China
[ 8 ] [Meng F.]Beijing University of Technology, Beijing, 100024, China
[ 9 ] [Kong J.]Beijing University of Technology, Beijing, 100024, China
[ 10 ] [Zhang Y.]Beijing University of Technology, Beijing, 100024, China

Reprint Author's Address：

Email：

Related Keywords：

Reducing noisy labels in weakly labeled data for visual sentiment analysis
2017，24th IEEE International Conference on Image Processing, ICIP 2017
Virtual mix design: Prediction of compressive strength of concrete with industrial wastes using deep data augmentation
2022，Construction and Building Materials
Machine-Learning-Based Malware Detection for Virtual Machine by Analyzing Opcode Sequence
2018，9th International Conference on Brain-Inspired Cognitive Systems, BICS 2018
Mobile Traffic Prediction in Consumer Applications: A Multimodal Deep Learning Approach
2024，IEEE Transactions on Consumer Electronics

Type
Departments

All Years Choose Year From to