Exclusive share from AI masters
AI competitions provide a chance for many industry workers and college students to prove their strength and hone their skills.
Recently, the “Short Video Face Parsing Challenge”, sponsored by Madacode, NVIDIA Inception Program and China Association of Image and Graphics, was held from late April to mid-June, and received much attention.
Face analysis is a face-centric analysis, widely used in virtual reality (VR), video monitoring, entertainment and social, facial expression analysis, etc. It means to divide the face captured in the image into several semantically consistent areas, such as eyes, mouth, etc. As a fine-grained semantic segmentation task, it is more challenging than looking for face contours and key points.
The rich results of this competition have promoted the academic communication between industry insiders and researchers, and promoted the development of graphics and image technology in China. It attracted more than 100 teams, 30 of which made it to the finals, and 21 of them won prizes, with the top three winning abundant bonuses.
Are you curious about how to stand out in the competition and get the prize? What are the secrets of the winners?
Today, we interviewed the members of “1324A team” who won the third prize in “Short Video Face Parsing Challenge”. They are from the School of Automation, University of Science and Technology Beijing and the Institute of Automation, Chinese Academy of Sciences. They will reveal the story behind their wining in this AI competition!
Hanna: Hello, everyone! I am Hanna from Madacode. I am the hostess of this interview. Welcome!
Zhao Yikai: Hello, everyone!
Zheng Yang: Hello!
He Jian: Hello!
Hanna: In this “Short Video Face Parsing Challenge”, there are more than 100 teams participating. You stands out from them and wins the third prize. It’s really an excellent result.
Hanna: Thank you for taking the time to interview us and share your experience and experience. First of all, please introduce yourself so that we can have a better understanding of you.
Zhao Yikai: My name is Zhao Yikai. Now I am a second-year graduate student in the School of Automation, University of Science and Technology Beijing. My research interests are mainly semantic segmentation of images and videos based on deep learning, and model compression of deep network models.
Zheng Yang: Hello, my name is Zheng Yang. Now I am also a second-year graduate student in the School of Automation, University of Science and Technology Beijing. My main research direction is target detection and multi-target tracking.
He Xingjian: Hi, my name is He Xingjian. I am now a second-year doctoral student in the Institute of Automation, Chinese Academy of Sciences. At present, the main research direction is image semantic segmentation, instance segmentation and detection.
Hanna: Thank you. So, two of your team are from University of Science and Technology Beijing and one is from The Chinese Academy of Sciences. How did you find each other and form this team?
Zhao Yikai: It’s a long story. Because when I received a postgraduate recommendation from my university when I graduated with bachelor degree. I found my graduate supervisor, Professor Li Jiangyun. At that time, our laboratory has just started the AI field at that time. Our laboratory was not originally from the AI field and the accumulation of our lab in the field of AI was not much. So when I received the postgraduate recommendation, my supervisor decided to send me to do research in The Chinese Academy of Sciences, following professor Liu Jing, who is the supervisor of He Xingjian. I wanted to learn some more advanced knowledge, then can go back to our lab and share my experience, promoting the development of the laboratory. He Xingjian is my senior male in the Institute of Automation. Zheng Yang was my roommate and classmate for six years, and then we formed a team.
Hanna: Good. You said your lab was not originally in AI field, what did it do before?
Zhao Yikai: In fact, our lab is from an industrial background. Our teacher has done a lot of industrial projects, such as cooperating with some steel mills. About five years ago, AI was on the rise, and it wasn’t as hot as it is now. My teacher at the time, very wisely, foresaw the current wave of AI, so our lab began to transform into deep learning. Now our lab has started to combine AI algorithm with some practical projects in industry, such as surface defect detection of steel plates, belt deviation detection, and coal pile volume estimation. And our lab last year cooperated with a company in Beijing called Tianxingyuanjing in a project of unmanned supermarket
Hanna: How did you know our competition? Why did you want to participate in this competition?
Zheng Yang: Let me answer this question. In fact, I initiated this contest first, because it was my supervisor Prof. Fu Dongmei who posted a message about this contest in the WeChat group of the research group. And then I was interested, and then I got in touch with my teammates. Then we discussed together, first of all, we feel that the research direction is more relevant. And then you also have experience with segmentation. Then I want to test my usual research results through this competition. Then on the other hand, many universities or some companies also participated in the competition, and we wanted to play with these teams in practice. Challenge myself.
Hanna: You have challenged yourselves and achieved good results, which shows that you are really strong. Do you think your team won the third place? What’s the secret of winning? For example, are there any skills in model building and data processing?
Zhao Yikai: Our team can finally achieve such a result, the most important fact is that we are more down-to-earth. According to the game, or to provide data to do such a face, (to finish) such a task, our team from the data processing, model building, model training, and parameter adjustment, and the final model tests, it is in these ways we have done a lot of attempts, all have our own a understanding in it. Instead of just taking some existing theory and putting it on, we’ve incorporated a lot of our own experience. For the specific method, we will give you a detailed report on the PPT when we make the final.
Hanna: Ok, may I ask when you start preparing for the competition? How long did it take each day to prepare for the game?
Zhao Yikai: It should be early May when we sign up. We should have this game at the end of April, the 20th of April. We only confirmed our participation in the contest in early May. As for how long we spend each day, maybe the initial investment will be a little bit more. Because we need to do a lot of research, and some scaffolding on the code, and on the model, right, so we’re probably going to spend a little bit more time. That later period of words is basically in our early has discussed out some countermeasures above, and then to do some more attempts. In fact, a lot of time or use GPU to run the model. Then, in general, according to the results of this model, specific problems, and then to do specific optimization.
Hanna: Ok, thanks. What have you learned and gained from this competition? And then how does that help you with some of your future research and work?
He Xingjian: (through) this competition, we can better understand some of the needs of the industry. In future work, we can continue to follow these requirements, and then dig deeper into which areas are worth ploughing and which are probably now saturated and do not need to be done. It has a guiding effect on our scientific research.
Hanna: Could you please explain it in detail? What do you find worth ploughing?
He Xingjian: For example, face analysis, short video, may be the current industry demand is relatively large, relatively strong (field), now there are fewer human beings. In the future, as more and more people know about this task, more and more people will do it, and the algorithm will get better and better.
Hanna: Thanks. What advice do you have for AI and data annotation for other people who are learning AI and data annotation?
He Xingjian: I think the current AI model is mostly data-driven. But the quantity and quality of the data has a big impact on the performance of the model. So data annotation is a very complicated process. For example, for a data set marked with image segmentation, it may take ten minutes, dozens of minutes or even an hour to mark an image. So data tagging is actually a labor-intensive task. But with the development of artificial intelligence, I think this task should be shifted from labor-intensive to a labor-intensive and technology-intensive task. That is to say, do a good job of man-machine integration. The man-machine inside the machine refers to the current AI model. The AI model can be simply annotated first, then confirmed and fine-adjusted manually. And then what I’ve talked about above is basically a relatively simple way of integrating human and computer. What I think is ideal is not just the combination of man and machine, but the way in which man and machine interact. So let’s do segmentation again. It is quite tedious to mark a segmented data set. In the case of human-computer interaction, the technician can click a position with the mouse first, and then the AI model can mark a region directly according to this position. The technician can annotate an entire image with just a few clicks. That will greatly reduce the technical staff’s workload. Therefore, I think, in the future, there will be a gradual transition to such a work (mode) dominated by AI and assisted by human.
Hanna: Thanks for your sharing about data annotation. I would also like to ask you to talk about your future career plans. What do you want to do in the future? What role does AI play in the industry you want to work in?
He Xingjian: Let me say it first. In terms of the future I think, because we’ve been doing this for many years, I still want to work on the AI side. What I am doing now with my team is mainly analyzing and understanding images and videos. Therefore, In the future, I will continue to engage in relevant research on image and video understanding. I think its prospects are relatively large. Because there used to be a word called “Internet plus”. Now more popular is “AI+”, such as “AI+ education”, “AI+ agriculture”, “AI+ security” and other fields, image and video play a very important role. So In the future I will continue to plow along this road.
Zheng Yang: In the future, I will do further research in the field of computer vision. Because my current research direction is mainly video multi-target detection and tracking. They are also widely used in areas such as driverless cars. Because in unmanned driving, the specific position and trajectory of pedestrians and vehicles in the scene are judged by detecting and tracking them, and then the vehicle makes a judgment of the corresponding movement direction. I hope I can put my current research direction into concrete applications in the future.
Zhao Yikai: I should also continue to develop in the AI direction in the future. Because my main direction right now is image and video understanding. Especially in the semantic segmentation of image and video. I hope the research will be more in-depth in the future. I have been working on this for several years and can continue to work on it. The application of this direction is very extensive. There are a lot of unanswered questions right now.(AI) Applications can give a few examples. For example, let’s face analysis of this competition, a little bigger is the human body analysis, the human body parts. They have a wide range of applications in unmanned vehicles, remote sensing, etc. One of the areas that’s been getting a lot of attention lately is medicine, which USES image segmentation, like x-rays, to segment sick areas. This is really a very hot direction right now.
Hanna: Ok, thanks. After listening to your share, I also know that AI is developing very fast and its application prospect is very broad. I also hope your research will promote the development of AI in China in the future. Thank you for sharing today, I also feel that I have learned a lot of knowledge. I believe these knowledge through our public account, share to more readers, they will also benefit a lot. That’s all for today’s interview. Thank you!