Interview of the champion team in “Short Video Face Parsing Challenge”

  • Posted by 马达爱
  • September 23, 2020
  • Updated 12:39 am

Exclusive share from ZTE AI masters

AI competitions provide a chance for many industry workers and college students to prove their strength and hone their skills.

Recently, the “Short Video Face Parsing Challenge”, sponsored by Madacode, NVIDIA Inception Program and China Association of Image and Graphics, was held from late April to mid-June, and received much attention.

Face analysis is a face-centric analysis, widely used in virtual reality (VR), video monitoring, entertainment and social, facial expression analysis, etc. It means to divide the face captured in the image into several semantically consistent areas, such as eyes, mouth, etc. As a fine-grained semantic segmentation task, it is more challenging than looking for face contours and key points.

The rich results of this competition have promoted the academic communication between industry insiders and researchers, and promoted the development of graphics and image technology in China. It attracted more than 100 teams, 30 of which made it to the finals, and 21 of them won prizes, with the top three winning abundant bonuses.

Today, let us listen to the back story of the team that won the championship. We interviewed members of the champion team “AI-Jieson”, who are from ZTE Corporation. Let us walk into the AI world, and learn more about ZTE AI business!

Hanna: Hello, I’m Hanna from Madacode, and I’m the host of this interview. Welcome Zheng Chengjian and Wang Bofei who are from AI-Jieson team! Congratulations on winning the champion in the “Short Video Face Parsing Challenge”. Thank you for taking the time to speak with us and share your experience related to the competition. I know that you are from ZTE Corporation, so today I also want to talk to you about the application of AI in the field of communication. First of all, in order to have a better understanding of you, please introduce yourself!

Zheng Chengjian: Hello, my name is Zheng Chengjian. I have graduated from Shanghai University and I am now working in ZTE Corporation. The main direction of my work is video image processing, such as target detection, target segmentation, face recognition and so on.

Wang Bofei: Hello, my name is Wang Bofei. I am a graduate student from Huazhong University of Science and Technology. Now I am also doing video image processing and analysis in ZTE corporation. We are in the same team in the company, which is the algorithm for video processing and analysis.

Hanna: May I ask how you know about this competition? Why did you take part in the competition?

Zheng Chengjian: One of our colleagues is a member of The Chinese Society of Image and Graphics. He saw the email sent by the society and learned that there was a competition. It happened that we were working in this field as a team, so he shared this information with us. After we saw it, we thought it was quite interesting. The algorithm was used to divide different semantic parts like facial features, so that we could do some intelligent beauty makeup, facial expression recognition, cartoon image and other applications. In addition, as just mentioned, we have done work in related fields, including target detection, image segmentation, etc., which is quite consistent, so we think we can apply our knowledge to this scene. Then I chose to participate in this competition.

Hanna: There are more than 100 teams participating in this competition. It’s very difficult f to win the first prize. Do you have any secrets of winning that you can share with us? Have you worked in any particular way?

Zheng Chengjian: In fact, there is no “secret”. It is mainly accumulated in ordinary work. At the beginning, we directly used the instance segmentation network for the whole image transfer learning, and soon found that the background part was not clean enough, and there would be some false detection, misjudging the background as some parts of the face, resulting in wrong labels. Then we came up with the idea of first detecting the head and only dividing the area of the detection box, so that the background would be clean and the effect would be improved immediately. We also refer to the fusion idea used in the previous video segmentation, and perform simple fusion processing on the segmentation results, so as to further improve the face resolution accuracy. In addition, there is the network training tuning skills, team members communicate more with each other, to avoid repeated experiments, after all, the time is relatively tight.

Hanna: Can you share with us what you have gained from this competition? Is it helpful to your work?

Zheng Chengjian: I have gained a lot. First of all, Thank you very much for your hard work. I am very happy to take part in this competition and win the first prize. Through this competition, we have further mastered the segmentation network. In the past, we used to segment larger targets in the video, such as pedestrians, cars, animals, etc. This time, we segmented more fine-grained targets in different parts of the face. It is very helpful for our work, because we can apply the facial analysis results to intelligent beauty makeup and other aspects. If you’re doing face detection or face recognition, for example, you can use this to amplify the face data, experiment with the effects of different makeup looks on face recognition, and compare the effects of images that are beautified and images that are not beautified in face recognition, it’s really helpful. Because image segmentation is a relatively basic technique in visual analysis, it has many applications. In fact, our team is also doing such a core technology accumulation in the company, which can be used to divide this technology in the company’s applications such as smart phone beauty, video conferencing system, such as the background replacement we use now, etc.

Hanna: It seems that both of them got a lot out of the competition. Next, I want to talk about ZTE. What is your demand for AI data when you work in ZTE? Does ZTE have a dedicated data department?

Wang Bofei: Now the industry has reached a consensus that data is an important component of AI. If the data is not good enough, it is actually very difficult to do the algorithm research and the final business expansion. Therefore, the company attaches great importance to this aspect. For all kinds of AI data, including structured and unstructured data, there will be a special big data product line from collection and annotation to a series of subsequent storage, data security, data development and data opening. The company as a whole is more important. And now in the domestic should have also been dozens of bureau point’s wide application.

Hanna: It can be seen that ZTE has a lot of demands and applications in data. Could you please tell us something about ZTE’s research and development in the AI field?

Zheng Chengjian: ZTE has been investing in AI for a long time and there are a lot of teams working on this. Al technical Expert Committee was established to take charge of the technical planning of the overall AI of the company. There are more than 1,000 r&d personnel in the AI field to build a comprehensive AI technical capability, covering all levels of underlying Al hardware capability, AI technical framework, AI algorithm, AI application and so on. The company also has a unified AI big data platform to support the application of operators, government and enterprises, terminals and other fields.

There is also a unified AI framework. We have a unified development and deployment architecture, which supports mainstream deep learning frameworks such as TensorFlow and Caffe, and machine learning frameworks such as Sklearn and Spark ML Lib. We have carried out self-research and optimization on key technologies such as parallel computing acceleration and reasoning pruning.

Our team mainly focuses on Al algorithm components: we have developed a large number of algorithms, including communication network, audio and video, natural language processing, robot motion and other fields.

In terms of AI application, it mainly covers intelligent network application, intelligent industry application, intelligent terminal application, etc., and fully serves the needs of various 2B2C customers. Accumulated hundreds of artificial intelligence related patent applications, and actively build independent intellectual property rights, in a number of domestic/international standard and open source as a important position in the organization, such as the ITU (international telecommunication union) – T ML5G architecture group chairman, ETSI (European telecommunications standardization association) ISG ZSM foundation project founding member, the Linux AI directors and chairman AdIik project (optimization) model of TSC, is the domestic and foreign standards/open source active contributor and participant of the organization.

Hanna: What do you think of the application of AI in the field of communication, and what are your prospects for its future?

Zheng Chengjian: AI applications have broad prospects for development in the field of communication, introduce the AI technology communication network can provide 5 g era network operations with a new ability, will play an important role from four aspects: reducing simple repetitive network prospectively prevention before operation, based on historical data prediction, the high complexity of multidimensional analysis and resources and to seek the optimal solution of business requirements. At present, ZTE has planned 60+ NETWORK AI scenarios, covering the full business life cycle of planning, construction, maintenance, optimization and operation. It has conducted extensive cooperation with global operators and implemented more than 40 COMMERCIAL and pilot CASES of AI. Looking into the future, the evolution of communication network towards intelligent and autonomous direction is the general trend. The next 10 years will be a key period for the intelligent transformation of operator network. ZTE is willing to work with operators and partners from all walks of life to jointly promote the development of AI ecology.

Hanna: What do you think of the application of AI in the field of communication, and what are your prospects for its future?

Wang Bofei: AI is really a very hot direction in recent years. As you can see, there are a lot of experiences on the Internet. From our personal point of view, [we should] throw away some of the things we agree on. We think more important point, if it is as a new entry, should go to more hands to try some. Because “the paper is light”. Start from some simple problems, such as dichotomy, handwriting recognition (start), oneself first get through the whole process. From the beginning of data preparation and construction, including network construction, as well as network forward reasoning and back propagation. First through some simple problems, the entire AI deep neural network system training process to understand skillfully. On the basis of this step, it can be combined with some need to solve the problem by themselves, the reference of some papers online, or some open source code to in-depth (research), such as building more complex networks, and to prepare more sufficient data, as well as the data cleaning, the network call details, including the optimizer copy parameters of unit 1.More hands-on, gradually deeper, so that from the beginning, to become familiar with the field, further to the expert. There are two aspects of data annotation that we think are important. The first is the rule of annotation. Because the data itself is very relevant to the business scenario or research content, the rules need to be clearly defined at the beginning, such as which features should be marked. In addition, specific examples should be pointed out clearly, otherwise different people may have differences when marking. For example, for the labeling scheme of face detection, if the rules are not clear, some people may have a larger frame, while others have a smaller frame, which may not be good for the overall training.

The other thing that’s important now for tagging is to develop some good tagging tools, or to download some good tagging tools from the Internet, and then make improvements, because that’s also important. In fact, our company also attaches great importance to this aspect now. Our data scale is large, and our team has some special semi-automatic annotation platforms for big data. It is to use your own tools or semi-automatic processing, and then manually modify the results. This can save a lot of time, and the accuracy of annotation is more guaranteed. To be specific, your company, Motor Intelligence number, may be specialized in this business and may have a little more experience than ours.

Hanna: Thank you, everyone has shared a lot today. I’m very happy to chat with you, and you have a deeper understanding of AI in the field of communication. Thank you again for your time. I wish you continue to achieve excellent results in the finals and look forward to more cooperation opportunities with you in the future! Good bye!

Leave a Reply

Your email address will not be published. Required fields are marked *

KEYWORDS: ChatGPT, GPT4, LargeLanguageModels, AIForFashion, ChatGPTfashion, VirtualAssistant Recently, "ChatGPT fashion" has become a phenomenal trend on Tik Tok, with over
KEYWORDS: ChatGPT, Chain-of-Thought, fine-tuning, GPT-4Just a few days ago, OpenAI co-founder Greg Brockman demonstrated ChatGPT’s new plug-in capabilities live at
KEYWORDS: ChatGPT,  LLM,  GPT-4,  GitHub, Dingtalk,  data Recently, New applications related to GPT are constantly being updated. OpenAI’s co-founder Greg