Differentiate Containers Scheduling for Deep Learning Applications

Yun Song, Fordham University


The advent of deep learning has completely reshaped our world. Now, our daily life is fulfilled with many well-known applications that adopt deep learning techniques, such as self-driving cars and face recognition. Furthermore, robotics developed more forms of technology which share the same principle with face recognition, such as hand pose recognition and fingerprint recognition. Image recognition technology requires a huge database and various learning algorithms, such as convolutional neural network and recurrent neural network, that requires lots of computational power, such as CPUs and GPUs. Thus, clients could not be satisfied with the computational resource of the local machine. The cloud resource platform emerged at the historic moment. Docker containers play a significant role of microservices-based applications in the next generation. However, it could not guarantee the quality of service. From clients’ perspective, they have to balance the budget and quality of experiences (e.g. response time). The budget leans on individual business owners and the required Quality of Experience (QoE) depends on usage scenarios of different applications, for instance, an autonomous vehicle requires real-time response, but, unlocking your smartphone can tolerate delays. Plenty of on-going projects developed user-oriented optimization resource allocation to improve quality of the service. Considering the users’ specifications, including accelerating the training process and specifying the quality of experience, this thesis proposes two differentiate containers scheduling for deep learning applications:TRADL and DQoES. In TRADL , developers have options to specify a two-tier target. If the accuracy of the model reaches a target, it can be delivered to clients while the training is still going on to continue improving the quality. If the accuracy of the model reaches a target, it can be delivered to clients while the training is still going on to continue improving the quality. The experiments show that TRADL is able to significantly reduce the time cost, as much as 48.2%, for reaching the target. In addition, DQoES is a differentiating quality of experience scheduler for deep learning applications. DQoES accepts client’s specification on targeted QoEs, and dynamically adjust resources to approach their targets. Through extensive, cloud-based experiments, DQoES demonstrates that it is able to schedule multiple concurrent jobs with respect to various QoEs and achieve up to 8x times more satisfied models when compared to the existing system.

Subject Area

Computer science|Information Technology

Recommended Citation

Song, Yun, "Differentiate Containers Scheduling for Deep Learning Applications" (2020). ETD Collection for Fordham University. AAI27960639.