SOURCE IEEE/CIC ICCC’21, Xiamen, China, 28-30, July 2021
Distributed computing can well exploit the computation resources in edge and cloud for many applications of large-scale machine learning, which also raises concerns on data privacy and straggling effect. A promising method to address these issues is using codes. In our work, we design a general computation framework that incorporates multi-stage computing tasks with multiple inputs and can be expressed as a multi-variable arbitrary-degree polynomial function f, with N distributed servers as workers, over a batch of data D that consists of data from different sources. We propose a privacy-preserving and straggler-robust coding scheme based on Lagrange polynomials, which can address up to S straggling workers and up to L colluding workers. We prove the optimality of the proposed scheme in terms of downlink communication efficiency, defined as the amount of bits of desired results versus that of the downloading results, and obtain an explicit expression of the capacity: C = N-Sd-(Nd(-LS-)1)-1, which is the supremum of downlink communication efficiency over all feasible encoding schemes, and d is the degree of function f.