Note that since the xi's are known constants, this is a univariate polynomial with unknown y and with degree n. Now you can note that the coefficient of yk in P(y) is exactly Snk, so to evaluate all the Sn0,…,Snn, it suffices to compute P(y).
This makes it possible to compute P(y) in O(nlg2n) time: build a balanced binary tree of polynomials with the (1+xiy)'s at the leaves, and multiply the polynomials. Multiplying two polynomials of degree d takes O(dlgd) time using FFT techniques, so we get the recurrence T(n)=2T(n/2)+O(nlgn), which solves to T(n)=O(nlg2n). For convenience, I am ignoring poly(lglgn) factors.
If you care about the case where k is very small, you can compute Sn0,…,Snk in O(nlg2k) time using similar tricks, keeping in mind that you only care about P(x)modyk+1 (i.e., throwing away all terms of yk+1 or higher powers of y).
Of course, the FFT uses subtraction, so naively it's not expressible in a monotone circuit. I don't know whether there's some other way to multiply polynomials efficiently with monotone arithmetic circuits, but any efficient monotone method for polynomial multiplication immediately leads to an algorithm for your problem as well. So, lower bounds on your problem require/imply lower bounds for polynomial multiplication.