We propose the use of sequences of separable, piecewise linear approximations for solving classes of nondiffferential stochastic optimization problems. The approximations are estimated adaptively using a combination of stochastic gradient information and possibly sample information on the objective function itself. We prove the convergence of several versions of such methods when the objective function is separable and illustrate their behavior on numerical examples. We then demonstrate the preformance on nonseparable problems that arise in the context of two-stage stochastic programming problems, and demonstrate that these techniques provide near optimal solutions with a very fast rate of convergence compared to other solution techniques