Learning and convergence properties of linear threshold elements or percept,rons are well understood for the case where the input vectors (or the training sets) to the perceptron are linearly separable. However, little is known about the behavior of a linear threshold element when the training sets are linearly non-separable. In this paper we present the first known results on the structure of linearly non-separable training sets and on the behavior of perceptrons when the set of input vectors is linearly non-separable. More precisely, we show that using the well known perceptron learning algorithm a linear threshold element can learn the input vectors that are provably learnable, and identify those vectors that cannot be learned without co...