Optimal binary labelings, input distributions, and input alphabets are analyzed for the so-called bit-interleaved coded modulation (BICM) capacity, paying special attention to the low signal-to-noise ratio (SNR) regime. For 8-ary pulse amplitude modulation (PAM) and for 0.75 bit/symbol, the folded binary code results in a higher capacity than the binary reflected gray code (BRGC) and the natural binary code (NBC). The 1 dB gap between the additive white Gaussian noise (AWGN) capacity and the BICM capacity with the BRGC can be almost completely removed if the input symbol distribution is properly selected. First-order asymptotics of the BICM capacity for arbitrary input alphabets and distributions, dimensions, mean, variance, and binary labe...