Compiler optimization passes employ cost models to determine if a code transformation will yield performance improvements. When this assessment is inaccurate, compilers apply transformations that are not beneficial, or refrain from applying ones that would have improved the code. We analyze the accuracy of the cost models used in LLVM's and GCC's vectorization passes for two different instruction set architectures. In general, speedup is over-estimated, resulting in mispredictions and a weak to medium correlation between predicted and actual performance gain. We therefore propose a novel cost model that is based on a code's intermediate representation with refined memory access pattern features. Using linear regression techniques, this plat...