In achieving structural patterns in parameters, we focus on two challenging cases in which (1) hierarchical sparsity pattern is desired such that one group of parameters is set to zero whenever another is set to zero; and (2) many features that are counts of rarely occurring events are present, and appropriate aggregation of the rare features may lead to better estimation. In either case, the methods under consideration use a tree or a directed acyclic graph (DAG) that encodes relations among parameters as side information. For achieving hierarchical sparsity patterns in parameters, we investigate the differences between group lasso (GL) and latent overlapping group lasso (LOG) in terms of their statistical properties and computational effi...