This thesis describes methods used to optimize energy consumption of an offine bipedal walking trajectories through hip height control. The experiments were carried out on a miniature humanoid robot within the simulation environment Webots. Zero Moment Point (ZMP) preview control methods were implemented in Matlab to produce a stable walking trajectory for the robot with a fixed hip height. The hip height trajectory was then developed using an observation based Q-learning method that consider both stability and energy consumption. Through the Q-learning methods there was approximately a 9% decrease in the average energy consumption. Additionally, an increase in stability was observed.M.S., Mechanical Engineering and Mechanics -- Drexel Univ...