Dynamic Discrete Choice Estimation using Reinforcement Learning with Applications in Online Food Markets