Harnessing Temporal Information: Methods For Long-Term Dependency Learning In Autoregressive Models