In the past decade Machine Learning (ML), especially deep learning, has brought us many successful data-driven AI applications. Many real-world data are intrinsically sequential, for example, text, speech, music, time series, DNA sequences and unfolding of events. However, conventional deep learning methods can process only short sequences up to a few thousand steps. The existing approaches often face challenges like slow inference, vanishing (and exploding) gradients and difficulties in capturing long-term dependencies. In this project we develop a scalable machine learning method which enables efficient and accurate inference for very long sequences up to millions or even billions of steps. At the end of the project, we will deliver a versatile ML framework based on deep neural networks, as well as its efficient optimization algorithms, computer software, and visualization tools. Our research findings will be applied to two focus areas: 1) microbiology and infectious disease epidemiology and 2) remote sensing pattern recognition. Moreover, because long sequential data are commonly available in many areas, our method can be applied as a critical component in a wide range of tasks including scientific research, next-generation DNA sequence analysis, natural language processing, financial data analysis, market studies, etc.
Project leader: Zhirong Yang
Institution: Institutt for datateknologi og informatikk