Available courses

Interest in the filtering problem dates back to the late 1930s–early 1940s. It was considered in Kolmogorov’s work on time series and Wiener’s on improving radar communication during WWII, which first appeared in 1942 as a classified memorandum nicknamed “The Yellow Peril”, so named after the colour of the paper on which it was printed.

Kalman extended this work to non-stationary processes. This work had military applications, notably the prediction of ballistic missile trajectories. Non-stationary processes were required to realistically model their launch and re-entry phases. Of course, non-stationary processes abound in other fields–even the standard Brownian motion of the basic Bachelier model in finance is non-stationary. Kalman’s fellow electrical engineers initially met his ideas with scepticism, so he ended up publishing in a mechanical engineering journal. In 1960, Kalman visited the NASA Ames Research Center, where Stanley F. Schmidt took interest in this work. This led to its adoption by the Apollo programme and other projects in aerospace and defence. The discrete-time version of the filter derived by Kalman is now known as the Kalman filter.

The general solutions are, however, infinite-dimensional and not easily applicable. In practice, numerical approximations are employed. Particle filters constitute a particularly important class of such approximations. These methods are sometimes referred to as sequential Monte Carlo (SMC), a term coined by Liu and Chen. The Monte Carlo techniques requisite for particle filtering date back to the work of Hammersley and Morton. Sequential importance sampling (SIS) dates back to the work of Mayne and Handschin. The important resampling step was added by Gordon, Salmond, and Smith, based on an idea by Rubin, to obtain the first sequential importance resampling (SIR) filter, which, in our experience, remains the most popular particle filtering algorithm used in practice.

Markov chain Monte Carlo (MCMC) takes its origin from the work of Nicholas Metropolis, Marshall N. Rosenbluth, Arianna W. Rosenbluth, Edward Teller, and Augusta H. Teller at Los Alamos on simulating a liquid in equilibrium with its gas phase. The discovery came when its authors realized that, instead of simulating the exact dynamics, they could simulate a certain Markov chain with the same equilibrium distribution.

In this course we consider the theory and practice of Kalman and particle filtering and employ probabilistic programming (PP) languages, such as the veteran BUGS/WinBUGS/OpenBUGS and the more recent PyStanPyMC3, and PyMC4 to perform MCMC analysis.

Statistical inference is the process of using data analysis to draw conclusions about populations or scientific truths on the basis of a data sample. Inference can take many forms, but primary inferential aims will often be point estimation, to provide a “best guess” of an unknown parameter, and interval estimation, to produce ranges for unknown parameters that are supported by the data.

Under the frequentist approach, parameters and hypotheses are viewed as unknown but fixed (nonrandom) quantities, and consequently there is no possibility of making probability statements about these unknowns. As the name suggests, the frequentist approach is characterized by a frequency view of probability, and the behaviour of inferential procedures is evaluated under hypothetical repeated sampling of the data.

Under the Bayesian approach, a different line is taken. The parameters are regarded as random variables. As part of the model a prior distribution of the parameter is introduced. This is supposed to express a state of knowledge or ignorance about the parameters before the data are obtained (ex ante). Given the prior distribution, the probability model and the data, it is now possible to calculate the posterior (ex post) probability of the parameters given the data. From this distribution inferences about the parameters are made.

In modern era statistical inference is facilitated by probabilistic programming (PP) languages. The veteran BUGS/WinBUGS/OpenBUGS has been complemented by PyStan and PyMC3; and now PyMC4 is on the horizon.

In this course we will examine both frequentist and Bayesian approaches, explain probabilistic programming (PP) languages, and illustrate their usage with examples.

Time series databases are increasingly being recognized as a powerful way to manage big data. They can be used to instrument, learn, and automate applications and systems, enable real-time and predictive analytics across business processes, and provide operational intelligence to make improved and faster decisions based on what is occurring now, what has occurred in the past, and what is predicted to take place in the future.

Kdb+ is a time series database optimized for big data analytics. The columnar design of kdb+ means it offers greater speed and efficiency that typical relational databases and its native support for time series operations vastly improves both the speed and performance of queries, aggregation, and analysis of structured data.

Kdb+ is also different from other popular databases because it has built-in proprietary languages, k and q, allowing it to operate directly on the data in the database, removing the need to ship data to other applications for analysis.

As the volume of data and speed at which it arrives continue to grow, traditional relational database management systems are facing ever more challenging workloads that they were never designed to support. On the contrary, kdb+ is ideally suited for these increasing demands because of its unique combination of features:

VOLUME: Kdb+/q enables decision-making in real-time on vast volumes of data;

SPEED: Kdb+/q boasts unbeatable speed being an in-memory database with an integrated vector-oriented programming system;

ANALYTICS: Kdb+/q and its associated products, such as kdb+/tick are designed to deliver analytics at speed and in real time.

INTEROPERABILITY: Kdb+/q works in combination with other state-of-the-art technologies, such as Python and Kafka.

Your course has been carefully designed to flatten the usually steep learning curve for kdb+/q. It is updated every year to cater for the latest versions of kdb+/q and to cover the most recent technologies. It has been thoroughly tested at bulge bracket investment banks, sell-side companies, and at leading universities.

REMOTE DELIVERY: While we are happy to welcome you at our office in the heart of Canary Wharf, during the Pandemic we offer fully remote delivery.

BROWSER-BASED: We will go through the kdb+/q installation with you. However, if you prefer, the training can be fully browser-based with no need for local installation.

JUPYTER NOTEBOOKS: The course is delivered using Jupyter notebooks - an industry standard research environment.

NUMEROUS EXERCISES: We have prepared hundreds of exercises to support the learning process. These exercises are an integral part of your course.

PRACTICAL TIPS & TRICKS: We include numerous case studies from practical work and present tips and tricks which are difficult to come by other than through extensive practice.

EXPERIENCED INSTRUCTORS: Our instructors have decades of experience working as quants and developers in the industry - and are eager to share their knowledge.


Time series databases are increasingly being recognized as a powerful way to manage big data. They can be used to instrument, learn, and automate applications and systems, enable real-time and predictive analytics across business processes, and provide operational intelligence to make improved and faster decisions based on what is occurring now, what has occurred in the past, and what is predicted to take place in the future.

Kdb+ is a time series database optimized for big data analytics. The columnar design of kdb+ means it offers greater speed and efficiency that typical relational databases and its native support for time series operations vastly improves both the speed and performance of queries, aggregation, and analysis of structured data.

Kdb+ is also different from other popular databases because it has built-in proprietary languages, k and q, allowing it to operate directly on the data in the database, removing the need to ship data to other applications for analysis.

As the volume of data and speed at which it arrives continue to grow, traditional relational database management systems are facing ever more challenging workloads that they were never designed to support. On the contrary, kdb+ is ideally suited for these increasing demands because of its unique combination of features:

VOLUME: Kdb+/q enables decision-making in real-time on vast volumes of data;

SPEED: Kdb+/q boasts unbeatable speed being an in-memory database with an integrated vector-oriented programming system;

ANALYTICS: Kdb+/q and its associated products, such as kdb+/tick are designed to deliver analytics at speed and in real time.

INTEROPERABILITY: Kdb+/q works in combination with other state-of-the-art technologies, such as Python and Kafka.

Your course has been carefully designed to flatten the usually steep learning curve for kdb+/q. It is updated every year to cater for the latest versions of kdb+/q and to cover the most recent technologies. It has been thoroughly tested at bulge bracket investment banks, sell-side companies, and at leading universities.

REMOTE DELIVERY: While we are happy to welcome you at our office in the heart of Canary Wharf, during the Pandemic we offer fully remote delivery.

BROWSER-BASED: We will go through the kdb+/q installation with you. However, if you prefer, the training can be fully browser-based with no need for local installation.

JUPYTER NOTEBOOKS: The course is delivered using Jupyter notebooks - an industry standard research environment.

NUMEROUS EXERCISES: We have prepared hundreds of exercises to support the learning process. These exercises are an integral part of your course.

PRACTICAL TIPS & TRICKS: We include numerous case studies from practical work and present tips and tricks which are difficult to come by other than through extensive practice.

EXPERIENCED INSTRUCTORS: Our instructors have decades of experience working as quants and developers in the industry - and are eager to share their knowledge.