The Ellipsoid method is an approach proposed by Schor [1] to solve convex optimization problems. It was further developed by Yudin and Nemirovskii [2], and Leonid Khachiyan [3, 4] later used it in his derivation of the first polynomial-time algorithm for linear programming. On a theoretical basis, Khachiyan’s ellipsoid algorithm has polynomial bounds and is thus superior to the Simplex algorithm [5] as a linear program, however in practice the Simplex algorithm is more efficient. Despite this, Khachiyan’s ellipsoid method is of fundamental importance to the analysis of linear programs and more generally in combinatorial optimization theory. In this post I summarize an approach [6-10] based on Khachiyan’s algorithm. I also wrote a small c++ library that implements the algorithm. In part II of this post, I discuss the better-known Karmarkar’s algorithm [11], which was the first efficient polynomial time linear program.

# A primer on deep reinforcement learning

Reinforcement learning methods are an class of algorithms used to learn from an agent’s experience in an episodic setting. Much like how we learn from experience, an RL algorithm learns by sequentially executing actions, observing the outcome of those actions, and updating its behavior so as to achieve better outcomes in the future. This approach is distinct from traditional supervised learning, in which we attempt to model some unobserved probability distribution from a finite set of observations. In a certain sense, RL algorithms generate their own labels through experimentation. As labels are generated only through simulation, though, two characteristics of RL methods become evident: (i) sample inefficiency–a single observation may require a very large number of actions, and (ii) reward-allocation enefficiency–for instance, although a sequence of chess moves may have produced a win, it says nothing about the relative quality of each individual move; thus, we cannot avoid reinforcing bad actions and penalizing good ones. The former can be viewed as a problem of sparse rewards while the second one introduced noise into the signal that we use in optimizing our learning model. Despite these drawbacks, RL methods are unique in that they enable us to optimize an agent to perform complex tasks of our choosing, such as playing chess, walking, or playing video games, and not just map inputs to outputs in a dataset. Within RL there are a few different entities that we can try to learn. Many of the modern RL methods, such as the actor-critic approach, combine them. Here we’ll discuss some RL basics that I’ve used in developing a super-human checkers bot.

# OrbTouch -> using deformation as a medium for human-computer interaction

This mini-post explores the use of deformation as a medium for human computer interaction. The question being asked is the following: can we use the shape of an object, and how it changes in time, to encode information? Here I present a deep learning approach to this problem and show that we can use a balloon to control a computer.

# A comparison of robust regression methods based on iteratively reweighted least squares and maximum likelihood estimation

Time series data often have heavy-tailed noise. This is a form of heteroskedasticity, and it can be found throughout economics, finance, and engineering. In this post I share a few well known methods in the robust linear regression literature, using a toy dataset as an example throughout.

# Cephalopod-inspired mechanical systems

With the exception of computer automation, the design of mechanical systems has not fundamentally changed since the industrial revolution. Even though our philosophy, aesthetics, and tools are far more refined and capable, we still design mechanical systems around bars, linkages, and motors. Similarly, electronic systems have largely been constrained to rigid circuits. These constraints are the central challenge being addressed by two emerging fields: soft robotics and soft electronics. In this post I’ll talk briefly about why non-rigid systems are interesting (and potentially useful), and present results from recent work that I published in Science Magazine on this topic.