A team led by Georgia Tech mathematician Anton Leykin has developed a powerful new technique for solving problems related to 3D reconstruction. The research team’s open access paper, "Learning to Solve Hard Minimal Problems", has also won the prestigious best paper award at CVPR 2022, the Computer Vision and Pattern Recognition Conference (CVPR) — selected from a pool of over 8,000 papers submitted this year.
The team’s research idea revolved around developing a new way to solve a family of problems known as hard minimal problems, which are essential for 3D reconstruction. “A minimal problem is a smallest geometric problem one can consider in the 3D reconstruction context,” Leykin explained. “For example, recovering a 3D scene consisting of 5 points from 2 views (2-dimensional images of 5 points in the plane) without knowing the relative position and orientation of the second camera with respect to the first.”
In other words, the problem focuses on “solving” how to see in three dimensions by analyzing multiple two-dimensional perspectives — this is how humans and self-driving cars see in 3D. One way to understand this is by imagining our eyes as cameras. Both eyes capture two-dimensional images, each from a slightly different perspective. By considering the perspective of the image sent by each eye, our brains create a 3D rendering of these two-dimensional images. While our brains might do this with seeming ease in the case of our vision, solving these problems mathematically can be more difficult.
Petr Hruby, currently a Ph.D. student at the ETH Zurich Department of Computer Science, with a recent Master’s degree from Czech Technical University, serves as the paper’s lead author. He is joined by co-authors Leykin, a professor in the School of Mathematics at Georgia Tech; Timothy Duff, NSF Postdoctoral Fellow at the University of Washington (Georgia Tech Ph.D. in Algorithms, Combinatorics, and Optimization, 2021); and Tomas Pajdla, professor at the Czech Technical University in Prague Czech Institute of Informatics, Robotics and Cybernetics. The core of the team started working together during the Institute for Computational and Experimental Research in Mathematics (ICERM) semester on Nonlinear Algebra in 2018, of which Leykin was the primary organizer.
After their first project won the best student paper award at the 2019 International Conference on Computer Vision (ICCV), the team decided to pursue research in hard minimal problems.
Since the technique the researchers developed is general, Leykin said it can be applied to many other situations with similar mathematical problems. In addition, the software pieces derived from the researchers’ findings are in the public domain, and can be used by a broad computer vision community.
Solve-and-Pick vs. Pick-and-Solve
Solving minimal problems can be difficult, because they often have many spurious solutions (solutions that might solve the equation, but are ultimately unhelpful or unexpected).
Previously, the state-of-the-art technique for solving minimal problems used a “solve-and-pick” approach. Solve-and-pick involves first determining all of the possible solutions to a problem, and then picking the optimal solutions — this is done by removing non-real solutions, using inequalities, and evaluating how well they support the solution. But, when there are many spurious solutions, this type of optimization can be costly and time-consuming.
Instead of using this traditional solve-and-pick approach, the researchers investigated the opposite: a “pick-and-solve” technique that learns, for a given data sample, how to first pick a promising starting point and then continue it to a meaningful solution. This approach is unique in that it avoids computing large numbers of spurious solutions.
By selecting a suitable starting point and solving from that point (instead of solving from all points), the method can quickly find and track a path to the solution more quickly, learning how to find that target solution more efficiently.
“Instead of finding all possible solutions and then deciding which one is relevant, we aimed at ‘guessing’ which path leads to one physically meaningful solution — as long as the guess is correct with high probability, this becomes practically useful,” said Leykin. “For a ‘hard’ minimal problem, this is like finding a needle in a haystack — we need to guess one correct path out of several hundreds.”
To do so, the research combined concepts spanning several fields of mathematics: algebra, geometry, numerical analysis, and statistics. Computer science and engineering components also played a vital role: “We had to use neural networks for one particular task and, of course, implement the algorithms efficiently,” Leykin said. Since the minimal problem solvers are executed as subroutines millions, billions, or trillions of times, efficiency was essential.
Solving the hard problems
To test their method, the researchers developed a solver using their pick-and-solve technique for a well-known problem in the field. They benchmarked and studied their engineering choices with another familiar problem.
Finally, they applied their technique to a harder problem – reconstructing a 3D view using four 2D points in three views. The researchers’ implementation of their method solves this problem in about 70 microseconds on average – ten times faster than any other method.
The team hopes that their solution could change how these problems are approached and solved in the future. “Previously, ‘hard’ minimal problems were avoided in practical applications, since there were no fast reliable solvers for them,” Leykin said. “We hope that, over time, our work will convince the industry to reconsider – the ‘hard’ problems are not that hard after all!”
Hruby, Petr & Duff, Timothy & Leykin, Anton & Pajdla, Tomas. (2021). Learning to Solve Hard Minimal Problems.