Computer vision research is tough and more annoyingly, quite competitive. The field moves extremely quickly and many ideas are simultaneously being worked on by amazing groups worldwide. Recently, I wondered if I could satisfy my research inclinations by working on its more applied side, one that was industry facing and more result oriented, rather than academia. It was a lot of fun but I discovered that it was in fact just as tough albeit in a very different way.


When I first started out doing robotics research, I often felt very overwhelmed by the number of papers coming out in machine learning, each of which promised to be the new state of the art. Every once in a while, papers would also claim to change the way we train models altogether, which severely confused this new student in ML. For instance, I remember reading a paper on Grokking, the phenomenon where a model achieves generalization long after it had overfit to a dataset. Now, while I tried to improve the generalization of my model, I wondered, if my training paradigm was what was wrong, not the model. Looking back I see it differently.


Academic research in my view is producing novel insights with the luxury or even the penalty of certain assumptions. Applied research is the translation of these ideas into practice while removing as many assumptions as economically possible.


In the context of grokking, my research question was less about how we can obtain the best metric on a leaderboard, but rather, how we can learn something new about the task at hand using many of the same assumptions as prior work - including their training paradigm. Academic research is very focused and very precise, to the point that it’s okay if you ignore developments like grokking for a project. One of my advisors at that time pointed out that if it doesn’t singularly furthen that one research question, it’s maybe not that relevant. Further, if you’re approach is changing too often as a result of new papers, maybe the original question itself is too noisy.


On the other hand, applied research is a lot less focused. For instance, in robotics, a lot of research assumes you have a monocular camera and tries to push 3D perception using just that. In practice, self-driving cars and trucks have multiple cameras, LiDARs, and other redundant sensors. In this case, the assumptions made in academia make the problem tougher than it is in real life. So, when we move to the industry, the toolkit to work with is a lot wider, the available techniques more, and the possibility for poor results higher as well since it’s out in the real world. As a result, the type of creative thinking you do is very different too.


I interned at a self driving truck startup this summer - Plus AI - and this was one of the surprising things I learnt. If a lot of academic research noise is determining what to work on, applied research noise is determining what ideas potentially matter. In academia, it’s okay to work with select datasets but in the industry that’s not enough. Hence, during the internship I found it quite confusing to have a clear idea on how best to frame a problem. At one point I was thinking about 3D object reconstruction on highways and just felt confused at the wide variety of solutions that I could’ve used (Optimization based, feed-forward, 2D priors, 3D priors, etc). The question was no longer, what’s novel enough for a conference, but what was good enough to work for many months/years. To be honest, even after the internship I’m not completely sure! All I know is that the best skill to have regardless of what you’re doing is understanding concepts at various levels of abstraction while being opinionated about them.