June 28, 2023

What's the difference between vision-based and LiDAR-based SLAM?

For the past few decades, our society has been pioneering  robots that can automate tasks across industries like vehicle manufacturing, retail, and security. However,  those tasks are often extremely repeatable. Take part A and weld it to part B in exactly the same spot every time. We covered this broadly in a previous post about automation vs autonomy. However, when you start creating an autonomous system to enable a robot to explore  an unknown environment, the robot will need to understand where it is inside that environment without a human in the loop. 

An "easy" way to solve this problem is to track the robots movements, say for instance the revolutions of a wheel. A tool called an encoder is used to track how far the wheel is revolving and using that data you can extrapolate the robot's position. However, you can imagine this approach isn't foolproof. Very simply, the wheel could slip on a surface and cause a misreading, or the encoder could even slip up. We generally refer to this as an encoder slip. These slips cause uncertainty in the robot's position meaning it could run into a wall or even fall off a surface. 

SLAM algorithms take odometry and IMU data to determine the robot's position and orientation (pose)

The best way to combat that uncertainty is by adding an additional sensor, normally an internal measurement unit, or IMU. This sensor detects any motion the robot makes, so now you have a new data set to compare to the encoder. So let's walk through that. The robot's IMU detects that it's moved forward one foot. The robot's autonomy algorithm can then check that movement against the encoder data, did it indeed move? If that is correct, the robot has a better understanding of how it's moving through the environment. 

That is, at a very simplistic level, how most autonomous robots understand and move through the world. They detect motion and then check their sensors to validate where they are relative to other objects. This is the foundation to an algorithm used in a lot of autonomous robots, Simultaneous Localization and Mapping, or SLAM. And today we see robots with an array of sensors autonomously navigating underwater, flying in GPS-denied environments, or scanning whole buildings. 

You can think of those sensors like eyes. And there are so many types of sensors that help robots "see" the world around them. But for this blog post we wanted to focus on two of the most common types: vision-based and LiDAR-based SLAM. 

What is vision-based SLAM? 
Vision-based SLAM creates a map of its environment using visual information captured by monocular, stereo, or RGB-D cameras. In this approach, the camera captures images of the environment, and an algorithm extracts features or landmarks from them. These features are then matched across consecutive images to estimate the camera's motion and build a map of the environment.

Some familiar examples of this are the Ingenuity robot currently still (!) flying on the surface of Mars. The robot uses a downward facing camera to identify unique geometric features on the Martian surface and track visual odometry through that. 

mars-helicopter-1Image of NASA's autonomous Ingenuity helicopter currently flying on Mars.

Thanks to the small size of camera sensors today, autonomous robotic platforms can be made smaller and lighter while not compromising on their capabilities. However, it does come with limitations. Robots using vision-based SLAM require a well lit environment in order to operate and can run into issues if it confronts fog, dust, or particulates in the air. Glare and other environmental factors can affect the SLAM as well. 

What is LiDAR-based SLAM?
LiDAR-based SLAM is almost the exact same algorithm used for vision, except it uses a LiDAR sensor as its "eyes". A LiDAR sensor emits laser pulses and measures the time it takes for the laser to reflect back from surrounding objects. By analyzing where these points are in relation to the robot's internal measurement unit (IMU), the LiDAR can create a 3D point cloud map of the environment. The robot then uses this point cloud map to estimate its position and intelligently navigate its environment. Then once the vehicle lands, or completes its scan of the area, it uses the position and orientation data to further refine the point cloud map and create a more accurate representation of the environment. 

The ExynAero, with a gimbaled LiDAR sensor mounted in font for maximum FOV.

One major benefit of using a LiDAR-based SLAM platform is that the sensor can operate in an environment with little to no light at all. This is crucial for data capture in search & rescue situations and also crucial for industries such as underground cavity monitoring in the mining industry. 

Lidar-based SLAM is widely used in autonomous driving, robotics, and mapping applications. It has several advantages over vision-based SLAM, including the ability to operate in low-light or no-light conditions and to generate highly accurate and detailed maps of the environment.

Which is better: LiDAR-based SLAM or vision-based SLAM? 
There is no clear winner between Vision-based or LiDAR-based SLAM, as both have their strengths and weaknesses and are better suited for different applications and environments.

Vision-based SLAM has the advantage of being less expensive and easier to implement, as it uses standard cameras that are widely available. It can also provide detailed visual information about the environment, such as texture and color, which can be useful in some applications. However, Vision-based SLAM can be less accurate than LiDAR-based SLAM, especially in low-light or dynamic lighting conditions, and can be more sensitive to visual occlusions or cluttered environments.

LiDAR-based SLAM, on the other hand, has the advantage of providing highly accurate and precise 3D maps of the environment, even in low-light or no-light conditions. It is also less sensitive to visual occlusions or cluttered environments and can be more robust to changes in lighting conditions. However, LiDAR sensors can be expensive and require significant computational resources to process the large amounts of data they generate.

The Power of ExynAI
One of the benefits of our autonomy software stack – ExynAI  – is that it can fuse any type of visual sensor (RGB cameras, LiDAR, thermal, etc) and use that as the 'eyes' of the SLAM pipeline. At the moment we've successfully integrated our software with a LiDAR-based SLAM robot. But in the future we have plans to create smaller platforms that utilize the visual spectrum for close-quarters industrial inspection. 

Once we've tackled more of these fundamental problems with building an autonomous robot with a variety of sensors, there's almost nowhere our SLAM pipeline wouldn't be able to map. Sending an autonomous robot underwater? You bet! There are already robust sensors in development that can see father and crisper underwater. Well certainly this couldn't be used in space? Think again! 

Right now, humanity is very good at sending a rocket from Earth orbit to, say, Mars. But if we had to dock with a space station in orbit once we got there, that would require precise human control. However, if we were to equip that ship with ExynAI, our autonomy could take over when the ships were close enough – like an autopilot taking over for landing a plane – and successfully dock the ships in orbit. These are exactly the types of problems we need to solve in order to bring our scifi vision of the future closer to reality. 

LiDAR-based SLAM is a powerful tool that can benefit a wide range of industries and applications, from industrial inspection to underground mining. If you're interested in learning more about how LiDAR-based SLAM can help your organization, we encourage you to book a demo and see the power of true autonomy in action. 

Subscribe to email updates