Types of 3d cameras
In the field of robotics and in engineering finding out the distance of an object can be very handy. For example, if we want to pick up a ball 50m away we first have to get closer to it and then we could pick it up. There are various types of sensors for getting the distance of an object. They can be categorized into the following.
1. Through Time of flight sensor: These sensors can find a distance of a point object in space such as ultrasonic sensors and lidars. By attaching it to a motor, and finding the angle at it displaced each timestamp, we can turn it into a 2D scanner with lidar that has excellent accuracy. We can also scan a 3D environment by adding a servo in the z-axis.
2. Through Depth images: Another excellent way to find the distance between objects is by making a depth map of an environment. Each point in that image map has a number assigned to it which represents the distance of that point from the 3d camera sensor.
Depth images can then be converted into point clouds and then we could visualize the environment and how far the objects are from it.
There are 3 ways to of finding depth images and each has its pros and con.
TOF(time-of-flight) camera:
This is a particular type of 3d camera that gives out values in RGBD format, where RGB represents color intensity, and D represents how far the object is from the pixel.
In addition to the RGB sensors, each pixel has a near-infrared laser that emits lasers and later calculates how long it took the laser to reach the sensor (similar to Lidar, but for each pixel). One such 3d camera is the Azure Kinect depth 3d camera.
3d Cameras like these use less computational power than other 3d cameras, they only need a processor capable of displaying 3D points, and they do not require any further calculations. However, unlike other depth 3d cameras, it cannot be used outdoors as sunrays are capable enough to interfere with the NIR camera not only altering the results but also damaging the sensor.
Stereo Camera:
This is the most prevalent 3d camera setup used in the industry for making depth maps. Unlike it perceives depth using a two-camera setup just like our eyes, and just like our eyes it estimates depth by a technique known as a disparity map. Disparity mapping uses geometry and distance between cameras to estimate the depth using geometry.
First, we have to calibrate our cameras such as remove distortion if there is any such as fish eye or if sensors are not completely parallel to the plane, next we have to accurately measure other parameters such as the distance between cameras, the angle between them, check whether they have equal resolution and color science as well. And after that depth mapping can be done.
This difference in the image will be more if an object is closer to the 3d camera and will be negligible if the object is farther. A computer calculates these differences between the cameras by creating a disparity map showing how far the same pixels are from the center. If the distance is large then the object is closer and depth is represented by 1/disparity.This technique is used by Intel real sense 3d cameras and unlike the above 3d cameras they are not vulnerable to sunlight however these cameras need a very sensitive apparatus to calibrate they should also be on the same plane of the horizon
Depth estimation through CNN models
Depth estimation through CNN models: here depth model is calculated using only a single RGB camera(typically any camera) however it highly relies on heavy neural networking computation. This depth estimation was possible due to training of weights and biases of neural networks on multiple depth datasets some of these datasets are ReDWeb, DIML, Movies, MegaDepth, WSVD, TartanAir, HRWSI and there are many more and the de-facto model for depth estimation is known as Midas https://github.com/isl-org/MiDaS.git. Although AI depth modeling is usable, it is not up to par with the other two cameras, as they rely on heavy calculation, causing serious performance issues. But, with time as processors are getting faster and cheaper, they will replace other 3d cameras and they are still used in many application which does not rely on fps or mostly on still images as they only need a simple image with a simple format of RGB pixels.
1 Comments
So basically Stereo Camera uses Parallax Vision method.
ReplyDeleteThis blog is fantastic