With the massive advancement towards the development of autonomous driving systems, no one today denies or questions the practicality of the driverless vehicles; however, in a restricted ODDs. The launch of Audi’s A8 featured with level 3 functions that hit the deployment stage in mid-2017, had increased the confidence among majority of the OEMs and tier-1s; however, the level 4 and 5 vehicles still need enough time and testing to get on public roads.
Perception is the basis for a vehicle to be able to drive itself (without a driver). The vehicle with high automation should be trained enough to track, classify, differentiate the objects in the vicinity in order to decide on its course of action.
Moreover, envisaging the path of moving entities is determined as the next most important ability to be acquired by highly automated vehicles. This could be attained by rigorous testing and validation under enormous datasets including multiple scenarios. Data annotation or labeling of objects plays a vital role in this by automating and fast tracking the process.
What is Annotation?
Annotation is the process of labeling the object of interest in the image or video by using bounding boxes to help AI or Machine Learning models understand and recognize the objects detected by sensors.
In the ADAS development process, high volume of data is acquired from the test fleet through the cameras, ultrasonic sensors, radar, LiDAR, and GPS, which is then ingested from vehicle to the data lake. This ingested data is labeled and processed to build a testing suite for simulation, validation and verification of ADAS models. In order to get autonomous vehicles quickly on public roads, huge training data is required, and the current shortage of it, is the biggest challenge.
Type of Labeling! Who are the developers?
Huge amount of rich and diverse labelled data is the most precious asset require for training and validation of autonomous vehicles. Ground truth annotation involves collection of the information on location, allowing the image data to relate to the reality on ground.
This annotated data assist in training and validating the perception and prediction models with precision. For autonomous vehicle, ground truth labeling helps in annotating urban scenarios, highway environments, road markings and sign boards, and different weather conditions that enables to efficiently train and detect moving objects.
Manual labeling of this huge datasets requires significant resources, time and money. Several automation software tools and labeling apps that have evolved recently provides frameworks to create algorithms to automate the labeling process, ensuring the same precision and safety.
Some of the open source automatic annotation tools include Amazon SageMaker Ground Truth, MathWorks Ground Truth Labeler app, Intel’s Computer Vision Annotation Tool (CVAT), Microsoft’s Visual Object Tagging Tools (VoTT), DataTurks, LabelMe, Fast Image Data Annotation Tool (FIAT), COCO Annotator, Scalabel by DeepDrive, RectLabel, and Cloud-LSVA.
Competition in the AV Industry
The autonomous driving race between different players in the ecosystem is becoming aggressive to showcase the most precise and fluent system capable of operating in any weather conditions.
Majority of the players are adopting Artificial Intelligence (AI) and Machine Learning (ML) to train their AVs. Huge data captured from sensors needs to be labeled or annotate to accurately train these machine learning models. This market holds billion-dollar business potential behind the actual AV industry.
Majority of automotive OEMs/ Tier-1s have started outsourcing the data labeling, while few of them find it painful paying third parties and hence preferred in-house data annotation. For example,
Waymo with highest number of autonomous test miles traveled, have in-house annotation datasets of approximately 25 million 3D bounding boxes and 22 million 2D bounding boxes.
Also, Tesla has 1.3 million miles of data gathered from its Autopilot equipped vehicles.
As companies are stepping towards the production stage of AVs, the data annotation requirement is scaling up exponentially. It becomes challenging for the companies to internally meet this mounting demand of training datasets and hence the companies are moving towards outsourcing of annotation data.
Specialized annotation companies serving in the self-driving industry includes CMORE Automotive, Understand.ai, annotell, and FEV Group from Germany; United States based Cogito Tech, Scale AI, Anolytics, Basic AI, Deepen.ai, Samasource, Inc., Appen, Lionbridge Technologies Inc.; Playment, mCYCLOID, GTS Ltd, Infolks Group, and Oclavi are few of the well-known companies headquartered in India.
There has been a massive development in the data labeling industry from past two to three years in India. Several start-ups have emerged in this region making it a hub for ML datasets with quality and innovative solution offerings. CMORE Automotive, a well-known German software tools and measurement systems provider has formed a joint venture with Expert Global Solutions (EGS) based in Aurangabad, India to form ‘EC. Mobility’ which is focused on autonomous driving data annotation.
Other companies with high growth potential in this field includes Egypt based Avidbeam, Israeli Dataloop, and Canada’s Awakening Vector. Amongst these Avidbeam has comparatively more years of experience with 30+ experts or engineers working on the annotation database serving industries such as smart cities, retail, automotive, industrial and consumer space.