As the name implies, "Object" Recognition, a central part of computer vision, is usually centered around the object. It's what you want to detect after all. Then, what if we told you the environment those objects are in is just as important for your model ?
You might find yourself in a situation where the low environment diversity in your training dataset has a negative impact on your metrics.
Our Kiadam tool, by allowing you to freely choose your backgrounds can help you bypass this constraint. We'll show you how and why it works and help you choose the kind of backgrounds that will best work for you.
Our Experiment
In order to run this experiment, we will use two testing sets, containing the same objects but in two drastically different environments. Down below are excerpts of each testing set. Each has the same four objects (cans of Coke, Pepsi, Schweppes and Sprite) but placed in an apartment in one set and in outdoors in a park in the other.
Want to generate your own Dataset? Follow our comprehensive tutorial
For the training we generated using KIADAM four datasets of 1000 images each. The generation is done by pasting photos of our cans onto the chosen backgrounds combined with various data augmentation techniques.
In this case, the photos of cans as well as the transformations and other parameters are the exact same for the four training sets, the only difference is the backgrounds.
We have :
One with backgrounds from the apartment of the first testing set
One with backgrounds from the park of the second testing set
One with backgrounds from both
A final set with a collection of 120 varied background from https://unsplash.com/, a website for free to use professional photos.The background set contains a mix of indoors and outdoors images unrelated to either of the testing sets. You can download it here
See the results of this generation down below
First Results
We trained YoloV5 on these four training sets using Ultralytics' distribution. We used 150 epochs for each training.
Here is a compilation of our results. The comparison metric used is the mean Average Precision or mAP. The only difference between all training sets is the background, everything else being strictly identical (same number of images, same generation parameters, trained with the same number of epochs...).
Type of Background | mAP - Apartment Testing Set | mAP - Outdoors Testing Set |
Apartment | 0.419 | 0.179 |
Outdoors | 0.443 | 0.776 |
Apartment & Outdoors | 0.545 | 0.713 |
120 Images from Unsplash | 0.594 | 0.64 |
Which backgrounds should you choose?
As you can see, changing the background can cause some wild variations in performance, going from a mAP of 0.179 to 0.713 by simply changing the backgrounds! The standard deviation for the Outdoors testing set is as high as 0.27.
Let's look a bit deeper, and try to figure out which set of backgrounds is the one you should use.
The Apartment Backgrounds set significantly under performs in both testing sets.
The Outdoors Backgrounds set is the best in the outdoors test, but almost the worst in the Apartment test
The "Apartment & Outdoors" as well as the "120 Images from Unsplash" hold up pretty well in both tests.
Let's run a test with all test images at once, outdoors and apartment to see which is globally better
Testing on both sets at once
Type of Background | mAP - Apartment & Outdoors Testing Set |
---|---|
Apartment | 0.270 |
Outdoors | 0.606 |
Apartment & Outdoors | 0.660 |
120 Images from Unsplash | 0.591 |
The Outdoors set and the Unsplash background set are quite close to each other with a mAP of respectively 0.591 and 0.606. However, this is explained by the fact that the set trained with Outdoors background does very well on the Outdoors testing set, and (relatively) poorly on the other.
Is the Outdoors background trained set really this good?
It is usually more common to see cans of soda in an apartment setting rather than a park setting. What would happen if we adjusted the distribution of the testing set to include more apartments testing pictures?
Type of Background | mAP - 75% Apartment Test |
---|---|
Apartment | 0.328 |
Outdoors | 0.521 |
Apartment & Outdoors | 0.602 |
120 Images from Unsplash | 0.576 |
As you can see, the mAP of the Outdoors background trained model crumbles rapidly under these circumstances, while the other two sets are more resilient.