top of page
Writer's pictureRémi Coscoy

The importance of Backgrounds in Object Recognition with KIADAM

Updated: Aug 29, 2023

As the name implies, "Object" Recognition, a central part of computer vision, is usually centered around the object. It's what you want to detect after all. Then, what if we told you the environment those objects are in is just as important for your model ?

You might find yourself in a situation where the low environment diversity in your training dataset has a negative impact on your metrics.


Our Kiadam tool, by allowing you to freely choose your backgrounds can help you bypass this constraint. We'll show you how and why it works and help you choose the kind of backgrounds that will best work for you.

Our Experiment

Run it yourself with our Train and Test Google Colab notebooks

In order to run this experiment, we will use two testing sets, containing the same objects but in two drastically different environments. Down below are excerpts of each testing set. Each has the same four objects (cans of Coke, Pepsi, Schweppes and Sprite) but placed in an apartment in one set and in outdoors in a park in the other.




Want to generate your own Dataset? Follow our comprehensive tutorial

For the training we generated using KIADAM four datasets of 1000 images each. The generation is done by pasting photos of our cans onto the chosen backgrounds combined with various data augmentation techniques.


In this case, the photos of cans as well as the transformations and other parameters are the exact same for the four training sets, the only difference is the backgrounds.

We have :

  • One with backgrounds from the apartment of the first testing set

  • One with backgrounds from the park of the second testing set

  • One with backgrounds from both

  • A final set with a collection of 120 varied background from https://unsplash.com/, a website for free to use professional photos.The background set contains a mix of indoors and outdoors images unrelated to either of the testing sets. You can download it here

See the results of this generation down below




First Results


We trained YoloV5 on these four training sets using Ultralytics' distribution. We used 150 epochs for each training.

Here is a compilation of our results. The comparison metric used is the mean Average Precision or mAP. The only difference between all training sets is the background, everything else being strictly identical (same number of images, same generation parameters, trained with the same number of epochs...).

Type of Background

mAP - Apartment Testing Set

mAP - Outdoors Testing Set

Apartment

0.419

0.179

Outdoors

0.443

0.776

Apartment & Outdoors

0.545

0.713

120 Images from Unsplash

0.594

0.64

Which backgrounds should you choose?


As you can see, changing the background can cause some wild variations in performance, going from a mAP of 0.179 to 0.713 by simply changing the backgrounds! The standard deviation for the Outdoors testing set is as high as 0.27.


Let's look a bit deeper, and try to figure out which set of backgrounds is the one you should use.

  • The Apartment Backgrounds set significantly under performs in both testing sets.

  • The Outdoors Backgrounds set is the best in the outdoors test, but almost the worst in the Apartment test

  • The "Apartment & Outdoors" as well as the "120 Images from Unsplash" hold up pretty well in both tests.

Let's run a test with all test images at once, outdoors and apartment to see which is globally better


Testing on both sets at once

Type of Background

mAP - Apartment & Outdoors Testing Set

Apartment

0.270

Outdoors

0.606

Apartment & Outdoors

0.660

120 Images from Unsplash

0.591

The Outdoors set and the Unsplash background set are quite close to each other with a mAP of respectively 0.591 and 0.606. However, this is explained by the fact that the set trained with Outdoors background does very well on the Outdoors testing set, and (relatively) poorly on the other.


Is the Outdoors background trained set really this good?


It is usually more common to see cans of soda in an apartment setting rather than a park setting. What would happen if we adjusted the distribution of the testing set to include more apartments testing pictures?

Type of Background

mAP - 75% Apartment Test

Apartment

0.328

Outdoors

0.521

Apartment & Outdoors

0.602

120 Images from Unsplash

0.576

As you can see, the mAP of the Outdoors background trained model crumbles rapidly under these circumstances, while the other two sets are more resilient.


Conclusion


If you know for sure where your object will be detected in your production environment, it is better to use as backgrounds photos from these environments. Beware however that your model will struggle to detect objects in environments you did not plan for, as did the model trained with only outdoors backgrounds.


If you do not know exactly all the settings in which your object might be detected, we recommend you to use a large collection of varied backgrounds, such as the one we put together. As you can see, the mAP is actually pretty close to when using the best training set, and this without using any information about the testing sets.













105 views
bottom of page