While Google’s sync tool does know how to avoid uploading exact duplicates of photos, it doesn’t do any similarity matching on image content, so thumbnails (with the same content but lower resolution) and alternate formats (sidecar JPEGs that go along with camera raw files) end up duplicated in the Google Photos library. It seems this wasn’t a problem with the old “ Backup and Sync” application because it supported excluding some files from backup (so I could have it ignore the directory that thumbnails get put into), but the new Drive application lacks such a feature.
I recently had a bit of a problem with the files that had ended up in Google Photos on my account: the Google Drive desktop synchronization app seemed to have noticed the many (reasonably-high-resolution) thumbnails that my local photo management application (Lightroom) creates, and had uploaded many near-duplicate images. I found that a burst of images taken in raw mode on a standalone camera were marked as duplicates by Photos: the burst was so fast that the four images, named differently, were taken within the same second and had nearly identical poses among the subjects.Managing Google Photos duplicates with Python 30 April, 2022
In other cases, it might be more subtle: the location is missing from one image but present in the other, or they’re identical except one is slightly downsampled. That reformulated image appears identical, and Photos marks such images as duplicates. For instance, if you post a photo via Instagram in iOS, Instagram makes a copy at a lower resolution which most metadata removed. In some cases, you might find that you have a low-resolution or metadata-scrubbed version of an image alongside one at higher resolution or with more embedded data.