또르르's 개발 Story

[13-3] Google Image Data 다운로드 본문

부스트캠프 AI 테크 U stage/실습

[13-3] Google Image Data 다운로드

또르르21 2021. 2. 4. 02:21

Google에서 image data를 한 번에 다운 받을 때 사용할 수 있는 방법입니다.

 

Joe Clinton이라는 분이 만들어주신 모듈입니다.

!pip install --upgrade git+https://github.com/Joeclinton1/google-images-download.git

 

아래와 같은 argument를 사용해서 goole images를 download할 수 있습니다.

https://github.com/BoostcampAITech/lecture-note-python-basics-for-ai/blob/main/codes/pytorch/00_utils/google%20images%20download.md

 

예를 들어, 아래와 같이 수행할 수 있습니다.

  • 검색할 키워드 : 블랙핑크 지수 / 블랙핑크 제니 / 블랙핑크 리사 / 블랙핑크 로제
  • 다운로드 수 : 키워드 당 5 개
  • 이미지 파일 포맷 : png
  • 다운로드 위치 : ./data/
googleimagesdownload --keywords "블랙핑크 지수,블랙핑크 제니,블랙핑크 리사,블랙핑크 로제" --limit 5 --format png --output_directory data
Item no.: 1 --> Item name = \ube14\ub799\ud551\ud06c \uc9c0\uc218
Evaluating...
Starting Download...
Completed Image ====> 1.450.png
Completed Image ====> 2.54b618a84b9e8b6172331ee5822a29fd.png
Completed Image ====> 3.img.png
Completed Image ====> 4.131ff670694240bfdebbb51be94a5353.png
Completed Image ====> 5.image.png

Errors: 0


Item no.: 2 --> Item name = \ube14\ub799\ud551\ud06c \uc81c\ub2c8
Evaluating...
Starting Download...
Completed Image ====> 1.img.png
Completed Image ====> 2.997f194f5c3158cc35.png
Completed Image ====> 3.art_1599187660283_523786.png
Completed Image ====> 4.image.png
Invalid image format 'text/html'. Skipping...
Completed Image ====> 5.p179579579671534_735.png

Errors: 1


Item no.: 3 --> Item name = \ube14\ub799\ud551\ud06c \ub9ac\uc0ac
Evaluating...
Starting Download...
Completed Image ====> 1.lisa_%281%29.png
Completed Image ====> 2.50a2ec797a6cb60362ef0958924c6fbc.png
Completed Image ====> 3.img.png
Completed Image ====> 4..png
Completed Image ====> 5.5d1fc430ef0b7b27927288f023d0322b.png

Errors: 0


Item no.: 4 --> Item name = \ube14\ub799\ud551\ud06c \ub85c\uc81c
Evaluating...
Starting Download...
Completed Image ====> 1.%ef%bb%bf%eb%b8%94%eb%9e%99%ed%95%91%ed%81%ac_%eb%a1%9c%ec%a0%9c_%eb%8b%a4%ec%9d%b4%ec%96%b4%ed%8a%b8_%ec%9e%90%ea%b7%b9%ec%82%ac%ec%a7%84(%eb%82%98%ec%9d%b4_%ea%b5%ad%ec%a0%81_%eb%aa%b8%eb%ac%b4%ea%b2%8c)_(1).png
Completed Image ====> 2.ff8ab818ef6fbbfadfc777419b5ee01b.png
Completed Image ====> 3.se-f871f494-653c-4b95-91d7-1f754333da9b.png
Completed Image ====> 4.image_readtop_2020_78497_15797604314062929.png
Completed Image ====> 5.bed0d8af2161450d9a3d32cfdabfc9b2.png

Errors: 0


Everything downloaded!
Total errors: 1
Total time taken: 24.805182695388794 Seconds
Comments