为什么我不能使用python从Google下载图像？

9

该代码帮助我从Google下载了一堆图像。它曾经工作了几天，现在突然间代码中断了。

代码：

# importing google_images_download module 
from google_images_download import google_images_download  

# creating object 
response = google_images_download.googleimagesdownload()  

search_queries = ['Apple', 'Orange', 'Grapes', 'water melon'] 


def downloadimages(query): 
    # keywords is the search query 
    # format is the image file format 
    # limit is the number of images to be downloaded 
    # print urs is to print the image file url 
    # size is the image size which can 
    # be specified manually ("large, medium, icon") 
    # aspect ratio denotes the height width ratio 
    # of images to download. ("tall, square, wide, panoramic") 
    arguments = {"keywords": query, 
                 "format": "jpg", 
                 "limit":4, 
                 "print_urls":True, 
                 "size": "medium", 
                 "aspect_ratio": "panoramic"} 
    try: 
        response.download(arguments) 

    # Handling File NotFound Error     
    except FileNotFoundError:  
        arguments = {"keywords": query, 
                     "format": "jpg", 
                     "limit":4, 
                     "print_urls":True,  
                     "size": "medium"} 

        # Providing arguments for the searched query 
        try: 
            # Downloading the photos based 
            # on the given arguments 
            response.download(arguments)  
        except: 
            pass

# Driver Code 
for query in search_queries: 
    downloadimages(query)  
    print()

输出日志：

编号：1->物品名称= Apple评估中...开始下载...

不幸的是，由于某些图像无法下载，因此无法全部下载4个图像。搜索过滤器仅获得0！

错误：0

编号：1->物品名称=橙色评估中...开始下载...

不幸的是，由于某些图像无法下载，因此无法全部下载4个图像。搜索过滤器仅获得0！

错误：0

产品编号：1->产品名称=葡萄评估中...开始下载...

不幸的是，由于某些图像无法下载，因此无法全部下载4个图像。搜索过滤器仅获得0！

错误：0

编号：1->物品名称=西瓜评估中...开始下载...

不幸的是，由于某些图像无法下载，因此无法全部下载4个图像。搜索过滤器仅获得0！

错误：0

实际上，这将创建一个文件夹，但其中没有图像。

python python-3.x google-image-search

— 赛·克里斯纳达斯
source

1

我不明白为什么这篇文章有2次不喜欢？

— Sai Krishnadas

1

我也有同样的问题。前几天工作正常。

— 阿米斯

0

程序包似乎有问题。查看这些打开的PR：PR1和PR2

— 阿里·西里克（Ali Cirik）
source

很久以前拜访了他们，但仍然无法解决问题

— Sai Krishnadas

0

我认为Google正在改变DOM。元素class =“ rg_meta notranslate”不再存在。它更改为class =“ rg_i ...”


def get_soup(url,header):
    return BeautifulSoup(urllib2.urlopen(urllib2.Request(url,headers=header)),'html.parser')    

def main(args):
    query = "typical face"
    query = query.split()
    query = '+'.join(query)
    url = "https://www.google.co.in/search?q="+query+"&source=lnms&tbm=isch"
    headers = {}
    headers['User-Agent'] = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36"
    soup = get_soup(url, headers)
    for a in soup.find_all("img", {"class": "rg_i"}):
        wget.download(a.attrs["data-iurl"], a.attrs["data-iid"])


if __name__ == '__main__':
    from sys import argv
    try:
        main(argv)
    except KeyboardInterrupt:
        pass
    sys.exit()

— Nguyentran
source

那么，如何更改它们？

— Sai Krishnadas

0

确实，这个问题不久前就出现了，已经有很多类似的Github问题：

不幸的是，目前还没有正式的解决方案，您可以使用讨论中提供的临时解决方案。

— 滑稽的人
source

0

这不起作用的原因是因为Google更改了它们执行所有操作的方式，因此您现在需要搜索字符串中包含的api_key。因此，即使您使用的是2.8.0版，也无法使用google-images-download这样的软件包，因为它们没有占位符来插入api_key字符串，您必须向Google注册该字符串才能每天获得2500次免费下载。

因此，现在最好的方法是使用pip包google-search-results并将api_key作为查询参数的一部分提供。

params = {
           "engine" : "google",
           ...
           "api_key" : "secret_api_key" 
}

您可以在此处自行提供API密钥，然后调用：

client = GoogleSearchResults(params)
results = client.get_dict()

这将返回带有所有图像URL链接的JSON字符串，然后直接将其下载即可。

— 伊蒙·肯尼（Eamonn Kenny）
source

从哪里获得API密钥？

— Sai Krishnadas