如何使用Boto将文件上传到S3存储桶中的目录

107

我想使用python在s3存储桶中复制文件。

例如：我的存储桶名称=测试。在存储桶中，我有2个文件夹名称为“ dump”和“ input”。现在，我想使用python将文件从本地目录复制到S3“转储”文件夹...有人可以帮助我吗？

— Dheeraj Gundra
source

105

试试这个...

import boto
import boto.s3
import sys
from boto.s3.key import Key

AWS_ACCESS_KEY_ID = ''
AWS_SECRET_ACCESS_KEY = ''

bucket_name = AWS_ACCESS_KEY_ID.lower() + '-dump'
conn = boto.connect_s3(AWS_ACCESS_KEY_ID,
        AWS_SECRET_ACCESS_KEY)


bucket = conn.create_bucket(bucket_name,
    location=boto.s3.connection.Location.DEFAULT)

testfile = "replace this with an actual filename"
print 'Uploading %s to Amazon S3 bucket %s' % \
   (testfile, bucket_name)

def percent_cb(complete, total):
    sys.stdout.write('.')
    sys.stdout.flush()


k = Key(bucket)
k.key = 'my test file'
k.set_contents_from_filename(testfile,
    cb=percent_cb, num_cb=10)

[更新]我不是pythonist，所以感谢您对import语句的注意。另外，我不建议将凭据放入您自己的源代码中。如果您在AWS内部运行此代码，请使用带有实例配置文件（http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html）的IAM凭证，并在其中保留相同的行为。您的开发/测试环境，请使用类似AdRoll的Hologram（https://github.com/AdRoll/hologram）

— 费利佩·加西亚
source

8

我会避免使用多个导入行，而不是pythonic。将导入线移到顶部，对于boto，您可以使用boto.s3.connection import S3Connection; conn = S3Connection（AWS_ACCESS_KEY_ID，AWS_SECRET_ACCESS_KEY）; bucket = conn.create_bucket（bucketname ...）; bucket.new_key（keyname，...）。set_contents_from_filename ....

— cgseller 2015年

2

boto.s3.key.Key在1.7.12上不存在

— Alex Pavy

截至2020年4月的更新，请点击此链接 upload_file_to_s3_using_python

— Prayag Sharma

48

无需使其变得如此复杂：

s3_connection = boto.connect_s3()
bucket = s3_connection.get_bucket('your bucket name')
key = boto.s3.key.Key(bucket, 'some_file.zip')
with open('some_file.zip') as f:
    key.send_file(f)

— vcarel
source

这将起作用，但是对于大型.zip文件，您可能需要使用分块。 elastician.com/2010/12/s3-multipart-upload-in-boto.html

— cgseller

2

是的..比较简单和常用的方法

— 狮子座王子

1

我试过了，它不起作用，但是k.set_contents_from_filename（testfile，cb = percent_cb，num_cb = 10）可以了

— Simon

1

您最近在boto 2上吗？无论如何，set_contents_from_filename是一个更简单的选项。去吧！

— vcarel

3

key.set_contents_from_filename('some_file.zip')也可以在这里工作。请参阅doc。boto3的相应代码可以在此处找到。

— Greg Sadetsky '17

44

import boto3

s3 = boto3.resource('s3')
BUCKET = "test"

s3.Bucket(BUCKET).upload_file("your/local/file", "dump/file")

— 鲍里斯
source

您能解释一下这一行s3.Bucket（BUCKET）.upload_file（“您的/本地/文件”，“转储/文件”）

— venkat

@venkat“ your / local / file”是使用python / boto在计算机上的文件路径，例如“ /home/file.txt"，"dump/file”是将文件存储在S3存储桶中的键名。请参阅：boto3.readthedocs.io/en/latest/reference/services/...

— 乔希S.

1

看来用户已经预配置了AWS Key，要执行此操作，请打开anaconda命令提示符并输入aws configure，输入信息，您将自动与boto3连接。检查boto3.readthedocs.io/en/latest/guide/quickstart.html

— seeiespi，

最简单的解决方案IMO，就像tinys3一样容易，但不需要其他外部依赖项。强烈建议您aws configure提前设置您的AWS密钥，以使您的生活更轻松。

— barlaensdoonn

凭据中有多个配置文件时会发生什么。如何通过特定的凭据

— 塔拉·普拉萨德·古龙

36

我用了它，实现起来很简单

import tinys3

conn = tinys3.Connection('S3_ACCESS_KEY','S3_SECRET_KEY',tls=True)

f = open('some_file.zip','rb')
conn.upload('some_file.zip',f,'my_bucket')

https://www.smore.com/labs/tinys3/

— 奥伦·埃夫隆（Oren Efron）
source

我认为这不适用于大文件。我必须使用此文件：docs.pythonboto.org/en/latest/s3_tut.html#storing-large-data

— wordforthewise

这也使我得到了此修复程序：github.com/boto/boto/issues/2207#issuecomment-60682869 以及此：stackoverflow.com/questions/5396932/…–

— wordforthewise

6

由于tinys3项目被放弃，因此您不应使用此项目。github.com/smore-inc/tinys3/issues/45

— Halil Kaskavalci，

这种平坦的布局在2019年不再对我有用。tinys3不仅被放弃了……我认为它不再起作用了。对于决定尝试此操作的其他人，如果遇到403错误，请不要感到惊讶。但是，一个简单的boto3.client解决方案（如Manish Mehra的答案）立即起作用。

— 罗斯，

16

from boto3.s3.transfer import S3Transfer
import boto3
#have all the variables populated which are required below
client = boto3.client('s3', aws_access_key_id=access_key,aws_secret_access_key=secret_key)
transfer = S3Transfer(client)
transfer.upload_file(filepath, bucket_name, folder_name+"/"+filename)

— 曼尼什·梅赫拉（Manish Mehra）
source

什么是文件路径？什么是folder_name + filename？这很令人困惑

— colintobing

@colintobing filepath是群集上文件的路径，而folder_name / filename是您希望在s3存储桶中具有的命名约定

— Manish Mehra

2

@ManishMehra如果您编辑它来阐明colintobing的困惑点，那么答案会更好。如果不检查文档，这不是显而易见的，哪些参数引用本地路径，哪些参数引用S3路径而无需检查文档或阅读注释。（完成后，您可以标记要清除此处的所有注释，因为它们将过时。）

— Mark Amery

aws_access_key_id并且aws_secret_access_key还可以使用AWS CLI配置并存储在脚本之外，以便可以调用`client = boto3.client（'s3'）

— yvesva

16

在具有凭据的会话中将文件上传到s3。

import boto3

session = boto3.Session(
    aws_access_key_id='AWS_ACCESS_KEY_ID',
    aws_secret_access_key='AWS_SECRET_ACCESS_KEY',
)
s3 = session.resource('s3')
# Filename - File to upload
# Bucket - Bucket to upload to (the top level directory under AWS S3)
# Key - S3 object name (can contain subdirectories). If not specified then file_name is used
s3.meta.client.upload_file(Filename='input_file_path', Bucket='bucket_name', Key='s3_output_key')

— 罗马奥拉克
source

s3_output_key是什么？

— Roelant

它是S3存储桶中的文件名。

— Roman Orac

12

这也将起作用：

import os 
import boto
import boto.s3.connection
from boto.s3.key import Key

try:

    conn = boto.s3.connect_to_region('us-east-1',
    aws_access_key_id = 'AWS-Access-Key',
    aws_secret_access_key = 'AWS-Secrete-Key',
    # host = 's3-website-us-east-1.amazonaws.com',
    # is_secure=True,               # uncomment if you are not using ssl
    calling_format = boto.s3.connection.OrdinaryCallingFormat(),
    )

    bucket = conn.get_bucket('YourBucketName')
    key_name = 'FileToUpload'
    path = 'images/holiday' #Directory Under which file should get upload
    full_key_name = os.path.join(path, key_name)
    k = bucket.new_key(full_key_name)
    k.set_contents_from_filename(key_name)

except Exception,e:
    print str(e)
    print "error"

— Piyush S.Wanare
source

7

这是三班轮。只需按照boto3文档中的说明进行操作。

import boto3
s3 = boto3.resource(service_name = 's3')
s3.meta.client.upload_file(Filename = 'C:/foo/bar/baz.filetype', Bucket = 'yourbucketname', Key = 'baz.filetype')

一些重要的论据是：

参数：

文件名（str）-要上传的文件的路径。

存储桶（str）-要上传到的存储桶的名称。

键（str）-您要分配给s3存储桶中文件的的名称。该名称可以与文件名相同，也可以与您选择的名称不同，但是文件类型应保持不变。

注意：我假设您已按照boto3文档中最佳配置做法的~\.aws建议将凭据保存在文件夹中。

— 塞缪尔·恩德
source

谢谢Nde Samuel与我合作...在我的情况下，还需要做的另一件事是已经创建了存储桶，以避免出现““”指定的存储桶不存在“错误。

— HassanSh__3571619

@ HassanSh__3571619很高兴它有所帮助。

— 塞缪尔·恩德

5

import boto
from boto.s3.key import Key

AWS_ACCESS_KEY_ID = ''
AWS_SECRET_ACCESS_KEY = ''
END_POINT = ''                          # eg. us-east-1
S3_HOST = ''                            # eg. s3.us-east-1.amazonaws.com
BUCKET_NAME = 'test'        
FILENAME = 'upload.txt'                
UPLOADED_FILENAME = 'dumps/upload.txt'
# include folders in file path. If it doesn't exist, it will be created

s3 = boto.s3.connect_to_region(END_POINT,
                           aws_access_key_id=AWS_ACCESS_KEY_ID,
                           aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
                           host=S3_HOST)

bucket = s3.get_bucket(BUCKET_NAME)
k = Key(bucket)
k.key = UPLOADED_FILENAME
k.set_contents_from_filename(FILENAME)

— 沙克提
source

4

使用boto3

import logging
import boto3
from botocore.exceptions import ClientError


def upload_file(file_name, bucket, object_name=None):
    """Upload a file to an S3 bucket

    :param file_name: File to upload
    :param bucket: Bucket to upload to
    :param object_name: S3 object name. If not specified then file_name is used
    :return: True if file was uploaded, else False
    """

    # If S3 object_name was not specified, use file_name
    if object_name is None:
        object_name = file_name

    # Upload the file
    s3_client = boto3.client('s3')
    try:
        response = s3_client.upload_file(file_name, bucket, object_name)
    except ClientError as e:
        logging.error(e)
        return False
    return True

— 努法瓦尔瓦拉
source

1

对于上传文件夹示例，如下代码和S3文件夹图片

import boto
import boto.s3
import boto.s3.connection
import os.path
import sys    

# Fill in info on data to upload
# destination bucket name
bucket_name = 'willie20181121'
# source directory
sourceDir = '/home/willie/Desktop/x/'  #Linux Path
# destination directory name (on s3)
destDir = '/test1/'   #S3 Path

#max size in bytes before uploading in parts. between 1 and 5 GB recommended
MAX_SIZE = 20 * 1000 * 1000
#size of parts when uploading in parts
PART_SIZE = 6 * 1000 * 1000

access_key = 'MPBVAQ*******IT****'
secret_key = '11t63yDV***********HgUcgMOSN*****'

conn = boto.connect_s3(
        aws_access_key_id = access_key,
        aws_secret_access_key = secret_key,
        host = '******.org.tw',
        is_secure=False,               # uncomment if you are not using ssl
        calling_format = boto.s3.connection.OrdinaryCallingFormat(),
        )
bucket = conn.create_bucket(bucket_name,
        location=boto.s3.connection.Location.DEFAULT)


uploadFileNames = []
for (sourceDir, dirname, filename) in os.walk(sourceDir):
    uploadFileNames.extend(filename)
    break

def percent_cb(complete, total):
    sys.stdout.write('.')
    sys.stdout.flush()

for filename in uploadFileNames:
    sourcepath = os.path.join(sourceDir + filename)
    destpath = os.path.join(destDir, filename)
    print ('Uploading %s to Amazon S3 bucket %s' % \
           (sourcepath, bucket_name))

    filesize = os.path.getsize(sourcepath)
    if filesize > MAX_SIZE:
        print ("multipart upload")
        mp = bucket.initiate_multipart_upload(destpath)
        fp = open(sourcepath,'rb')
        fp_num = 0
        while (fp.tell() < filesize):
            fp_num += 1
            print ("uploading part %i" %fp_num)
            mp.upload_part_from_file(fp, fp_num, cb=percent_cb, num_cb=10, size=PART_SIZE)

        mp.complete_upload()

    else:
        print ("singlepart upload")
        k = boto.s3.key.Key(bucket)
        k.key = destpath
        k.set_contents_from_filename(sourcepath,
                cb=percent_cb, num_cb=10)

PS：有关更多参考URL

— 郑威
source

0

xmlstr = etree.tostring(listings,  encoding='utf8', method='xml')
conn = boto.connect_s3(
        aws_access_key_id = access_key,
        aws_secret_access_key = secret_key,
        # host = '<bucketName>.s3.amazonaws.com',
        host = 'bycket.s3.amazonaws.com',
        #is_secure=False,               # uncomment if you are not using ssl
        calling_format = boto.s3.connection.OrdinaryCallingFormat(),
        )
conn.auth_region_name = 'us-west-1'

bucket = conn.get_bucket('resources', validate=False)
key= bucket.get_key('filename.txt')
key.set_contents_from_string("SAMPLE TEXT")
key.set_canned_acl('public-read')

— 马丁
source

文字说明以及您的代码的功能将非常不错！

— 尼克

0

我觉得有些东西还需要点命令：

import boto3
from pprint import pprint
from botocore.exceptions import NoCredentialsError


class S3(object):
    BUCKET = "test"
    connection = None

    def __init__(self):
        try:
            vars = get_s3_credentials("aws")
            self.connection = boto3.resource('s3', 'aws_access_key_id',
                                             'aws_secret_access_key')
        except(Exception) as error:
            print(error)
            self.connection = None


    def upload_file(self, file_to_upload_path, file_name):
        if file_to_upload is None or file_name is None: return False
        try:
            pprint(file_to_upload)
            file_name = "your-folder-inside-s3/{0}".format(file_name)
            self.connection.Bucket(self.BUCKET).upload_file(file_to_upload_path, 
                                                                      file_name)
            print("Upload Successful")
            return True

        except FileNotFoundError:
            print("The file was not found")
            return False

        except NoCredentialsError:
            print("Credentials not available")
            return False

这里有三个重要的变量，BUCKET const，file_to_upload和file_name

BUCKET：是您的S3存储桶的名称

file_to_upload_path：必须是您要上传的文件的路径

file_name：是存储桶中生成的文件和路径（这是您添加文件夹或其他内容的位置）

有很多方法，但是您可以在这样的另一个脚本中重用此代码

import S3

def some_function():
    S3.S3().upload_file(path_to_file, final_file_name)

— 耶稣·沃克
source