Python library and collection of command line programs for interfacing with AWS S3. Uses buffering and fixed memory usage, where possible, so that operations on large buckets and objects are safe and easy.
- Memory-efficient streaming for large objects
- Batch operations for large buckets
- Support for custom S3-compatible endpoints
- Simple credential management
- Both library and CLI interfaces
pip install s3libS3Lib supports multiple authentication methods (in order of precedence):
- Command-line argument: Use
--creds <path>to specify a credentials file - Environment variables: Set
AWS_ACCESS_KEY_IDandAWS_SECRET_ACCESS_KEY - Credentials file: Create
~/.s3with your credentials (default)
Create a file at ~/.s3 (or any path you specify) with:
<AWS_ACCESS_KEY_ID>
<AWS_SECRET_ACCESS_KEY>
Example:
AKIAIOSFODNN7EXAMPLE
wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
List all buckets:
s3lsList objects in a bucket:
s3ls mybucketList with prefix filter:
s3ls mybucket --prefix logs/2024/List with custom fields:
s3ls mybucket --fields Key Size LastModifiedAvailable fields: Key, LastModified, ETag, Size, StorageClass
Options:
--host HOST- Custom S3 endpoint hostname--port PORT- Custom port--output FILE- Write output to file--creds FILE- Path to credentials file--mark MARKER- Start listing from this key--prefix PREFIX- Filter by prefix--batch SIZE- Batch size for API calls (default: 1000)
Download an object:
s3get mybucket myfile.txt --output local-file.txtDownload to stdout:
s3get mybucket logs/app.log | grep ERRORDownload multiple objects:
s3get mybucket file1.txt file2.txt --output combined.txtOptions:
--host HOST- Custom S3 endpoint hostname--port PORT- Custom port--output FILE- Write output to file (default: stdout)--creds FILE- Path to credentials file
Upload a file:
s3put mybucket remote-file.txt local-file.txtUpload from stdin:
echo "Hello World" | s3put mybucket hello.txtUpload with custom headers:
s3put mybucket file.txt local.txt --header "Content-Type:text/plain" --header "Cache-Control:max-age=3600"Options:
--host HOST- Custom S3 endpoint hostname--port PORT- Custom port--creds FILE- Path to credentials file--header KEY:VALUE- Add custom HTTP headers (repeatable)
Get metadata for objects:
s3head mybucket file1.txt file2.txtGet metadata in JSON format:
s3head mybucket file.txt --jsonOptions:
--host HOST- Custom S3 endpoint hostname--port PORT- Custom port--creds FILE- Path to credentials file--json- Output in JSON format
Copy object within or between buckets:
s3cp source-bucket source-key dest-bucket dest-keyCopy with custom metadata:
s3cp mybucket old.txt mybucket new.txt --header "Content-Type:application/json"Options:
--host HOST- Custom S3 endpoint hostname--port PORT- Custom port--creds FILE- Path to credentials file--header KEY:VALUE- Set metadata headers (repeatable)
Delete objects:
s3rm mybucket file1.txt file2.txtDelete with verbose output:
s3rm mybucket file.txt --verboseBatch delete with custom batch size:
s3rm mybucket file*.txt --batch 100Options:
--host HOST- Custom S3 endpoint hostname--port PORT- Custom port--creds FILE- Path to credentials file-v, --verbose- Show files as they are deleted--batch SIZE- Batch size for delete operations (default: 500)
Sign a policy document for browser-based uploads:
s3sign policy.jsonThis outputs the base64-encoded policy and signature.
Options:
--creds FILE- Path to credentials file
from s3lib import Connection
# Create connection
access_id = "AKIAIOSFODNN7EXAMPLE"
secret = b"wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
with Connection(access_id, secret) as s3:
# List buckets
for bucket in s3.list_buckets():
print(bucket)
# List objects in a bucket
for key in s3.list_bucket("mybucket"):
print(key)
# List with metadata
for obj in s3.list_bucket2("mybucket"):
print(obj['Key'], obj['Size'], obj['LastModified'])
# Get object
response = s3.get_object("mybucket", "myfile.txt")
data = response.read()
# Upload object
s3.put_object("mybucket", "newfile.txt", b"Hello World")
# Upload from file
with open("local.txt", "rb") as f:
s3.put_object("mybucket", "remote.txt", f)
# Copy object
s3.copy_object("bucket1", "file.txt", "bucket2", "file.txt")
# Delete object
s3.delete_object("mybucket", "oldfile.txt")
# Bulk delete
keys = ["file1.txt", "file2.txt", "file3.txt"]
for key, result in s3.delete_objects("mybucket", keys):
print(f"{key}: {result}")
# Get object metadata
headers = s3.head_object("mybucket", "file.txt")
print(dict(headers))
# Get object URL
url = s3.get_object_url("mybucket", "file.txt")
print(url)# Custom endpoint
with Connection(access_id, secret, host="s3.us-west-2.amazonaws.com") as s3:
pass
# Custom port
with Connection(access_id, secret, port=9000) as s3:
pass
# Connection timeout
with Connection(access_id, secret, conn_timeout=60) as s3:
passThe library is designed for memory efficiency with large files:
# Download large file
with Connection(access_id, secret) as s3:
response = s3.get_object("mybucket", "largefile.bin")
with open("local-large.bin", "wb") as f:
chunk = response.read(65536) # 64KB chunks
while chunk:
f.write(chunk)
chunk = response.read(65536)
# Upload large file
with Connection(access_id, secret) as s3:
with open("large-local.bin", "rb") as f:
s3.put_object("mybucket", "large-remote.bin", f)See MAINTAINING.md for development and maintenance instructions.
# Install development dependencies
pip install tox
# Run tests
tox
# Run tests for specific Python version
tox -e py39
# Run linting
tox -e lintMIT License - See setup.py for details.
Andrew Thomson (athomsonguy@gmail.com)