Create your first S3 bucket, upload and download files, and set up the right access controls — without accidentally making everything public.
By the end of this post you'll have an S3 bucket created, files uploaded and downloaded, lifecycle rules to age out old data, and the right permissions configured so your bucket isn't accidentally world-readable. About 25 minutes. Stays in the free tier.
You'll need an AWS account and the AWS CLI configured (aws configure).
S3 (Simple Storage Service) is object storage. Each "object" is a file plus metadata. Objects live in "buckets" — flat namespaces with a globally unique name across all of AWS.
S3 is not a filesystem. There are no real directories — what looks like /photos/2024/cat.jpg is just one object with the literal key photos/2024/cat.jpg. The slashes are part of the name. S3 fakes a directory structure in the console for convenience, but underneath it's flat.
That model — flat key-value, infinite scale, $0.023/GB-month for the standard tier — is why S3 backs almost every backup, log archive, build artifact, ML dataset, and static site on AWS.
Bucket names are globally unique, must be lowercase, and have other naming rules (no underscores, no caps, no IP-address-shaped names). A reasonable pattern: <your-name>-tutorial-<short-uuid>.
BUCKET="alice-tutorial-$(uuidgen | tr 'A-Z' 'a-z' | cut -c1-6)"
echo "Bucket name: $BUCKET"
aws s3api create-bucket --bucket $BUCKET --region us-east-1
For regions other than us-east-1, you also need --create-bucket-configuration LocationConstraint=<region>. (us-east-1 is special because of legacy reasons.)
Verify:
aws s3 ls
You should see the new bucket listed.
echo "Hello S3" > hello.txt
aws s3 cp hello.txt s3://$BUCKET/hello.txt
You'll see upload: ./hello.txt to s3://.../hello.txt. Confirm:
aws s3 ls s3://$BUCKET/
aws s3 cp s3://$BUCKET/hello.txt downloaded.txt
cat downloaded.txt
Two-line round trip: Hello S3. That's the core S3 interaction.
The CLI also supports aws s3 sync for whole directories, and aws s3 rm to delete. The mental model: it's cp, ls, rm, sync — same verbs as a filesystem, but the URLs are s3://bucket/key.
By default, S3 buckets are private and have public-access-block enabled, which is the safe state. Let's verify:
aws s3api get-public-access-block --bucket $BUCKET
You should see all four flags set to true. That blocks public ACLs and public bucket policies — even if you accidentally try to make something public, it'll be blocked at the bucket level.
Leave this on. Almost every real S3 leak in industry incidents traces back to this protection being disabled.
Need to share a single object with someone for a limited time? Generate a presigned URL:
aws s3 presign s3://$BUCKET/hello.txt --expires-in 3600
You'll get a long URL with ?X-Amz-...&X-Amz-Signature=... parameters. Anyone with that URL can download the object for the next hour. After that, it stops working.
This is the right pattern for "let the user download this report" or "embed this image in an email." No public bucket required.
Bucket-level encryption is on by default for new buckets in modern AWS, but verify:
aws s3api get-bucket-encryption --bucket $BUCKET
You'll see AES256 — Amazon-managed encryption. Fine for most cases. If you need to control the key (audit, compliance), you can switch to KMS:
aws s3api put-bucket-encryption --bucket $BUCKET \
--server-side-encryption-configuration '{
"Rules": [{
"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "aws:kms"}
}]
}'
That's a real production thing to do. For this tutorial, AES256 is fine.
Objects accumulate. Lifecycle rules age them out automatically:
cat > lifecycle.json <<EOF
{
"Rules": [{
"ID": "expire-old-logs",
"Status": "Enabled",
"Filter": {"Prefix": "logs/"},
"Expiration": {"Days": 30}
}]
}
EOF
aws s3api put-bucket-lifecycle-configuration \
--bucket $BUCKET \
--lifecycle-configuration file://lifecycle.json
Now any object with a key starting with logs/ is automatically deleted 30 days after upload. Lifecycle rules also support transitions to cheaper storage tiers (STANDARD_IA, GLACIER_IR, DEEP_ARCHIVE) — same pattern, different action.
The cost difference is real. Standard is $0.023/GB-month; Deep Archive is $0.001/GB-month. For backup data nobody touches, that's a 23x reduction.
aws s3api put-bucket-versioning --bucket $BUCKET \
--versioning-configuration Status=Enabled
Now every overwrite of an object keeps the previous version. Accidentally deleted something? Restore the previous version. Costs slightly more (you're storing more bytes) but is the cheapest insurance against rm -rf mistakes.
# Empty the bucket (versioning makes this a two-step thing)
aws s3 rm s3://$BUCKET --recursive
# Delete remaining versioned objects (if you enabled versioning)
aws s3api delete-objects --bucket $BUCKET \
--delete "$(aws s3api list-object-versions --bucket $BUCKET \
--query '{Objects: Versions[].{Key:Key, VersionId:VersionId}}')"
# Delete the bucket
aws s3api delete-bucket --bucket $BUCKET
Making a bucket public to "share files with users." Almost never the right move. Use presigned URLs (single-object, time-limited) or CloudFront with signed URLs. Public buckets show up in security headlines for a reason.
Putting credentials in object metadata or filenames. Bucket access logs and CloudTrail include keys. If a credential is part of the key, it leaks every time the object is touched.
Disabling public-access-block to "make development easier." It stays disabled. Production data ends up in dev buckets. Things leak. Leave it on.
Not enabling versioning on important buckets. rm mistakes happen. Versioning is the cheapest mitigation. The storage cost is small relative to the recovery insurance.
Using S3 like a filesystem. Listing a "directory" with millions of objects is slow because S3 is paginating through a flat key namespace. Design key prefixes that match your access patterns; partition by date, by tenant, by whatever you'll filter on.
You've got the basics. The next levels:
S3 is one of the most-used AWS services for a reason: simple model, infinite scale, low cost. Once you've put the safety basics in place (private by default, encryption, versioning, lifecycle), the rest is just cp with extra steps.
Get the latest tutorials, guides, and insights on AI, DevOps, Cloud, and Infrastructure delivered directly to your inbox.
Write, package, and deploy a Lambda function using only the AWS CLI. Trigger it via a public URL. Understand what serverless actually means.
A working mental model for AWS VPCs — what each piece does, how they connect, and why "VPC" is the wrong mental model if you came from physical networks.
Explore more articles in this category
A working mental model for AWS VPCs — what each piece does, how they connect, and why "VPC" is the wrong mental model if you came from physical networks.
Write, package, and deploy a Lambda function using only the AWS CLI. Trigger it via a public URL. Understand what serverless actually means.
We deployed the same edge function on both platforms and measured for a quarter. Where each wins, where each loses, and the surprises along the way.