Reduce AWS S3 Costs by Automating PDF Compression
User uploads a 20MB PDF. You store it in S3. Do this a few thousand times, and suddenly your AWS bill is climbing. The fix? Compress PDFs before storing them.
PDF compression works especially well on documents with embedded images. A scanned contract or image-heavy report can shrink by 80-90% without visible quality loss. That translates directly into storage savings.
The Quick Solution
import okhttp3.*;
class CompressPdf {
public static void main(String[] args) throws Exception {
OkHttpClient client = new OkHttpClient();
FormBody formBody = new FormBody.Builder()
.add("file", "https://your-server.com/large-document.pdf")
.build();
Request request = new Request.Builder()
.url("https://apdf.io/api/pdf/file/compress")
.addHeader("Authorization", "Bearer YOUR_API_TOKEN")
.addHeader("Accept", "application/json")
.addHeader("Content-Type", "application/x-www-form-urlencoded")
.post(formBody)
.build();
Response response = client.newCall(request).execute();
System.out.println(response.body().string());
}
}
That is it. The API returns a URL to the compressed PDF, plus the original and compressed file sizes so you can calculate exactly how much storage you saved.
The Math: Why This Matters
Let's do some quick calculations. Say you store 10,000 user-uploaded PDFs per month, averaging 15MB each:
- Without compression: 10,000 × 15MB = 150GB/month
- With 80% compression: 10,000 × 3MB = 30GB/month
- Monthly savings: 120GB of S3 storage
At AWS S3 standard pricing, that is real money that adds up over time.
Step 1: Get Your API Token
- Sign up at aPDF.io (it is free).
- Copy your API Token from the dashboard.
Step 2: Add OkHttp Dependency
If you are using Maven, add OkHttp to your pom.xml:
<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>4.12.0</version>
</dependency>
For Gradle:
implementation 'com.squareup.okhttp3:okhttp:4.12.0'
Step 3: Build a Compression Service
Here is a complete Java class that compresses a PDF and returns the savings percentage:
import okhttp3.*;
import org.json.JSONObject;
public class PdfCompressionService {
private static final String API_URL = "https://apdf.io/api/pdf/file/compress";
private final String apiToken;
private final OkHttpClient client;
public PdfCompressionService(String apiToken) {
this.apiToken = apiToken;
this.client = new OkHttpClient();
}
public CompressionResult compress(String pdfUrl) throws Exception {
FormBody formBody = new FormBody.Builder()
.add("file", pdfUrl)
.build();
Request request = new Request.Builder()
.url(API_URL)
.addHeader("Authorization", "Bearer " + apiToken)
.addHeader("Accept", "application/json")
.addHeader("Content-Type", "application/x-www-form-urlencoded")
.post(formBody)
.build();
try (Response response = client.newCall(request).execute()) {
if (!response.isSuccessful()) {
throw new RuntimeException("API error: " + response.code());
}
String responseBody = response.body().string();
JSONObject json = new JSONObject(responseBody);
long originalSize = json.getLong("size_original");
long compressedSize = json.getLong("size_compressed");
String compressedUrl = json.getString("file");
double savingsPercent = (1.0 - (double) compressedSize / originalSize) * 100;
return new CompressionResult(compressedUrl, originalSize, compressedSize, savingsPercent);
}
}
public static class CompressionResult {
public final String url;
public final long originalSize;
public final long compressedSize;
public final double savingsPercent;
public CompressionResult(String url, long originalSize, long compressedSize, double savingsPercent) {
this.url = url;
this.originalSize = originalSize;
this.compressedSize = compressedSize;
this.savingsPercent = savingsPercent;
}
}
}
Step 4: Integrate Into Your Upload Flow
Here is how you would use this service when a user uploads a PDF:
public class FileUploadHandler {
private final PdfCompressionService compressionService;
private final S3Client s3Client;
public FileUploadHandler(String apiToken) {
this.compressionService = new PdfCompressionService(apiToken);
this.s3Client = S3Client.builder().build();
}
public void handleUpload(String uploadedPdfUrl, String s3Bucket, String s3Key) {
try {
System.out.println("Compressing PDF before S3 upload...");
// Compress the PDF
PdfCompressionService.CompressionResult result =
compressionService.compress(uploadedPdfUrl);
System.out.printf("Original: %d bytes%n", result.originalSize);
System.out.printf("Compressed: %d bytes%n", result.compressedSize);
System.out.printf("Savings: %.1f%%%n", result.savingsPercent);
// Download the compressed PDF and upload to S3
// Note: The compressed URL is valid for 1 hour
OkHttpClient client = new OkHttpClient();
Request downloadRequest = new Request.Builder()
.url(result.url)
.build();
try (Response downloadResponse = client.newCall(downloadRequest).execute()) {
byte[] compressedPdfBytes = downloadResponse.body().bytes();
// Upload to S3 (using AWS SDK v2)
s3Client.putObject(
PutObjectRequest.builder()
.bucket(s3Bucket)
.key(s3Key)
.contentType("application/pdf")
.build(),
RequestBody.fromBytes(compressedPdfBytes)
);
System.out.println("Compressed PDF uploaded to S3: " + s3Key);
}
} catch (Exception e) {
System.err.println("Compression failed: " + e.getMessage());
// Fallback: upload original file
}
}
}
Example Output
Compressing PDF before S3 upload...
Original: 20971520 bytes
Compressed: 2097152 bytes
Savings: 90.0%
Compressed PDF uploaded to S3: documents/contract-2025-001.pdf
When Compression Works Best
Not all PDFs compress equally. Here is what to expect:
- Scanned documents: 70-90% reduction (high-resolution images compress well)
- PDFs with photos: 60-85% reduction
- Text-heavy documents: 5-15% reduction (already compact)
- Already compressed PDFs: Minimal reduction
The biggest wins come from user-uploaded content, such as scanned contracts, receipts, and image-heavy reports. These are exactly the files that inflate your S3 costs the most.
Batch Processing Existing Files
Already have a bucket full of uncompressed PDFs? Here is a script to process them in batch:
import java.util.List;
public class S3BatchCompressor {
public static void main(String[] args) {
String apiToken = "YOUR_API_TOKEN";
PdfCompressionService service = new PdfCompressionService(apiToken);
// List of PDF URLs to compress (from S3 presigned URLs or public URLs)
List<String> pdfUrls = List.of(
"https://your-bucket.s3.amazonaws.com/doc1.pdf",
"https://your-bucket.s3.amazonaws.com/doc2.pdf",
"https://your-bucket.s3.amazonaws.com/doc3.pdf"
);
long totalOriginal = 0;
long totalCompressed = 0;
for (String url : pdfUrls) {
try {
System.out.println("Processing: " + url);
PdfCompressionService.CompressionResult result = service.compress(url);
totalOriginal += result.originalSize;
totalCompressed += result.compressedSize;
System.out.printf(" Compressed: %.1f%% savings%n", result.savingsPercent);
System.out.printf(" Download: %s%n", result.url);
} catch (Exception e) {
System.err.println(" Failed: " + e.getMessage());
}
}
double totalSavings = (1.0 - (double) totalCompressed / totalOriginal) * 100;
System.out.printf("%nTotal: %d bytes -> %d bytes (%.1f%% savings)%n",
totalOriginal, totalCompressed, totalSavings);
}
}
Conclusion
PDF compression is one of the easiest infrastructure optimizations you can make. Add a single API call to your upload pipeline, and you immediately start saving on storage costs. The aPDF.io API handles the heavy lifting, so there is no CPU load on your servers.
For high-volume applications, the savings compound quickly. A 10TB S3 bucket of uncompressed PDFs could shrink to 2TB, cutting your storage bill significantly.
Next Steps
- Extract text for search: Use the Extract Content endpoint to make your PDFs searchable without storing extra metadata.
- Split large documents: If users upload multi-hundred-page PDFs, use the Split endpoint to break them into smaller, faster-loading chunks.