Remove Password Protection from Archived PDFs with Ruby
Organizations often have archives of password-protected PDFs. Maybe it was company policy to protect all invoices with a standard password. Or perhaps a vendor always sends encrypted reports using a shared credential.
Now you need to migrate these documents to a new system, or make them easily accessible to internal teams. Opening each file manually and re-saving it without protection isn't practical when you have hundreds of files.
The solution: batch-remove the password protection via API. You provide the PDF and the known password, and get back an unlocked version. No manual work, no desktop software.
Important: This endpoint requires you to know the password. It's designed for legitimate scenarios where you have authorized access to protected documents, not for bypassing security on files you shouldn't access.
Quick Example
require 'net/http'
require 'uri'
require 'json'
API_TOKEN = 'YOUR_API_TOKEN'
API_URL = 'https://apdf.io/api/pdf/security/remove'
uri = URI(API_URL)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
request = Net::HTTP::Post.new(uri)
request['Authorization'] = "Bearer #{API_TOKEN}"
request['Content-Type'] = 'application/json'
request['Accept'] = 'application/json'
request.body = {
file: 'https://example.com/protected-document.pdf',
password: 'known-password-123'
}.to_json
response = http.request(request)
result = JSON.parse(response.body)
puts "Unlocked PDF: #{result['file']}"
The API returns a URL to the unlocked PDF. The original file is unchanged.
Real-World Scenario: Migrating Protected Archives
Your company is moving to a new document management system. The old system required all financial documents to be protected with the password "Finance2020". You need to unlock thousands of files before migration.
require 'net/http'
require 'uri'
require 'json'
API_TOKEN = 'YOUR_API_TOKEN'
API_URL = 'https://apdf.io/api/pdf/security/remove'
# Standard password used by the legacy system
ARCHIVE_PASSWORD = 'Finance2020'
def remove_password(pdf_url, password)
uri = URI(API_URL)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
http.read_timeout = 60
request = Net::HTTP::Post.new(uri)
request['Authorization'] = "Bearer #{API_TOKEN}"
request['Content-Type'] = 'application/json'
request['Accept'] = 'application/json'
request.body = {
file: pdf_url,
password: password
}.to_json
response = http.request(request)
if response.code == '200'
JSON.parse(response.body)
else
{ 'error' => "HTTP #{response.code}: #{response.body}" }
end
end
def process_archive(documents)
results = { success: [], failed: [] }
documents.each do |doc|
puts "Processing: #{doc[:name]}"
result = remove_password(doc[:url], ARCHIVE_PASSWORD)
if result['file']
puts " -> Unlocked: #{result['file']}"
results[:success] << {
name: doc[:name],
original: doc[:url],
unlocked: result['file']
}
else
puts " -> Failed: #{result['error']}"
results[:failed] << {
name: doc[:name],
error: result['error']
}
end
# Rate limiting
sleep(0.5)
end
results
end
# Documents to process
archived_docs = [
{ name: 'Invoice-2020-001', url: 'https://storage.example.com/invoices/2020-001.pdf' },
{ name: 'Invoice-2020-002', url: 'https://storage.example.com/invoices/2020-002.pdf' },
{ name: 'Invoice-2020-003', url: 'https://storage.example.com/invoices/2020-003.pdf' }
]
results = process_archive(archived_docs)
puts "\n=== Summary ==="
puts "Successful: #{results[:success].count}"
puts "Failed: #{results[:failed].count}"
# In production: download the unlocked PDFs and store them in your new system
Note: The unlocked PDF URLs are valid for 1 hour. Download and store them in your storage system promptly.
Handling Different Passwords
Sometimes different batches of documents use different passwords. Here's how to handle that:
# Password mapping by document source or date range
PASSWORD_MAP = {
'2019' => 'Finance2019',
'2020' => 'Finance2020',
'2021' => 'Finance2021',
'vendor_a' => 'VendorReports123',
'vendor_b' => 'SecureDoc456'
}
def get_password_for_document(doc_name)
# Determine password based on document naming convention
if doc_name.include?('2019')
PASSWORD_MAP['2019']
elsif doc_name.include?('2020')
PASSWORD_MAP['2020']
elsif doc_name.include?('2021')
PASSWORD_MAP['2021']
elsif doc_name.start_with?('VendorA')
PASSWORD_MAP['vendor_a']
elsif doc_name.start_with?('VendorB')
PASSWORD_MAP['vendor_b']
else
nil # Unknown password
end
end
def process_with_auto_password(doc)
password = get_password_for_document(doc[:name])
if password.nil?
puts " -> Skipped: Unknown password for #{doc[:name]}"
return nil
end
remove_password(doc[:url], password)
end
Checking If a PDF Is Protected First
Before attempting to unlock, you can check if a PDF is actually encrypted using the metadata endpoint:
METADATA_URL = 'https://apdf.io/api/pdf/metadata/read'
def is_encrypted?(pdf_url)
uri = URI(METADATA_URL)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
request = Net::HTTP::Post.new(uri)
request['Authorization'] = "Bearer #{API_TOKEN}"
request['Content-Type'] = 'application/json'
request['Accept'] = 'application/json'
request.body = { file: pdf_url }.to_json
response = http.request(request)
result = JSON.parse(response.body)
result['encrypted'] == true
end
# Smart processing: only unlock if needed
def smart_unlock(doc)
if is_encrypted?(doc[:url])
puts " -> Encrypted, removing password..."
remove_password(doc[:url], ARCHIVE_PASSWORD)
else
puts " -> Not encrypted, skipping"
{ 'file' => doc[:url], 'skipped' => true }
end
end
Next Steps
Once your documents are unlocked, you can further process them:
- Extract text for indexing: Use the Content Read endpoint to extract text from the now-accessible documents.
- Compress for storage: Use the Compress endpoint to reduce file sizes before archiving in your new system.