
Privacy-First Practices #1: Minimize Data Retention
- Djeditech
- Privacy , Best practices
- November 10, 2025
Table of Contents
The Privacy Principle
Keep only what you need, delete what you don’t.
Data retention is a critical privacy practice that’s often overlooked. Every piece of data you store is a liability - it can be breached, misused, or become a compliance burden. The best way to protect data is to not have it in the first place.
Why Data Minimization Matters
The Risks of Over-Retention
- Breach Exposure: More data = larger breach impact
- Compliance Burden: Old data creates GDPR/CCPA liabilities
- Storage Costs: Unnecessary data costs money to store
- Discovery Risk: Old data can be subpoenaed in legal cases
- Staleness: Outdated data leads to poor decisions
The Benefits of Minimal Retention
- Reduced Attack Surface: Less data to protect
- Easier Compliance: Simpler to manage what you have
- Lower Costs: Less storage and backup needs
- Faster Processing: Smaller datasets perform better
- Trust Building: Shows you value privacy
Implementing Data Retention Policies
1. Define Retention Requirements
Identify what you must keep:
- Legal requirements (tax records, contracts)
- Regulatory compliance (HIPAA, SOX, etc.)
- Business operations (active customer data)
- Statistical/analytics (aggregated only)
Example Retention Periods:
Active customer data: Duration of relationship + 1 year
Transaction records: 7 years (tax law)
Support tickets: 2 years
Marketing analytics: 1 year (anonymized)
Application logs: 30-90 days
Website logs: 14 days
2. Automate Data Deletion
Database-level automation:
-- PostgreSQL example: Auto-delete old records
CREATE OR REPLACE FUNCTION delete_old_logs()
RETURNS void AS $$
BEGIN
DELETE FROM application_logs
WHERE created_at < NOW() - INTERVAL '30 days';
END;
$$ LANGUAGE plpgsql;
-- Schedule it
SELECT cron.schedule(
'delete-old-logs',
'0 2 * * *', -- 2 AM daily
'SELECT delete_old_logs();'
);
Application-level cleanup:
// Automated cleanup service
class DataRetentionService {
async cleanupOldData() {
const thirtyDaysAgo = new Date();
thirtyDaysAgo.setDate(thirtyDaysAgo.getDate() - 30);
// Delete old logs
await db.logs.deleteMany({
where: { createdAt: { lt: thirtyDaysAgo } }
});
// Anonymize old analytics
await db.analytics.updateMany({
where: {
createdAt: { lt: thirtyDaysAgo },
anonymized: false
},
data: {
userId: null,
ipAddress: null,
anonymized: true
}
});
}
}
// Run daily at 2 AM
cron.schedule('0 2 * * *', () => {
new DataRetentionService().cleanupOldData();
});
3. Anonymize Before Deletion
When you need statistics but not personal data:
// Convert personal data to anonymous statistics
async function anonymizeUserData(userId: string) {
// Extract statistics
const stats = await db.userActivity.aggregate({
where: { userId },
_sum: { pageViews: true },
_avg: { sessionDuration: true }
});
// Store aggregated stats
await db.anonymousStats.create({
data: {
pageViews: stats._sum.pageViews,
avgSessionDuration: stats._avg.sessionDuration,
date: new Date()
}
});
// Delete personal data
await db.userActivity.deleteMany({ where: { userId } });
}
4. Document Retention Policies
Create a data retention schedule:
| Data Type | Retention Period | Deletion Method | Reason |
|---|---|---|---|
| Active accounts | Account lifetime + 1yr | Hard delete | Legal requirement |
| Inactive accounts | 2 years | Anonymize then delete | Business need |
| Payment records | 7 years | Archive then delete | Tax law |
| Support tickets | 2 years | Hard delete | Business practice |
| Application logs | 30 days | Rolling delete | Operations |
| Marketing analytics | 1 year anonymized | Anonymize immediately | Analytics |
Storage Tiers Strategy
Implement progressive data aging:
- Hot Storage (0-30 days): Fast access, full fidelity
- Warm Storage (30-90 days): Slower access, compressed
- Cold Storage (90 days - retention limit): Archived, anonymized
- Deletion: After retention period expires
// Progressive data aging
async function ageData() {
const now = new Date();
// Move to warm storage (30 days old)
const warmCutoff = new Date(now.getTime() - 30 * 24 * 60 * 60 * 1000);
await moveToWarmStorage({ olderThan: warmCutoff });
// Move to cold storage (90 days old)
const coldCutoff = new Date(now.getTime() - 90 * 24 * 60 * 60 * 1000);
await moveToColdStorage({ olderThan: coldCutoff });
// Delete (365 days old)
const deleteCutoff = new Date(now.getTime() - 365 * 24 * 60 * 60 * 1000);
await deleteData({ olderThan: deleteCutoff });
}
User Control Over Their Data
Provide self-service data management:
// User data export (GDPR Article 20)
async function exportUserData(userId: string) {
const userData = {
profile: await db.user.findUnique({ where: { id: userId } }),
orders: await db.order.findMany({ where: { userId } }),
support: await db.ticket.findMany({ where: { userId } })
};
return JSON.stringify(userData, null, 2);
}
// User data deletion (GDPR Article 17 - Right to be forgotten)
async function deleteUserData(userId: string) {
await db.$transaction([
db.userActivity.deleteMany({ where: { userId } }),
db.preferences.delete({ where: { userId } }),
db.user.delete({ where: { id: userId } })
]);
}
Monitoring and Auditing
Track retention compliance:
// Retention audit report
async function generateRetentionAudit() {
const report = {
oldestRecord: await db.data.findFirst({ orderBy: { createdAt: 'asc' } }),
dataByAge: await db.data.groupBy({
by: ['type'],
_count: true,
where: {
createdAt: {
// Group by age buckets
gte: new Date(Date.now() - 90 * 24 * 60 * 60 * 1000)
}
}
}),
retentionViolations: await findRetentionViolations()
};
return report;
}
Action Items
- Document retention requirements for each data type
- Implement automated deletion for logs and temporary data
- Set up anonymization before deletion for statistical data
- Create retention policy documentation
- Schedule regular retention audits
- Provide user data export and deletion capabilities
- Monitor compliance with retention policies
Key Takeaways
- Default to Deletion: Set expiration dates on all new data by default
- Automate Cleanup: Manual deletion doesn’t scale and gets forgotten
- Anonymize for Statistics: Keep insights, not personal data
- Document Everything: Clear policies protect you legally
- Empower Users: Give users control over their own data
Remember: The data you don’t have can’t be stolen, leaked, or misused. Minimal retention is minimal risk.
Part of the Privacy-First Practices series - practical privacy engineering for modern applications.