Security is hard.
Security is one of the most difficult system properties to guarantee. In a sense, you can’t guarantee it, in such as there is no such thing as a system that is totally 100% secure. Instead, security becomes a matter cost-effectively increasing the cost of attacking your system to the point that the potential prize no longer becomes worth it.
Even when your software system is in a 100-feet underground facility, off-network, with a dedicated power source, and encased in lead, one can come up with possible threats. And an attacker will only need one vulnerability to compromise your system. As a result, a system is only as strong as its weakest link. If your system has an administrator backdoor, your system is only as secure as that backdoor. If all data passes through a server, your data is only as secure as that security on that server’s access.
As a system architect, you have great incentive to reduce your attack surface as much as possible and increase the cost of attack wherever possible. However, at some point, it becomes prohibitively expensive to protect against a long tail of small probability events. Do we ever need to think about protecting against random bit flips caused by cosmic rays from space? No…unless of course, our software is destined for outer space–in which case then yes, we’ll need error correction algorithms to counter this real physical phenomenon.
Many smart people have thought about security and have come up with a collective set of best security practices and principles. My team has been successful building our systems over the years in heeding those principles. To summarize those principles and practices, here is a partial run through of our hit list when evaluating a system from security perspective:
Use vetted tooling. Do not rebuild the wheel, especially what can impact security.
Never trust input. From either users or other external systems. Always escape input and validate it.
Security and correctness go hand in hand. Protect against system drift and protect against deploying incorrect code. Incorrect code can lead to security being compromised.
Use end-to-end encryption. Encrypt data over the wire and at rest. Examine every leg of your data communication and storage. Use HTTPS, and redirect HTTP to HTTPS, but be mindful that SSL is, well, only as good as SSL.
Never commit secrets, private keys, or passwords to git repositories. These should always be configuration that is only stored on the server or deployment system. If they are accidentally exposed, they should be considered compromised and rotated immediately.
Salt and use a slow hashing function to hash passwords and any other data that only needs comparison and not retrieval (e.g. SSNs).
Validate everything you want to validate on the backend, too. Never rely on just frontend/client validation.
Run a test to dynamically go through all your URLs to check for authentication decorators (with explicitly listed exceptions)
Barring accessibility requirements, use CAPTCHA or an alternative after a number of failed login/create account attempts from an IP address within a time window to prevent brute force attacks. Keep in mind that CAPTCHA has its weaknesses.
Don’t allow your website to be embedded in an iframe by default to avoid clickjacking. Exempt only the pages for which this is needed functionality.
Monitor errors and logs. Setup error reporting and system monitoring (e.g. Sentry, UptimeRobot).
Block IP addresses that are making bad requests to your site. For example, temporarily block IP addresses with too many failed logins.
Make sure your database only has its firewall open to the application and worker servers that need to access it.
If using a load balancer, make sure the application servers are only open to the load balancer.
Use large-bit (2048 or 4096 bit) RSA keys for SSH access to servers instead of passwords, if not using something like AWS SSM.
Only open your system to the minimally needed. Ports that don’t need to be open should be closed by default. Only open them when necessary and only to the necessary IP addresses, and close them when they are no longer needed.
Update your system packages and system version as security updates are released and versions fall out of support. Test for system correctness.
Update your libraries and frameworks as bug fixes are released and versions fall out of support. Test for system correctness.
Reduce your attack surface area. Remove or disable unused or insecure software, such as older versions of SSL/TLS, unused HTTP methods, etc.
For systems with session cookies, cryptographically sign your session cookie.
If not all subdomains are trusted content, sign and check signatures on CSRF cookies to protect against the possibility of a CSRF attack from a subdomain.
For HTTP interfaces, make sure data modifications only happen through POST/PUT/DELETE operations and keep in mind that CSRF protection does not apply to GET requests.
Think of the physical security of your system. Here at Zagaran, we typically offload this burden to cloud providers that detail their physical security commitments.
Prior to launch and at other critical junctures, conduct penetration testing. For web systems, test against the OWASP checklist and run automated tools to check your deployment as a first pass on vulnerabilities:
And the list goes on. With so many diverse aspects to cover, a canonical checklist to run through can help a great deal. Careful, methodical application of best practices, meticulous attention to detail, and your ability to add additional layers of Swiss cheese are your foremost allies in this war.
If you are looking to build secure software or get your software system’s security vetted, you can reach out to my team. We serve as software experts for dozens of groups, and we’d be happy to serve as those experts for you as well.