Recently the new OWASP API Security Top 10 2019 is released, as a release candidate, which is a breath of fresh air into the field of security and penetration testing. It is a tribute to its old pal, OWASP Backend Security Project, which was released in 2008, or at least, I see it so, as the naming suggests. But deep down, they are two different guides and two different approaches to security. This new one deals with the Top 10 approach using scenarios, while the old one tries to present a multi-phase approach, with phases being development, hardening and testing.
In this article I will try to present the new one using all the current tools and techniques available to defend against threats to your APIs. I will focus on Python, since it is my area of expertise, and would present packages and tools either directly or generally related to it.
It is really interesting to see how the Top 10 list of software vulnerabilities of an API is different from the original Top 10 list, priority-wise. This sheds a light on the importance of different measures to take.
The following is the current Top 10 list of vulnerabilities presented in the document, which I will go through one by one afterwards:
API1:2019-Broken Object Level Authorization
API3:2019-Excessive Data Exposure
API4:2019-Lack of Resources & Rate Limiting
API5:2019-Broken Function Level Authorization
API9:2019-Improper Assets Management
API10:2019-Insufficient Logging & Monitoring
API1:2019 - Broken Object Level Authorization
As the title implies, it is all about giving unwanted parties or attackers extra unnecessary access to information. The AAA (Authentication, Authorization and Accounting) of network protocols also apply to high-level applications sitting on top of the Application Layer (L7) in the TCP/IP network protocol stack, in which, in this case, the Access would be interchangeably used instead of Accounting. Although the vulnerability only mentions authorization, I think that other things, like authentication that precedes and access that succeeds it, should also be explored, which I’ll do below.
Authentication is a mechanism by which one’s identity is verified. This means that they have been either identified or unidentified, in advance. This ensures that no one can or will take anyone else’s identity. In applications, identity management is easily described by user management, in which, each user/actor of the system is considered an identity known to the system, which therefore can be verified through a mechanism or a protocol.
Usually authentication happens via a mechanism or protocol, either application-based or protocol-based, working in different networking stack layers. An example of application-based authentication is the user authentication module in Django, which usually uses a ModelBackend* *to filter through a list of users (by username or email) and their matching password existing in a database or a data store. Authentication protocols are either application-layer based ones, like digest access authentication, or they operate on lower layers in the stack, e.g. IEEE 802.1X, though all of which are considered and categorized under SASL framework.
These are the things to consider while performing any kind of authentication and authorization and granting access:
- NEVER EVER TRUST USER INPUT! **’ or ‘1’=’1**
- Authenticate properly, either via SSO, SAML, social authentication or whatever method you choose to authenticate against!
- Always use a rate limiting mechanism to disallow users from using a single pair of username and password to authenticate multiple time in a row. This helps prevent brute-force attacks and credential stuffing.
- Make sure you keep track of IP address users are authenticating from, and limit users authenticating numerous times via the same IP address, thus preventing the same brute-force attack and also preventing man-in-the-middle attacks.
- Always use the most secure protocol or mechanism to authenticate users. There is always a trade-off between security and usability, so beware!
- Authentication tokens are not as secure as they seem to be, since stolen tokens are the key to everything. So, always use some sort of expiry, like refreshing the token every once in a while.
- Use end-to-end encryption.
- Use strong hashing mechanism that support better randomness and salt.
- Firewalls are dumb. Use them, but don’t trust them fully. The same applies for other defensive mechanisms.
- Misconfigurations are the root of all evil. Take it seriously!
- Use captcha and account lockout mechanisms, which is also mentioned above.
- Never permit weak passwords!
- Never send authentication details in URL!
- Always verify tokens, in terms of validity, expiry, etc!
- Never reply with meaningful sensitive information. For example, replying whether a user exists or not to a forget password request will help the attacker enumerate users existing on your API or website. Ambiguous messages and 204 No Content status code help prevent this issue.
- Follow the guidelines in OWASP Authentication Cheatsheet.
- API keys are for projects, authentication is for users.
- Use multi-factor authentication (MFA), where possible, but also beware of SIM swap attack which may cause issues:
The second step is authorization, in which, the system decides whether the authenticated user is authorized to access a resource. Imagine you have a website selling stuff. You don’t want each user to be able to see what other users have ordered, well, for privacy reasons. So the object of concern here is the order history and only the owner of the order(s) can view it. If by some mechanism, other users can view each others’ order history, this makes your website insecure and untrustable.
In terms of an API, you should disallow unauthorized access to resources via some authorization mechanism. As is practiced, it is best to authorize users based on their access to objects, and hence models. I have mentioned models, just because nowadays it is widely accepted to use an OxM (Object-to-X-Mapper, like ORM, OGM, ODM, etc.) to access your database tables, rows, nodes, relations, documents or the like. Since models are the single source of truth, it would be good to authorize users based on them, not the routes, since a users may access a route they are not authorized to, hence jeopardizing the security of your API. This may seem a little bit complicated, but just keep it mind that authorizing based on models are easier and safer. You can do both, but it easily gets messy and hard to manage.
The usual method is to have a “User ↔ Role ↔ Permission” set of models, in which authentication and user profile management is managed by the user object, and each user can have one or multiple role(s), which in turn, can have certain permissions granted to it. This is usually referred to as role-based access controls (RBAC).
Another method is to have a “User ↔ Group ↔ Permission” set of models, in which a user belongs to one or more group(s) that may have a certain number of permissions. In this method, the group may have an arbitrary set of permissions, completely unrelated to the concept of its designation or role.
The hardest part is when one has more than one role or belongs to more than one group. In this case, the best approach is to take advantage of the principle of least privilege (PLoP), meaning that the lowest permission(s) always win, or simply put, a user doesn’t need to have administrative access to view his order history. Also, make sure that you understand the object capability model to its fullest.
Sometimes you have the user authenticated and authorized to access a resource, but the resouce itself needs some special permissions, like delete permissions. Users may be able to view their order history, but may (or should) not be able to change them. This boils down to the specific object-level permissions one resource may have.
So, authorization is the permission(s) the user has, while access is the permission(s) a resource needs, for the system to be able to grant access to it.
This subtle difference is usually disregarded or misinterpreted, which leads to the topic of the current vulnerability I am talking about, broken object level authorization. From the user’s perspective, when they are in, they are in, but from the system’s and security perspective, this may not be the case. So, try to implement fine-grained access controls to restrict users and attackers from gaining access to your valuable resources.
API2:2019 - Broken Authentication
This topic is well discussed above, so if you want to know more about this vulnerability, make sure you read the first section, API1:2019-Broken Object Level Authorization.
API3:2019 - Excessive Data Exposure
It is considered a bad design decision, security-wise, to rely on client to filter out data returned by API, since an attacker can sit in between and hear all the communications (requests and responses) and sniff out sensitive information that should not be normally present.
Always review all response data returned by all endpoints and try to validate all returned data against a schema to prevent exposure. Examples of this is returning all user objects data to the client, like password hashes and other sensitive data.
API4:2019 - Lack of Resources & Rate Limiting
Rate limiting is already discussed in API1:2019-Broken Object Level Authorization to prevent brute-force attacks. It also applies to other system resources like CPU, memory, etc., which is always scarce, no matter how much you have and it can be easily saturated by overuse. Scarcity makes for different strategies for resource management. Rate limiting is one of those strategies. Another is to limit resource usage using something like docker that can control resource usage.
Relying on the client for data validation is a true mistake and should be prevented at all costs, meaning that client- and server-side validation should accompany each other. This also includes enforcing incoming data size to a specific amount.
There are cases like zip bombs that can explode your resources on the server, so just verify compression ratio of compressed files before accepting and extracting the file:
API5:2019 - Broken Function Level Authorization
We have already discussed object level authorization in API1:2019-Broken Object Level Authorization, which deal with unauthorized access to objects. Current vulnerability discusses a method by which the attacker is already authorized or is using a malicious input against an endpoint that requires little or no authorization, like anonymous users’ endpoints.
A small change to a HTTP request method would make the endpoint vulnerable. For example, changing a GET request to POST would make the endpoint create or update a record with the same anonymous access rights. This also applies to the request URL, where you can simply change a /users to /admins and you’ll get all data relating to privileged admin users, which is horrible. Just keep in mind that almost everything in the request can be changed. Burp Suite is just one example, by which you can easily act as a proxy between client and the API and change the requests in between.
The best mechanism to prevent most of these vulnerabilities is to tighten enforcement of authorization rules, by denying everything by default and then explicitly giving access rights to specific roles for access to every function. This is also one of the most important topics in firewall configuration, in which the last rule is to deny all, in case that no rule matches against the criteria of the packet, is is denied and obviously not passed through.
API6:2019 - Mass Assignment
As said before, **NEVER EVER TRUST USER INPUT! **I just can’t put more emphasis on this advice. Certain things in database and workflow of the API is managed by user input, like a user profile or metadata passed along with an upload request. Binding users’ input parameters to exact objects without any validation and filtering is a recipe for disaster. Imagine an attacker sets a simple is_admin flag to True while sending a request to update a user’s profile. If not taken seriously, it would make the attacker an admin in the system right away. Just take input validation seriously, so that the next attacker or curious user not be able to update his account’s balance by updating it via a legitimate request to the system. Also be aware of some fields, like created and updated fields, which should almost always be internally handled and not passed from the user directly.
Use white- and black-list feature to prevent some fields from being updated via requests.
API7:2019 - Security Misconfiguration
As I said above, misconfigurations, specially security misconfigurations, are the root of all evil. Taking security configurations as granted is a grave mistake. Since security is always considered as an afterthought, try to get rid of the temptation to forego implementing security recommendations, like encryption, hashing, rate-limiting, CORS, TLS, etc.
There are some general pieces of advice:
- Integrate hardening process in your workflow.
- Regularly review and update configurations and packages on your projects and infrastructure. Always apply the latest security patches.
- Use E2E encryption!
- Try to integrate automated configuration analysis and vulnerability assessment in your workflow.
- Never send stack traces produced by exceptions to the user (or potential attacker). Always validate the response against a schema.
- Static analysis and linting will detect most of misconfigurations and bugs in the first place.
API8:2019 - Injection
Injection, injection, injection! It is almost 2020 and we have have this injection issue! Neither OxMs nor any other pieces of software today can fully prevent it from being exploited. Well, there has been some advances in IDSes and cloud services to inspect and prevent such attacks, but it can still be exploited in the wide. Just keep in mind that NoSQL is also vulnerable to injection attacks.
Nowadays working without having any relation to a third-party API is just not feasible anymore. Things like payment service providers, authentication providers and the like are all examples of such services. We, as developers, usually tend to think that when we pay for something, is it to be trusted, but sometimes reverse proves to be the case! So, I suggest you not to trust any data coming from external systems and APIs and validate them, too! For example, an attack on a third-party API may propagate to those using it in their APIs.
These are some resources to learn more about ways to prevent injection:
API9:2019 - Improper Assets Management
APIs do not usually exist as a separate isolated entity, without having any relation or connection to its environment and outside world. They are literally living beings that sit there and listen to requests and eventually return a response. Yet they can be seen as assets to a company, making enough money for it to survive and grow. APIs and their belongings, like environments, accesses, …, should be seen as an inventory and shoud all be well-documented and accounted for. In turn, roles in the system and its data flow should also be documented. Security policies also need their own documentation.
Versioning is an important part of product development, which is concretely seen as an important aspect of API development, where the routes themselves contain version information, e.g. /v1/products or /v2/consumers. Along the lifecycle of the API, some endpoints are become obsolete or deprecated, so they should be somehow decommissioned. Keeping track of these endpoints in older versions of the API help reduce the attack surface of the API, in case there is a gem for attackers hidden inside. At the same time, some vulnerabilities are found in the new endpoints which needs to be ported the old ones. This specific backports should be taken seriously, by QAs, reviewers and testers and be labeled separately in any project management or ticketing solution to provide more visibility into the matter.
API10:2019 - Insufficient Logging & Monitoring
With enough logging and monitoring all issues can be unearthed and examined before causing any severe damage to the system (API) and credibility of the company.
It is really easy to saturate and overwhelm the log server with useless log messages. In terms of security, focus on special log messages, called security events, like failed authentication attempts, denied accesses and input validation errors and try to log enough data to help uncover the issue. Logs are considered as sensitive data, so handle it with care!
Implement and configure a monitoring solution to continuously monitor every asset in your system and use a Security Information and Event Management (SIEM) solution to aggregate logs and provide meaningful insights out of them.
Last but not least, do not stick with the default dashboards, customize them to show important security events first.
These are some recommendations, but you know best!
What I’ve tried to achieve in this long article was to inform you of the Top 10 list of vulnerabilities found in today’s APIs that threatens their existence. I’ve mentioned many Python packages, cheatsheets, related resources and software needed for the job. But the developer always knows best. Yeah, common sense can alleviate most of these.
Try to take advice from experts in the field, specially those contributed to the making of the Top 10 list and NEVER EVER TRUST USER INPUT!