Security architecture ===================== So far: security for network communication. Sending messages over a network. Network might be controlled by the adversary. Observe, modify, inject, drop messages. Good model for a large network with many participants. But also could be a good model for adversary targetting small network. Goals: authentication / integrity, confidentiality. Know how to achieve security for communication primitives. But we want to ultimately have secure applications. Next level: security for computers running code. Almost never the case that humans directly send/receive messages. Instead, devices / computers run code and communicate on behalf of users. And servers intermediate the communication, store and process data. What kinds of attacks could arise when we consider computers? What techniques can help defend against attacks? Broad discussion of attacks we might worry about above communication. Software bugs. Adversary might take advantage of bug to subvert software. As a result, software might not behave like we were expecting. Worst case, will execute arbitrary code from adversary. Malware, key loggers, ... Will have an entire module on software security in a few weeks. User mistakes. Adversary takes advantage of user mistakes. Sort-of like bugs in user's head as opposed to user's software. E.g., phishing attacks. Will talk a little bit about this towards the end of the class. Good idea to avoid situations where user could get confused. Malicious components controlled by adversary. Could be that there are no bugs but adversary can still run software. E.g., software on the same cloud provider as the victim. E.g., adversary wrote a library that you inadvertently used. E.g., adversary sold you a malicious USB storage device. E.g., adversary sold you a laptop with a backdoor in its hardware. Adversary has some access to the system. Adversary guesses admin's password, or bribes administrator. Adversary gets a job as an administrator, developer, etc. One common theme: addressing mistakes + malicious components together. Techniques for malicious software work for software bugs or mistakes. Techniques for malicious users/admins work for user/admin mistakes. Not always, as we'll see in software security. This module in the class: how to design systems for security. This lecture: broad ideas for how to structure the system. Overall design elements that help with security. But of course, low-level details also matter.. Next few lectures: focus on specific techniques. Isolation, trust, hardware. Relevant case study: Google's security architecture. [[ Ref: https://cloud.google.com/security/infrastructure/design ]] Some of the big ideas: Butler's overview paper. [[ Ref: https://www.microsoft.com/en-us/research/uploads/prod/2020/11/69-SecurityRealIEEE.pdf ]] Some of the specific details on how to think about delegation: TAOS. [[ Ref: https://dl.acm.org/doi/pdf/10.1145/174613.174614 ]] Goal is typically to limit damage of malicious components. In some cases this means strong guarantees akin to network communication. E.g., sharing a server between two unrelated applications. Ensure adversary's application doesn't tamper with our app. In other cases this means reducing damage. E.g., bug in Google Calendar app on my phone. Would be great if adversary could not take over my entire phone. Might be difficult to provide strong guarantees about calendar. Starting point: isolated execution environments. Need to run code in a way that the adversary cannot tamper with: "boxes". Critical for protecting some parts of the system from adversaries. Host enforces isolation. Host-enforced isolation is typically subject to some assumptions. E.g., correctness of the host is critical for isolation. Often configuration is part of the host's correctness. Isolation: keeping the bad guy out, and keeping the bad guy in. Allows sharing computer resources for unrelated activities. Many kinds of isolation environments used in practice. Processes in an operating system. Host is the OS kernel. Virtual machines. Host is the VMM. Software isolation -- Javascript, WebAssembly. Host is the language runtime. Physical isolation: separate computers. Used for critical applications. HSMs holding cryptographic keys. USB devices for two-factor authentication. Host is "physics". Not perfect. Physical security is part of the assumption for all of the above. Plus some part of the software stack (e.g., OS + runtime for sw isolation). Above isolation: controlled sharing. Explicit communication between isolated components. Check each request to determine if it should be allowed. Overall plan for controlled sharing: "Gold standard". Authentication: what principal is issuing the request? We've talked about this in the context of authenticating users. Often requests come from other components (software, devices, etc). Authentication using the underlying host for local channels. Authentication using cryptographic techniques over the network. Authorization: is this principal allowed to perform this request? Audit: keep a log of requests, decisions. Log should ideally be an isolated component. Collecting log records, detecting anomalies is commonly done. Authorization specifies the policy. When managing access to many objects, often think of policy as matrix. Principal vs object, with cells specifying allowed operations. E.g., user IDs and files in a file system. E.g., email addresses and docs in Google docs. Often stored by "row" along with the object: access control list (ACL). Who sets the policy / permissions? Users or applications: discretionary access control (DAC). User can set permissions on their files. Application can set permissions on objects it creates (files, databases, etc). Benefit: flexible, lets users share data. Downside: buggy or malicious user/app can make permissions too lax. Security administrator: mandatory access control (MAC). Administrator decides what policy should be enforced. E.g., corporate email can be accessed only from company computers, not user phones. Benefit: ensure broad policies are enforced. Benefit: some defense against malicious/compromised users or applications. Downside: hard for administrator to enforce fine-grained policies. Hard to determine what objects matter, especially in large applications. E.g., only certain staff can access student financial info in registrar. But where is student financial data stored in registrar's database? DAC/MAC terminology comes from the perspective of a security administrator. Fine-grained hybrid plan: role-based access control. Application specifies coarse units at which policy can be specified ("roles"). Specifies what roles need access to which underlying objects. Administrator can make policy decisions with fine-grained enforcement. E.g., fiscal staff in registrar's application. Administrator decides who should be "fiscal staff". Application developer specifies database rows requiring "fiscal staff" role. "Mandatory" enforcement of policy. Delegation. Chain of many components (principals) between a user and an object. Especially in distributed systems. E.g., fetching an email message from a Gmail account (simplified): User -> web browser -> Google front-end -> Gmail app server -> Storage server. E.g., employee accessing Google corporate HR directory: Employee -> Google laptop -> HR server -> Database. Storage server might have an ACL for objects (email messages, employee records). E.g., only nickolai.zeldovich@gmail.com can access this email message. But request is coming from the Gmail app server, not directly from user. Why should the storage server allow the request? What principal is the request coming from? A -> B -> C Is it A or B? Perhaps the request is coming from the server principal (B)? But then this means that the server can access data for any user. E.g., bug in Gmail code means all users' data could be leaked. Perhaps the request is coming from the user principal (A)? But then this means we can't rely on, say, trusted laptop. Google might want to say that corporate records accessible to employees only when accessing them from a trusted company-provided computer. E.g., employee's leaked password cannot be used to access internal systems. In general, we can think of request coming from a compound principal: "B for A". "Gmail for nickolai.zeldovich@gmail.com". "Some-corporate-laptop for alice@corp.google.com". Allows capturing the dependency chain in resulting access control policy. Compound "for" principal is neither weaker or stronger than either constituent. E.g., Gmail on its own can't access user's data, but user alone can't either. E.g., laptop on its own can't access internal Google servers, but user alone can't either. In theory, could talk about complete chains from user to some server: Gmail for (Frontend for (Browser for User)). Two related questions: 1. How do we deal with these compound principals? Concretely, what are the entries in the ACLs? 2. How do we know if request from B is actually from the "B for A" principal? Concretely, how to authenticate these compound principals? Weak answer to 1: "B for A" is just "B". Storage server might not even have an explicit ACL for user records. Fairly common with web applications. Benefit: no need to explicitly define policy. Easy to start writing application, evolve it over time, etc. Benefit: no need to answer question #2 because it doesn't matter. Downside: application bugs or compromises are big security problems. Storage server cannot enforce fine-grained policy on who reads/writes data. Stronger answer to 1: "B for A" translates into two levels of access. Require "B" principal to access the database at all. But then require "A" principal to access specific record. E.g., only Gmail server can connect to messages DB, but then need specific user principal to access specific message. E.g., only trusted corporate laptops can connect to HR service, but then only specific employees can access specific records. Weak answer to 2: assume B is trusted to say that it speaks as "B for A". E.g., application server is trusted to represent all users. Definitionally, app server speaks for all users in the system. Storage server authenticates app server but does not explicitly authenticate users. App server's job is to authenticate user and authorize requests before sending anything to the storage server. Strong answer to 2: explicit delegation. Storage server requires explicit proof that user asked app server to do something. In Gmail's design, this is derived from the HTTP cookie in user's web browser. Browser gets HTTP cookie when user first logs into gmail account. Browser sends cookie to front-end server when issuing request. Front-end server gives cookie to Gmail app server (simplified). App server gives cookie to storage server. Storage server can verify the cookie represents particular user. Will not allow reading data unless user's cookie is present. Benefit: storage server can enforce policy. Even if app is malicious, data won't leak unless user sends request. Cost: sophisticated authentication / delegation system. Cost: writing explicit policies on storage server. Another example of explicit delegation: browser speaks for the user. User types password into browser; Google servers won't trust browser otherwise. Delegation scope. When user delegates their privileges, what exactly are they delegating? Easy to delegate everything: send user's password (or HTTP cookie). Not great in terms of limiting damage of compromised components. E.g., what if user sent their password to Gmail app server? Compromised app server not just reads email, but can also delete photos, reset phone, change pw, .. Within a system, relatively easy to add a restriction when delegating. Google front-end server can send a restricted version of HTTP cookie. Effectively "for use in the context of Gmail app". Other contexts (e.g., photos, calendar, etc) will not accept this delegation. Can use certificate-like constructions for this. Often hard to agree on delegation format across systems, though. Capabilities: finer-grained delegation across systems. Name of object represents the right to access the object. E.g., URL that contains a secret hard-to-guess string. Anyone with this URL is assumed to have right to access object. Not just for web URLs, though. E.g., Android uses delegation when Gmail app opens attachment in PDF viewer app. Great for delegation: can pass object reference through many layers. No need to agree on format for delegating credentials, etc. Fine-grained support: delegating just the object whose name we passed. Need to manage lifetime of capabilities. Capabilites mean you can't answer "who has access to this object?". Typically capabilites are short-lived. Generating capability requires being on the (long-lived) ACL. Which delegated privileges to use? Component might have many delegated privileges. E.g., Gmail app server might have privileges from many different users. Fan-in when looking at a graph of delegation between principals. When issuing requests, which delegated privileges should be used? Dangerous: implicitly use all of the delegated privileges. Typical plan with implicit delegation. Risk: one user might trick app into requesting another user's data. Storage server has no idea what user really issued request. "Confused deputy" problem. Also shows up in web browser: browser has user's credentials for many web sites. When opening a link, browser uses user's cookies (credentials) for that web site. Can trick user into opening link with user's credentials. Two solutions: explicitly specify privileges or capabilities. Explicit: Gmail app server passes user's cookie to storage server each time. Capabilities: cannot accidentally access data on behalf of another user. Summary. Big architectural ideas for how to improve security. Isolation. Controlled sharing. Delegation.