Privilege separation
====================

Privilege separation.
  Problem: what if you can't find all of the bugs in your code?
  Plan for compromises.
  Limit the damage of compromise.

Principle of Least Privilege.
  Each component should have the least privileges needed to do its job.

Privilege separation.
  Break up the system into components to profitably apply PoLP.

Challenge: how to split up the application?
  One component for each application feature?
  One component for each user of the application?

  One component for the most likely-to-be-buggy code?
  One component for the most exposed interfaces (attack surface)?
    "Keep the bad guy in a box."
  One component for the code handling most sensitive data?
    "Keep the bad guy out of the box."

Challenge: how to make the application work despite the separation?
  Need to architect interfaces between components.
  These interfaces need to limit privilege but still enable the app to work.
  Side benefit: interfaces typically lead to a simpler, cleaner spec for component.

Challenge: how to retain good performance?
  Communication between components can make things less efficient.
  Need to take this into account when designing the separation plan.

Relatively easier: how to isolate components?
  VMs, containers, processes, physical machines, ...
  We already saw how to do this in earlier lectures.

Example: logging.
  Data split: log vs everything else.
  Interface: append log entry, read log.
  Probably not "remove log entry".
  Bugs outside the logger cannot delete logs.

Example: cryptographic key management (smartcard, HSM).
  Data split: signing key vs everything else.
  Interface: sign, get public key.
  Maybe generate fresh key.
  Maybe get a log of signed messages.
  Probably not "get private key".
  Bugs outside the HSM cannot leak private key.
    But can probably sign lots of messages.

Example: user authentication devices (cryptocurrency hardware wallets, U2F).
  Data split: same as above, signing key vs everything else.
  Interface: sign this message.
  Device gets user approval to sign the message.
  Bugs outside the device cannot leak private key.
    And can't even sign many messages -- user would need to approve.

Example: media codecs (JPEG, MPEG, etc) in a web browser.
  Data split: media data vs everything else.
  Interface: decode media, produce image / video output.
  Bugs in the codec cannot get any data other than the media file.
  Challenge: compatibility with existing codec interfaces.
    Passing pointers, callbacks, memory allocation, etc.
  [[ Ref: https://www.usenix.org/system/files/sec20-narayan.pdf ]]

Example: front-end vs back-end code in web application.
  Data split: TLS key + DoS protection + logging vs application data.
  Interface: execute this HTTP request.
  Front-end responsible for TLS, parsing requests, DoS protection, logging, etc.
  Back-end responsible for application-specific logic and data.
  Bugs in application code cannot get TLS key, corrupt log, etc.
  Bugs in front-end cannot (directly) get application-level data.

Example: payment processing.
  Data split: financial / payment information vs everything else.
  Interface: initiate a charge; list payments.
  Maybe "refund payment".
  Probably not "send money to address".
  Probably not "list saved credit card numbers".
  Bugs in payment system cannot access any application data.
  Bugs in application cannot compromise payment data, send out money (but refunds?).
  Potential bug: inconsistent context (such as payment amount).
    Application -> user: pay $10 for txn 22.
    User -> payment system: pays $1 for txn 22.
    Payment system -> application: txn 22 was paid.
    [[ Ref: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/caas-oakland-final.pdf ]]

Example: user authentication to a server.
  Data split: credentials vs application data.
    Components must agree on usernames.
  Interface: authentication server signs message when user logs in.
    Strawman: token, sign_AS(token || username || timestamp)
    Ambiguous: token || Bob || 15-Nov-2021 == token || Bob1 || 5-Nov-2021
    Critical to be explicit about input structure for hashing / signatures.
    Keeps happening.
      Flickr authentication bug: http://netifera.com/research/flickr_api_signature_forgery.pdf
      Amazon authentication bug: http://www.daemonology.net/blog/2008-12.html

Example: web applications in the browser.
  How to safely open attachment in gmail?
  Privilege separation: open attachment in separate isolation "box".
  Browser unit of isolation is (effectively) a domain name.
  Google uses googleusercontent.com for untrusted data like attachments.
  Browser opens attachment in isolation from gmail's own domain.
  If something went wrong and attachment runs Javascript code in user's browser,
    at least it can't access gmail state (or state of any other interesting domain).

Example: web application logic.
  Backend database has many users' data.
  How to reduce privileges of web application code?
  Split by user (each user's requests handled by separate process)?
    Maybe a good fit if users do not need to access each other's data.
    Probably authentication would happen outside of per-user logic.
    Ensures we don't miss authentication checks.
  Split by function (messaging, view profiles, ..).
    Maybe a good fit if there's sharing.
    Bugs in profile-viewing code cannot steal messages, etc.
  DB front-end to limit database access.
    Authenticate individual components (per user or per function).
    Policy to determine what queries are allowed from each component.

Quick sanity-check: how to organize logging?
  Application --> Payment
           \----> Logging
    or
  Application --> Payment --> Logging

  What about web app DB instead of payments?
    Could go either way; probably both.

  Web applications: database vs app logic, ala OKWS.
    Different app logic features might have their own DB credentials.

Example: Google Chrome.
  Separate process for GPU interactions: buggy, complex code.
    Similar to codec sandboxing in some ways.
  Separate process for every site.
    Bugs triggered by one site cannot steal another site's data.
    Big challenge in designing interface: many interactions between sites.
    See https://www.chromium.org/Home/chromium-security/site-isolation
  Separate process for top-level UI ("chrome").
    Address bar, bookmarks, etc.
    Ensures user knows what site they are interacting with.
  Components: https://docs.google.com/drawings/d/1TuECFL9K7J5q5UePJLC-YH3satvb1RrjLRH-tW_VKeE
  Sandboxing machinery: https://www.chromium.org/Home/chromium-security/guts

Example: NTP server.
  [[ Ref: http://www.openbsd.org/papers/ven05-deraadt/mgp00035.html ]]
  NTP server needs root privileges to change system time.
  NTP server also needs to talk over the network to synchronize time.
  Bugs could allow adversary to take control of NTP process, get root access.
  What are the important components?
    Port 123 for NTP packets.
    Processing network packets.
    Setting the time.
  Privilege-separated design.
    Initial process: binds to port 123, spawns two children.
    Time-setting process: runs as root, performs time adjustments.
    Network process: runs as NTP user (no privileges), talks NTP on the network.
    API: network process can as time-setter to adjust time.
      Time-setter can implement some checks, like bounding big time jumps, etc.

Example: OpenSSH.
  What are the important components?
    Port 22.
    Exposure to arbitrary network inputs.
    User authentication.
    Running as the user.
    Network crypto protocol.
    User session.
  Sophisticated design.
    "Monitor" process runs as root.
      Has access to port 22, private key, etc.
    Fresh worker process handles incoming network connection.
      No privileges, no access to files, just the network socket.
      Need three main APIs from monitor:
        Sign key exchange (since worker has no access to private key).
        Authenticate user (password, key-auth, etc).
        Start user session (after authentication), passing crypto state.
    Another worker for user session.
      Runs with user's privileges, spawns user shell, etc.
      Starts with imported crypto state from network worker.
      Allows this worker to continue speaking crypto protocol on network.
  [[ Ref: https://www.usenix.org/legacy/event/sec03/tech/full_papers/provos_et_al/provos_et_al.pdf ]]

Example: mail servers (qmail, Postfix).
  [[ Ref: http://www.postfix.org/OVERVIEW.html ]]
  [[ Ref: http://cr.yp.to/qmail/pictures.html ]]
  What matters?
    Port 25.
    Exposure to arbitrary network inputs.
    Confidential email messages.
    Durability of emails (don't lose them).
  How to design?
    Core data: queue of messages, typically implemented as a directory.
    Privileged process accepts connections, passes them to worker.
    Worker accepts message, writes it to queue.
      Can only create new file; cannot read, write, or delete existing files.
      Adversary exploiting bug can at most inject garbage message into queue.
    Another worker monitors queue to handle messages.
      Spawns worker to handle each queued message.
      Independent of other messages: at most corrupt handling of its own message.
    In practice more complications, with more isolation domains.
      Local delivery vs remote delivery.
      Filtering of messages (spam filtering, processing rules, etc).

How would you privilege-separate MIT's registrar system?
  Websis, etc.
  What might be the important components?
    Enrollment / registration.
    Grades.
    Logs.
    User authentication.
    Access control: who defines TAs, course staff, etc.
    Interact with financial system?
      Might not allow registration if your account is unpaid?
      Might change the tuition based on registration heavy/light?