DS 593: Privacy-Conscious Computer Systems

Projects

Final projects should seek to answer a research question through implementation of a new idea in a real system. This could take one of several forms:

Prototype a new, privacy-centered system design.
Apply a privacy-enhancing or privacy-preserving technique in an existing system, and measure its impact.
Conduct a study of privacy risks and deficiencies in existing software, and analyze what it would take to address them.

You may work on projects individually, or in groups of two to four students. Your project deliverables include a proposal, a progress report, a final paper describing design and implementation, your code, and a presentation I will post the final presentation and writeup to the course website (unless you explicitly want it kept confidential for a good reason).

Important dates

Friday, October 3, 2025: submit your project proposal (by 11pm).
Friday, November 7, 2025: submit an individual progress report via Google form (by 11pm).
Tuesday, December 4, 2025 and Thursday, December 9, 2025: presentation and demo (in class).
Friday, December 12, 2025: submit your code and final report (by 11pm).

Project proposal

Please use the OSDI 2024 submission template. Your proposal should be a one to two page summary of what your idea is, how you plan to go about investigating it, and what techniques you will apply (or need to learn about beyond the course material). Your proposal should roughly have the following structure:

Problem statement & motivation: What is the problem you are trying to solve? why is it important? You can describe an example of how your work can be used in practice in some application, or an example for why existing approaches do not work or are harmful to prviacy.
Idea: What is your high level idea? Is it a new system? A new approach to designing applications? A user study? Tell me a little about your proposed design: the components you think you will have and what their roles are. You can use a rough figure if it is helpful to you.
Related work/relevant techniques: Is this idea related to papers in the course (or papers you read elsewhere)? How so? What kind of techniques and skills will you need to learn and apply during the project?
Evaluation plan: What questions determine whether your approach is successful (usually 1-3 questions)? What is your plan for evaluating these questions? This is usually done via some kind of experiment. Tell me a little about your experiment scenario and what you will measure in the experiment.
Project plan: What deliverables will you produce at the end of the project? Are any components "reach-goals"? If you are working in a group, give me a rough estimate for how you will divide the work fairly between group members.

Please send your project proposal drafts by email to babman@bu.edu by Friday, October 3, 2025 (11pm).

The first proposal you submit is a draft. I will give you feedback and likely ask you to make some modification to it, to help you make sure it is the correct scope and not too small or too ambitious. After you address the feedback, you will send me an updated proposal, and I will approve it when it's satisfactory.

Project ideas

Here's a list of some starter ideas to get you thinking. Please feel free to pursue your own ideas! Click on the project idea to get some more information.

Reproducing results is an important part of research, and an awesome way to really learn how something works! Pick any of the systems we learn about in the course and implement your own version of it. Then try to reproduce their results! You may wish to simplify some aspects of the system to make the reproduction practical within the time available.

The best way to evaluate whether a system or technique is effective is to apply it to real-world application! A thorough case study entails comparing the performance of the modified application relative to the original, reasoning about the application-level guarantees provided by the system, and discussing the overall experience of applying the system. Here are some example ideas:

Apply K9db or RuleKeeper to existing web applications.
Apply the methodology of the GDPR comparative evaluation to Sesame or Riverbed.
Build a censorship-resiliant application on top of Nostr, e.g., GitHub or Piazza.
Implement a differentially-private version of RateMyProfessor (or similar review aggregating applications).
Build a reusable MPC survey application (e.g. an MPC version of Google Forms or SurveyMonkey).

Several of the systems we read have complementary guarantees or cover different parts of the web ecosystem. Perhaps they can be combined to offer stronger protections! Some examples include: (1) combining K9db and Sesame to create a complete GDPR compliance package, (2) applying IFC techniques to MPC frameworks to ensure computations are allowed and secure, and (3) tracking differential privacy budgets via Sesame policies.

In Resin and Sesame, each sensitive datum is stored within a privacy container with an associated policy. The datum is outside application reach, and can only be manipulated via the systems, which ensures they keep track of the policies.

These systems require developers to specify their desired policies in Python or Rust, and manually review filter objects or review and sign critical regions. Some ideas to reduce this effort include:

Create a small domain-specific language (DSL) for formally expressing privacy policies, including generating APIs for combining these policies and reasoning about their exact meaning.
Create a mechansim for verifying that a filter object or privacy region satisfies the requirements of the associated policy and context, e.g. by automatically analyzing the code via a theorem prover (e.g. Z3) or Rust verification tools (e.g. Verus).

Systems like Resin and Sesame track policies and data within the application process and ensure data leaves the application only when the policy allows it. This is not ideal in the case of large scale applications where data leaves the boundary of a process (but not the application) frequently. For example, when the application is distributed, or when the application uses underlying data storage and processing systems.

Extend the protections of policy enforcement system to include the entire application rather than a single process. You can do this by developing new wrapers and shims for popular systems like memcached or Amazon S3, modifying existing systems to track and combine policies as they process data like SQL databases.

Alternatively, help us improve and evaluate a new Sesame-based system for enforcing policies in distributed and micro-service-based applications, by applying this new system to realistic applications.

We read a paper on evaluating the usability of DP tools in the class, which studies how data practitioners use existing DP tools and libraries, and how effective these tools are at helping practitioners correctly apply DP and understand its various parameters and guarantees.

Design a similar study to evaluate existing systems and their usability and effectiveness with application developers. For example, evaluating 2-3 existing MPC frameworks, or similar IFC systems.

Alternatively, you can help us continue development of Carousels, a resource estimation tool that aims to help non-experts implement correct and efficient MPC programs, and evaluate the effectiveness of this tool via a user study.

If you are considering a project of this kind, we suggest you take a look at the DP tools paper. We will hold your project to a similar standard in terms of study design, methodology, and analysis, although we will clearly expect that you will have far fewer participants.

Note that the study design needs to be completed several weeks ahead of project presentations, so as to give enough time for the study participants to complete the study, and have enough time to analyze the collected data and form conclusions.

DELF and K9db assist applications comply with the GDPR right to deletion. However, the systems have some limitations. You can make them better!

Investigate an alternative design to K9db that does not create duplicate copies for records that are jointly owned, e.g., using wrapped encryption keys.
Extend K9db to support account and data recycling, by setting a time-to-live for pieces of data, and automatically deleting unused data that exceeds that threshold.
Investigate how to assist applications delete user data effectively when that data may have already been used to train machine learning models.
Investigate transative data deletion across organizations, e.g., when an organization shares data with other services or organizations, or sells the data to third party brokers.

These ideas may entail extending K9db's open-source implementation. Familiarity with or a willing-less to learn C++ is helpful.

Feel free to come up with your own ideas. We will help you tease these ideas apart and scope them properly. Come to office hours and tell us about a paper you are really excited by or a problem that you find motivating, and we can take it from there!