# Computer Science Theory at Northeastern

## Theory Seminar

This semester the theory seminar will be every Tuesday from 3:30-5:00pm in ISEC 142 . The Theory Seminar will occur every week whether or not we have a speaker. If we do not have a speaker, we will have short presentations from students or faculty. Write to Jonathan Ullman with suggestions for speakers!

## Spring Semester, 2018

 Jan 16, 20183:30PM Steven Wu, MSR A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem 142 ISEC Abstract Bandit learning is characterized by the tension between long-term exploration and short-term exploitation. However, as has recently been noted, in settings in which the choices of the learning algorithm correspond to important decisions about individual people (such as criminal recidivism prediction, lending, and sequential drug trials), exploration corresponds to explicitly sacrificing the well-being of one individual for the potential future benefit of others. This raises a fairness concern. In such settings, one might like to run a “greedy” algorithm, which always makes the (myopically) optimal decision for the individuals at hand — but doing this can result in a catastrophic failure to learn. In this paper, we consider the linear contextual bandit problem and revisit the performance of the greedy algorithm. We give a smoothed analysis, showing that even when contexts may be chosen by an adversary, small perturbations of the adversary’s choices suffice for the algorithm to achieve “no regret”, perhaps (depending on the specifics of the setting) with a constant amount of initial training data. This suggests that “generically” (i.e. in slightly perturbed environments), exploration and exploitation need not be in conflict in the linear setting. Jan 23, 20183:30PM Michael Mitzenmacher Cuckoo Filters, and Adaptive Cuckoo Filters 142 ISEC Abstract We present cuckoo filters, an efficient data structure for approximate set membership that improves on the well-known Bloom filter. We then discuss recent work on how to make cuckoo filters adaptive in response to false positives, which can be important for many practical problems. Jan 30, 20183:30PM Audra McMillan, University of Michigan Local Differential Privacy for Physical Sensor Data and Sparse Recovery 142 ISEC Abstract Physical sensors (thermal, light, motion, etc.) are becoming ubiquitous and offer important benefits to society. However, allowing sensors into our private spaces has resulted in considerable privacy concerns. Differential privacy has been developed to help alleviate these privacy concerns. In this talk, we’ll develop and define a framework for releasing physical data that preserves both utility and provides privacy. Our notion of closeness of physical data will be defined via the Earth Mover Distance and we’ll discuss the implications of this choice. Physical data, such as temperature distributions, are often only accessible to us via a linear transformation of the data. We’ll analyse the implications of our privacy definition for linear inverse problems, focusing on those that are traditionally considered to be "ill-conditioned”. We’ll then instantiate our framework with the heat kernel on graphs and discuss how the privacy parameter relates to the connectivity of the graph. Our work indicates that it is possible to produce locally private sensor measurements that both keep the exact locations of the heat sources private and permit recovery of the general geographic vicinity'' of the sources. Joint work with Anna C. Gilbert. Feb 13, 20183:30PM Jack Doerner, Northeastern University Scaling ORAM for Secure Computation 142 ISEC Abstract We design and implement a Distributed Oblivious Random Access Memory (DORAM) data structure that is optimized for use in two-party secure computation protocols. We improve upon the access time of previous constructions by a factor of up to ten, their memory overhead by a factor of one hundred or more, and their initialization time by a factor of thousands. We are able to instantiate ORAMs that hold 2^34 bytes, and perform operations on them in seconds, which was not previously feasible with any implemented scheme. Unlike prior ORAM constructions based on hierarchical hashing, permutation, or trees, our Distributed ORAM is derived from the new Function Secret Sharing scheme introduced by Boyle, Gilboa and Ishai. This significantly reduces the amount of secure computation required to implement an ORAM access, albeit at the cost of O(n) efficient local memory operations. We implement our construction and find that, despite its poor O(n) asymptotic complexity, it still outperforms the fastest previously known constructions, Circuit ORAM and Square-root ORAM, for datasets that are 32 KiB or larger, and outperforms prior work on applications such as stable matching or binary search by factors of two to ten. Feb 27, 20183:30PM Albert Cheu, Northeastern University Skyline Identification in Bandits 142 ISEC Abstract The multi-armed bandit setting is a well-studied learning model. Within this setting, the approximate best arm identification problem has been solved optimally, using a clever technique to avoid a union bound. In our work, we pose a variant problem and present an optimal algorithm, which also avoids a union bound. We give intuition for the solution and, time permitting, the matching impossibility result. Joint work with Ravi Sundaram and Jonathan Ullman. March 20, 20183:30PM Ryan Williams Circuit Lower Bounds for Nondeterministic Quasi-Polytime 142 ISEC Abstract I will give an overview of the recent proof (with my PhD student Cody Murray) that the class NQP = NTIME$(n^{polylog n})$ does not have ACC circuits of $n^{log n}$ size (even with a layer of threshold gates at the bottom). This is achieved by strengthening a key component of the proof of NEXP not in ACC; in particular, we prove a new "Easy Witness Lemma" for NQP. March 27, 20183:30PM Nithin Varma, Boston University Property testing in the presence of erased data 142 ISEC Abstract Property testing is a formal framework to design algorithms for large datasets. The traditional definition of property testing [Rubinfeld & Sudan '96, Goldreich, Goldwasser & Ron '98] abstracts datasets as functions and assumes that an algorithm (also called tester) can query points in the domain of a function to obtain the corresponding values. The task of the tester is to distinguish, with probability 2/3, between the case that the function f being queried belongs to a set (property) P and the case that f is `far' from every function in P, for an appropriate measure of distance. One of the key assumptions in the above definition, that the tester has access to function values at all of the queried domain points, is not applicable to most real-life datasets. It could be the case that some or several of the function values are kept hidden for privacy reasons, or erased by mistake or by an adversary. Querying such erased points does not provide an algorithm with any information about the function. Moreover, it might happen that, an algorithm, by not knowing the locations of these erasures, makes a lot of queries to erased points and turns out entirely useless. In the talk, I will describe the erasure-resilient property testing model that we defined to account for the occurrence of erasures in datasets. We will then present and analyse our erasure-resilient tester for monotonicity of functions. Joint work with Kashyap Dixit, Sofya Raskhodnikova and Abhradeep Thakurta. April 3, 20183:30PM Aanchal Malhotra, Boston University A Universally Composable Treatment of Network Time 142 ISEC Abstract The security of almost any real-world distributed system today depends on the participants having some "reasonably accurate" sense of current real time. Indeed, to name one example, the very authenticity of practically any communication on the Internet today hinges on the ability of the parties to accurately detect revocation of certificates, or expiration of passwords or shared keys. However, as recent attacks show, the standard protocols for determining time are subvertible, resulting in wide-spread security loss. Worse yet, we do not have security notions for network time protocols that (a) can be rigorously asserted and (b) rigorously guarantee security of applications that require a sense of real time. We propose such notions, within the universally composable (UC) security framework. That is, we formulate ideal functionalities that capture a number of prevalent forms of time measurement within existing systems. We show how they can be realized by real-world protocols, and how they can be used to assert security of time-reliant applications --- specifically, certificates with revocation and expiration times. This allows for relatively clear and modular treatment of the use of time in security-sensitive systems. Our modeling and analysis are done within the existing UC framework, in spite of its asynchronous, event-driven nature. This allows incorporating the use of real time within the existing body of analytical work done in this framework. In particular it allows for rigorous incorporation of real time within cryptographic tools and primitives. With Ran Canetti, Kyle Hogan, and Mayank Varia