Computer Science Theory at Northeastern
Click here to subscribe to our mailing list.
Click here to subscribe to our Google Calendar.
Theory Seminar
This semester the theory seminar will be held on Wednesdays 12:00 - 1:30 PM, and is being co-organized by Andisheh Ghasemi and John Wilkins. Write to John with suggestions for speakers!Fall Semester, 2024
October 212:00 PMEast Village 102 |
Mingda Qiao
Abstract:
In sequential calibration, a forecaster makes probabilistic predictions on a sequence of T adversarially chosen binary outcomes. The predictions are called perfectly calibrated if, among the steps on which each value p in [0, 1] is predicted, exactly a p fraction of the outcomes are ones. Since perfectly calibrated forecasts are often unachievable, calibration measures have been introduced to quantify the deviation from perfect calibration.
We initiate the study of the truthfulness of calibration measures. A calibration measure is said to be truthful if the forecaster (approximately) minimizes the expected penalty by predicting the conditional expectation of the next outcome, given the prior distribution of outcomes. Our main contribution is the introduction of a new calibration measure, termed the Subsampled Smooth Calibration Error (SSCE), under which truthful prediction is optimal up to a constant factor. In contrast, all the existing calibration measures are far from being truthful: there are simple distributions on which a polylogarithmic (or even zero) penalty is achievable, while truthful prediction leads to a polynomial penalty. Based on joint works with Nika Haghtalab, Kunhe Yang, Eric Zhao, and Letian Zheng. Papers available at https://arxiv.org/abs/2402.07458, https://arxiv.org/abs/2407.13979. Speaker Bio: Mingda Qiao is a postdoc at MIT, and an incoming assistant professor at UMass Amherst (starting Fall'25). His research focuses on the theory of prediction, learning, and decision-making in sequential settings, as well as collaborative federated learning. Prior to MIT, Mingda was a FODSI postdoc at UC Berkeley, received his PhD in Computer Science from Stanford University, and received his BEng in Computer Science from Tsinghua University. |
Spring Semester, 2024
April 1712:40 PMHastings 209 |
Rachel Redberg
Differential privacy (DP) provides rigorous guarantees which can be used to provably bound the privacy loss of running an algorithm on sensitive data. But there is still a large gap between theory and practice which presents challenges to the widespread deployment of DP in machine learning (ML) applications. I plan to talk about several tools and algorithms to bridge this gap.
Using a data-dependent version of DP can help to improve the privacy-utility trade-off of a DP algorithm — but this must be done with great care as data-dependent privacy losses are themselves a function of sensitive data. First I will discuss how to privately publish data-dependent privacy losses for the objective perturbation mechanism. I’ll then demonstrate how data-dependent privacy losses can be used to develop DP algorithms which can adapt to favorable properties of the data, in order to achieve a better privacy-utility trade-off. I’ll conclude by returning to the objective perturbation mechanism, and discuss new tools and privacy analyses that allow it to compete with more modern algorithms. Speaker Bio: Rachel Redberg is a postdoctoral fellow at Northeastern’s Institute for Experiential AI. She obtained a PhD (computer science) in December 2023 from UC Santa Barbara and a BA (applied math) from UC Berkeley in 2015. |
April 1012:40 PMHastings 209 |
Yair Zick
I will discuss practical and theoretical challenges in the analysis of fair allocation of indivisible goods. I will discuss problems along the algorithmic fair allocation pipeline: from preference elicitation, via algorithmic challenges, to the guarantees we might offer. I will present recent results on simple algorithms for the fair allocation of indivisible resources. We advocate for the use of simple algorithmic techniques - i.e. ones that are easy to implement and understand by non-expert stakeholders - in real-world applications. The mechanisms we propose are rather intuitive, but provide strong fairness guarantees, as well as high social welfare.
Speaker Bio: Yair Zick is an assistant professor at the College of Information Systems and Computer Sciences, UMass Amherst. Prior to that, he was an assistant professor at the NUS School of Computing. He obtained his PhD (mathematics) from Nanyang Technological University, Singapore in 2014, and a B.Sc (mathematics, "Amirim" honors program) from the Hebrew University of Jerusalem. His research interests include computational fair division, computational social choice, algorithmic game theory and algorithmic transparency. He is the recipient of the 2011 AAMAS Best Student Paper award, the 2014 Victor Lesser IFAAMAS Distinguished Dissertation award, the 2016 ACM EC Best Paper award, and the 2017 Singapore NRF Fellowship. He is a 2023 UMass Public Interest Technology Fellow, and a 2023 Sloan Faculty Diversity Fellow. His work is generously supported by the NSF and the DoD. |
Mar 2712:40 PMHastings 209 |
Connor Wagaman
Releasing differentially private statistics about social network data is challenging: one individual's data consists of a node and all of its connections, and typical analyses are sensitive to the insertion of a single unusual node in the network. This challenge is further complicated in the continual release setting, where the network varies over time and one wants to release information at many time points as the network grows. Previous work addresses node-private continual release by assuming an unenforced promise on the maximum degree in a graph; indeed, the algorithms from these works exhibit blatant privacy violations when the degree bound is not met.
In this work, we describe the first algorithms that satisfy the usual notion of node-differential privacy in the continual release setting (i.e., without an assumed promise on the input streams). These algorithms are accurate on sparse graphs, for several fundamental graph problems: counting edges, triangles, other subgraphs, and connected components; and releasing degree histograms. Up to lower-order terms, our unconditionally private algorithms have the same (generally optimal) error as the previously known algorithms that only satisfy a restricted notion of privacy. We provide general transformations that take a base algorithm for the continual release setting, which need only be private for streams satisfying a promised degree bound, and produce an algorithm that is unconditionally private yet mimics the base algorithm when the stream meets the degree bound (and adds only linear overhead to the time and space complexity of the base algorithm). To do so, we design new projection algorithms for graph streams, based on the batch-model techniques of Day et al. 2016 and Blocki et al. 2013, which modify the stream to limit its degree. Our main technical innovation is to show that the projections are stable---meaning that similar input graphs have similar projections---when the input stream satisfies a privately testable safety condition. Our transformation then follows a novel online variant of the Propose-Test-Release framework (Dwork and Lei, 2009), privately testing the safety condition before releasing output at each step. Joint work with Palak Jain and Adam Smith. https://arxiv.org/abs/2403.04630 Speaker Bio: Connor Wagaman is a second-year PhD student in computer science at Boston University, where he is advised by professors Adam Smith and Marco Gaboardi. He is interested in data privacy (e.g., differential privacy), and in issues of privacy and security more generally. |
Mar 2012:40 PMHastings 209 |
Sammy Khalife
In this talk, I will present new results on the formal expressivity of graph neural networks (GNNs). GNNs form a popular and efficient computational framework for various learning tasks involving network data.
In the first part, I will present results about the expressive power of GNNs when compared to canonical message-passing algorithms such as color-refinement. Our results first state that if the size of the GNN is not allowed to grow with the size of the input graphs (uniform expressivity), GNNs have strictly weaker expressivity if the activation functions are piecewise polynomial. More precisely, we prove that for any GNN with piecewise polynomial activations (for instance, ReLUs), there are trees of depth two that color-refinement can distinguish their source vertices in two iterations, but such that any GNNs with piecewise polynomial activations cannot distinguish in an arbitrary number of iterations. Second, if one allows non-piecewise polynomial activations such as sigmoids, a single neuron perceptron can distinguish the source vertex of these tree structures in only two interactions. This result is based on transcendental number theory. The expressive power of GNNs can also be described through the lens of descriptive complexity. Namely, any query of the two-variables fragment of graded modal logic (GC2) interpreted over labeled graphs can be uniformly expressed using a GNN whose size depends only on the depth of the query. This description holds for a family of activation functions, leaving the possibility for a hierarchy of logics uniformly expressible by GNNs depending on the chosen activation function. My related result states that such a hierarchy among activation functions exists by proving that GC2 queries cannot be expressed uniformly by GNNs with polynomial activation functions, although they can with piecewise linear activations. Sammy Khalife is a postdoctoral researcher at Johns Hopkins University since 2021, in the Department of Applied Mathematics and Statistics, mentored by Amitabh Basu. His current research interests are a mix of Discrete Optimization, Theoretical Deep Learning and Data Science. He holds a PhD from Ecole Polytechnique (France), Computer Science in 2020. Before that, he graduated from ENSTA Paristech (MSc), has a MSc from Paris 1 Sorbonne in Mathematical Optimization and a MSc from ENS Paris Saclay (Mathematics, Vision, Learning). |
Mar 1312:40 PMHastings 209 |
Mahdi Haghifam
The amount of information that a learning algorithm uses from its training set to produce the output is a natural and important quantity to study. A central idea in the learning theory is that a learning algorithm that only uses a small amount of information from its input sample will generalize well. In this talk, we investigate the interplay between information and learning in the context of stochastic convex optimization (SCO). In particular, we answer the question: “ How much information about the training set is revealed by a learning algorithm in the context of SCOs?”
Bio: Mahdi Haghifam is a distinguished postdoctoral fellow at Northeastern University's Khoury College of Computer Science. He holds a PhD from the University of Toronto, where he was also a graduate student researcher at the Vector Institute for Artificial Intelligence. Mahdi received his Bachelor's and Master's degrees in Electrical Engineering from Sharif University of Technology. His current research focuses broadly on statistical learning theory and Differential Privacy. |
Feb 2812:40 PMHastings 209 |
Ilya Volkovich
Abstract:
We study close connections between Indistinguishability Obfuscation (IO) and the Minimum Circuit Size Problem (MCSP), and argue that algorithms for one of MCSP or IO would empower the other one. Some of our main results are: If there exists a perfect (imperfect) IO that is computationally-secure against non-uniform polynomial-size circuits, then we obtain fixed-polynomial lower bounds against NP (MA). In addition, computationally-secure IO against non-uniform polynomial-size circuits imply super-polynomial lower bounds against NEXP. If MCSP is in BPP, then statistical security and computational security for IO are equivalent. If computationally-secure perfect IO exists, then MCSP is contained in BPP iff NP = ZPP; As a consequence we separate ZPEXP from BPP. To the best of our knowledge, this is the first consequence of strong circuit lower bounds from the existence of an IO. The results are obtained via a construction of an optimal universal distinguisher, computable in randomized polynomial time with access to the MCSP oracle, that will distinguish any two circuit-samplable distributions with the advantage that is the statistical distance between these two distributions minus some negligible error term. This is our main technical contribution. As another application, we get a simple proof of the result by Allender and Das (Inf. Comput., 2017) that SZK is contained in BPP^MCSP. Bio: Ilya Volkovich is an Assistant Professor of Computer Science at Boston College. Previously, he was a Postdoctoral Research Associate in the Computer Science Department at Princeton University and held a visiting position at the Institute of Advanced Study. He obtained his Ph.D. in Computer Science from Technion, Israel Institute of Technology, advised by Prof. Amir Shpilka. His research interests are in the broad area of theoretical computer science. More specifically, he is interested in aspects of algebraic complexity, randomness in computation, computational learning theory and meta-complexity, as well as their applications to cryptography and machine learning. |
Feb 2112:40 PMHastings 209 |
Evi Micha
Abstract: Algorithms have had a remarkable impact on human lives as they have been used increasingly to automate critical decisions. Consequently, it is more important than ever to design decision-making algorithms that treat people fairly, use limited resources efficiently, and foster social good. To illustrate my research in this direction, I will present two recent examples: in one, we boost the efficiency of COVID testing in a real-world setting, and in the other, we make the selection of citizens' assemblies more representative.
Bio: Evi Micha is a postdoc at Harvard University, under the supervision of Ariel Procaccia. Before that she obtained her Ph.D. from University of Toronto, advised by Nisarg Shah. She was also an affiliate of the Vector Institute for Artificial Intelligence and a fellow of the Schwartz Reisman Institute for Technology and Society. Her research interests lie at the intersection of computer science and economics, and span areas such as algorithmic fairness and computational social choice. One of her recent papers was selected as the exemplary paper in the applied modeling track of ACM Conference on Economics and Computation. |
Feb 1412:40 PMHastings 209 |
Rebecca Lin
Bio: Rebecca is a PhD student at MIT CSAIL, advised by Erik Demaine. Her research spans several areas, including theoretical computer science, computer graphics, and computational fabrication, with a unifying theme of geometry. This past summer, Rebecca was a visiting researcher at the Origami Lab led by Tomohiro Tachi at the University of Tokyo. Prior to her PhD, she earned her BSc in Computer Science at the University of British Columbia, during which she was fortunate to work with Will Evans, Craig Kaplan, and Alla Sheffer.
Abstract: Inspired by artistic practices such as beadwork and himmeli, we study the problem of threading a single string through a set of tubes, so that pulling the string forms a desired graph. More precisely, given a connected graph (where edges represent tubes and vertices represent junctions where they meet), we give a polynomial-time algorithm to find a minimum-length closed walk (representing a threading of string) that induces a connected graph of string at every junction. The algorithm is based on a surprising reduction to minimum-weight perfect matching. Along the way, we give tight worst-case bounds on the length of the optimal threading and on the maximum number of times this threading can visit a single edge. |
Feb 712:40 PMHastings 209 |
Neha Makhija
Reverse data management problems focus on finding the minimum set of changes to a database ("interventions") to produce a certain change in the output of a query. We discuss a novel approach for solving reverse data management problems: instead of creating dedicated algorithms for easy (PTIME) and hard cases (NP-complete), we suggest using (Integer) Linear Programs (ILP/LPs) and Max-Flow Min-Cut (MFMC) based algorithms as "unified" algorithms. These algorithms are unified in the sense that they are guaranteed to be always correct and to terminate in PTIME for all known easy cases.
This talk illustrates this unified approach with two fundamental yet distinct reverse data management problems: (1) Resilience: What is a minimal set of tuples to delete from a database in order to eliminate all query answers? (2) Minimal factorization of Provenance Expressions: What is an equivalent boolean expression that minimizes the number of variable repetitions for an input provenance expression? (While general Boolean formula minimization is ΣP2-complete, we show that the problem is an NP-C in our setting). Our approach yields new complexity results for these problems, including in settings that are typically harder to analyze, such as bag semantics or for queries with self-joins. Related work https://northeastern-datalab.github.io/unified-reverse-data-management/ Makhija, Gatterbauer. A Unified Approach for Resilience and Causal Responsibility with Integer Linear Programming (ILP) and LP Relaxations, SIGMOD 2024 https://arxiv.org/pdf/2212.08898 Makhija, Gatterbauer. Towards a Dichotomy for Minimally Factorizing the Provenance of Self-Join Free Conjunctive Queries, 2023. https://arxiv.org/abs/2105.14307 Bio: Neha Makhija is a PhD student at the DataLab at Northeastern University, advised by Professor Wolfgang Gatterbauer. Her research aims to find efficient algorithms and borders of tractability for problems in database theory, such as problems in reverse data management and of minimal representations of knowledge. |
Jan 1712:40 PMHastings 209 |
Antonis Skarlatos
Bio: Antonis Skarlatos is a PhD student at the University of Salzburg, supervised by Sebastian Forster. Previously, he completed an internship at Max Planck Institute for Informatics, and he earned his master's degree at the National and Kapodistrian University of Athens. His research interests include shortest paths and clustering problems, usually
in the dynamic setting.
Abstract: The input for many applications can be naturally modeled as a graph, for instance a network of people or a network of roads. One of the most fundamental and well-studied clustering problems is the $k$-center problem. In this problem, given a graph with $n$ vertices and a positive integer $k \leq n$, the goal is to find $k$ vertices, referred to as centers, such that the maximum distance of any vertex to its closest center is minimized. It is NP-hard to get a better than $2$ approximation for this problem. In the real world where everything changes all the time, the relations between the vertices of the graph may change over time. That is, edges may be deleted from the graph or new edges may be inserted to the graph. The goal then is to maintain a good $k$-center solution as fast as possible while the graph is subject to edge updates. All prior works on $k$-center problem in dynamic settings are on point sets in arbitrary metric spaces, and not in graphs that are subject to edge updates. In this talk we give the first efficient algorithms for the $k$-center problem on dynamic graphs undergoing edge updates. In particular, we give a deterministic decremental $(2+\epsilon)$-approximation algorithm and a randomized incremental $(4+\epsilon)$-approximation algorithm, both with amortized update time $kn^{o(1)}$ for weighted graphs. Moreover, we show a reduction that leads to a fully dynamic $(2+\epsilon)$-approximation algorithm for the $k$-center problem, with update time that is within a factor $k$ of the state-of-the-art fully dynamic $(1+\epsilon)$-approximation single-source shortest paths algorithm in graphs. This is a joint work with Emilio Cruciani, Sebastian Forster, Gramoz Goranci, Yasamin Nazari, and it appeared in the proceedings of SODA 2024. |
Jan 1012:30 PMHastings 204 |
Jessie Finocchiaro
Bio: Jessie Finocchiaro is a NSF Mathematical Sciences Postdoctoral Research fellow at Harvard University hosted by Dr. Yiling Chen, and an affiliate of Harvard’s Center for Research on Computation and Society. Previously, she received her doctorate in Computer Science from the University of Colorado Boulder under the supervision of Dr. Rafael Frongillo, where she was awarded a NSF Graduate Research Fellowship. Her research spans theoretical machine learning and algorithmic economics, and has been recognized with a NeurIPS spotlight presentation, awarded to the top 4% of submissions.
|
Fall Semester, 2023
Dec 612:30 PMWVH 366 |
Rahul Ilango
In this talk, I will discuss progress on showing hardness of the Minimum Circuit Size Problem (MCSP). Understanding the complexity of MCSP is a longstanding open question (since at least Levin's seminal work on NP-completeness). Over the past several years, researchers have utilized seemingly special properties of MCSP (not known for SAT) to show important connections between MCSP and other areas of theoretical computer science, including learning theory, cryptography, average-case complexity, and proof complexity. We give, in our view, the strongest evidence yet that MCSP is in fact NP-complete (and thus that SAT does indeed have these special properties). Specifically, we show that, with probability one, there is a P/poly reduction from the NP-hard problem of approximating vertex cover on hypergraphs to MCSP on circuits that have access to a uniformly random oracle O. Building on this, we give the first candidate reduction from SAT to MCSP, and conclude that the existence of sufficiently "unstructured" functions implies a problem with a known (non-black-box) worst-case to average-case reduction is NP-complete. Curiously, a key part of our reduction is computing a cryptographic proof of work. No background knowledge on MCSP or other areas of complexity is required! |
Nov 3012:30 PMWVH 166/168 |
Fabian Spaeh
Display Ads and the generalized assignment problem are two well-studied online packing problems with important applications in ad allocation and other areas. In both problems, ad impressions arrive online and have to be allocated immediately to budget-constrained advertisers. Worst-case algorithms that achieve the ideal competitive ratio are known, but might act overly conservative given the predictable and usually tame nature of real-world input. Given this discrepancy, we develop an algorithm for both problems that incorporate machine-learned predictions and can thus improve the performance beyond the worst-case. Our algorithm is based on the work of Feldman et al. (2009) and similar in nature to Mahdian et al. (2007) who were the first to develop an algorithm with predictions for the related, but more structured Ad Words problem. We use a novel analysis to show that our algorithm is able to capitalize on a good prediction, while being robust against poor predictions. We experimentally evaluate our algorithm on synthetic and real-world data on a wide range of predictions. Our algorithm is consistently outperforming the worst-case algorithm without predictions. Joint work with Alina Ene. |
Nov 1512:30 PMWVh 366 |
Jad Silbak
Guruswami and Smith (J. ACM 2016) considered codes for channels that are computationally bounded which modify at most a p-fraction of the bits of the codeword. This class of channels is significantly stronger than Shannon’s binary symmetric channel which flips each bit independently with probability p, but weaker than Hamming’s channels which may flip any p-fraction of bits and are computationally unbounded.
|
Nov 112:30 PMWVH 366 |
Hsin-Hao Su
Traditional studies on symmetry breaking focus on breaking symmetries on atomic objects such as nodes or edges (e.g., the maximal independent set problem and the maximal matching problem). Their generalizations to higher-order objects have the potential to be used for solving optimization problems in distributed and parallel models. In this talk, I will discuss how such primitives play a role in our two recent results as follows:
Based on joint works with Nairen Cao and Shang-En Huang. |
October 2512:30 PMWVH 366 |
Sandeep Silwal
The talk is motivated by the question: how efficiently can we search distributions? The problem is modeled as follows: we are given knowledge of k discrete distributions v_i for 1 <= i <= k over the domain [n] = {1,...,n} which we can preprocess. Then we get samples from an unknown discrete distribution p, also over [n]. The goal is to output the closest distribution to p among the v_i's in TV distance (up to some small additive error). The state of the art algorithms require Theta(log k) samples and run in near linear time. We introduce a fresh perspective on the problem and ask if we can output the closest distribution in sublinear time. This question is particularly motivated as the problem is a generalization of the traditional nearest neighbor search problem: if we take enough samples, we can learn p explicitly up to low TV distance and then find the closest v_i in o(k) time using standard neast neighbor search algorithms. However, this approach requires Omega(n) samples. Thus, it is natural to ask: can we obtain both sublinear number of samples and sublinear query time? We present some nice progress towards this question. This is joint work with Anders Aamand, Alex Andoni, Justin Chen, Piotr Indyk, and Shyam Narayanan, and appeared in ICML 23. |
October 1812:30 PMWVH 366 |
Thien Nguyen
Stochastic optimization is a well-studied area with many applications ranging from machine learning, to operation research, numerical linear algebra and beyond. In contrast to deterministic algorithms, stochastic algorithms might fail, and a pertinent question is how often does failure happen and how to increase the success rate. These questions are especially important in critical applications where failure is not tolerable, or when a single run is costly in time and resources. Our recent works employ a new technique that can obtain improved high probability convergence guarantees for well-known stochastic gradient methods (SGD, SMD, Clipped-SGD, ASGD, Adagrad, etc.) in a variety of settings (sub-Gaussian gradient noise, heavy-tailed gradient noise, arbitrary domain, etc.) for both convex and non-convex optimization. |
October 11 12:30 PM WVH 366 |
Nadya Voronova
The approximate degree of a Boolean function is the minimum degree of real polynomial that approximates it pointwise. For any Boolean function, its approximate degree serves as a lower bound on its query complexity (both randomized and quantum), and generically lifts to a quantum communication lower bound for a related function. We introduce a framework for proving approximate degree lower bounds for certain oracle identification problems, where the goal is to recover a hidden binary string x∈{0,1}^n given possibly non-standard oracle access to it. Our lower bounds apply to decision versions of these problems, where the goal is to compute the parity of x. We apply our framework to the ordered search and hidden string problems, proving nearly tight approximate degree lower bounds of Ω(n/log^2(n)) for each. These lower bounds generalize to the weakly unbounded error setting, with optimal dependency on the error parameter in this regime. Our lower bounds are driven by randomized communication upper bounds for the greater-than and equality functions. Based on joint work with Mark Bun. |
October 4 12:30 PM WVH 366 |
Prantar Ghosh
A (possibly multi-pass) graph streaming algorithm is called semi-streaming if it processes an n-vertex graph using O(n polylog(n)) space per pass. The following question arises naturally in the study of semi-streaming algorithms: Is there any graph problem which is "not too hard'', in the sense that it can be solved efficiently with O(n polylog(n)) total communication, but for which, nonetheless, any semi-streaming algorithm needs a polynomial n^Ω(1) number of passes? Assadi, Chen, and Khanna [STOC 2019] were the first to give an affirmative answer to this question, albeit with some rather non-standard graph problems. In this talk, I shall present the first polynomial-pass lower bounds for natural "not too hard'' graph problems: finding a k-core of a graph or even the value of its degeneracy. First I shall describe a novel communication protocol for both problems with O(n polylog n) total communication, thus showing that they are indeed "not too hard.'' Next, I shall describe how we prove that any semi-streaming algorithm (exactly) solving these problems requires (almost) Ω(n^{1/3}) passes. This proof uses a reduction from a generalized version of the Hidden Pointer Chasing (HPC) communication problem introduced by the aforementioned paper of Assadi, Chen, and Khanna. I shall sketch this reduction and give a high level idea of how we improve upon the communication lower bound of HPC to the optimal bound, which allows us to prove stronger (by a polynomial factor) semi-streaming pass lower bounds for a certain class of graph problems. This is based on a recent joint work with Sepehr Assadi, Bruno Loff, Parth Mittal, and Sagnik Mukhopadhyay. |
Spring Semester, 2023
April 26 12:30 PM ISEC 655 |
Anamay Chaturvedi
In this work, we study the problem of privately maximizing a submodular function in the streaming setting. Extensive work has been done on privately maximizing submodular functions in the general case when the function depends upon the private data of individuals. However, when the size of the data stream drawn from the domain of the objective function is large or arrives very fast, one must privately optimize the objective within the constraints of the streaming setting. We establish fundamental differentially private baselines for this problem and then derive better trade-offs between privacy and utility for the special case of decomposable submodular functions. A submodular function is decomposable when it can be written as a sum of submodular functions; this structure arises naturally when each summand function models the utility of an individual and the goal is to study the total utility of the whole population as in the well-known Combinatorial Public Projects Problem. Finally, we complement our theoretical analysis with experimental corroboration. Joint work with Huy Nguyen and Thy Nguyen. |
April 12 12:30 PM ISEC 655 |
Soheil Behnezhad
Linear-time algorithms have long been the gold standard of algorithm design. But can we design algorithms that run even faster, i.e., in time *sublinear* in the input size? In this talk, I will show how this is possible for the fundamental maximum matching problem. I will also discuss how sublinear time algorithms for maximum matching can be used to answer a longstanding open problem in the area of dynamic algorithms for maintaining a better than 1/2 approximation of maximum matching in polylog time. |
March 1 12:30 PM ISEC 655 |
Ethan Mook
Can we design a private variant of "Google search" that would enable users to search the Internet privately without revealing their queries to Google? Fully homomorphic encryption gives a solution to this problem, where users would send their search queries to Google encrypted in a way that allows Google to compute the encrypted search results, without learning anything about the query itself. However, this solution would require Google to run a huge computation that touches the entire content of the Internet to answer each encrypted search query, making it realistically infeasible! Is there a way for Google to preprocess the Internet content into a special data structure that would allow it to answer any encrypted query efficiently by only accessing some small number of locations in the data structure? We give a solution to this problem as a special case of our work. Concretely, we construct the first schemes for "doubly efficient private information retrieval" and "fully homomorphic encryption in the RAM model" under standard cryptographic hardness assumptions (Ring Learning with Errors). Previously we only had heuristic candidate constructions that only satisfied relaxed variants of these primitives. Our new solutions combine tools from cryptography together with data structures for fast polynomial evaluation (Kedlaya and Umans '08). Based on: https://eprint.iacr.org/2022/1703. |
February 15 12:30 PM Ryder Hall 153 |
Rajesh Jayaram
We consider the problem of computing the cost of the Euclidean Minimum Spanning Tree (MST) of an n-point set X in R^d, where the points in X are presented in an arbitrary order stream of insertions and deletions. In the streaming setting, the goal is to maintain an approximation to the MST cost in small space. In low dimensions, (1+ε) approximations are possible in sublinear (in n) space [Frahling, Indyk, Sohler, SoCG '05]. However, for high dimensional spaces the best known approximation was Õ(log n), due to [Chen, Jayaram, Levi, Waingarten, STOC '22], and prior to that it was O(log^2 n) due to [Indyk, STOC '04] and [Andoni, Indyk, Krauthgamer, SODA '08]. In this talk, we describe the first algorithm which achieves a constant factor approximation to the Euclidean MST in sublinear space. Specifically, for any ε > 0, our algorithm obtains a Õ(1/ε^2) approximation in n^{O(ε)} space. We show the approximation is tight up to a constant, by proving an Omega(sqrt{n}) space lower bound for any algorithm that beats a 1.1 approximation. Furthermore, the algorithm can be modified to give a (1+ε) approximation in O(1/ε)-passes over the stream, which separates the complexity for single vs multi-pass streaming. Joint work with Xi Chen, Vincent Cohen-Addad, Amit Levi, and Erik Waingarten (to appear in STOC '23). |
February 1 12:30 PM ISEC 655 |
Jason Li
Minimum cut problems are among the most well-studied questions in combinatorial optimization. In this talk, I will introduce a simple but powerful new tool for solving minimum cut problems called the minimum isolating cuts. I will show how this tool can be employed to obtain faster algorithms for several fundamental min-cut problems, namely global min-cut, Steiner min-cut, and all-pairs min-cut. For these problems, the new results represent the first improvement in their runtimes in several decades. These results are in collaboration with Amir Abboud, Robert Krauthgamer, Danupon Nanongkai, Thatchaphol Saranurak, and Ohad Trabelsi. |
January 25 12:30 PM Ryder Hall 153 |
Zihan Tan
Graph Crossing Number is a fundamental and extensively studied problem with wide ranging applications. In this problem, the goal is to draw an input graph G in the plane so as to minimize the number of crossings between the images of its edges. The problem is notoriously difficult, and despite extensive work, non-trivial approximation algorithms are only known for bounded-degree graphs. Even for this special case, the best current algorithm achieves a O(sqrt(n))-approximation, while the best current negative results do not rule out constant- factor approximation. All current approximation algorithms for the problem build on the same paradigm, which is also used in practice: compute a set E’of edges (called a planarizing set) such that G \ E’ is planar; compute a planar drawing of G \ E’; then add the drawings of the edges of E to the resulting drawing. Unfortunately, there are examples of graphs G, in which any implementation of this method must incur Ω(OPT^2) crossings, where OPT is the value of the optimal solution. This barrier seems to doom the only currently known approach to designing approximation algorithms for the problem, and to prevent it from yielding a better than O(sqrt(n))-approximation. We propose a new paradigm that allows us to overcome this barrier. Using the new paradigm, we reduce the Crossing Number problem to Crossing Number with Rotation System – a variant of the Crossing Number problem, in which the ordering of the edges incident to every vertex is fixed as part of input. We then show a randomized algorithm for this new problem, that allows us to obtain a subpolynomial approximation for Graph Crossing Number on low-degree graphs. This talk is based on joint work with Julia Chuzhoy and Sepideh Mahabadi. |
January 18 12:30 PM WVH 366 |
Christopher Tosh
Multi-group agnostic learning is a formal learning criterion that is concerned with the conditional risks of predictors within subgroups of a population. This criterion addresses recent practical concerns in machine learning such as subgroup fairness and hidden stratification. In this talk, we will discuss the structure of solutions to the multi-group learning problem and provide corresponding simple and near-optimal algorithms. |
Fall Semester, 2022
December 7 12:30 PM ISEC 655 |
François Sellier
We consider the maximum weight b-matching problem, assuming all weights are small integers drawn from [1,W]. We design a generalized weighted version of edge-degree constrained subgraphs (EDCS), a matching sparsifier originally developed by Bernstein and Stein that have been used for dynamic and streaming matchings in graphs. Our new subgraph has bounded vertex degrees (hence uses only a small number of edges) and can be easily computed. Moreover, it contains a (2 - 1/(2W) + epsilon) approximation of the maximum weight matching. |
November 3012:30 PMISEC 655 |
Eysa Lee
We propose a secure multiparty signing protocol for the BBS+ signature scheme; in other words, an anonymous credential scheme with threshold issuance. We prove that due to the structure of the BBS+ signature, simply verifying the signature produced by an otherwise semi-honest protocol is sufficient to achieve composable security against a malicious adversary. Consequently, our protocol is extremely simple and efficient: it involves a single request from the client (who requires a signature) to the signing parties, one round-trip communication among the signing parties, and finally a response to the client; in some deployment scenarios the concrete cost bottleneck may be the client's local verification of the signature that it receives. Furthermore, our protocol can be extended to support strongly-blind signing and to serve as a distributed evaluation protocol for the Dodis-Yampolskiy Oblivious VRF. We validate our efficiency claims by implementing and benchmarking our protocol. This is joint work with Jack Doerner, Yashvanth Kondi, abhi shelat, and LaKyah Tyner. |
November 1612:30 PMWVH 366 |
Daniel Reichman
Networks can be conductive to the spread of undesirable phenomena from false information to bankruptcy of financial institutions and contagious disease. How can we leverage algorithms to stop or slow down contagious processes? I will survey some of the literature focusing mostly on the bootstrap percolation model which is a simple contagion model where every uninfected node is infected if it has at least r infected neighbors for some r>1. I will present some improved algorithms for the binomial random graph and conclude with several open questions. Based on joint works with Hermish Mehta, Uri Feige, Michael Krivelevich and Amin Coja-Oghlan. |
November 912:30 PMISEC 655 |
Anthimos-Vardis Kandiros
Statistical estimation has largely focused on estimating models from independent and identically distributed observations. This assumption is, however, too strong. In many applications, observations are collected on nodes of a network, or some spatial or temporal domain, and are dependent. Examples abound in financial and meteorological applications, and dependencies naturally arise in social networks through peer effects. We study statistical estimation problems wherein responses at the nodes of a network are not independent conditioning on the nodes’ features but dependent. We model their dependencies through a Markov Random Field with a log-density that captures individual effects and peer effects. Importantly, we allow dependencies to be substantial, i.e do not assume that the Markov Random Field is in high temperature. Our model generalizes the standard statistical estimation model with independent observations which is obtained by setting the temperature to infinity. As our main contribution we provide algorithms and statistically efficient estimation rates for our model, giving several instantiations of our bounds in logistic regression, sparse logistic regression, and neural network estimation problems with dependent data. Our estimation guarantees follow from novel results for estimating the parameters (i.e. external fields and interaction strengths) of Ising models from a single sample. Based on joint works with Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala and Surbhi Goel. |
November 212:30 PMISEC 655 |
Bogdan Kulynych
Having similar behavior at training time and test time — what we call
a “What You See Is What You Get” (WYSIWYG) property — is
desirable in machine learning. Models trained with standard
stochastic gradient descent (SGD), however, do not necessarily have
this property, as their complex behaviors such as robustness or
subgroup performance can differ drastically between training and test
time. In contrast, we show that Differentially-Private (DP) training
provably ensures the high-level WYSIWYG property, which we quantify
using a notion of Distributional Generalization (DG). Applying this
connection, we introduce new conceptual tools for designing
deep-learning methods by reducing generalization concerns to
optimization ones: to mitigate unwanted behavior at test time, it is
provably sufficient to mitigate this behavior on the training data.
By applying this novel design principle, which bypasses
“pathologies” of SGD, we
construct simple algorithms that are competitive with SOTA in several
distributional-robustness applications, significantly improve the
privacy vs. disparate impact trade-off of DP-SGD, and mitigate robust
overfitting in adversarial training. Finally, we also improve on
theoretical bounds relating DP, stability, and distributional
generalization. This talk is based on the upcoming NeurIPS 2022
“What
You See is What You Get: Principled Deep Learning via Distributional
Generalization” paper by Bogdan Kulynych (EPFL), Yao-Yuan Yang (UC
San Diego), Yaodong Yu (UC Berkeley), Jarosław Błasiok
(Columbia University), and Preetum Nakkiran (UC San Diego). The
preprint version is available at
https://arxiv.org/abs/2204.03230.
|
October 261:00 PMHybrid, Richards Hall 238 |
Pratik Chaudhari
Accepted statistical wisdom suggests that larger the model class, the more likely it is to overfit the training data. And yet, deep networks generalize extremely well. The larger the deep network, the better its accuracy on new data. This talk seeks to shed light upon this apparent paradox. We will argue that deep networks are successful because of a characteristic structure in the space of learning tasks. The input correlation matrix for typical tasks has a peculiar (“sloppy”) eigenspectrum where, in addition to a few large eigenvalues (salient features), there are a large number of small eigenvalues that are distributed uniformly over a very large range. This structure in the input data is strongly mirrored in the representation learned by the network. A number of quantities such as the Hessian, the Fisher Information Matrix, as well as others such as correlations of activations or Jacobians, are also sloppy. Even if the model class for deep networks is very large, there is only a tiny subset of models that fit such sloppy tasks. Using these ideas, this talk will demonstrate an analytical non-vacuous generalization bound for deep networks that does not use compression. It will also discuss how these ideas can be harnessed into algorithms that learn from unlabeled data optimally. References |
October 191:00 PMHybrid, West Village H 366 |
Santhoshini Velusamy
Constraint satisfaction problems (CSPs) are ubiquitous in computer science and encompass optimization problems such as Max-CUT, Max-DICUT, Max-k-SAT, Max-q-Coloring, Unique Games, etc. Until recently, Max-CUT and a couple of other Max-2CSPs were the only CSPs studied in the streaming setting. In this talk, I will first describe our results giving new upper and lower bounds on the approximability of every CSP in the single-pass streaming setting. In particular, we prove the following theorem: For every CSP, there is a constant α such that the CSP is α-approximable in polylogarithmic space by a linear sketching algorithm and any sketching algorithm that beats α-approximation requires polynomial space. I will conclude with some interesting results from recent follow-up works and also discuss open problems in this direction. |
September 2812:30 PMRichards Hall 238 |
Nicole Wein
A shortcut set is a (small) set of edges that when added to a directed graph G produces a graph with the same reachability structure as G while reducing the diameter as much as possible. A hopset is defined similarly but for approximate distances instead of reachability. One motivation for such structures is that in many settings (e.g. distributed, parallel, dynamic, streaming), computation is faster when the diameter of the graph is small. Thus, to compute, for instance, st-reachability in such a setting, it is useful to find a shortcut set as a preprocessing step, and then run an st-reachability algorithm on the resulting graph. This talk is about existential bounds for shortcut sets and hopsets: given a budget of ~O(n) added edges, what is the smallest diameter achievable? A folklore algorithm says that diameter O(n^{1/2}) is possible for any graph. A recent breakthrough of Kogan and Parter improves this to O(n^{1/3}) for shortcut sets and O(n^{2/5}) for hopsets. In recent work, we ask whether hopsets (approximate distances) are truly harder than shortcut sets (reachability). We close the gap between hopsets and shortcut sets by providing an O(n^{1/3})-diameter construction for hopsets. Joint work with Aaron Bernstein. |
Spring Semester, 2022
April 2811AMISEC 655 (in-person)/Zoom |
Lucianna Kiffer
In cryptocurrencies, the block reward is meant to serve as the incentive mechanism for miners to commit resources to create blocks and in effect secure the system. Existing systems primarily divide the reward in proportion to expended resources and follow one of two static models for total block reward: (i) a fixed reward for each block (e.g., Ethereum), or (ii) one where the block reward halves every set number of blocks (e.g., the Bitcoin model of halving roughly every 4 years) but otherwise remains fixed between halvings. In recent work, a game-theoretic analysis of the static model under asymmetric miner costs showed that an equilibrium always exists and is unique [1]. Their analysis also reveals how asymmetric costs can lead to large-scale centralization in blockchain mining, a phenomenon that has been observed in Bitcoin and Ethereum and highlighted by other studies including [2,3]. In this work we introduce a novel family of mining reward functions, HaPPY-Mine (HAsh-Pegged Proportional Yield), which peg the value of the reward to the hashrate of the system, decreasing the reward as the hashrate increases. HaPPY-Mine distributes rewards in proportion to expended hashrate and inherits the safety properties of the generalized proportional reward function established in [4]. We study HaPPY-Mine under a heterogeneous miner cost model and show that an equilibrium always exists with a unique set of miner participants and a unique total hashrate. Significantly, we prove that a HaPPY-Mine equilibrium is more decentralized than the static model equilibrium under a set of metrics including number of mining participants and hashrate distribution. Finally, we show that any HaPPY-Mine equilibrium is also safe against collusion and sybil attacks, and explore how the market value of the currency affects the equilibrium.
|
March 3111AMISEC 655 (in-person)/Zoom |
LaKyah Tyner
Property-preserving hashing (PPH) consists of a family of compressing hash functions h such that, for any two inputs x, y, we can correctly identify whether some property P(x, y) holds given only the digests h(x), h(y). In a basic PPH, correctness should hold with overwhelming probability over the choice of h when x, y are chosen a-priori and independently of h, while in an adversarially robust PPH (RPPH), correctness must hold even when x, y are chosen adversarially and adaptively depending on h. Here, we study (R)PPH for the property that the Hamming distance between x and y is at most t. The notion of (R)PPH was introduced by Boyle, LaVigne and Vaikuntanathan (ITCS '19), and further studied by Fleischhacker, Simkin (Eurocrypt '21) and Fleischhacker, Larsen, Simkin (Eurocrypt'22). In this work, we obtain improved constructions that are conceptually simpler, have nearly optimal parameters, and rely on more general assumptions than prior works. Our results are:
|
March 2411AMZoom/ISEC 655 (screening) |
Hongyang Zhang
We live in a society with vast amounts of information. New information content such as news articles, technical reports, short videos, and images are created on the Internet every day. These contents have led to the creation of many public datasets and (massive) deep neural networks trained on them. A broad research question is whether and how we can utilize numerous datasets and models to build better ML models. This requires rethinking the underlying algorithms and the learning-theoretic foundations. This talk will describe my lab's recent works towards addressing these questions. I will first describe the problem of transfer learning from pretrained deep neural networks and our approach to achieve robust transfer. Then, I will present a PAC-Bayesian analysis of the generalization performance of our algorithm: The key idea being the measurement of Hessians. Lastly, I will talk about the problem of negative transfer in multitask learning. I will close my talk with ongoing and future directions about sharpness, distribution shifts, and building generalizable models across multiple tasks. |
Fall Semester, 2021
September 2812PMISEC 655 |
Omri Ben-Eliezer, MIT
Streaming algorithms are traditionally divided into two categories: deterministic algorithms and randomized ones. These two types of algorithms exhibit a tradeoff between robustness and efficiency; on one hand, deterministic algorithms always return a correct output, but for many important streaming problems, they are provably inefficient. On the other hand, efficient randomized algorithms are known for a very wide range of problems, but their correctness typically relies on the assumption that the stream is fixed in advance, an assumption that is commonly unrealistic in practice. In this talk, I will focus on a middle ground that combines both robustness and efficiency: adversarially robust algorithms, whose output is correct with high probability even when the stream updates are adaptively chosen as a function of previous outputs. This regime has received surprisingly little attention until recently, and many intriguing problems are still open. I will mention some of the recent results, discussing algorithms that are well-suited for the adversarially robust regime, algorithms that are not robust, and efficient techniques to turn algorithms that work in the standard static setting into robust streaming algorithms. The results demonstrate strong connections between streaming and other areas such as differential privacy, online learning, cryptography, and adaptive data analysis. Based on joint works with Noga Alon, Yuval Dagan, Talya Eden, Rajesh Jayaram, Shay Moran, Moni Naor, Krzysztof Onak, David Woodruff, and Eylon Yogev. (https://arxiv.org/abs/1906.11327, https://arxiv.org/abs/2003.14265, https://arxiv.org/abs/2101.09054, https://arxiv.org/abs/2109.03785) |
October 512PMISEC 655 |
Harm Derksen, NEU
Suppose we are given a matrix whose entries depend linearly on some variables. Edmonds' problem from 1967 asks to find the rank of such a matrix over the field of rational functions. Plugging in random values for the variables gives a good probabilistic algorithm, but it is not known if there is a deterministic polynomial time algorithm for finding this rank. Surprisingly, there exists a deterministic polynomial time algorithm if the variables are not assumed to commute. This also leads to an interesting notion of non-commutative rank. I will discuss the commutative and non-commutative Edmonds' problem, and how work of Visu Makam and myself have contributed to this area. |
October 1212PMISEC 655 |
Lydia Zakynthinou, NEU
Covariance-aware mean estimation is a fundamental problem in statistics, where we are given n i.i.d. samples from a d-dimensional distribution with mean $\mu$ and covariance $\Sigma$ and the goal is to find an estimator $\hat\mu$ with small error $\|\hat\mu-\mu\|_{\Sigma}\leq \alpha$, where $\|\cdot\|_{\Sigma}$ denotes the Mahalanobis distance. It is known that the empirical mean of the dataset achieves this guarantee if we are given at least $n=\Omega(d/\alpha^2)$ samples. The empirical mean and other statistical estimators can reveal sensitive information about the samples of the training dataset. To protect the privacy of the individuals who participate in the dataset, we study statistical estimators which satisfy differential privacy, a condition that has become a standard criterion for individual privacy in statistics and machine learning. In this talk, I will present two new differentially private mean estimators for d-dimensional (sub)Gaussian distributions with unknown covariance whose sample complexity is optimal up to logarithmic factors and matches the non-private one in many parameter regimes. All previous estimators with the same guarantee either require strong a priori bounds on the covariance matrix or require $\Omega(d^{3/2})$ samples. We will then focus on one of the estimators, which is based on a simple idea of perturbing the empirical mean of the dataset with noise calibrated to the empirical covariance, and we will explain the challenges and technical steps involved in making this estimator differentially private.
|
October 1912PMISEC 655 |
William Kuszmaul, MIT
The linear-probing hash table is one of the oldest and most widely used data structures in computer science. However, linear probing also famously comes with a major drawback: as soon as the hash table reaches a high memory utilization, elements within the hash table begin to cluster together, causing insertions to become slow. This phenomenon, now known as "primary clustering", was first captured by Donald Knuth in 1963; at a load factor of $1 - 1/x$, the expected time per insertion becomes $\Theta(x^2)$, rather than the more desirable $\Theta(x)$. We show that there is more to the story than the classic analysis would seem to suggest. It turns out that small design decisions in how deletions are implemented have dramatic effects on the asymptotic performance of insertions. If these design decisions are made correctly, then even a hash table that is continuously at a load factor $1 - \Theta(1/x)$ can achieve average insertion time $\tilde{O}(x)$. A key insight is that the tombstones left behind by deletions cause a surprisingly strong "anti-clustering" effect, and that when insertions and deletions are one-for-one, the anti-clustering effects of deletions actually overpower the clustering effects of insertions. We also present a new variant of linear probing, which we call "graveyard hashing", that completely eliminates primary clustering on any sequence of operations. If, when an operation is performed, the current load factor is $1 - 1/x$ for some $x$, then the expected cost of the operation is $O(x)$. One corollary is that, in the external-memory model with a data block size of $B$, graveyard hashing offers the following remarkable guarantee: at any load factor $1-1/x$ satisfying $x = o(B)$, graveyard hashing achieves $1 + o(1)$ expected block transfers per operation. Past external-memory hash tables have only been able to offer a $1+o(1)$ guarantee when the block size $B$ is at least $\Omega(x^2)$. Based on joint work with Michael A. Bender and Bradley C. Kuszmaul. https://arxiv.org/abs/2107.01250 To appear in FOCS 2021. |
October 2612PMISEC 655 | No talk scheduled |
November 212PM |
Konstantina Bairaktari, NEU
A challenge in fair algorithm design is that, while there are compelling notions of individual fairness, these notions typically do not satisfy desirable
composition properties, and downstream applications based on fair classifiers might not preserve fairness. To study fairness under composition, Dwork and Ilvento (2019)
introduced an archetypal problem called fair-cohort-selection problem, where a single fair classifier is composed with itself to select a group of candidates of a given size,
and proposed a solution to this problem.
|
November 912PMISEC 655 |
Peter Ivanov, NEU
We study a simple and general template for constructing affine extractors by composing a linear transformation with resilient functions. Using this we show that good affine extractors can
be computed by non-explicit circuits of various types, including AC0-Xor circuits: AC0 circuits with a layer of parity gates at the input. We also show that one-sided extractors can be
computed by small DNF-Xor circuits and separate these circuits from other well-studied classes. As further motivation for studying DNF-Xor circuits we show that if they can approximate
inner product then small AC0-Xor circuits can compute it exactly – a long-standing open problem.
|
November 1612PMISEC 655 |
Maryam Aliakbarpour, BU+NEU
In this talk, we present a new setting for testing properties of distributions while receiving samples from several distributions, but few samples per distribution. The main motivation
for considering this setting is that it captures data collection in the real world. We explain a brief description of our testers for the following problems in this setting when given
samples from s distributions, p_1, p_2, . . . , p_s: (1) Uniformity Testing: Testing whether all the p_i’s are uniform or eps-far from being uniform in \ell_1-distance (2) Identity Testing:
Testing whether all the p_i’s are equal to an explicitly given distribution q or eps-far from q in \ell_1-distance, and (3) Closeness Testing: Testing whether all the p_i’s are equal to a
distribution q which we have sample access to, or eps-far from q in \ell_1-distance. By assuming an additional natural condition about the source distributions, we provide sample optimal
testers for all of these problems.
|
November 2312PMISEC 655 |
Willy Quach, NEU
The talk will be about so-called lossy functions, a family of cryptographic primitives that has found a wide variety of applications. While the original notion of lossy trapdoor functions,
introduced by Peikert and Waters (STOC '08), have now found countless applications, they are a relatively higher-end primitive that can only be built from ``public-key'' computational assumptions.
In this work, we define and explore a relaxation that we call targeted lossy functions. We show that such a relaxation is meaningful by (1) building it from substantially weaker assumptions
(namely one-way functions and variants) and (2) showing several applications.
|
November 3012PMISEC 655 |
Chara Podimata, Harvard
We study contextual search, a generalization of binary search in higher dimensions, which captures settings such as feature-based dynamic pricing. Standard game-theoretic formulations of this
problem assume that agents act in accordance with a specific behavioral model. In practice, however, some agents may not subscribe to the dominant behavioral model or may act in ways that seem
to be arbitrarily irrational. Existing algorithms heavily depend on the behavioral model being (approximately) accurate for all agents and have poor performance in the presence of even a few such
arbitrarily irrational agents. We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying behavioral model. In particular, we provide
two algorithms, one based on multidimensional binary search methods, and one based on gradient descent. We show that these algorithms attain near-optimal regret guarantees in the absence of
irrational agents and their performance degrades gracefully with the number of such agents, providing the first results for contextual search in any adversarial noise model. Our techniques
draw inspiration from learning theory, game theory, high-dimensional geometry, and convex analysis.
|
December 712PMISEC 655 |
Benjamin Lovitz, U. Waterloo
Tensors are natural generalizations of matrices to higher-way arrays. Decompositions of tensors into sums of product (rank-one) tensors are useful in many areas for compressing
and interpreting the information stored in a tensor. The tensor rank, which naturally generalizes matrix rank, is the smallest number of product tensors that can decompose a
given tensor. In contrast to matrices, tensor rank decompositions are often unique (up to trivialities). Uniqueness is useful in applications, as it corresponds to a unique
interpretation of the information stored in a tensor. I will review a concrete application of uniqueness in unsupervised learning. |
Spring Semester, 2020
February 5 |
Jinshuo Dong, UPenn
It is important to understand the exact privacy guarantee provided by private algorithms developed in the past prosperous decade of differential privacy. In particular, any underestimation of actual privacy means unnecessary noise in the algorithm and loss in the final accuracy. We observe a central limit behavior in iterative private algorithms, which demonstrates the limit of the common $(\varepsilon,\delta)$ parametrization in most application scenarios. For the rescue, a new notion called Gaussian Differential Privacy (GDP) is proposed and a complete toolkit is developed. We carry out various experiments to show how much unnecessary loss of accuracy can be saved in deep learning applications. Based on joint work with Aaron Roth, Weijie Su, Zhiqi Bu and Qi Long. |
February 1212:30PMISEC 655 |
Shibani Santurkar, MIT
We present a new perspective on the phenomenon of wide-spread vulnerability of current machine learning models to adversarial examples. Specifically, we discuss how the existence of adversarial examples stems, in large part, from a fundamental misalignment in the way humans and machine learning models solve classification tasks. This view also brings to light some benefits of robust models—such as their ability to learn representations that are more human-aligned. Overall, this perspective highlights the utility of adversarial robustness as a goal beyond the security context. |
February 1912:30PMISEC 655 |
Ramesh Krishnan S. Pallavoor, Boston University
We study the problem of approximating the distance to monotonicity of Boolean functions over the hypercube domain. For a function f:{0,1}^n -> {0,1}, the distance to monotonicity of f is the minimum fraction of points at which f has to be modified to make it monotone. We give a nonadaptive algorithm that, given f:{0,1}^n -> {0,1} whose distance to monotonicity is at least \alpha, makes poly(n, 1/\alpha) queries and returns an O(\sqrt{n} polylog n)-multiplicative approximation to the distance to monotonicity of f. Furthermore, we show that for any constant k > 0, approximating the distance to monotonicity up to n^{0.5 - k}-factor requires 2^{n^k} nonadaptive queries, thereby ruling out a poly(n, 1/\alpha)-query nonadaptive algorithm for such approximations. This answers a question of Seshadhri (Property Testing Review ’14) for the case of nonadaptive algorithms. Approximating the distance to a property is equivalent to tolerantly testing that property. Our lower bound stands in contrast to standard (non-tolerant) testing of monotonicity that can be done nonadaptively with O(\sqrt{n} polylog n) queries. We obtain our lower bound by proving an analogous bound for erasure-resilient testers. Our method yields the same lower bounds for other properties (unateness and being a k-junta). These lower bounds improve exponentially on the existing lower bounds for these properties. This talk is based on a joint work with Sofya Raskhodnikova and Erik Waingarten that appeared in SODA 2020. |
February 2612:30PMISEC 655 |
Gabrielle De Micheli, INRIA
The discrete logarithm problem is one of the two hard mathematical problems at the foundation of today's public key cryptography. In this presentation, I will talk about the discrete logarithm problem at the boundary case between small and medium characteristic finite fields, which is precisely the area where finite fields used in pairing-based cryptosystems live. In order to evaluate the security of pairing-based protocols, we thoroughly analyze the complexity of all the algorithms that coexist at this boundary case: the Quasi-Polynomial algorithms, the Number Field Sieve and its many variants, and the Function Field Sieve. We adapt the latter to the particular case where the extension degree is composite, and show how to lower the complexity by working in a shifted function field. All this study finally allows us to give precise values for the characteristic asymptotically achieving the highest security level for pairings. |
Fall Semester, 2019
October 21:30PMISEC 655 |
Tanay Mehta, Northeastern University
Consider the query reformulation problem where a user enters the shopping query “anxiety toy rotating” into an e-commerce website. This query has no textual similarity to “fidget spinners” which is, in fact, the product the user is looking for. Traditionally, such language processing tasks are tackled using memory-based architectures (LSTMs). We solve the problem of query reformulation by adopting a first principles approach. We propose a simple theoretical model of query generation, AttEST, that views both products and n-grams as vectors embedded in a latent attribute space. We realize a practical implementation of AttEST as a feedforward neural network with an attention mechanism. The AttEST network beats the memory-based architectures by over 20 percentage points (on the F-1 score). We explain this out-performance by giving a theoretical justification for attention as the best linear unbiased estimator. Joint work with M. Dippel, A. Kiezun, R. Sundaram, S. Thirumalai and A. Varma |
October 912:00PMISEC 655 |
Mark Bun, Boston University
We investigate the problem of differentially private hypothesis selection: Given i.i.d. samples from an unknown probability distribution P and a set of m probability distributions H, the goal is to privately output a distribution from H whose total variation distance to P is comparable to that of the best such distribution. We present several algorithms for this problem which achieve sample complexity similar to those of the best non-private algorithms. These give new and improved learning algorithms for a number of natural distribution classes. Our results also separate the sample complexities of private mean estimation under product vs. non-product distributions. |
October 1612:00PMISEC 655 |
Talya Eden, MIT
In this talk, I will discuss the problem of sampling edges in an unknown graph G = (V,E) from a distribution that is pointwise almost uniform over E. Given query access to a graph G over n vertices, m edges and a bound on the arboricity of G, I will describe an algorithm that performs queries in expectations and returns an edge in the graph such that every edge is sampled with probability . We also show that the upper bound is tight (up to poly-logarithmic factors and the dependence in ). If time will permit I will also discuss a simple algorithm for estimating the number of edges in graphs with bounded arboricity. This talk is based on joint work with Dana Ron and Will Rosenbaum. |
October 2312:00PMISEC 655 |
Archita Agarwal, Brown University
Distributed hash tables (DHT) are a fundamental building block in the design of distributed systems with applications ranging from content distribution networks to off-chain storage networks for blockchains and smart contracts. When DHTs are used to store sensitive information, system designers use end-to-end encryption to guarantee the confidentiality of their data. A prominent example is Ethereum’s off-chain network Swarm. In this work, we formalize the use of end-to-end encryption in DHTs and the many systems they support. We introduce the notion of an encrypted DHT and provide simulation-based security definitions that capture the security properties one would desire from such a system. Using our definitions, we then analyze the security of a standard approach to storing encrypted data in DHTs. Interestingly, we show that this “standard scheme” leaks information probabilistically, where the probability is a function of how well the underlying DHT load balances its data. We also show that to be securely used with the standard scheme, a DHT needs to satisfy a form of equivocation with respect to its overlay. To show that these properties are indeed achievable in practice, we study the balancing properties of the Chord DHT—arguably the most influential DHT—and show that it is equivocable with respect to its overlay in the random oracle model. Finally, we consider the problem of encrypted DHTs in the context of transient networks, where nodes are allowed to leave and join. |
October 3012:00PMISEC 655 |
Nishanth Dikkala, MIT
Statistical learning theory has largely focused on learning and generalization given independent and identically distributed samples. Motivated by settings where data is sampled on a network or a spatial domain, we provide learning and generalization bounds for data that are complexly dependent, yet their distribution satisfies the Dobrushin’s condition. Indeed, we show that the standard complexity measures of Gaussian and Rademacher complexities and VC dimension are sufficient measures of complexity for the purposes of bounding the generalization error and learning rates of hypothesis classes in our setting. Moreover, our generalization bounds only degrade by constant factors compared to their i.i.d. analogs, and our learnability bounds degrade by log factors in the size of the training set. This is joint work with my advisor Costis Daskalakis, Yuval Dagan and Siddhartha Jayanti. |
November 612:00PMISEC 655 |
Tal Wagner, MIT
We prove tight bounds on the bit complexity of approximately representing an n-point Euclidean metric space. We show that bits are both sufficient and necessary for recovering each distance up to distortion , where is the aspect ratio of the metric space (ratio of largest to smallest distance). This improves over the bound which follows from the dimension reduction theorem of Johnson and Lindenstrauss (1984), and by over a previous result of the authors (SODA 2017). In words, while the dimension reduction theorem states that each point can be represented by coordinates, we show it can be represented by that many bits. Joint work with Piotr Indyk. |
November 1312:00PMISEC 655 |
Chara Podimata, Harvard University
We study the problem of online learning in strategic classification settings from the perspective of the learner, who is repeatedly facing myopically rational strategic agents. We model this interplay as a repeated Stackelberg game, where at each timestep the learner deploys a high-dimensional linear classifier first and an agent, after observing the classifier, along with his real feature vector, and according to his underlying utility function, best-responds with a (potentially altered) feature vector. We measure the performance of the learner in terms of Stackelberg regret for her 0-1 loss function. Surprisingly, we prove that in strategic settings like the one considered in this paper there exist worst-case scenarios, where any sequence of actions providing sublinear external regret might result in linear Stackelberg regret and vice versa. We then provide the Grinder Algorithm, an adaptive discretization algorithm, potentially of independent interest in the online learning community, and prove its data-dependent upper bound on the Stackelberg regret given oracle access, while being computationally efficient. We also provide a nearly matching lower bound for the problem of strategic classification. We complement our theoretical analysis with simulation results, which suggest that our algorithm outperforms the benchmarks, even given access to approximation oracles. Our results advance the known state-of-the-art results in the growing literature of online learning from revealed preferences, which has so far focused on “smoother” utility and loss functions from the perspective of the agents and the learner respectively. Joint work with Yiling Chen and Yang Liu |
December 412:00PMISEC 655 |
Katerina Sotiraki, MIT
We study the search problem class PPAq defined as a modulo-q analog of the well-known polynomial parity argument class PPA introduced by Papadimitriou (JCSS 1994). Our first result shows that this class can be characterized in terms of PPAp for prime p. Our main result is to establish that an explicit version of a search problem associated to the Chevalley–Warning theorem is complete for PPAp for prime p. This problem is natural in that it does not explicitly involve circuits as part of the input. It is the first such complete problem for PPAp when p ≥ 3. We also discuss connections between Chevalley-Warning theorem and the well-studied short integer solution problem and survey the structural properties of PPAq. Joint work with Mika Göös, Pritish Kamath and Manolis Zampetakis |
December 1112:00PMISEC 655 |
Josh Alman, Harvard University
We study the known techniques for designing Matrix Multiplication algorithms. The two main approaches are the Laser method of Strassen, and the Group theoretic approach of Cohn and Umans. We define a generalization based on restrictions of powers of tensors which subsumes these two approaches, which we call the Universal method. We then prove concrete lower bounds on the algorithms one can design by applying the Universal method to many different tensors. Our proofs use new tools for upper bounding the asymptotic slice rank of a wide range of tensors. Our main result is that the Universal method applied to any Coppersmith-Winograd tensor (the family of tensors used in all record-holding algorithms from the past 30+ years) cannot yield a bound on omega, the exponent of matrix multiplication, better than 2.16805. No background knowledge about matrix multiplication algorithms is assumed. Some of the talk is based on joint work with Virginia Vassilevska Williams. |
Spring Semester, 2019
January 301:00PMWVH 366 |
Hugo Alves Akitaya, Tufts University
An embedding of a graph $G$ is a drawing $\varphi:G\rightarrow M$ of $G$ on a surface $M$ such that every point each vertex in $V(G)$ is mapped to distinct points and each edge in $E(G)$ to interior-disjoint Jordan arcs between the corresponding vertices. A weak embedding is a map $\varphi:G\rightarrow M$ if, for every $\epsilon>0$, there exist an embedding $\psi_\epsilon:G\rightarrow M$ so that $\|\varphi-\psi_\epsilon\|<\epsilon$, where $\|.\|$ is the uniform norm (i.e., $\sup$ norm). We say that the embedding $\psi_{\varphi}$ approximates $\varphi$. The study of weak embeddings lies at the interface of several independent lines of research in mathematics and computer science. In particular, it generalizes various graph visualization models and is a special case of the notoriously difficult cluster-planarity problem. It is also related to the foldability problem in computational origami. This talk presents several results on the algorithmic complexity of recognizing weak embeddings. |
February 131:00PMWVH 366 |
Paul Hand, Northeastern University
Deep generative modeling has led to new and state of the art approaches for enforcing structural priors in a variety of inverse problems. In contrast to priors given by sparsity, deep models can provide direct low-dimensional parameterizations of the manifold of images or signals belonging to a particular natural class, allowing for recovery algorithms to be posed in a low-dimensional space. This dimensionality may even be lower than the sparsity level of the same signals when viewed in a fixed basis. In this talk, we will show rigorous recovery guarantees for solving inverse problems under a learned generative prior. First, we will discuss convergence guarantees for compressive sensing under random neural network priors. Then, we will show that generative priors allow for a significant advance to be made in the problem of compressive phase retrieval. To date, no known computationally efficient algorithm exists for solving phase retrieval under a sparsity prior at sample complexity proportional to the signal complexity. With generative priors, we establish a new approach for compressive phase retrieval and establish rigorous guarantees with sample complexity proportional to the signal complexity. |
February 201:00PMWVH 366 |
Ram Ramanathan, Chief Scientist, goTenna
Decentralized infrastructure-less wireless networks can provide instantly-deployable and robust connectivity in the aftermath of natural disasters, in remote areas, and in other contexts such as the Internet-of-Things. After briefly introducing goTenna and multi-hop wireless networking, I will discuss the tradeoffs involved in designing such networks focusing in particular on connectivity properties using insights from percolation theory. I will then consider one particular problem, namely network-wide packet broadcasting, which is closely related to the Connected Dominating Set problem in graphs. I will present a fully distributed zero-control protocol for broadcasting and discuss its correctness and performance. Finally, I will touch upon some research challenges in the area such as incentivizing network growth using cryptocurrencies. |
February 271:00PMWVH 366 |
Jarosław Błasiok, Harvard University
The distinct elements problem is one of the fundamental problems in streaming algorithms --- given a stream of integers in the range $\{1,\ldots,n\}$, we wish to provide a $(1+\varepsilon)$ approximation to the number of distinct elements in the input. After a long line of research optimal solution for this problem with constant probability of success, using $\mathcal{O}(\frac{1}{\varepsilon^2}+\log n)$ bits of space, was given by Kane, Nelson and Woodruff in 2010. The standard approach used in order to achieve low failure probability $\delta$, is to take a median of $\log \delta^{-1}$ parallel repetitions of the original algorithm and report the median of computed answers. We show that such a multiplicative space blow-up is unnecessary: we provide an optimal algorithm using $\mathcal{O}(\frac{\log \delta^{-1}}{\varepsilon^2} + \log n)$ bits of space --- matching known lower bounds for this problem. That is, the $\log\delta^{-1}$ factor does not multiply the $\log n$ term. This settles completely the space complexity of the distinct elements problem with respect to all standard parameters. We consider also \emph{strong tracking} (or \emph{continuous monitoring}) variant of the distinct elements problem, where we want an algorithm which provides an approximation of the number of distinct elements seen so far, at all times of the stream. We show that this variant can be solved using $\mathcal{O}(\frac{\log \log n + \log \delta^{-1}}{\varepsilon^2} + \log n)$ bits of space, which we show to be optimal. |
March 271:00PMWVH 366 |
Christina Ilvento, Harvard University
There has been much discussion recently about how fairness should be measured or enforced in classification. Individual Fairness [Dwork et al. 2012], which requires that similar individuals be treated similarly, is a highly appealing definition because it gives very strong guarantees on treatment of individuals. Unfortunately, the need for a task-specific similarity metric has prevented its use in practice. In this work, we set out a framework for approximating a metric for Individual Fairness based on human judgements. Our model assumes that we have access to a human fairness arbiter, who can answer a limited set of queries concerning similarity of individuals for a particular task, is free of explicit biases and possesses sufficient domain knowledge to evaluate similarity. Our contributions include useful definitions for metric approximation relevant for Individual Fairness, constructions for approximations from a limited number of realistic queries to the arbiter on a fixed sample of individuals, and learning procedures to construct hypotheses for metric approximations which generalize to unseen samples under certain assumptions of learnability of distance threshold functions. |
April 101:00PMWVH 366 |
Oxana Poburinnaya, Boston University
While standard encryption guarantees secrecy of the encrypted plaintext only against an attacker that has no knowledge of the communicating parties’ keys and randomness of encryption, deniable encryption [Canetti et al., Crypto’96] provides the additional guarantee that the plaintext remains secret even in face of entities that attempt to coerce (or bribe) the communicating parties to expose their internal states, including the plaintexts, keys and randomness. To achieve this guarantee, deniable encryption equips the parties with faking algorithms which allow them to generate fake keys and randomness that make the ciphertext appear consistent with any plaintext of the parties’ choice. To date, however, only partial results were known: Either deniability against coercing only the sender, or against coercing only the receiver [Sahai-Waters, STOC ‘14] or schemes satisfying weaker notions of deniability [O’Neil et al., Crypto ‘11]. In this paper we present the first fully bideniable interactive encryption scheme, thus resolving the 20-years-old open problem. Our scheme also provides an additional and new guarantee: Even if the sender claims that one plaintext was used and the receiver claims a different one, the adversary has no way of figuring out who is lying - the sender, the receiver, or both. This property, which we call off-the-record deniability, is useful when the parties don’t have means to agree on what fake plaintext to claim, or when one party defects against the other. Our protocol has three messages, which is optimal [Bendlin et al., Asiacrypt’11], and needs a globally available reference string. We assume subexponential indistinguishability obfuscation (IO) and one-way functions. Joint work with Ran Canetti and Sunoo Park |
April 171:00PMISEC 655 |
Dimitris Tsipras, MIT
Recent progress in machine learning has made the deployment of ML systems in the real world an imminent possibility. But are our current machine learning systems really up to the task? In this talk, I will discuss the brittleness and vulnerability of the existing ML toolkit. I will then describe a conceptual framework that aims to deliver models that are more reliable, and robust to adversarial manipulation. Finally, I will outline how this framework constitutes a new learning paradigm, how it differs from the classic perspective, and what new challenges it gives rise to. |
April 241:00PMWVH 366 |
Mitali Bafna, Harvard University
In the last several years, neural networks have made unprecedented achievements on computational learning tasks like image classification. Despite their remarkable success, neural networks have been shown to be brittle in the presence of adversarial noise. Many effective attacks have been proposed in the context of computer vision that reliably generate small perturbations to input images (sometimes imperceptible to humans) that drastically change the network’s classification of the image. In this work, we give a framework for improving the robustness of classifiers to adversaries with L_0 noise budgets. That is, adversaries are restricted in the number of features they can corrupt, but may corrupt each arbitrarily. Our framework is based on a new Sparse Discrete Fourier transform (DFT) that is robust to worst-case L_0 noise added in the time domain. We give a new algorithm for approximating the DFT of an approximately sparse signal that has been corrupted by worst-case L_0 noise, namely a bounded number of coordinates of the signal have been corrupted arbitrarily. Our techniques generalize to a wide range of linear transformations that are used in data analysis such as the Discrete Cosine and Sine transforms, the Hadamard transform, and their high-dimensional analogs. We use our algorithm to successfully defend against well known L_0 adversaries in the setting of image classification. We give experimental results on the Jacobian-based Saliency Map Attack (JSMA) and the Carlini Wagner (CW) L_0 attack on the MNIST and Fashion-MNIST datasets as well as the Adversarial Patch on the ImageNet dataset. |
Fall Semester, 2018
September 1312:00PM150 Forsyth |
Eylon Yogev, Weizmann Institute of Science
The class TFNP is the search analog of NP with the additional guarantee that any instance has a solution. TFNP has attracted extensive attention due to its natural syntactic subclasses that capture the computational complexity of important search problems from algorithmic game theory, combinatorial optimization and computational topology. Thus, one of the main research objectives in the context of TFNP is to search for efficient algorithms for its subclasses, and at the same time proving hardness results where efficient algorithms cannot exist. Currently, no problem in TFNP is known to be hard under assumptions such as NP hardness, the existence of one-way functions, or even public-key cryptography. The only known hardness results are based on less general assumptions such as the existence of collision-resistant hash functions, one-way permutations less established cryptographic primitives (e.g. program obfuscation or functional encryption). Several works explained this status by showing various barriers to proving hardness of TFNP. In particular, it has been shown that hardness of TFNP hardness cannot be based on worst-case NP hardness, unless NP=coNP. Therefore, we ask the following question: What is the weakest assumption sufficient for showing hardness in TFNP? In this talk, I will answer this question and show that hard-on-average TFNP problems can be based on the weak assumption that there exists a hard-on-average language in NP. In particular, this includes the assumption of the existence of one-way functions. In terms of techniques, there is an interesting interplay between problems in TFNP and derandomization techniques. Based on joint works with Pavel Hubáček, and Moni Nao |
September 2012:00PM150 Forsyth |
Alina Ene, Boston University
In this talk, we present recent progress on understanding the tradeoff between the approximation guarantee and adaptivity for submodular maximization. The adaptivity of an algorithm is the number of sequential rounds of queries it makes to the evaluation oracle of the function, where in every round the algorithm is allowed to make polynomially-many parallel queries. Adaptivity is an important consideration in settings where the objective function is estimated using samples and in applications where adaptivity is the main running time bottleneck. We present nearly-optimal algorithms for submodular maximization subject to a cardinality constraint and more generally, subject to multiple packing constraints. This talk is based on joint work with Huy Nguyen (Northeastern) and Adrian Vladu (Boston University). |
September 2712:00PM655 ISEC |
Lucianna Kiffer (internal)
The celebrated Nakamoto consensus protocol ushered in several new consensus applications including cryptocurrencies. A few recent works have analyzed important properties of blockchains, including most significantly, consistency, which is a guarantee that all honest parties output the same sequence of blocks throughout the execution of the protocol. To establish consistency, the prior analysis of Pass, Seeman and shelat required a careful counting of certain combinatorial events that was difficult to apply to variations of Nakamoto. Work by Garay, Kiayas, and Leonardas provides another method of analyzing the blockchain under the simplifying assumption that the network was synchronous. The contribution of this paper is the development of a simple Markov-chain based method for analyzing consistency properties of blockchain protocols. The method includes a formal way of stating strong concentration bounds as well as easy ways to concretely compute the bounds. We use our new method to answer a number of basic questions about consistency of blockchains:
In each case, we use our framework to experimentally analyze the consensus bounds for various network delay parameters and adversarial computing percentages. We hope our techniques enable authors of future blockchain proposals to provide a more rigorous analysis of their schemes. |
October 412:00PM150 Forsyth |
Maryam Aliakbarpour, MIT
In this talk, we will discuss new directions in testing properties of distributions. Distributions are at the core of data-related scientific endeavors, and due to their importance, they have been studied for past decades in Statistics and Computer Science. The goal is to find an algorithm to test the properties of distributions with provable performance guarantees, using as few samples as possible. We discuss the fundamental problems of identity (goodness of fit) and closeness (equivalence) testing concerning the new challenges:
|
October 1112:00PM150 Forsyth |
Tanay Mehta (internal)
We propose a new model for the study of resilience of coevolving multiplex scale-free networks. Our network model, called preferential interdependent networks, is a novel continuum over scale-free networks parameterized by their correlation ρ, 0 ≤ ρ ≤ 1. Our failure and recovery model ties the propensity of a node, both to fail and to assist in recovery, to its importance. We show, analytically, that our network model can achieve any γ,2 ≤ γ ≤ 3 for the exponent of the power law of the degree distribution; this is superior to existing multiplex models and allows us better fidelity in representing real-world networks. Our failure and recovery model is also a departure from the much studied cascading error model based on the giant component; it allows for surviving important nodes to send assistance to the damaged nodes to enable their recovery. This better reflects the reality of recovery in man-made networks such as social networks and infrastructure networks. Our main finding, based on simulations, is that resilient preferential interdependent networks are those in which the layers are neither completely correlated (ρ = 1) nor completely uncorrelated (ρ = 0) but instead semi-correlated (ρ ≈ 0.1 − 0.3). This finding is consistent with the real-world experience where complex man-made networks typically bounce back quickly from stress. In an attempt to explain our intriguing empirical discovery we present an argument for why semi-correlated multiplex networks can be the most resilient. Our argument can be seen as an explanation of plausibility or as an incomplete mathematical proof subject to certain technical conjectures that we make explicit. Joint work with Auroop Ganguly, Ravi Sundaram and Devesh Tiwari |
October 1812:00PM655 ISEC |
Manolis Zampetakis, MIT
We consider the classical statistical problem of estimating to arbitrary accuracy the parameters of a multidimensional Gaussian distribution under the following real-world obstacles: (1) truncated samples, (2) inhomogeneous population.
Abstracting the use of iterative algorithms in the previous important statistical problems, we consider the general question of analyzing the convergence of iterative algorithms. We prove that contraction maps provide a universal analysis tool for proving global convergence of iterative maps. Based on joint works with: Constantinos Daskalakis, Themis Gouleakis and Christos Tzamos. |
November 112:00PM655 ISEC |
Ran Cohen (internal)
Secure multiparty computation (MPC) on incomplete communication networks has been studied within two primary models: (1) Where a partial network is fixed a priori, and thus corruptions can occur dependent on its structure, and (2) Where edges in the communication graph are determined dynamically as part of the protocol. Whereas a rich literature has succeeded in mapping out the feasibility and limitations of graph structures supporting secure computation in the fixed-graph model (including strong classical lower bounds), these bounds do not apply in the latter dynamic-graph setting, which has recently seen exciting new results, but remains relatively unexplored. In this work, we initiate a similar foundational study of MPC within the dynamic-graph model. As a first step, we investigate the property of graph expansion. All existing protocols (implicitly or explicitly) yield communication graphs which are expanders, but it is not clear whether this is inherent. Our results consist of two types (for constant fraction of corruptions):
More generally, we provide a formal framework for analyzing the evolving communication graph of MPC protocols, giving a starting point for studying the relation between secure computation and further, more general graph properties. |
November 812:00PM655 ISEC |
Seth Neel, University of Pennsylvania
We develop theory for using heuristics to solve computationally hard problems in differential privacy. Heuristic approaches have enjoyed tremendous success in machine learning, for which performance can be empirically evaluated. However, privacy guarantees cannot be evaluated empirically, and must be proven — without making heuristic assumptions. We show that learning problems over broad classes of functions — those that have polynomially sized universal identification sets — can be solved privately and efficiently, assuming the existence of a non-private oracle for solving the same problem. Our first algorithm yields a privacy guarantee that is contingent on the correctness of the oracle. We then give a reduction which applies to a class of heuristics which we call certifiable, which allows us to convert oracle-dependent privacy guarantees to worst-case privacy guarantee that hold even when the heuristic standing in for the oracle might fail in adversarial ways. Finally, we consider classes of functions for which both they and their dual classes have small universal identification sets. This includes most classes of simple boolean functions studied in the PAC learning literature, including conjunctions, disjunctions, parities, and discrete halfspaces. We show that there is an efficient algorithm for privately constructing synthetic data for any such class, given a non- private learning oracle. This in particular gives the first oracle-efficient algorithm for privately generating synthetic data for contingency tables. The most intriguing question left open by our work is whether or not every problem that can be solved differentially privately can be privately solved with an oracle-efficient algorithm. While we do not resolve this, we give a barrier result that suggests that any generic oracle-efficient reduction must fall outside of a natural class of algorithms (which includes the algorithms given in this paper). |
November 1512:00PM655 ISEC |
Albert Cheu (internal)
We consider the problem of designing scalable, robust protocols for computing statistics about sensitive data. Specifically, we look at how best to design differentially private protocols in a distributed setting, where each user holds a private datum. The literature has mostly considered two models: the “central” model, in which a trusted server collects users’ data in the clear, which allows greater accuracy; and the “local” model, in which users individually randomize their data, and need not trust the server, but accuracy is limited. Attempts to achieve the accuracy of the central model without a trusted server have so far focused on variants of cryptographic MPC, which limits scalability. In this paper, we initiate the analytic study of a shuffled model for distributed differentially private algorithms, which lies between the local and central models. This simple-to-implement model, a special case of the ESA framework of [Bittau et al., '17], augments the local model with an anonymous channel that randomly permutes a set of user-supplied messages. For sum queries, we show that this model provides the power of the central model while avoiding the need to trust a central server and the complexity of cryptographic secure function evaluation. More generally, we give evidence that the power of the shuffled model lies strictly between those of the central and local models: for a natural restriction of the model, we show that shuffled protocols for a widely studied selection problem require exponentially higher sample complexity than do central-model protocols. |
November 2912:00PM655 ISEC |
Ilias Zadik, MIT
Motivated by growing concerns over ensuring privacy on social networks, we develop new algorithms and impossibility results for fitting complex statistical models to network data subject to rigorous privacy guarantees. We consider the so-called node-differentially private algorithms, which compute information about a graph or network while provably revealing almost no information about the presence or absence of a particular node in the graph. We provide new algorithms for node-differentially private estimation for a popular and expressive family of network models: stochastic block models and their generalization, graphons. Our algorithms improve on prior work reducing their error quadratically and matching, in many regimes, the optimal nonprivate algorithm. We also show that for the simplest random graph models (G(n, p) and G(n, m)), nodeprivate algorithms can be qualitatively more accurate than for more complex models—converging at a rate of $1 / \epsilon^2 n^3$ instead of $1 / \epsilon^2 n^2$. This result uses a new extension lemma for differentially private algorithms that we hope will be broadly useful. |
December 1312:00PM655 ISEC |
Alexander Russell, University of Connecticut
Bitcoin has ushered in a new era of interest in cryptocurrencies and distributed algorithms for consensus. While Bitcoin provides a spectacular solution in this space, it uses a "proof-of-work" mechanism that makes quite awesome resource demands: the protocol is projected to burn approximately 70TWh next year, as much as the country of Austria. Proof-of-stake is an alternative framework that has the potential to remove these energy demands. However, proof-of-stake protocols face numerous analytic challenges that do not exist in the proof-of-work setting. We will begin with an overview of the Bitcoin system, focusing on the proof-of-work mechanism. We will then describe the constructions and analyses of the Ouroboros blockchains, the first blockchain protocols providing provable security in the proof-of-stake setting. These new blockchains are now deployed as a part of the Cardano/ADA cryptocurrency. The analysis shows off a number of new analytic features of interest and makes connections to coupled Markov chains, combinatorics, and the UC-model of protocol security. This is joint work with Bernardo David, Aggelos Kiayias, Peter Gazi, and Roman Oliynikov. |
Spring Semester, 2018
Jan 16, 20183:30PM |
Steven Wu, MSR
Bandit learning is characterized by the tension between long-term exploration and short-term exploitation. However, as has recently been noted, in settings in which the choices of the learning algorithm correspond to important decisions about individual people (such as criminal recidivism prediction, lending, and sequential drug trials), exploration corresponds to explicitly sacrificing the well-being of one individual for the potential future benefit of others. This raises a fairness concern. In such settings, one might like to run a “greedy” algorithm, which always makes the (myopically) optimal decision for the individuals at hand — but doing this can result in a catastrophic failure to learn. In this paper, we consider the linear contextual bandit problem and revisit the performance of the greedy algorithm. We give a smoothed analysis, showing that even when contexts may be chosen by an adversary, small perturbations of the adversary’s choices suffice for the algorithm to achieve “no regret”, perhaps (depending on the specifics of the setting) with a constant amount of initial training data. This suggests that “generically” (i.e. in slightly perturbed environments), exploration and exploitation need not be in conflict in the linear setting. |
Jan 23, 20183:30PM |
Michael Mitzenmacher
We present cuckoo filters, an efficient data structure for approximate set membership that improves on the well-known Bloom filter. We then discuss recent work on how to make cuckoo filters adaptive in response to false positives, which can be important for many practical problems. |
Jan 30, 20183:30PM |
Audra McMillan, University of Michigan
Physical sensors (thermal, light, motion, etc.) are becoming ubiquitous and offer important benefits to society. However, allowing sensors into our private spaces has resulted in considerable privacy concerns. Differential privacy has been developed to help alleviate these privacy concerns. In this talk, we’ll develop and define a framework for releasing physical data that preserves both utility and provides privacy. Our notion of closeness of physical data will be defined via the Earth Mover Distance and we’ll discuss the implications of this choice. Physical data, such as temperature distributions, are often only accessible to us via a linear transformation of the data. We’ll analyse the implications of our privacy definition for linear inverse problems, focusing on those that are traditionally considered to be "ill-conditioned”. We’ll then instantiate our framework with the heat kernel on graphs and discuss how the privacy parameter relates to the connectivity of the graph. Our work indicates that it is possible to produce locally private sensor measurements that both keep the exact locations of the heat sources private and permit recovery of the ``general geographic vicinity'' of the sources. Joint work with Anna C. Gilbert. |
Feb 13, 20183:30PM |
Jack Doerner, Northeastern University
We design and implement a Distributed Oblivious Random Access Memory (DORAM) data structure that is optimized for use in two-party secure computation protocols. We improve upon the access time of previous constructions by a factor of up to ten, their memory overhead by a factor of one hundred or more, and their initialization time by a factor of thousands. We are able to instantiate ORAMs that hold 2^34 bytes, and perform operations on them in seconds, which was not previously feasible with any implemented scheme. Unlike prior ORAM constructions based on hierarchical hashing, permutation, or trees, our Distributed ORAM is derived from the new Function Secret Sharing scheme introduced by Boyle, Gilboa and Ishai. This significantly reduces the amount of secure computation required to implement an ORAM access, albeit at the cost of O(n) efficient local memory operations. We implement our construction and find that, despite its poor O(n) asymptotic complexity, it still outperforms the fastest previously known constructions, Circuit ORAM and Square-root ORAM, for datasets that are 32 KiB or larger, and outperforms prior work on applications such as stable matching or binary search by factors of two to ten. |
Feb 27, 20183:30PM |
Albert Cheu, Northeastern University
The multi-armed bandit setting is a well-studied learning model. Within this setting, the approximate best arm identification problem has been solved optimally, using a clever technique to avoid a union bound. In our work, we pose a variant problem and present an optimal algorithm, which also avoids a union bound. We give intuition for the solution and, time permitting, the matching impossibility result. Joint work with Ravi Sundaram and Jonathan Ullman. |
March 20, 20183:30PM |
Ryan Williams
I will give an overview of the recent proof (with my PhD student Cody Murray) that the class NQP = NTIME\((n^{polylog n})\) does not have ACC circuits of \(n^{log n}\) size (even with a layer of threshold gates at the bottom). This is achieved by strengthening a key component of the proof of NEXP not in ACC; in particular, we prove a new "Easy Witness Lemma" for NQP. |
March 27, 20183:30PM |
Nithin Varma, Boston University
Property testing is a formal framework to design algorithms for large datasets. The traditional definition of property testing [Rubinfeld & Sudan '96, Goldreich, Goldwasser & Ron '98] abstracts datasets as functions and assumes that an algorithm (also called tester) can query points in the domain of a function to obtain the corresponding values. The task of the tester is to distinguish, with probability 2/3, between the case that the function f being queried belongs to a set (property) P and the case that f is `far' from every function in P, for an appropriate measure of distance. One of the key assumptions in the above definition, that the tester has access to function values at all of the queried domain points, is not applicable to most real-life datasets. It could be the case that some or several of the function values are kept hidden for privacy reasons, or erased by mistake or by an adversary. Querying such erased points does not provide an algorithm with any information about the function. Moreover, it might happen that, an algorithm, by not knowing the locations of these erasures, makes a lot of queries to erased points and turns out entirely useless. In the talk, I will describe the erasure-resilient property testing model that we defined to account for the occurrence of erasures in datasets. We will then present and analyse our erasure-resilient tester for monotonicity of functions. Joint work with Kashyap Dixit, Sofya Raskhodnikova and Abhradeep Thakurta. |
April 3, 20183:30PM |
Aanchal Malhotra, Boston University
The security of almost any real-world distributed system today depends on the participants having some "reasonably accurate" sense of current real time. Indeed, to name one example, the very authenticity of practically any communication on the Internet today hinges on the ability of the parties to accurately detect revocation of certificates, or expiration of passwords or shared keys. However, as recent attacks show, the standard protocols for determining time are subvertible, resulting in wide-spread security loss. Worse yet, we do not have security notions for network time protocols that (a) can be rigorously asserted and (b) rigorously guarantee security of applications that require a sense of real time. We propose such notions, within the universally composable (UC) security framework. That is, we formulate ideal functionalities that capture a number of prevalent forms of time measurement within existing systems. We show how they can be realized by real-world protocols, and how they can be used to assert security of time-reliant applications --- specifically, certificates with revocation and expiration times. This allows for relatively clear and modular treatment of the use of time in security-sensitive systems. Our modeling and analysis are done within the existing UC framework, in spite of its asynchronous, event-driven nature. This allows incorporating the use of real time within the existing body of analytical work done in this framework. In particular it allows for rigorous incorporation of real time within cryptographic tools and primitives. With Ran Canetti, Kyle Hogan, and Mayank Varia |
Fall Semester, 2017
Dec 06, 201712:00PM |
Prashant Vasudevan, MIT
We present functions that are hard to compute on average for algorithms running in some fixed polynomial time, assuming widely-conjectured worst-case hardness of certain problems from the study of fine-grained complexity. We discuss the relevance of such average-case hardness to cryptography and present, as an illustration, an outline of a proof-of-work protocol constructed based on the hardness and certain structural properties of our functions. Joint work with Marshall Ball, Alon Rosen and Manuel Sabin. |
Nov 29, 201712:00PM |
Ran Cohen, NEU
In the setting of secure multiparty computation (MPC), a set of mutually distrusting parties wish to jointly compute a function, while guaranteeing the privacy of their inputs and the correctness of the output. An MPC protocol is called fully secure if no adversary can prevent the honest parties from obtaining their outputs. A weaker definition is fairness ensuring that an adversary can prematurely abort the computation only before learning any new information. We present highly efficient transformations from fair computations to fully secure computations, assuming the fraction of honest parties is constant (e.g., 1% of the parties are honest). One application of these transformations is a new coin-flipping protocol, whose round complexity has a super-logarithmic dependency on the security parameter, improving over the linear dependency on the number of parties in Beimel, Omri, and Orlov (Crypto 2010). A second application is a new fully secure protocol for computing the Boolean OR function, with a super-constant round complexity, improving over the protocol of Gordon and Katz (TCC 2009) whose round complexity is linear in the number of corrupted parties. Finally, we prove that for some functions, a super-constant number of invocations of the fair computation is necessary for computing the function in a fully secure manner. This is a joint work with Eran Omri, Iftach Haitner, and Lior Rotem. |
Nov 15, 201712:00PM |
Wolfgang Gatterbauer, NEU
We develop upper and lower bounds for the probability of monotone Boolean functions by treating multiple occurrences of variables as independent and assigning them new individual probabilities. We call this approach "dissociation" and give an exact characterization of optimal oblivious bounds, i.e. when the new probabilities are chosen independent of the probabilities of all other variables. Our motivation comes from the weighted model counting problem (or, equivalently, the problem of computing the probability of a Boolean function), which is #P-hard in general. By performing several dissociations, one can transform a Boolean formula whose probability is difficult to compute, into one whose probability is easy to compute, and which is guaranteed to provide an upper or lower bound on the probability of the original formula by choosing appropriate probabilities for the dissociated variables. Our bounds shed light on the connection between previous relaxation-based and model-based approximations and unify them as concrete choices in a larger design space. We also show how our theory allows a standard relational database management system to both upper and lower bound hard probabilistic queries in guaranteed polynomial time. Based on joint work with Dan Suciu from TODS 2014. |
Nov 08, 201712:00PM |
Siyao Guo, NEU
We initiate a quantitative study of the power of negations, asking how many negations are required to compute cryptographic primitives. We provide several lower bounds, some of them tight, for various cryptographic primitives and building blocks including one-way permutations, pseudorandom functions, small-bias generators, hard-core predicates, error-correcting codes, and randomness extractors. Among our results, we highlight the following.
In addition, we will mention some recent results in understanding the power of negations from multiple angles including Boolean formulas and property testing. Based on join works with Clément L. Canonne, Elena Grigorescu, Ilan Komargodski, Akash Kumar, Tal Malkin, Igor C. Oliveria, Alon Rosen, and Karl Wimmer. |
Nov 01, 201712:00PM | Theory Lunch - no talk 142 ISEC |
Oct 25, 201712:00PM |
Mika Göös, Harvard
For any unsatisfiable CNF formula F that is hard to refute in the Resolution proof system, we show that a "lifted" two-party version of F is hard to refute in any proof system whose lines are computed by efficient communication protocols---or, equivalently, that a monotone function associated with F has large monotone circuit complexity. Our result extends to real monotone circuits, which yields new lower bounds for the Cutting Planes proof system. Joint work with Ankit Garg, Pritish Kamath, and Dmitry Sokolov. |
Oct 18, 201712:00PM |
Ali Vakilian, MIT
We study the Fractional Set Cover problem in the streaming model. That is, we consider the relaxation of the set cover problem over a universe of $n$ elements and a collection of $m$ sets, where each set can be picked fractionally, with a value in $[0, 1]$. We present a randomized $(1+ \epsilon)$-approximation algorithm that makes $p$ passes over the data, and uses $O(\textit{polylog} (m, n, 1/\epsilon)(mn^{(O(1/(p \epsilon))})+ n))$ memory space. The algorithm works in both the set arrival and the edge arrival models. To the best of our knowledge, this is the first streaming result for the fractional set cover problem. We obtain our results by employing the multiplicative weights update framework in the streaming settings. |
Oct 11, 201712:00PM |
Matt Dippel + Tanay Mehta
|
Oct 04, 201712:00PM |
Yashvanth Kondi
Garbled circuits are of central importance in cryptography, finding widespread application in secure computation, zero-knowledge (ZK) protocols, and verifiable outsourcing of computation to name a few. We are interested in a particular kind of garbling scheme, termed privacy-free in the literature. We show that Boolean formulas can be garbled information-theoretically in the privacy-free setting, producing no ciphertexts at all. Existing garbling schemes either rely on cryptographic assumptions (and thus require cryptographic operations to construct and evaluate garbled circuits), produce garbled circuits of non-zero size, or are restricted to low depth formulaic circuits. Our result has both theoretical and practical implications for garbled circuits as a primitive. On the theory front, our result breaks the known theoretical lower bound of one ciphertext for garbling an AND gate in this setting. As an interesting implication of producing size zero garbled circuits, our scheme scores adaptive security for free. On the practical side, our garbling scheme involves only cheap XOR operations and produces size zero garbled circuits. As a side result, we propose several interesting extensions of our scheme. Namely, we show how to garble threshold and high fan-in gates. An aspect of our garbling scheme that we believe is of theoretical interest is that it does not maintain the invariant that the garbled circuit evaluator must not at any point be in possession of both keys of any wire in the garbled circuit. |
Sep 27, 201712:00PM |
Ariel Hamlin
Protected database search systems cryptographically isolate the roles of reading from, writing to, and administering the database. This separation limits unnecessary administrator access and protects data in the case of system breaches. Since protected search was introduced in 2000, the area has grown rapidly; systems are offered by academia, start-ups, and established companies. However, there is no best protected search system or set of techniques. Design of such systems is a balancing act between security, functionality, performance, and usability. This challenge is made more difficult by ongoing database specialization, as some users will want the functionality of SQL, NoSQL, or NewSQL databases. This database evolution will continue, and the protected search community should be able to quickly provide functionality consistent with newly invented databases. At the same time, the community must accurately and clearly characterize the tradeoffs between different approaches. To address these challenges, we provide the following contributions: 1) An identification of the important primitive operations across database paradigms. We find there are a small number of base operations that can be used and combined to support a large number of database paradigms. 2) An evaluation of the current state of protected search systems in implementing these base operations. This evaluation describes the main approaches and tradeoffs for each base operation. Furthermore, it puts protected search in the context of unprotected search, identifying key gaps in functionality. 3) An analysis of attacks against protected search for different base queries. 4) A roadmap and tools for transforming a protected search system into a protected database, including an open-source performance evaluation platform and initial user opinions of protected search. |
Spring Semester, 2017
Apr 27, 20173:30PM |
Omri Weinstein, Columbia
We prove the first super-logarithmic lower bounds on the cell probe complexity of dynamic *boolean* (a.k.a. decision) data structure problems, a long-standing milestone in data structure lower bounds. We introduce a new technique and use it to prove a ~ $\log^{1.5}(n)$ lower bound on the operational time of a wide range of boolean data structure problems, most notably, on the query time of dynamic range counting *over $GF_2$* ([Pat07]). Proving an $\omega(\log n)$ lower bound for this problem was explicitly posed as one of five important open problems in the late Mihai Patrascu's obituary [Tho13]. This result also implies the first $\omega(\log n)$ lower bound for the classical 2D range counting problem, one of the most fundamental data structure problems in computational geometry and spatial databases. We derive similar lower bounds for boolean versions of dynamic polynomial evaluation and 2D "rectangle stabbing", and for the (non-boolean) problems of range selection and range median. Our technical centerpiece is a new way of "weakly" simulating dynamic data structures using efficient *one-way* communication protocols with small advantage over random guessing. This simulation involves a surprising excursion to low-degree (Chebychev) polynomials which may be of independent interest, and offers an entirely new algorithmic angle on the "cell sampling" method of Panigrahy et al. [PTW10]. Joint work with Kasper Green-Larsen and Huacheng Yu. |
Apr 20, 20173:30PM |
Ankur Moitra, MIT
Starting from the seminal works of Tukey (1960) and Huber (1964), the field of robust statistics asks: Are there estimators that provably work in the presence of noise? The trouble is that all known provably robust estimators are also hard to compute in high-dimensions. Here, we study a basic problem in robust statistics, posed in various forms in the above works. Given corrupted samples from a high-dimensional Gaussian, are there efficient algorithms to accurately estimate its parameters? We give the first algorithms that are able to tolerate a constant fraction of corruptions that is independent of the dimension. Additionally, we give several more applications of our techniques to product distributions and various mixture models. This is based on joint work with Ilias Diakonikolas, Jerry Li, Gautam Kamath, Daniel Kane and Alistair Stewart. |
Apr 13, 20173:30PM | No seminar |
Apr 06, 20173:30PM | No seminar |
Mar 30, 20173:30PM |
Ariel Hamlin and Lucianna Kiffer
|
Mar 23, 20173:30PM |
Mehraneh Liaee, NEU
We study the problem of gossip in dynamic networks controlled by an adversary that can modify the network arbitrarily from one round to another, provided that the network is always connected. In the gossip problem, $n$ tokens are arbitrarily distributed among the $n$ network nodes, and the goal is to disseminate all the n tokens to every node. Our focus is on token-forwarding algorithms, which do not manipulate tokens in any way other than storing, copying, and forwarding them. Gossip can be completed in linear time in any static network, but a basic open question for dynamic networks is the existence of a distributed protocol that can do significantly better than an easily achievable bound of $O(n^2)$ rounds. In previous work, it has been shown that under adaptive adversaries, every token forwarding algorithm requires $\Omega(n^2/\log n)$ rounds. In this paper, we study oblivious adversaries, which differ from adaptive adversaries in one crucial aspect--- they are oblivious to random choices made by the protocol. We present an $\tilde{\Omega}(n^{3/2})$ lower bound under an oblivious adversary for RANDDIFF, a natural algorithm in which neighbors exchange a token chosen uniformly at random from the difference of their token sets. We also present an $\tilde{\Omega}(n^{4/3})$ bound under a stronger notion of oblivious adversary for symmetric knowledge-based algorithms. On the positive side, we present a centralized algorithm that completes gossip in $\tilde{\Omega}(n^{3/2})$ rounds. We also show an $\tilde{O}(n^{5/3})$ upper bound for RANDDIFF in a restricted class of oblivious adversaries, which we call paths-respecting. |
Mar 16, 20173:30PM |
Yael Kalai, MIT/MSR
We construct an interactive coding scheme, a notion introduced by Schulman (FOCS 1992, STOC 1993). Loosely speaking, we show how to convert any two-party interactive protocol into one that is resilient to constant-fraction of *insertion* and *deletion* errors, while preserving computational efficiency, and blowing up the communication complexity and the *round* complexity by a constant factor that approaches 0 as the error-rate approaches 0. Previous works were not concerned with the round complexity, and typically assumed that one bit is sent per round. Moreover, most previous works (with the exception of the recent work of Braverman et. al.) considered only substitution errors or erasures errors. We consider the model where in each round each party may send a message of *arbitrary*, where the length of the messages and the length of the protocol may be adaptive, and may depend on the private inputs of the parties and on previous communication. This model is known as the (synchronous) message passing model, and is commonly used in distributed computing, and is the most common model used in cryptography. This is joint work with Klim Efremenko and Elad Haramaty. |
Mar 9, 20173:30PM | No seminar - Spring Break |
Mar 2, 20173:30PM |
Gautam Kamath, MIT
Given samples from an unknown distribution p, does it have some property, or is it far from every distribution possessing this property? This question has received enormous attention in statistics, information theory, and theoretical computer science. Some properties of interest include:
In this talk, I will discuss some recent results and directions on the frontiers of distribution testing. The main focus will be on testing whether a distribution belongs to a particular class, but I'll also talk about some newer directions, including testing on high-dimensional domains under structural assumptions (i.e., where the underlying distribution is a graphical model) and testing while maintaining differential privacy. Based on joint works with Jayadev Acharya, Bryan Cai, Constantinos Daskalakis, and Nishanth Dikkala. |
Feb 21, 20173:30PM |
Jalaj Upadhyay, Penn State
The problem of low-rank factorization of an m x n matrix $A$ requires outputting a singular value decomposition: an m x k matrix $U$, an n x k matrix $V$, and a k x k diagonal matrix $D$ such that $U D V^T$ approximates the matrix $A$ in the Frobenius norm. In this talk, we study releasing differentially-private low-rank factorization of a matrix in the general turnstile update model. We give a differentially-private algorithm instantiated with respect to a privacy level stronger than privacy levels for this and related problems studied in previous works, namely that of Blocki et al.(FOCS 2012), Dwork et al. (STOC 2014), Hardt and Roth (STOC 2012, STOC 2013), and Hardt and Price (NIPS 2014). We consider two matrices $A$ and $A'$ as neighboring if $A - A'$ can be represented as an outer product of two unit vectors. Our private algorithm with respect to this privacy level incurs optimal additive error. We also prove a lower bound that shows that the space required by this algorithm is optimal up to a logarithmic factor. Based on the paper arXiv:1604.01429 |
Feb 16, 20173:30PM |
Adam Sealfon, MIT
Motivated by the goal of improving the concrete efficiency of secure multiparty computation (MPC), we study the possibility of implementing an infrastructure for MPC. We propose an infrastructure based on oblivious transfer (OT), which would consist of OT channels between some pairs of parties in the network. We devise information-theoretically secure protocols that allow additional pairs of parties to establish secure OT correlations using the help of other parties in the network in the presence of a dishonest majority. Our main technical contribution is an upper bound that matches a lower bound of Harnik, Ishai, and Kushilevitz (Crypto 2007), who studied the number of OT channels necessary and sufficient for MPC. In particular, we characterize which n-party OT graphs G allow t-secure computation of OT correlations between all pairs of parties, showing that this is possible if and only if the complement of G does not contain as a subgraph the complete bipartite graph $K_{n-t,n-t}$. Joint work with Ranjit Kumaresan and Srinivasan Raghuraman. |
Feb 9, 20173:30PM | Talk cancelled due to weather |
Feb 2, 20173:30PM |
Constantinos Daskalakis, MIT
We provide global convergence guarantees for the expectation-maximization (EM) algorithm applied to mixtures of two Gaussians with known covariance matrices. We show that EM converges geometrically to the correct mean vectors, and provide simple, closed-form expressions for the convergence rate. As a simple illustration, we show that in one dimension ten steps of the EM algorithm initialized at infinity result in less than 1% error estimation of the means. Joint work with Christos Tzamos and Manolis Zampetakis. |
Jan 26, 20173:30PM | No talk |
Jan 19, 20173:30PM |
Stratis Ioannidis, Northeastern
Secure Function Evaluation (SFE) allows an interested party to evaluate a function over private data without learning anything about the inputs other than the outcome of this computation. This offers a strong privacy guarantee: SFE enables, e.g., a medical researcher, a statistician, or a data analyst, to conduct a study over private, sensitive data, without jeopardizing the privacy of the study's participants (patients, online users, etc.). Nevertheless, applying SFE to "big data" poses several challenges. First, beyond any computational overheads due to encryptions and decryptions, executing an algorithm securely may lead to a polynomial blowup in the total work compared to execution in the clear. For large datasets, even going from linear to quadratic complexity is prohibitive. Second, secure evaluations of algorithms should also maintain parallelizability: an algorithm that is easy to parallelize in the clear should also maintain this property in its SFE version, if its execution is to scale. This poses a challenge, as communication patterns between processors may reveal a lot about the private inputs. In this talk, we show that several machine learning and data mining algorithms can be executed securely while leading to only (a) a logarithmic blow-up in total work and (b) a logarithmic increase in the execution time when executed in parallel. Our techniques generalize to any algorithm that is graph-parallel, i.e., can be expressed through a sequence of scatter, gather, and apply operations. This includes several algorithms of practical interest, including page rank, matrix factorization, and training neural networks. |
Jan 12, 20173:30PM |
Jonathan Ullman, Northeastern
Adaptivity is an important feature of data analysis - the choice of questions to ask about a dataset often depends on previous interactions with the same dataset. However, statistical validity is typically studied in a nonadaptive model, where all questions are specified before the dataset is drawn. Recent work by Dwork et al. (STOC, 2015) and Hardt and Ullman (FOCS, 2014) initiated a general formal study of this problem, and gave the first upper and lower bounds on the achievable generalization error for adaptive data analysis. Specifically, suppose there is an unknown distribution $P$ and a set of $n$ independent samples $x$ is drawn from $P$. We seek an algorithm that, given $x$ as input, accurately answers a sequence of adaptively chosen "queries" about the unknown distribution $P$. How many samples $n$ must we draw from the distribution, as a function of the type of queries, the number of queries, and the desired level of accuracy? We give new upper bounds on the number of samples n that are needed to answer statistical queries. The bounds improve and simplify the work of Dwork et al. (STOC, 2015), and have been applied in subsequent work by those authors (Science, 2015; NIPS, 2015). As in Dwork et al., our algorithms are based on a connection with algorithmic stability in the form of differential privacy. We extend their work by giving a quantitatively optimal, more general, and simpler proof of their main theorem that the stability notion guaranteed by differential privacy implies low generalization error. Joint work with Raef Bassily, Kobbi Nissim, Adam Smith, Thomas Steinke, and Uri Stemmer. |
Jan 03, 20172:00PM |
Li-Yang Tan, TTI-Chicago
We consider the fundamental derandomization problem of deterministically finding a satisfying assignment to a CNF formula that has many satisfying assignments. We give a deterministic algorithm which, given an $n$-variable $poly(n)$-clause CNF formula $F$ that has $|F-1(1)| /eq \epsilon 2n$, runs in time $nO^~(loglogn)2$ for $\epsilon \geq 1/\polylog(n)$ and outputs a satisfying assignment of $F$. Prior to our work the fastest known algorithm for this problem was simply to enumerate over all seeds of a pseudorandom generator for CNFs; using the best known PRGs for CNFs [DETT10], this takes time $n\Omega^~(logn)$ even for constant $\epsilon$. Joint work with Rocco Servedio.
|
Fall Semester, 2016
Sep 21, 201612:30PM |
Daniel Wichs and Jack Doerner
|
Sep 28, 201612:30PM |
Yaron Singer, Harvard
In this talk we will consider the problem of maximizing a monotone submodular function under noise. There has been a great deal of work on optimization of submodular functions under various constraints, with many algorithms that provide desirable approximation guarantees. However, in many applications we do not have access to the function we aim to optimize, but rather to some erroneous or noisy version of it. This raises the question of whether provable guarantees are obtainable in presence of error and noise. We provide initial answers, by focusing on the question of maximizing a monotone submodular function under a cardinality constraint when given access to a noisy oracle of the function.
|
Oct 05, 201612:15PM |
Huy Nguyen, Northeastern
In the turnstile L_p heavy hitters problem with parameter eps, one must maintain a high-dimensional vector x in R^n subject to updates of the form update(i, Delta), which increases x_i by Delta, where i is in [n], and Delta is an integer. Upon receiving a query, the goal is to report every ``heavy hitter'' i in[n] with |x_i| >= eps*|x|_p as part of a list L, which is a subset of [n] of size O(1/\eps^p), i.e. proportional to the maximum possible number of heavy hitters. In fact we solve the stronger tail version, in which L should include every i such that |x_i| >= \eps |x_{proj{1/eps^p}}|_p and |x_i| > 0, where x_{proj{k}} denotes the vector obtained by zeroing out the largest k entries of x in magnitude. We provide a new algorithm, which in the most general turnstile model achieves optimal O(\eps^{-p}\log n) space, O(\log n) update time, and fast O(\eps^{-p}\poly(\log n)) query time, providing correctness whp. Our main innovation is an efficient reduction from the heavy hitters to a clustering problem in which each heavy hitter is encoded as some form of noisy spectral cluster in a much bigger graph, and the goal is to identify every cluster. Since every heavy hitter must be found, correctness requires that every cluster be found. We thus need a ``cluster-preserving clustering'' algorithm, that partitions the graph into clusters with the promise of not destroying any original cluster. To do this we first apply standard spectral graph partitioning, and then we use some novel combinatorial techniques to modify the cuts obtained so as to make sure that the original clusters are sufficiently preserved. Our cluster-preserving clustering may be of broader interest much beyond heavy hitters. Joint work with Kasper Green Larsen, Jelani Nelson, and Mikkel Thorup. |
Oct 12, 201612:30PM |
Mohsen Ghaffari, MIT/ETH Zurich
How can the computers in a network interact and communicate to solve the network's graph problems efficiently? This is the subject of the area of (local) distributed graph algorithms, which dates back to the 1980s. The recent years have witnessed significant developments in this classical area. In this talk, we survey some of these developments. Noting the gap between randomized and deterministic distributed graph algorithms, the talk has three (closely-related) parts: 1) On the randomized side, we present the first nearly-optimal randomized distributed algorithm for maximal independent set. Using well-known reductions, this provides a unified efficient algorithm for all the classical problems of the area. 2) On the deterministic side, we present an improved deterministic distributed algorithm for edge coloring, which almost settles a quarter-century old open problem. 3) We conclude with mentioning a surprising result that pinpoints the gap between randomized and deterministic distributed algorithms: we present a rudimentary-looking rounding problem, which admits a trivial 0-round randomized algorithm, and prove that if one can solve it deterministically in polylog(n) rounds, we would get polylog(n)-round deterministic distributed algorithms for all the local problems. The latter remains the most well-known open question of the area. |
Oct 19, 201612:30PM |
Mor Weiss, Northeastern
In the past few decades, probabilistic checking techniques were shown to yield dramatic efficiency improvements in verifying proofs and approximating distance from error-correcting codes. We show connections between cryptography and "zero-knowledge" variants of these techniques, such as Probabilistically Checkable Proofs (PCPs) and PCPs of Proximity (PCPPs), and use them to improve the complexity of cryptographic protocols. In this talk we will discuss two constructions: 1. PCPs for NP with zero-knowledge guarantees, that can be verified non-adaptively, namely by making a single round of queries to the proof. Our construction is based on a novel connection to leakage-resilient circuits. 2. Zero-knowledge PCPPs for NP, a generalization of zero-knowledge PCPs, which we construct by combining standard PCPPs with secure multiparty computation protocols. Based on joint works with Yuval Ishai, Amit Sahai, Michael Viderman and Guang Yang. |
Oct 26, 201612:30PM |
Matt Dippel and Chin Ho Lee
|
November 1, 201612:30PM |
Lorenzo Orecchia, Boston University
Fast iterative methods from Convex Optimization play a crucial role in a number of recent breakthroughs in the design of nearly-linear-time algorithms for fundamental combinatorial problems, such as maximum flow and graph partitioning. Multiplicative Weight Updates, Gradient Descent, and Nesterov's Method have become a mainstay in the construction and analysis of fast algorithms. The power of these approaches raises a number of questions: What is the exact relation between Multiplicative Weight Updates and Gradient Descent? Why do Multiplicative Weight Updates show up in so many settings? What is the intuition behind Nesterov's iterative algorithm that achieves the asymptotically optimal iteration count for smooth convex functions? In this survey talk, I will provide answers by presenting a unified framework that reveals the power and limitations of each method and provides much needed geometric intuition. Among other insights, I will explain how Multiplicative Weight Updates are a particular instance of a dual iterative method, known as Mirror Descent, and how Nesterov's algorithm can be naturally derived as a combination of Mirror and Gradient Descent. As an example of the application of these insights, I will also discuss their crucial role in recent progress in the nearly-linear-time solution of packing linear programs, both in parallel and sequentially. |
Nov 9, 201612:30PM |
Mitali Bafna and Hamid Jahanjou
|
Nov 16, 201612:30PM |
Harry Lang, Johns Hopkins
In the streaming model, data points arrive sequentially and algorithms attempt to use a sublinear amount of memory. For a class of clustering functions (including k-means, k-median, and L_p norms), we introduce a new technique for converting a coreset construction in the RAM model into a coreset construction in the streaming model. On a stream of n points, our method yields streaming coreset algorithms requiring the storage of O(S + k log n) points, where S is the size of the coreset. The previous state-of-the-art required O(S log^3 n) points. Note that S = \Omega(k), so this improves all parameters. In the RAM setting, we improve the size of coresets for a larger class of problems. This includes coresets for k-means with non-negative weights that depend on k as k \log k, and a Euclidean construction where the coreset size is independent of the dimension. |
Nov 23, 2016 | Thanksgiving - No Seminar |
Nov 30, 201612:30PM |
Ankit Garg, MSR
The celebrated Brascamp-Lieb (BL) inequalities (and their extensions) are
an important mathematical tool, unifying and generalizing numerous inequalities in analysis,
convex geometry and information theory. While their structural theory is very well understood,
far less is known about computing their main parameters. We give polynomial time algorithms to compute: The algorithms are obtained by a simple efficient reduction of a given BL-datum to an instance of the Operator Scaling problem defined by Gurvits, for which the present authors provided a polynomial time algorithm recently. This will be based on joint work with Leonid Gurvits, Rafael Oliveira and Avi Wigderson. |
Dec 07, 201612:30PM |
Vitaly Shmatikov, Cornell Tech
Mylar is a framework by Popa et al. that uses multi-key searchable encryption (MKSE) to build Web applications on top of encrypted data. We demonstrate that (1) the Popa-Zeldovich model for MKSE does not imply security against either passive or active attacks; (2) Mylar-based Web applications reveal users’ data and queries to passive and active adversarial servers; and (3) Mylar is generically insecure against active attacks due to system design flaws. Our results show that the problem of securing client-server applications against actively malicious servers is challenging and still unsolved. |
Spring Semester, 2016
Jun 16, 20163:30PM |
Ryan Rogers, University of Pennsylania
In this work, we initiate a principled study of how the generalization properties of approximate differential privacy can be used to perform adaptive hypothesis testing, while giving statistically valid p-value corrections. We do this by observing that the guarantees of algorithms with bounded approximate max-information are sufficient to correct the p-values of adaptively chosen hypotheses, and then by proving that algorithms that satisfy (approximate) (epsilon,delta)-differential privacy have bounded approximate max information when their inputs are drawn from a product distribution. This substantially extends the known connection between differential privacy and max-information, which previously was only known to hold for (pure) (epsilon,0)-differential privacy. It also extends our understanding of max-information as a partially unifying measure controlling the generalization properties of adaptive data analyses. We also show a lower bound, proving that (despite the strong composition properties of max-information), when data is drawn from a product distribution, approximate differentially private algorithms can come first in a composition with other algorithms satisfying max-information bounds, but not necessarily second if the composition is required to itself satisfy a nontrivial max-information bound. This, in particular, implies that the connection between approximate differential privacy and max-information holds only for inputs drawn from product distributions, unlike the connection between pure differential privacy and max-information.
|
Apr 14, 20162:50PM |
Gabor Lippner, Northeastern University Dept. of Mathematics
A local algorithm aims to compute some global structure on a graph using only simultaneous, local, decisions at the nodes. I will review recent developments on the possibilities and the limitations of such algorithms, and explain their connection to the theory of graph limits.
|
Apr 7, 20161:30PM |
Huy Nguyen, TTI Chicago
Challenges abound in modern large scale data analysis, ranging from the sheer volume of the data to its complex structure and the need for timely responses. A promising common approach centers on capturing key information from the data using concise representations that can be constructed in a distributed manner or from an evolving stream of data. In this talk, we will illustrate both the power and limitations of this approach using three research vignettes. In the first example, we describe an algorithmic framework for fundamental linear algebra problems based on short summaries. In the second example, we design distributed algorithms for a family of non-convex problems arising in learning applications. In the third example, we show that even basic statistical estimation tasks require large summaries.
|
Apr 4, 201610:30AM |
Matt Weinberg, Princeton
When real people interact with algorithms (e.g. in auctions, crowdsourcing, Bitcoin, etc.), they impose additional desiderata beyond simply that the algorithm is correct, fast, and uses little storage. People strategize during these interactions, so algorithms deployed in these settings must be robust against strategic manipulation. Additionally, people prefer transparent interactions, so these algorithms must also be as simple as possible. My research addresses these, and other novel challenges that arise when algorithms interact with strategic agents. In this talk, I will focus on robustness against strategic manipulation, and present a new algorithmic framework for these settings. Specifically, I will present a black-box reduction from solving any optimization problem in strategic settings, where the input is held by selfish agents with interests of their own, to solving a perturbed version of that same optimization problem in traditional settings, where the input is directly given. I will further apply this framework to resolve two longstanding open problems: extending Myerson's celebrated characterization of optimal single-item auctions to multiple items (Myerson 1981), and designing truthful mechanisms for job scheduling on unrelated machines (Nisan and Ronen 1999). Finally, I will briefly show how strategic considerations motivate nice questions in "traditional" areas of algorithm design as well, and present some of my work in convex optimization, parallel algorithms, and prophet inequalities. |
Mar 23, 201610:30AM |
Sofya Raskhodnikova, Penn State
Massive datasets are becoming increasingly common. What useful computations can be performed on a dataset when reading all of it is prohibitively expensive? This question, fundamental to several fields, is at the heart of the research area, called sublinear-time algorithms, that has provided important insights into fast approximate computation. In this talk, we will consider types of computational tasks central to sublinear-time algorithms: approximation, testing, and learning. We will see examples of sublinear-time algorithms in several domains. The algorithms themselves are typically simple and efficient, but their analysis requires insights into basic combinatorial, algebraic, and geometric questions. We will also discuss new directions in sublinear-time algorithms, including new computational tasks, new measures for accuracy guarantees, and new models for data access. These directions enable applications of sublinear-time algorithms in privacy, analysis of real-valued data, and situations where the data is noisy or incomplete. |
Mar 16, 201610:30AM |
Mohit Singh, Microsoft Research Redmond
In this talk I will discuss the problem of computing maximum entropy distributions over a discrete set of objects subject to observed marginals. While there has been a tremendous amount of interest in such distributions due to their applicability in areas such as statistical physics, economics, information theory, machine learning, combinatorics and algorithms, a rigorous and systematic study of how to compute such distributions had been lacking. In this talk, I will show that the problem of computing maximum entropy distributions, in a fairly general setting, is equivalent to a counting problem on the underlying set. One of the consequences of this equivalence is that, for several combinatorial settings, there is an efficient algorithm to compute max-entropy distributions. I will also talk about applications of maximum entropy distributions to design approximation algorithms for discrete optimization problems including the Traveling Salesman Problem (TSP). These results include the first improvements over the classical Christofides' algorithm from 1970's for TSP on graph metrics. |
Mar 15, 20164:00PM |
Nils Olberg, Aachen
In the concurrent multicast problem, we are given a graph representing a communication network together with a set of sender-receiver pairs. The goal is to deliver the message of each sender to the corresponding receiver. The concurrent multicast problem and its variants are motivated by information dissemination problems arising in diverse network scenarios. In this talk, we consider two of the most prominent communication models, namely the telephone and the radio model. For each of these models, our aim to develop algorithms that give good approximation algorithms, since the underlying problems are known to be NP-hard. Using sparse partition covers we provide a first non-trivial bound for radio concurrent multicast. We further demonstrate the connection of telephone concurrent multicast to other network design problems, which motivates the development of approximation algorithms for the latter. Finally, we show that our methods can be extended to yield new results for minimum degree graph spanners. |
Mar 14, 201610:30AM |
Vipul Goyal, Microsoft Research India
A central challenge in the design of secure systems is to defend against man-in-the-middle attacks, where an adversary can arbitrarily tamper with the messages exchanged by two parties over a communication channel. Starting with the early nineties, an important research goal in cryptography has been to build ``non-malleable'' cryptographic protocols that are resilient to such attacks. In this talk, I will describe my work that culminates this two-decade long research quest by constructing round-optimal non-malleable protocols based on almost minimal cryptographic assumptions. I will also discuss how the techniques developed in these works have transcended cryptography and found applications in randomness extraction, coding theory, and complexity theory. I will also briefly talk about my work on applied cryptography and its impact. |
Feb 29, 201610:30AM |
Timothy Chan, University of Waterloo
Although data structures in computational geometry have been studied since the 1970s, recent years have seen exciting new developments that deepen our theoretical understanding of some of the most fundamental problems in the area. In this talk, I will survey work on a number of these core problems, including (static and dynamic) 2D point location, 2D nearest neighbor searching, and orthogonal and nonorthogonal range searching. If time permits, I will also mention some unexpected applications of geometric data structures, to text indexing and all-pairs shortest paths. |
Feb 9, 201610:30AM |
Mary Wootters, Carnegie Mellon University
Error correcting codes are an important tool in communication, and the family of Reed-Solomon (RS) codes is an especially important example. I'll introduce error correcting and Reed-Solomon codes, and I'll talk about three very different ways that RS codes have come up in my research. These three ways (none of which immediately has to do with communication) run the gamut from theory to practice. The first way, on the very theoretical end, is a combinatorial problem called list decoding, which has motivations in complexity theory. The second way, a bit more down-to-earth, has to do with designing schemes for distributed storage: how do we store a file on several servers robustly? The third way, on the practical side of things, is combinatorial group testing: in our work we collaborated with computational biologists to design better algorithms for high-throughput genetic screening and de novo gene sequencing. |
Jan 21, 20163:30PM |
Zhiwei Steven Wu, University of Pennsylvania
We initiate the study of a quantity that we call coordination complexity. In a distributed optimization problem, the information defining a problem instance is distributed among n parties, who need to each choose an action, which jointly will form a solution to the optimization problem. The coordination complexity represents the minimal amount of information that a centralized coordinator, who has full knowledge of the problem instance, needs to broadcast in order to coordinate the n parties to play a nearly optimal solution. We show that upper bounds on the coordination complexity of a problem imply the existence of good jointly differentially private algorithms for solving that problem, which in turn are known to upper bound the price of anarchy in certain games with dynamically changing populations. We show several results. We fully characterize the coordination complexity for the problem of computing a many-to-one matching in a bipartite graph by giving almost matching lower and upper bounds.Our upper bound in fact extends much more generally, to the problem of solving a linearly separable convex program. We also give a different upper bound technique, which we use to bound the coordination complexity of coordinating a Nash equilibrium in a routing game, and of computing a stable matching. |
Jan 6, 201612:00PM |
Chien-Chung Huang, Chalmers University of Technology
We propose new exact and approximation algorithms for the weighted matroid intersection problem. Our exact algorithm is faster than previous algorithms when the largest weight is relatively small. Our approximation algorithm delivers a $(1-\epsilon)$-approximate solution with a running time significantly faster than most known exact algorithms. The core of our algorithms is a decomposition technique: we decompose an instance of the weighted matroid intersection problem into a set of instances of the unweighted matroid intersection problem. The computational advantage of this approach is that we can make use of fast unweighted matroid intersection algorithms as a black box for designing algorithms. Precisely speaking, we prove that we can solve the weighted matroid intersection problem via solving $W$ instances of the unweighted matroid intersection problem, where $W$ is the largest given weight. Furthermore, we can find a $(1-\epsilon)$-approximate solution via solving $O(\epsilon^{-1} \log r)$ instances of the unweighted matroid intersection problem, where $r$ is the smallest rank of the given two matroids. |
Fall Semester, 2015
Dec 8, 20153:30PM |
Elliot Anshelevich, RPI
We examine the quality of social choice mechanisms using a utilitarian view, in which all of the agents have costs for each of the possible alternatives. While these underlying costs determine what the optimal alternative is, they may be unknown to the social choice mechanism; instead the mechanism must decide on a good alternative based only on the ordinal preferences of the agents which are induced by the underlying costs. Due to its limited information, such a social choice mechanism cannot simply select the alternative that minimizes the total social cost (or minimizes some other objective function). Thus, we seek to bound the distortion: the worst-case ratio between the social cost of the alternative selected and the optimal alternative. Distortion measures how good a mechanism is at approximating the alternative with minimum social cost, while using only ordinal preference information. The underlying costs can be arbitrary, implicit, and unknown; our only assumption is that the agent costs form a metric space, which is a natural assumption in many settings. We quantify the distortion of many well-known social choice mechanisms. We show that for both total social cost and median agent cost, many positional scoring rules have large distortion, while on the other hand Copeland and similar mechanisms perform optimally or near-optimally, always obtaining a distortion of at most 5. We also give lower bounds on the distortion that could be obtained by *any* deterministic social choice mechanism, and extend our results on median agent cost to more general objective functions.
|
Nov 12, 20153:30PM |
Cris Moore, Santa Fe Institute
October 22, 3:30-5:00 PM Lecture I: Computational complexity and landscapes o Two puzzles: Euler vs. Hamilton o P vs. NP and NP-completeness: building computers o When greed works: Minimum Spanning Tree and Max Flow o Bumpy energy landscapes and local optima October 29, 3:30-5:00 PM Lecture II: Phase transitions in random graphs o The emergence and size of the giant component o Branching processes and differential equations o Power laws at criticality o The k-core and discontinuous transitions November 5, 3:30-5:00 PM Lecture III: Phase transitions in random k-SAT o Early history and phenomenology o First and second moment bounds o Algorithms, clustering, and frozen variables o Why we believe in a regime where solutions exist but are hard to find November 12, 3:30-5:00 PM Lecture IV: Community detection in networks o The stochastic block model o The analogy between statistical physics and statistical inference o Belief propagation and variational methods o The detectability transition o (If there's time) Spectral clustering |
Nov 10, 201512:00PM |
Bobby Kleinberg
How much information can be learnt about an individual by observing the outcome of a computation? If the computation is differentially private, the answer is: "not much more than if the individual's data had been excluded from the input." A stronger notion of privacy, originally propounded by Dalenius in the 1970's, requires instead that it should not be possible to learn much more about an individual than if the outcome of the computation had never been revealed. Simple examples, as well as a general impossibility theorem of Dwork and Naor, preclude the possibility of supplying this stronger "inferential" guarantee against attackers with arbitrary auxiliary information, assuming the computation is at least minimally useful. In this talk we revisit the notion of inferential privacy and ask: under what limitations on the adversary's side information can we deduce an inferential guarantee from a differential one? We model the adversary's side information as a prior distribution over datasets (or, more generally, a set of possible priors) and prove two main results. The first result pertains to distributions that satisfy a natural positive-affiliation condition, and gives an upper bound on the inferential privacy guarantee for any differentially private mechanism. This upper bound is matched by a simple mechanism that adds Laplace noise to the sum of the data. The second result pertains to distributions that have weak correlations, defined in terms of a suitable "influence matrix". The result provides an upper bound for inferential privacy in terms of the differential privacy parameter and the spectral norm of this matrix. |
Nov 5, 20153:30PM |
Cris Moore, Santa Fe Institute
October 22, 3:30-5:00 PM Lecture I: Computational complexity and landscapes o Two puzzles: Euler vs. Hamilton o P vs. NP and NP-completeness: building computers o When greed works: Minimum Spanning Tree and Max Flow o Bumpy energy landscapes and local optima October 29, 3:30-5:00 PM Lecture II: Phase transitions in random graphs o The emergence and size of the giant component o Branching processes and differential equations o Power laws at criticality o The k-core and discontinuous transitions November 5, 3:30-5:00 PM Lecture III: Phase transitions in random k-SAT o Early history and phenomenology o First and second moment bounds o Algorithms, clustering, and frozen variables o Why we believe in a regime where solutions exist but are hard to find November 12, 3:30-5:00 PM Lecture IV: Community detection in networks o The stochastic block model o The analogy between statistical physics and statistical inference o Belief propagation and variational methods o The detectability transition o (If there's time) Spectral clustering |
Oct 29, 20153:30PM |
Cris Moore, Santa Fe Institute
October 22, 3:30-5:00 PM Lecture I: Computational complexity and landscapes o Two puzzles: Euler vs. Hamilton o P vs. NP and NP-completeness: building computers o When greed works: Minimum Spanning Tree and Max Flow o Bumpy energy landscapes and local optima October 29, 3:30-5:00 PM Lecture II: Phase transitions in random graphs o The emergence and size of the giant component o Branching processes and differential equations o Power laws at criticality o The k-core and discontinuous transitions November 5, 3:30-5:00 PM Lecture III: Phase transitions in random k-SAT o Early history and phenomenology o First and second moment bounds o Algorithms, clustering, and frozen variables o Why we believe in a regime where solutions exist but are hard to find November 12, 3:30-5:00 PM Lecture IV: Community detection in networks o The stochastic block model o The analogy between statistical physics and statistical inference o Belief propagation and variational methods o The detectability transition o (If there's time) Spectral clustering |
Oct 22, 20153:30PM |
Cris Moore, Santa Fe Institute
October 22, 3:30-5:00 PM Lecture I: Computational complexity and landscapes o Two puzzles: Euler vs. Hamilton o P vs. NP and NP-completeness: building computers o When greed works: Minimum Spanning Tree and Max Flow o Bumpy energy landscapes and local optima October 29, 3:30-5:00 PM Lecture II: Phase transitions in random graphs o The emergence and size of the giant component o Branching processes and differential equations o Power laws at criticality o The k-core and discontinuous transitions November 5, 3:30-5:00 PM Lecture III: Phase transitions in random k-SAT o Early history and phenomenology o First and second moment bounds o Algorithms, clustering, and frozen variables o Why we believe in a regime where solutions exist but are hard to find November 12, 3:30-5:00 PM Lecture IV: Community detection in networks o The stochastic block model o The analogy between statistical physics and statistical inference o Belief propagation and variational methods o The detectability transition o (If there's time) Spectral clustering |