Title
| When
| Abstract
|
Recent Developments in
Quantum Copy Protection
By: Mark Zhandry (Host: Dr. Or Sattath) | June 16, 2021 19:00-20:00
>> Zoom Link
| Quantum copy protection is a hypothetical tool which uses unclonable quantum states to prevent programs from being copied. In this talk, I will discuss some recent progress on building such unclonable programs. Along the way, I will highlight interesting connections to watermarking software and obfuscating programs. Webpage Paper (I will also be talking about another paper that was just accepted to CRYPTO 2021, but it hasn't been posted yet).
>> Meeting Recording
|
A Comprehensive trust-based information security
model for Online Social Networks
By: Nadav Voloch (Advisors: Prof. Danny Hendler and Prof. Ehud Gudes)
| June 8, 2021 12:00-13:00
>> Zoom Link
| Online Social Networks (OSN) have become a central means of communication and interaction between people around the world. The essence of privacy has been challenged through the past two decades as technological advances enabled benefits and social visibility to active members that share content in online communities. While OSN users share personal content with friends and colleagues, they are not always fully aware of the potential unintentional exposure of their information to various people including adversaries, social bots, fake users, spammers, or data-harvesters. Preventing this information leakage is a key objective of many security models developed for OSNs including Access Control, Relationship based models, Trust based models and Information Flow control. Following previous research, we assert that a combined approach is required to overcome the shortcoming of each model. In this research we present a new model to protect users' privacy. The basic model is composed of three main phases addressing three of its major aspects: trust, role-based access control and information flow. This model considers a user's sub-network and classifies the user's direct connections to roles. It relies on public information such as total number of friends, age of user account, and friendship duration to characterize the quality of the network connections. It also evaluates trust between a user and members of the user's network to estimate if these members are acquaintances or adversaries based on the paths of the information flow between them. The basic model has three main extensions: Analyzing the attack scenarios on the basic model and proposing modes for its defense, Refining the model by considering both the category and context of the Data instance, and the User profile and past actions regarding different contexts. And third, considering the GDPR regulations, the implementation of our model creates control of all personal data since its sharing is restricted by the secure network and cannot leak from it. Current work is done on the context part of its use case of Fake News prevention by detecting potentially problematic users that spread it. This work was published in several conference papers and is to appear (accepted) as a journal paper in Elsevier's "Online Social Networks and Media (OSNEM)". This is a joint work with Ehud Gudes and Nurit Gal-Oz. >> Meeting Recording
|
MGIC: Multigrid-in-Channels Neural Network Architectures
By: Moshe Eliasof, Winner of the 2021 Friedman Award (Advisor: Dr. Eran Treister)
| June 1, 2021 12:00-13:00
>> Zoom Link
| Convolutional Neural Networks are famous for their success in various fields and applications. However, applying CNNs typically comes at the price of computationally expensive systems, consuming large amounts of energy and time. This restriction is undesired, especially for edge devices like smartphones and vehicles.To this end, CNN architectures like MobileNets and ShuffleNets were proposed as lightweight alternatives to the standard CNN blocks, reducing the computational cost while retaining similar accuracy. However, the frequent use of 1x1 convolutions in such networks still imposes a quadratic growth of the number of parameters with respect to the number of channels (width) of the network. In this work, we address the redundancy in CNN and quadratic scaling problems by introducing a multigrid-in-channels (MGIC) approach. Our MGIC architectures replace each CNN block with an MGIC counterpart that utilizes a hierarchy of nested grouped convolutions of small group size to address the problems above. We show that our proposed architectures scale linearly with respect to the network's width while retaining full coupling of the channels as in standard CNNs. Our extensive experiments on image classification, segmentation, and point cloud classification show that applying this strategy to different architectures like ResNet and MobileNetV3 reduces the number of parameters while obtaining similar or better accuracy.
This is a joint work with Jonathan Ephrath, Lars Ruthotto, and Eran Treister.
I presented this work at the SIAM Conference on Multigrid Methods 2021, where it was nominated for the best student paper.
|
Topics
in Secret Sharing
By: Hussien
Othman (Advisor: Prof. Amos Beimel)
| May 25, 2021 12:00-13:00
>> Zoom Link | Secret
sharing is a cryptographic primitive in which a dealer, who holds a secret,
distributes shares to parties such that an authorized subset of parties can
learn the secret and an unauthorized subset can learn no information about the
secret. Secret-sharing schemes are an important tool in cryptography and they
are used as a building box in many secure protocols, such as threshold
cryptography, access control, attribute-based encryption, and multi-party
computation (MPC) protocols.
In this talk, I'll present my results from my Ph.D. I will mainly discuss
results on evolving secret sharing schemes. Evolving secret-sharing schemes,
introduced by Komargodski, Naor, and Yogev (TCC 2016b), are secret-sharing
schemes in which there is no a-priori upper bound on the number of parties that
will participate. The parties arrive one by one and when a party arrives the
dealer gives it a share; the dealer cannot update this share when other parties
arrive. Motivated by the fact that when the number of parties is known, ramp
secret-sharing schemes are more efficient than threshold secret-sharing
schemes, I study evolving ramp secret-sharing schemes and in this talk, I will
present my constructions that are better than the best known evolving threshold
secret-sharing schemes.
I'll also briefly discuss my results on polynomial secret sharing. Polynomial
secret sharing is a class of secret-sharing schemes in which the sharing and/or
the reconstruction functions are degree-d polynomials. This is a broader class
than linear schemes, which is the most studied class in secret sharing, where
the sharing and reconstruction functions are linear. I will present my results
on polynomial secret sharing schemes, showing both new upper bounds and lower
bounds on their efficiency.
>> Meeting Recording
|
From Sample Compression to Learning and Back
By: Prof. Aryeh
Kontorovich (Host: Dr. Eran Treister)
| May 18, 2021 15:00-16:00
>> Zoom Link
| The Occam's Razor principle guides the scientist to choose the simplest among competing explanations. Statistical machine learning is able to formalize and quantify this principle, to obtain concrete generalization bounds as a function of the amount of compression. Collaborating with several colleagues over the years, we have been able to both make novel use of the compression technique to achieve the strongest possible Bayes-consistency results in metric spaces and to examine the tightness of the bounds themselves, providing lower and upper bounds. The talk will be a survey for the general CS audience.
|
CS First Annual Poster Competition
(Host: Dr. Eran Treister)
| May 11, 2021 12:10-14:00
>> Zoom Link
| In this event our students will showcase some of the work done in the CS department. Tentative Schedule: 12:10-12:30: Virtual gathering and some opening remarks. 12:30-13:00: A blitz of one brief slide per participant (about 30 sec. per slide). 13:00-14:00: Private zoom session for research presentation. You are all invited to take part in the event and vote for this year's winners.
|
The
Mathematics of Super-Resolution: how small can we see in principle?
By: Dmitry
Batenkov (TAU) (Host: Eran
Treister)
| May 4, 2021 12:00-13:00
>> Zoom Link
| The problem of super-resolution appears in many forms in applications, and is essentially asking to recover “small" features of some object from partial information (of low resolution). In this talk I will discuss mathematical aspects of the problem, emphasizing the optimal trade-offs between the type of prior information which must be assumed (otherwise the problem is ill-posed), available data uncertainty (noise), and the best possible accuracy/resolution one might obtain. I will elaborate a bit on the so-called “algebraic" methods of solution, and pose several open problems. Papers: 1. Batenkov D, Demanet L, Mhaskar HN (2019) Stable soft extrapolation of entire functions. Inverse Problems 35: 015011. 2. Batenkov D, Goldman G, Yomdin Y (2020) Super-resolution of near-colliding point sources. Information and Inference: A Journal of the IMA.
|
Does Bitcoin Scale? A Close look at the Lightning
Network
By: Aviv Zohar (Host: Or Sattath)
| Apr. 27, 2021 12:00-13:00
>> Zoom Link
| The Lightning Network is considered to be Bitcoin's greatest hope of scaling up the number of transactions that can be done with the currency. In the talk I will explain the principles underlying the operation of this network and discuss several vulnerabilities that we explored in our research. No prior knowledge of blockchains or the lightning network will be assumed.
Papers: Flood & Loot: A Systemic Attack On The Lightning Network (joint work with Jona Harris) Hijacking Routes in Payment Channel Networks: A Predictability Tradeoff (joint work with Saar Tochner and Stefan Schmid) Congestion Attacks in Payment Channel Networks (joint work with Ayelet Mizrahi)
>> Meeting Recording
|
Quantum Garbled Circuits
By: Henry Yuen (Host: Or Sattath)
| Apr. 21, 2021 16:00-17:00
>> Zoom Link
| Yao's garbled circuits
is a central example of a cryptographic primitive known as randomized encodings, in which a function f and input x can be encoded in such a way that only the value f(x) can be recovered, but nothing else about the function f or x can be learned. Garbled circuits (and randomized encodings more generally) have had numerous applications to secure multiparty
computation, obfuscation, zero knowledge protocols, parallel cryptography, complexity theory, and more. In this talk I will introduce the notion of quantum randomized encodings, and will present an instantiation with a quantum analogue of Yao's garbled circuits. Time permitting I will discuss applications of quantum garbled circuits/randomized encodings, including zero-knowledge for QMA and obfuscation of quantum circuits. No quantum knowledge is necessary.
Joint work with Zvika Brakerski - Link to article
>> Meeting Recording
|
Designer
protein 2D materials for engineered biology
By: Ariel
Ben Sasson (Host: Dr. Eran Treister)
| Apr. 6, 2021 12:00-13:00
>> Zoom Link
| Designer protein materials uniquely offer diverse dynamic and static properties and are able to interact with synthetic and living systems alike. To de-novo create these sophisticated materials we adopt evolutionary guidelines of hierarchical design (omitting their limitations), i.e., design of building blocks and the set of interactions between them and their environment to guide a controllable multi-step assembly towards the desired structure/function. The rapid progress in protein design (structure/function to sequence) as in the inverse problem of protein structure prediction (sequence to structure/function) 1,2 afford efficient computational tools to create protein building blocks yet the rules to generate the sophisticated complexes are still lacking. I will present our work, introducing two major steps towards the design of practical bio-materials. 3 First, using the Rosetta modelling tool we developed co-assembling materials, these are materials which are built from two or more distinctive components that are independently stable. Compared to their single-component counterparts such as S-layers and designed analogues, binary systems afford facile components production and functionalization, as well as control over the onset, location, and composition aspects of the formation process that is only initiated by mixing two different building blocks. Second, we leverage these novel binary system properties to go beyond homogeneous protein materials formation and demonstrate formation of biologically-active hybrid arrays on mammalian cell membranes and their synthetic counterparts. This work culminates in demonstrating array assembly on cells to tunable dimensions, receptors clustering, biological pathway activation, and endocytosis blocking, which paves the way for novel approaches to interact with and reshape synthetic and living systems.
Papers: 1. Huang, P.-S., Boyken, S. E. & Baker, D. The coming of age of de novo protein design. Nature 537, 320–327 (2016). 2. Service, R. F. 'The game has changed.' AI triumphs at protein folding. Science 370, 1144–1145 (2020). 3. Ben-Sasson, A. J. et al. Design of biologically active binary protein 2D materials. Nature 589, 468– 473 (2021).
Bio: Ariel J. Ben-Sasson is a research fellow and an acting instructor at the Baker Lab, at the Institute for Protein Design, at the University of Washington. Ariel joined the Baker Lab in late 2013 as an HFSP Cross Disciplinary Fellow. His research interests are in the areas of protein engineering, self-assembling systems, and functional/conductive bio-materials. Prior to joining the Baker Lab Ariel received his PhD and MSc at the Organic Materials and Devices lab at the Nano-Science program at the Technion – Israel Institute of Technology, and BSc (cum laude) in the Biomedical Eng. Department (2005). Ariel developed methods to design and implement self-assembling systems of synthetic polymers to logic devices and developed the first block-copolymers based Vertical Organic Field Effect Transistor (VOFET). This research garnered awards from the SPIE, EMRS, MRS, TSMC and the Azrieli Fellowship, and several refereed publications and patents.
|
Towards
Solving Extremal Problems in Practice
By: Michael Frank (Advisor: Prof. Mike Codish)
| Mar. 16, 2021 12:00-13:00
>> Zoom Link
| This
work is about methods for solving extremal problem instances in practice. Extremal problems are a class of problems that contain especially hard instances, which have remained open for decades despite their relatively
``simple'' description. This work presents several novel methods for solving extremal problems. Furthermore, we demonstrate these methods, by applying them to solve well
known open instances of extremal problems. A classic, and notoriously difficult, extremal problem is the Ramsey problem. Specifically, the calculation of (binary) Ramsey numbers: the (binary) Ramsey number R(s,t) is the smallest number n such that every
complete graph with n vertices has a clique of size s or an independent set of size t.
The Ramsey problem has been introduced in 1930, and to this day only nine (!) precise values of R(s,t) are known for non-trivial values of s and t. My talk will present the topic of extremal problems, and discuss methods for solving them in practice.
We will survey the results which were published during my PhD. The main contributions consist of a description of methods which can be used to solve extremal problems, examples in which the methods are applied to solve open instances of extremal
problems,
and a brief overview of the toolchain we apply. Specifically we will discuss some of the following results: - a computer assisted proof that the Ramsey number R(4,3,3)=30 - a computer assisted proof that the optimal size sorting network with 9 inputs
consists of 25 comparators (and 29 for 10) - a method for generating sets of error-correcting code words under biological
constraints
>> Meeting Recording
|
Nonparametric
estimation of high-dimensional shape spaces with applications to structural
biology
By Amit
Moscovich (Host: Dr. Eran Treister)
| Mar. 9, 2021 16:00-17:00
>> Zoom Link
| Over the last twenty years, there have been major advances in non-linear dimensionality reduction, or manifold learning, and nonparametric regression of high-dimensional datasets with low intrinsic dimensionality. A key idea in this field is the use of data-dependent Fourier-like basis vectors given by the eigenvectors of a graph Laplacian. In this talk, I will discuss the application of these methods for mapping spaces of volumetric shapes with continuous motion. Three lines of research will be presented: (i) High-dimensional nonparametric estimation of distributions of volumetric signals from noisy linear measurements. (ii) Leveraging the Wasserstein optimal transport metric for manifold learning and clustering. (iii) Non-linear independent component analysis for analyzing independent motions.
A key motivation for this work comes from structural biology, where breakthrough advances in cryo-electron microscopy have led to thousands of atomic-resolution reconstructions of various proteins in their native states. However, the success of this field has been mostly limited to the solution of the inverse problem aimed at recovering a single rigid structure, while many important macromolecules contain several parts that can move in a continuous fashion, thus forming a manifold of conformations. The methods described in this talk present progress towards the solution of this grand challenge, namely the extension of reconstruction methods which estimate a single 3D structure to methods that recover entire manifolds of conformations.
|
Collaborative Multi-Agent Planning with Partial
Observability and Interacting Actions
By Shashank Shekhar (Advisor: Prof. Ronen Brafman)
| Mar. 2, 2021 12:00-13:00
>> Zoom Link
| Collaborative Multi-Agent Planning (MAP) under uncertainty with partial observability is a notoriously difficult problem. Such MAP problems are often modeled as Dec-POMDPs, or its qualitative variant, QDec-POMDP, which is essentially a MAP version of contingent planning.The QDec-POMDP model was introduced with the hope that its simpler, non-probabilistic structure will allow for better scalability. Indeed, at least with deterministic actions, the recent IMAP algorithm scales much better than comparable Dec-POMDP algorithms. This lecture will cover our two recent approaches that show even better scalability and applicability than IMAP. In the first part of the talk, we describe a new approach, call it QDec-FP, to solving Deterministic QDec-POMDPs based on problem factoring. First, we find a solution to a MAP problem such that the results of any observation are available to all agents. This MAP problem is essentially a single-agent planning problem for the entire team. Then, we project the solution tree into sub-trees, one per agent, and let each agent transform its projected tree into a legal local tree. If all agents succeed, we combine the trees into a valid joint-plan. Otherwise, we continue to explore the space of team solutions. In the second part of the talk, we describe a new planner that uses richer information about agents' knowledge to improve upon QDec-FP. With this change, the new planner not only scales up to larger problems with more objects but can also support "signaling", where agents signal information to each other by changing the state of the world. We discuss their theoretical properties as well as compare them with state-of-the-art planners.
Website
Paper1 link: https://ojs.aaai.org//index.php/ICAPS/article/view/3550 Paper2 (recently published at AAAI): its old workshop version is: http://gki.informatik.uni-freiburg.de/papers/epip20/shekharetal.pdf
>> Meeting Recording
|
Towards Reliable
Data-Driven Computations
By Yuval Moskovitch (Host: Prof. Moshe Sipper) | Feb. 9, 2021 15:00-16:00
>> Zoom Link
| Data-driven methods are increasingly being used in domains such as fraud and risk detection, where data-driven algorithmic decision making may affect human life. The growing impact of data and data-driven systems on society makes it important that people be able to trust analytical results obtained from data-driven computations. This can be done in two complementary ways: by providing result explanations so that the user understands the computation and the basis for the observed results; and by profiling and monitoring the data used in the computation, to make the results more reliable in the first place. In the first part of the talk, I will present the use of provenance -- information regarding the data origin and computational process -- for providing explanations of computational results. In the second part of the talk, I will present a method for data profiling using labels, as an example of a data-focused technique to facilitate an analyst building a reliable decision-making pipeline.
|
Computational theory of
graphs, sets and rigid sets
By Nadav Dym (Host: Prof. Danny Hendler)
| Feb. 8, 2021 16:00-17:00
>> Zoom Link
| Quotient spaces are a
natural mathematical tool to describe a variety of algorithmic problems where
different objects are to be compared while their natural symmetries are to be
ignored. In particular, we will focus on graphs and sets whose symmetries are
permutation of the vertices, and rigid sets whose symmetries also include rigid
motions. All three data types are prevalent in computer vision/graphics and in
many other applications.
We will discuss two problems involving these data types: (1) Geometric
alignment of graphs/sets/rigid sets, and whether it can be done in polynomial
time. (2) Neural network architectures which are invariant to the symmetries of
graphs/sets/rigid sets, and whether they are universal (can approximate every
invariant continuous function). For both problems we will see that they are
tractable for sets and intractable for graphs. We will then explain why rigid
sets are an intermediate case, where high dimensional rigid sets are equivalent
in graphs in terms of complexity, while for fixed low dimension they are
tractable. The focus on the lecture will be on two of my recent papers which
leverage these insights to achieve tractable algorithms for low-dimensional
rigid sets, both for geometric alignment and for invariant neural networks.
>> Meeting Recording
|
Better Environments for Better AI
By Sarah Keren (Host: Prof. Moshe Sipper)
| Feb. 2, 2021 12:00-13:00
>> Zoom Link
| Most AI research focuses exclusively on the AI agent itself, i.e., given some input, what are the improvements to the agent's reasoning that will yield the best possible output? In my research, I take a novel approach to increasing the capabilities of AI agents via the use of AI to design the environments in which they are intended to act. My methods identify the inherent capabilities and limitations of AI agents and find the best way to modify their environment in order to maximize performance. I will describe research projects that vary in their design objectives, in the AI methodologies that are applied for finding optimal designs, and in the real-world applications to which they correspond. One example is Goal Recognition Design (GRD), which seeks to modify environments to allow an observing agent to infer the goals of acting agents as soon as possible. A second is Helpful Information Shaping (HIS), which seeks to find the minimal information to reveal to a partially-informed robot in order to guarantee the robot's goal can be achieved. I will also show how HIS can be used in a market of information, where robots can trade their knowledge about the environment and achieve an effective communication that allows them to jointly maximize their performance. The third, Design for Collaboration (DFC), considers an environment with multiple self-interested reinforcement learning agents and seeks ways to encourage them to collaborate effectively. Throughout the talk, I will discuss how the different frameworks fit within my overarching objective of using AI to promote effective multi-agent collaboration and to enhance the way robots and machines interact with humans.
Bio: Sarah Keren is a postdoctoral fellow at The Harvard School of Engineering and Applied Sciences and The Hebrew University of Jerusalem School of Computer Science and Engineering. She received her PhD from the Technion, Israel Institute of Technology. Sarah's research focuses on providing theoretical foundations for AI systems that are capable of effective collaboration with each other and with people. She has received a number of awards, including the ICAPS 2020 Best Dissertation Honorable Mention, the ICAPS 2014 Honorable Mention for Best Paper, the Eric and Wendy Schmidt Postdoctoral Award for Women in Mathematical and Computing Sciences, and the Weizmann Institute of Science National Postdoctoral Award for Advancing Women in Science.
>> Meeting Recording
|
Mechanism Design for Social Good
By Kira Goldner (Host: Dr. Sigal Oren)
| Feb. 1, 2021 15:00-16:00
>> Zoom Link
| Society is run by algorithms, and in many cases, these algorithms interact with participants who have a stake in the outcome. The participants may behave strategically in an attempt to "game the system," resulting in unexpected or suboptimal outcomes. In order to accurately predict an algorithm's outcome and quality, we must design it to be robust to strategic manipulation. This is the subject of algorithmic mechanism design, which borrows ideas from game theory and economics to design robust algorithms. In this talk, I will show how results from the theoretical foundations of algorithmic mechanism design can be used to solve problems of societal concern. I will focus on applications in carbon license allocations, health insurance markets, and online labor markets.
Based primarily on papers: (1) https://arxiv.org/pdf/1912.06428.pdf (ITCS '20) (2) https://arxiv.org/pdf/2002.06326.pdf (MD4SG '19) (3) https://arxiv.org/pdf/1903.06696.pdf (SODA '20)
Bio: Kira Goldner is a postdoctoral researcher in the Computer Science Department and at the Data Science Institute at Columbia University, hosted by Tim Roughgarden, and supported by NSF Math Sciences and Data Science Institute fellowships. Kira uses her background in the foundations of mechanism design to address societal problems, e.g., in healthcare, climate change, and privacy. She has also worked on core problems concerning revenue maximization, simplicity, and robustness. As part of this agenda, Kira co-founded Mechanism Design for Social Good (MD4SG), an interdisciplinary initiative working to improve access to opportunity for historically disadvantaged communities. She received her PhD in computer science and engineering from the University of Washington under the advisement of Anna Karlin, and was supported by a Microsoft Research PhD Fellowship and a Google Anita Borg Scholarship. She has received many awards for her work, including the EC 2019 Best Paper with a Student Lead Author Award and the EC 2020 Best Presentation by a Student or Postdoctoral Researcher Award.
Website
|
Randomness in
Computation — Extractors and PRGs
By Dean Doron (Host: Dr. Uri Stemmer)
| Jan. 26, 2021 12:00-13:00
>> Zoom Link
| Randomness is an incredibly useful, yet expensive, resource. In many important applications, randomness is essential, and it needs to be as pure as possible. In other cases, prominently time- and space-bounded algorithms, we can reduce and even eliminate the use of it. In this talk, I will discuss two, not unrelated, results from the theory of pseudorandomness. The first one will overview the recent exciting developments in the construction of two-source extractors, which are deterministic functions that purify randomness from two very weak sources. I will talk about the road to achieving optimal extractors, and our explicit entropy-efficient reduction from these extractors to nonmalleable ones — a reduction which is now used in supporting smaller and smaller entropies.
The second result concerns derandomizing time-bounded algorithms. Although we know that BPP = P follows from circuit lower bounds, the precise tradeoff between hardness and pseudorandomness has yet to be fully explored. Particularly, what is the optimal derandomization slowdown one can achieve? I will show that by assuming exponential lower bounds against nondeterministic circuits, we could convert any randomized algorithm running in time T to a deterministic one running in time T^{2+α} for an arbitrarily small constant α. Under plausible complexity-theoretic assumptions, such a slowdown is nearly optimal. Website
Papers: - An efficient reduction from two-source to nonmalleable extractors: achieving near-logarithmic min-entropy (ECCC link) - Nearly Optimal Pseudorandomness From Hardness (ECCC link)
|
Recent
Lower Bounds in Algebraic Complexity Theory
By Ben Lee Volk (Host: Dr. Uri Stemmer)
| Jan. 19, 2021 15:00-16:00
>> Zoom Link
| Algebraic Complexity Theory studies the complexity of solving algebraic computational tasks using algebraic models of computation. One major problem in this area is to prove lower bounds on the number of arithmetic operations required for computing explicit polynomials. This natural mathematical problem is the algebraic analog of the P vs. NP problem. It also has tight connections to other classical mathematical areas and to fundamental questions in complexity theory. In this talk I will provide background and then present some recent progress in this line of study, particularly in the area of proving lower bounds for computing linear transformations.
>> Meeting Recording
|
Estimation
of Manifolds from Point Clouds:
Building Models from Data
By Barak Sober (Host: Dr. Uri Stemmer)
| Jan. 14, 2021 15:00-16:00
>> Zoom Link
| A
common observation in data-driven applications is that high dimensional data
has a low intrinsic dimension, at least locally. Thus, when one wishes to work
with data that is not governed by a clear set of equations, but still wishes to
perform statistical or other scientific analysis, an optional model is the
assumption of an underlying manifold from which the data was sampled. This
manifold, however, is not given explicitly but we can obtain samples of it
(i.e., the individual data points). In this talk, we will consider the
mathematical problem of estimating manifolds from a finite set of samples,
possibly recorded with noise. Using a Manifold Moving Least-Squares approach we
provide an approximant manifold with high approximation order in both the
Hausdorff sense as well as in the Riemannian metric (i.e., a nearly isometry).
In the case of bounded noise, we present an algorithm that is guaranteed to
converge in probability when the number of samples tends to infinity. The
motivation for this work is based on the analysis of the evolution of shapes
through time (e.g., handwriting or primates' teeth) and we will show how this
framework can be utilized to answer scientific questions in paleontology and
archaeology.
|
Atomic NLP
By Yuval Pinter (Host: Prof. Michael Elhadad)
| Jan. 12, 2021 15:00-16:00
>> Zoom Link
| Over the last few years,
deep neural models have taken over the field of natural language processing
(NLP), brandishing great improvements on many of its tasks. Despite this
success, they still face considerable challenges at the representational level,
owing to the fundamental mismatch between methods relying on real-valued vector
inputs and the discrete, symbolic medium of natural language text. Linguistic
structure and variation make it difficult to determine the basic units, or
atoms, most desirable for neural NLP. In my talk, I will discuss the problem of defining and representing atoms in
modern NLP systems through a language-aware, integrative approach. I will
present my work on this topic, where I assess the ability of systems to handle
novel words and concepts, build systems that leverage the complementary
advantages of different representation levels, and analyze the interpretations
that humans derive from word-level internal states of models. I will conclude
by laying out my vision for the future of Atomic NLP.
Papers:
https://www.aclweb.org/anthology/D17-1010/
https://www.aclweb.org/anthology/D19-1002/
https://www.aclweb.org/anthology/2020.findings-emnlp.138/
Website
|
Novel Deep Learning
Architectures for Voice Separation and Enhancement
By Yossi Adi (Host: Prof. Michael Elhadad) | Jan. 10, 2021 14:00-15:00
>> Zoom Link
| In real-world acoustic
environments, a speech signal is frequently corrupted by interfering sounds
such as a noisy environment, room conditions, multi-talker setup, etc. The
ability to separate and enhance speech forms challenging perceptual tasks and
are crucial for any speech processing system designed to perform under such
conditions.
In this talk, I'll present some of my recent research around developing and
designing novel deep learning architectures for voice separation and speech
enhancement. Specifically, I'll present methods to separate an unknown (but
bounded) number of speakers, suppress background noise in real-time, and
leverage generative models for better speech enhancement. The proposed models
work directly over the raw-waveform and significantly outperform the current
state-of-the-art methods.
|
Finding
patterns in large datasets of bacterial genomes
By Dina
Svetlitsky (Advisor Prof. Michal Ziv-Ukelson) | Jan. 7, 2021 12:00-13:00
>> Zoom Link
| The fast-growing data of microbial genomes statistically empowers
genomic-context based approaches to functional analysis; Groups of genes that
are clustered locally together in many genomes (denoted “gene clusters”) tend
to express protein products that interact in the same biological pathway and can
be viewed as “biological machines”. Hence, identifying and analyzing gene
clusters can help in the annotation of unknown genes, or in the discovery of
new cellular mechanisms – such information can be applied to help fight
diseases caused by bacteria and to the utilization of beneficial bacteria in
agriculture, medicine, and in other domains of microbiological-based industry.
The primary focus of my research is to design and implement novel bioinformatic
algorithms and tools for the discovery of gene clusters across thousands of
bacterial genomes, for searching for a known gene cluster in a new genome, and
for classifying a new bacterial genome as pathogenic (harmful) to human or not
based on its gene content.
In my research, I utilize properties that characterize bacterial genomes and
the organization of their genes and incorporate them in specialized string
algorithms to obtain scalable data mining tools. I also employ machine learning
algorithms to develop accurate classifiers that yield biologically interpretable
results, prone to microbiological reasoning and analysis.
My advisor is Michal Ziv-Ukelson. During my PhD I collaborated with Vered
Caspi, Tal Dagan, Galia Zimerman, Meirav Zehavi, Shaked Naor, and Yaron
Orenstein.
>> Meeting Recording
|
Between
Online Learning and Reinforcement Learning
By Alon
Cohen (Host: Uri Stemmer) | Jan. 5, 2021 12:00-13:00
>> Zoom Link
| In
this talk I will describe some of my work around Online Learning and
Reinforcement Learning.
Online Learning is a classic sub-domain of Machine Learning that has provided
endless contributions to fields such as Statistical Learning, Optimization,
Decision Making and others.
Unlike Reinforcement Learning which focuses on planning long-term decisions in
the face of a non-adversarial environment, Online Learning focuses on making
short-term decisions in the face of an adversary - and doing so efficiently.
In my research, therefore, I am interested in the potential interface between
these fields, looking to design efficient algorithms for long-term decision
making in the face of a possibly adversarial environment---a problem with many
real-life use cases.
I will focus on two works that leverage efficient Online Learning algorithms
for learning Linear Quadratic Regulators and Stochastic Shortest Path problems.
Website GoogleScholar
|
Recoverability: Harnessing Non-Volatile RAM for
Enhanced Recovery From Failures
By Ohad Ben Baruch (Advisor Prof. Danny Hendler)
| Dec. 30, 2020 14:00-15:00
>> Zoom Link
| Recent developments foreshadow the emergence of new systems, in which byte-addressable non-volatile main memory (NVRAM), combining the performance benefits of conventional main memory with the durability of secondary storage, co-exists with (or eventually even replaces) traditional volatile memory. This draws attention to the crash-recovery model, in which a failed process may be resurrected after it crashes. Consequently, there is increased interest in recoverable concurrent objects (a.k.a. persistent or durable): objects that are made robust to crash-failures by allowing their operations to recover from such failures. In this talk, I will formally define a model and correctness condition for recoverable objects and discuss its properties and limitations. I will then discuss ways of designing recoverable concurrent objects. I will start by presenting recoverable versions of well-known primitive shared-memory operations. This will be followed by describing lower bounds and impossibility results on implementing such objects, thus demonstrating that our model facilitates rigorous analysis of the limitations of recoverable concurrent objects. Finally, I will discuss general schemes for designing efficient and recoverable data-structure implementations.
The talk is based on joint work with Hagit Attiya, Danny Hendler, Matan Rusanovsky, Panagiota Fatourou and Eleftherios Kosmas.
>> Meeting Recording
|
State evolution of tumors uncovered by
dimensionality reduction of single-cell measurements
By Matan Hofree (Host: Chen Keasar)
| Dec. 29, 2020 12:00-13:00
>> Zoom Link
| The evolution of tumors from a single genetically aberrant cell remains poorly understood. I will present key findings shedding light on the evolution of tumors uncovered by analyses of single cell measurements from mouse models and human cancers. Analysis of such data presents a set of unique computational challenges; I will describe my work extending non-negative matrix factorization approaches to identify robust cell-types despite systematic variability and experimental bias in a time course study of lung tumors and prostate regeneration. Next, I will show how I developed an optimal transport approach to examine the transcriptional state evolution across time from separate tumors evolving from a predetermined genetic origin. Finally, I will highlight several ideas which allowed me to translate findings in mouse models to human patient data. Throughout the talk I will highlight the biological discoveries made accessible by these approaches and opportunities for future work.
|
Adversarially Robust Streaming Algorithms
By Eylon Yogev (Host: Uri Stemmer)
| Dec. 22, 2020 12:00-13:00
>> Zoom Link
| A streaming algorithm is given a long sequence of items and seeks to compute or approximate some function of this sequence using a small amount of memory. A body of work has been developed over the last two decades, resulting in optimal streaming algorithms for a wide range of problems.While these algorithms are well-studied, the vast majority of them are defined and analyzed in the static setting, where the stream is assumed to be fixed in advance, and only then the randomness of the algorithm is chosen. In many scenarios, this assumption is unrealistic, making the algorithms prone to adversarial attacks, and unfit for real-life applications. I will investigate the adversarial robustness of streaming algorithms. An algorithm is considered robust if its performance guarantees hold even if the stream is chosen adaptively by an adversary that observes the outputs of the algorithm along the stream and can react in an online manner. I will describe general-purpose methods we have developed to transform standard streaming algorithms to be secure in the adversarial model. These are the first streaming algorithms to work in this model.
|
Next Generation Programming with Program Synthesis
By Hila Peleg (Host: Dr. Dana Fisman)
| Dec. 15, 2020 12:00-13:00
>> Zoom Link
| Program synthesis is the problem of generating a program to satisfy a specification of user intent. Since these specifications are usually partial, this means searching a space of candidate programs for one that exhibits the desired behavior. The lion's share of the work on program synthesis focuses on new ways to perform the search, but hardly any of this research effort has found its way into the hands of users.We wish to use synthesis to augment the programming process, leveraging both optimized search algorithms and concepts that are part of the programmer's life such as code review and read-eval-print loops (REPL). This talk describes three synthesis-based techniques that bring program synthesis into the development workflow. A major concern in designing for the user is that it can put the interface of the synthesizer at odds with state of the art synthesis techniques. Synthesis is, at best, a computationally hard problem, and any changes made to make the tool more usable can interfere with the synthesizer and its internals. We therefore demonstrate the process of bringing synthesis theory into practice when tool design also requires an algorithm re-design. The main papers discussed in the talk are the following: http://cseweb.ucsd.edu/~hpeleg/snippy-uist2020.pdf http://cseweb.ucsd.edu/~hpeleg/resl-oopsla20.pdf http://cseweb.ucsd.edu/~hpeleg/hplus-oopsla20.pdf
|
Historical
document image processing using deep learning
By Berat Kurar (Advisor Prof. Jihad El-Sana) | Dec. 8, 2020 12:00-13:00
>> Zoom Link
| Historical
documents are being digitized plentifully for preservation and dissemination.
But digital documents further need to be transcribed for easy explorability.
Hence there is a significant need for reliable historical document image
processing algorithms. Conventional methods falter when faced with odd
challenges and the alienness of the historical manuscripts and documents. On
the other hand, deep learning has improved state-of-the-art in other pattern
recognition problems. Therefore we hypothesize that deep learning can leverage
the complexity of historical documents and improve the performance of
historical document image processing. On the other hand, effective training of
deep learning algorithms requires much labeled data whilst we have interesting
results using unsupervised methods.
>> Meeting Recording
|
Algorithms for
Approximation of Random Variables
By Liat Cohen (Advisor Prof. Gera Weiss)
| Dec. 1, 2020 >> Zoom Link
| In planning algorithms, scheduling, and other domains, there is often a need to run long computations on random variables, and to store intermediate results. A source of computational complexity often neglected in the analysis of such algorithms, is that the support, the number of non-zero probability values, of the variables needed as intermediate results may grow exponentially along the computation. To avoid exponential memory and time complexities, we suggest to shrink the support of those variables in ways that yield good approximations. In this work we introduce algorithms and methods for approximating random variables under the Kolmogorov metric in which the support size can be maintained, however, with a cost of an error. As a motivating example that demonstrates the importance of the problem and also accentuates the NP-hardness of the problem we study the use-case of estimating the probability that a plan makespan meets a given deadline; we call this problem the Deadline Problem. We present the results, provide an elaborated theoretical analysis for the approximation algorithms and also an empirical evaluation. Finally, we conclude and present the main contributions of this work, the significance of the papers and a suggested future work.
>> Meeting Recording
|
Static and Streaming
Data Structures for Fréchet Distance Queries
By Dr. Omrit Filtser
| Nov. 24, 2020
| Measuring the similarity
of two curves or trajectories is an important task that arises in various
applications. The Fréchet distance and its variants became very popular in the
last few decades, and some variants were successfully applied to real data
sets. The Fréchet distance between two curves can be computed in quadratic time
using a simple dynamic programming algorithm. However, under SETH, no strongly
subquadratic algorithms exist, even if the solution may be approximated up to a
factor of 3. In applications where there is a need to compute the distance to a
single curve many times, or when the input curve is extremely large and
quadratic running time is infeasible, a natural solution is to construct a data
structure that allows fast distance queries. In this talk Dr. Filtser will discuss approximate distance oracles and nearest neighbour
search data structures under the discrete Fréchet distance. In addition, she will
present an algorithm that constructs simplifications in the streaming model, an
oracle for distance queries to a sub-curve, and introduce the zoom-in problem.
The talk is based on the following papers:
* Static and Streaming Data Structures for Fréchet Distance Queries (SODA'21) -
joint work with Arnold Filtser.
* Approximate Nearest Neighbor for Curves - Simple, Efficient, andDeterministic (ICALP'20) - joint work with Arnold Filtser and Matya Katz.
|
A
Differential Geometry Perspective on Sequential Models
By Dr. Omri Azencot
| Nov. 17, 2020 | We
introduce a new framework for the analysis and processing of time-series data
and sequential models. Based on Koopman theory and its application to mappings,
we propose a new interpretation for a recent family of state-of-the-art
orthogonal recurrent models. Then, we observe that orthogonal models have
limited expressivity, and we suggest a new recurrent architecture that
alleviates this shortcoming. Then, we show how to easily control the behavior
of our model using simple regularization components. We evaluate our approach
on several benchmarks, and we discuss its advantages over prior models.
|
Concise Essence-Preserving Big Data Representations
By Philip Derbeko (Advisors: Prof. Shlomi Dolev, Prof. Ehud Gudes)
| Nov. 10, 2020
| The amount of data grows rapidly with time and shows no signs of stopping. Computers have become smaller and more abundant. Devices get more sensors and wireless connectivity. All of them collect more data about their surroundings and, usually, send it to the cloud servers for processing. In the end, more data is collected and processed to get more insights into how to serve us better. The common premise is that more data leads to better insights. However, as the processing, storage and transfer of the data require an increasing amount of resources, the question is: is more data always worth its cost? In this research, we consider a different approach to dealing with increasing amounts of data. Instead of preparing for more data, we try to reduce the amount of collected, transmitted, stored, and processed data. In doing so, we hope to reduce the resource consumption of all participants from end-devices through the network and to the cloud providers, and support prompt answers to (the statistical in nature) queries. To reduce the amount of data we suggest using a smaller, representative data model. We call it the similitude model. The model represents the original dataset, i.e. it can be used to answer data queries while being much smaller than the data in size.
|
Research-oriented zoom meetings for new MSc. students - Round 2
| Nov. 03, 2020 12:00-13:00
>> Zoom Link
| This is the second of two small research-oriented zoom meetings of students and faculty members, intended mostly for new students who did not find an adviser yet (other students and faculty are obviously welcome). There will be 4-5 faculty members presenting their research at each meeting. These faculty members, and others that will also attend the meetings, are looking for new research students, and this is a good opportunity to get acquainted and talk to them. If you haven't done so already, we encourage you to visit our department's research page and follow the links according to your field(s) of interest.
>> Meeting Recording
|
Research-oriented zoom meetings for new MSc. students - Round 1
| Oct. 27, 2020 12:00-13:00
>> Zoom Link
| This is the first of two small research-oriented zoom meetings of students and faculty members, intended mostly for new students who did not find an adviser yet (other students and faculty are obviously welcome). There will be 4-5 faculty members presenting their research at each meeting. These faculty members, and others that will also attend the meetings, are looking for new research students, and this is a good opportunity to get acquainted and talk to them. If you haven't done so already, we encourage you to visit our department's research page and follow the links according to your field(s) of interest.
>> Meeting Recording
|