PodParley PodParley

WARNING!! Neural networks can memorize secrets (ep. 100)

Episode 97 of the Data Science at Home podcast, hosted by Francesco Gadaleta, titled "WARNING!! Neural networks can memorize secrets (ep. 100)" was published on March 23, 2020 and runs 24 minutes.

March 23, 2020 ·24m · Data Science at Home

0:00 / 0:00

One of the best features of neural networks and machine learning models is to memorize patterns from training data and apply those to unseen observations. That's where the magic is. However, there are scenarios in which the same machine learning models learn patterns so well such that they can disclose some of the data they have been trained on. This phenomenon goes under the name of unintended memorization and it is extremely dangerous. Think about a language generator that discloses the passwords, the credit card numbers and the social security numbers of the records it has been trained on. Or more generally, think about a synthetic data generator that can disclose the training data it is trying to protect.  In this episode I explain why unintended memorization is a real problem in machine learning. Except for differentially private training there is no other way to mitigate such a problem in realistic conditions.At Pryml we are very aware of this. Which is why we have been developing a synthetic data generation technology that is not affected by such an issue.   This episode is supported by Harmonizely. Harmonizely lets you build your own unique scheduling page based on your availability so you can start scheduling meetings in just a couple minutes.Get started by connecting your online calendar and configuring your meeting preferences.Then, start sharing your scheduling page with your invitees!   References The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networkshttps://www.usenix.org/conference/usenixsecurity19/presentation/carlini

One of the best features of neural networks and machine learning models is to memorize patterns from training data and apply those to unseen observations. That's where the magic is.  However, there are scenarios in which the same machine learning models learn patterns so well such that they can disclose some of the data they have been trained on. This phenomenon goes under the name of unintended memorization and it is extremely dangerous.

Think about a language generator that discloses the passwords, the credit card numbers and the social security numbers of the records it has been trained on. Or more generally, think about a synthetic data generator that can disclose the training data it is trying to protect. 

In this episode I explain why unintended memorization is a real problem in machine learning. Except for differentially private training there is no other way to mitigate such a problem in realistic conditions. At Pryml we are very aware of this. Which is why we have been developing a synthetic data generation technology that is not affected by such an issue.

 

This episode is supported by Harmonizely.  Harmonizely lets you build your own unique scheduling page based on your availability so you can start scheduling meetings in just a couple minutes. Get started by connecting your online calendar and configuring your meeting preferences. Then, start sharing your scheduling page with your invitees!

 

References

The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks https://www.usenix.org/conference/usenixsecurity19/presentation/carlini

Decoding Superhuman Boomer Anderson Decoding Superhuman is a show for those looking for an elevated human experience through health. Boomer Anderson, Decoding Superhuman’s host, interviews the top thought leaders in the fields of health and performance optimization.This includes deep dives into chronobiology, minimum effective exercise, technology, the future of work, nutrition, productivity, applications of Eastern medicine, neuroscience, data, omics testing and technologies, and tactics to live a healthy and optimized life.Explore the latest science, technology, and tools to optimize your health and performance whether it be at work or home.New episodes released every Tuesday. Hosted on Acast. See acast.com/privacy for more information. Explicit The Analytics Engineering Podcast dbt Labs, Inc. Tristan Handy has been curating the Analytics Engineering Roundup newsletter since 2015, pulling together the internet's best data science & analytics articles.Tristan and co-host Julia Schottenstein now bring the Roundup to real life, hosting biweekly conversations with data practitioners inventing the future of analytics engineering.You can view full episode summaries and read back issues of the Roundup newsletter at https://roundup.getdbt.com.The podcast is sponsored by dbt labs, makers of the data transformation framework dbt. To reach our team, drop a note to [email protected]. Explicit Apple Pie www.applepie.fyi Megumi and Eva do a podcast. Each week, we talk about how to balance ethics and hustle in a post-post-modern world. Have a question about how to get a job in data science or ux design? Email us at [email protected] livestream our phone calls on Anchor every Tuesday. https://anchor.fm/s/a63b14Inquiries: [email protected] Explicit Flesh 'N Bold Nevin J. Heard, PhD and Nia Heard-Garris, MD, MSc This sister and brother doctor duo task themselves with dismantling the kyriarchy in medicine, mental health, and society-at-large. Dr. Nia J. Heard-Garris (MD, MSc)-- a physician and researcher by day and scholar-activist by night and will not stop until structural racism is eliminated. Dr. Nevin J. Heard (PhD) has a background in clinical counseling and focuses on multicultural and social justice issues. Nevin's work centers the lived realities of folx living at the intersections of marginalized identities. These siblings have engaged in cultural commentary since they were in diapers. Flesh 'N Bold will continue those conversations coupled with data, science, and guest appearances. We will make these ivory-tower discussions digestible for the dinner table. Flesh 'N Bold invites you to join the family as we listen, reflect, and challenge one another to enact change one podcast at a time. Explicit
URL copied to clipboard!