Working at Google: Security, anti-abuse and artificial intelligence
Elie Bursztein joins us on today’s episode to talk all about his role as chief research lead for anti-abuse at Google! Along with Infosec Founder Jack Koziol and Cyber Work Podcast host Chris Sienko, they discuss the difference between the practices of security and anti-abuse, the difference between protecting Google the company and Gmail the product, and the aspects of security and anti-abuse that AI will never be able to do.
0:00 - Intro
2:35 - Starting a career in cybersecurity
12:57 - Entering the industry today
19:09 - Career progression
42:18 - Tech and academia collaboration for anti-abuse research
52:26 - Getting hired in anti-abuse and cybersecurity
1:01:09 - Future of machine learning as AI hacking
1:16:26 - Outro
Have you seen our new, hands-on training series Cyber Work Applied? Tune in every other week as expert Infosec instructors teach you a new cybersecurity skill and show you how that skill applies to real-world scenarios. You’ll learn how to carry out different cyberattacks, practice using common cybersecurity tools, follow along with walkthroughs of how major breaches occurred, and more. And it's free! Click the link below to get started.
– Learn cybersecurity with our FREE Cyber Work Applied training series: https://www.infosecinstitute.com/learn/
– View Cyber Work Podcast transcripts and additional episodes: https://www.infosecinstitute.com/podcast
[00:00:00] Chris Sienko: Today on Cyber Work I’m joined by Infosec CEO and founder, Jack Koziol, and we talk to Elie Bursztein, published author, speaker and Chief Research Lead for Anti-Abuse at Google. On the episode, we’ll talk about the difference between the practices of security and anti-abuse, the difference between protecting Google, the company, and Gmail the product, and the aspects of security and anti-abuse that even AI will never be able to do. That’s all today on Cyber Work.
Also, let’s talk about Cyber Work applied, a new series from Cyber Work. Tune in as expert infosec instructors and industry practitioners teach you a new cyber security skill and then show you how that skill applies to real-world scenarios. You’ll learn how to carry out a variety of cyber attacks, practice using common cyber security tools, engage with walkthroughs that explain how major breaches occurred and more. And believe it or not, it’s all free. Go to infosecinstitute.com learn or check out the link in the description and get started with hands-on training in a fun environment while keeping the cyber security skills that you have relevant. That’s infosecinstitute.com/learn.
And now on with the show.
[00:01:10] CS: Welcome to this week’s episode of the Cyber Work with Infosec podcast. Each week we talk with a different industry thought leader about cyber security trends, the way those trends affect the work of infosec professionals and offer tips for breaking in or moving up the ladder in the cyber security industry. Today’s guest needs no introduction, but that doesn’t free me up from having to write one anyway. Elie Bursztein leads the security and anti-abuse research team at Google. He focuses on deep learning and cryptography research, and among many other accomplishments, broke SHA-1. His website, elie.net is packed with informative articles and online talks he’s given over the years and is a veritable masterclass for any cybersecurity aspirants. Well worth checking out in your spare time. He also describes himself as a wearer of berets and a purveyor of magic tricks in his spare time. So we’re going to talk about the differences between security and anti-abuse, the work he does at Google and the skills and experiences our listeners would need to break into the fields of anti-abuse or cryptography.
In addition, I’m happy to welcome back to the program Infosec’s founder and CEO, Jack Koziol. Jack has been following Elie’s talks since 2014 back when Elie gave a talk about attacking Hearthstone, the game, at Defcon and has been eager to speak with him ever since. So in the next hour we’ll be covering lots of ground. Grab your caffeine of choice and get comfortable because this is going to a good one.
Elie, Jack, welcome to Cyber Work.
[00:02:31] Elie Bursztein: Yeah, thanks for having me.
[00:02:35] CS: Elie, I want to start with uh the place we start on every show. We like to get a sort of origin story, superhero style. How did you first get interested in cyber security? What was the initial attraction for you?
[00:02:47] EB: Okay. I don’t know if I can do the hero style, but I can try.
[00:02:50] CS: Okay.
[00:02:53] EB: How do I get attracted to it? I would say almost by accident. When I grew up in high school I was mostly doing sports. I was playing water polo and not probably the most astute student you could imagine. And then turning after high school my mom, which rarely have hard conversation with me had one was like, “No. You need like a job to pay the bills. That’s not going to happen. And you should get an education.” She always wanted me to have an education. She dropped out of high school. That was important. And so I did what I’ve been told and I ended up in engineering school, which is for computer and it was literally, “Oh, I like video games.” Computer science seems pretty good. And I asked people around me and they’re like, “Yeah, maybe.” So they don’t really know about higher education or anything like that.
You look in the paper and like, “Okay, it seems to be reasonable.” For somehow, someway I managed to pass on the bar. And so I almost ended up there clueless, right? And then I’ve been working since I’m 16. And so I was a plumber. A plumber when you’re 16 basically means you carry stuff from old bathroom that you would redo, the dome, and then you keep carrying stuff.
[00:04:18] CS: Right. You’re carrying the heavy stuff for the journeyman. You’re the journeyman for the tradesmen. Yeah.
[00:04:24] EB: Exactly. So that’s kind of my job, and it’s a horrible job, it sucks, but it’s very informative and I learned a lot of stuff. But I would go like 5 am on social building and try to refactor them before people wake up. That’s not the greatest. But I mean pays bill. We needed money. So that’s what it is. So then I was in engineering school and now I have to go very, very early to school and then I have to finish each niche pretty late. It’s pretty intense. I don’t know. Only few weeks where like by design very hard, but it’s 10 hours of work a day. So I need to sleep basically. Also I need to find a new job.
And so it turns out that one of the first early benefit I had to go to this kind of education was to meet people who say, “Well, I have the same problem. Here’s the thing.” A lot of us goes and we do tech support. And I say okay, “What it is?” And say, “Well, it’s very simple. There is this company –” At the time we were – I mean we took in 1998. So it’s a very, very long time ago. DSL has been deployed in Europe. And so one of the early deployment ISP who deployed DSL called Club Internet in France hired people to help people get online and resolve their problem with their DSL cable, modems and things like that. And so it’s like, “Okay, you’ll take call and then that’s Saturday, Sunday, and you do that on holidays, right? The equivalent of national holidays, and the pay is pretty good. I would make more than I would do doing plumber, right? I joined them in. And I would be able to do school during the week and work during the – Well, holidays and weekends. And I did that for about two years.
And to be honest that’s probably one of the two most tiring years of my life because you don’t get a break, right? And I think at the end when I stopped doing that I had done more than 20 thousand calls, which is kind of – Let’s say you do like quite a few a day. I don’t know if you ever called your ISP, but they try to get very, very fast and we have target numbers on how many calls you would do per day and things like that.
Like all things, by the way, I think it was a very good experience. Got very used to talk, got very used to get understand over the phone line. So you get used to talk, you get used to throw down your pace. You get used to different intonations, understand better people, deal with people which are really unhappy. So I think it was really good. I it was not maybe the most pleasant experience for that. And so a lot of people in that thing were basically startup owners. So what they would do is they would work on the startup over the week and then they would go and do this kind of work on the weekend to on the side right, a gig on the side, right?
So I met a lot of very smart and interesting people probably the smallest bunch of people I ever met in one go at least at that point in my life. And one of them or two of them were very, very into security. I don’t want to say the name, but let’s just say they were quite involved in the security community and they told me about frack. They do write for frack, which was in all easy. I don’t know if you get covered at a long time in the show, but frack was essentially was a bible where group would exchange tips and from technical information, right? And I’m like what are you doing after work? And like well, we do this. And then I see people typing weird things into a computer and things start to get funny and I found that very interesting. And I’m like, “Okay, what you guys are doing?” And they try to explain to me that security and that actually program misbehave and very, very nice way, right? Remember we were sitting in each other. So that’s like during lunch break or dinner break. We used to do also evening course support, right?
And so that’s how I get started. I get started because I met people which were kind of really good. I had an inclination for that, if you really want to go even deeper. My grandfather left Germany during crystal night due to the Nazi, right? And so he entered the resistance. And I was educated to not texting for granted, right? One of the things he would always ask me even before I do computer would be how this work and try to make sure you can always understand technology, understand what happened to you to not be controlled by it. And I think that’s why I like so much Defcon, right? There’s being a big conference I’m sure you mentioned multiple time. One of the tenet of that is you need to understand technology or less the technology will control you, right? And that’s something my grandfather always tried to teach me was you need to understand the world and you need to understand things to avoid being controlled by it, right? It’s very important for him and kind of in my DNA.
I have a very vivid recollection of, I don’t know, I was 12 or 13 and he’s like, “You know, one of the things which will be interesting,” he told me, “was probably we should learn how to hack a satellite,” probably a horrible idea legally.
[00:09:37] CS: Not a bad idea.
[00:09:38] EB: But I think his point was along the line of satellites seems to be very important. He was a very technical person. And that’s something you should probably pay attention to because might be important, right? GPS were coming online. TV by satellite was a big thing, big new thing at the time. So I think it could make sense. But I remember even when I was 12 I was 13 have this input from my grandfather of like, “You need to understand things deeper.” And I think security is this thirst of knowledge and the thirst of what’s behind the scene, how does this thing work? What is the limit of the system and what happens when the system breaks, right? And I think that’s a mindset.
So I grew up with a mindset. I didn’t know about it. And then it was revealed. And then security started naturally because some of the people I told you had decided on gig. Some of them took job into a famous pen testing company. And so after I learn with them and work with them on side project as we go along I basically they invited me to do an internship there. And again when you step up from tech support to be a security consultant the pay goes up and you feel like you are doing a more interesting job. So I learned a lot by being a security consultant for I think six months as an internship and then was way better.
And then from there I naturally joined the lab of my school, which is a security lab, right? They have the skills research. So the idea would be then I did not have to pay for education. I would just do how to teach security. And that’s what I did. It was really interesting. And then very surprisingly I think three months in uh the person who was in charge of doing the course had other things to personally – Personal problem or something else to do. And so I ended up being in charge of doing the security course. And now you have one terrified Elie who has to teach engineers how to do security. And that’s where I discover, A, I love teaching. B, is terrifying. And C, I had to learn a lot of things, and so I tried to change buffer overflow and for my string bugs and system and hardware security.
And then at from that point forward it became kind of my source of revenue. So I entered. That’s how I entered the space and onto security education was by knowing people. And I think that’s really the story of my life, which is I met people and if you open to changes and you open to opportunities they will come along. Sometimes it’s not what you want, but there might be something which on the program is beneficial to you.
[00:12:16] CS: Yeah. And I want to just emphasize this. We talked about this a lot on the podcast, but the importance of the ability to communicate in a security sphere. And you said it yourself that you talked with a lot of people and you were able to sort of like talk with people at different sort of energy levels or people who had problems or might be in a bad place right now or whatever. And so when you do have these people that you meet, then having that soft skill of being able to communicate with them like you’re more likely to get that opportunity as you said there. So it’s not just about learning the tech of it. You really do need to be a people person as well. Jack, does this sort of line up with your sort of security story?
[00:12:57] Jack Koziol: Yeah, absolutely. I think kind of coming into cyber security, it wasn’t even called cyber security back in the 90s, right? I think it’s just interesting that how you entered the field is very different than how a lot of young people are entering cyber security nowadays. They’re not reading frac and they’re not looking at bug track and all those all those things that were I thought really like a lot of fun and really draw you in. I just wonder you know for younger people that are trying to enter the industry, in my mind it’s good and it’s bad, right? It’s more accessible. There’s more information out there, but it’s less personal. There’s not like a small group of people that are really invested in helping you learn, like they were back – just because I’m old and like talking nostalgic about the hacker scene in Chicago in the 90s that I grew up in, but I don’t know. What do you think about that? Was that your observation as well with younger people that are coming into the industry?
[00:14:13] EB: It’s a hard question, but I think in a way, yeah, things have changed. I think there will always be small community and I think they still exist. It just happened that our field is so big that the community is not one anymore. It’s just a few of those and sometimes intersect, sometimes they don’t intersect. I remember the old days as well, right? One thing I can say and I don’t know if it’s very much public, but frac was never one group. There was a few groups. And frac in the U.S., but at some point frac was handed over to some people in France, some people in Germany at a different point in time. And so frac was never one group, right? It was a set of people. We didn’t know the other group to be honest. I know who were in frac friends, or some of it to be honest. Maybe some of them I don’t know. I don’t know we know everyone.
I think to be honest very few people know who created frac. I do, but I’m not going to say. But I don’t even know if it’s public. And I think frac was this kind of decentralized organization of like a group of people who started questioning on the Internet because they found it interesting and then groups already started to create. They still exist. Like Defcon, we have the villages, right? Which are essentially a special group of interest on security. The election security group is very tightly run. Everyone knows them. Everyone who work on that would know Alex Alderman and Mad Blaze which are kind of like the forefront of those. But if you go to the car hacking village where they do a lot of things on death rows these days, they are completely different group. I don’t know them. I knew of them, right? And so I think they you can get into car securely very easily by entering that community. And so I think you’re right. You did not enter the security community anymore.
In a way, depending on what your career path is and what you want to do it, will make it something harder. It will make something easier. I think there is more room. So it’s more opportunity because security is so pervasive .At the same time if your goal is to be like the guy known for security worldwide and I don’t think it exists anymore. So we had people like this at some point. I remember Dan Kaminsky being probably the most well-known researcher, right? And he would go to black cat of Defcon and the scene would be packed full and then they couldn’t accept people. And it happened to some of the talk, but we cannot pinpoint one person in security anymore, but that’s okay. I think that’s okay. I think we’re just bigger, which is good and we have more career paths, which mean more options with different sensitivity. Yeah, you’re right. But I think there exists a point. They’re just smaller. Or they know they are separated which maybe is good, maybe it’s not good. But yeah.
[00:17:24] CS: Yes. As things get bigger, it’s harder to sort of maintain that sort of small community vibe. Or as you say, it breaks into several other small communities that each are sort of specialized, I suppose.
[00:17:37] EB: Yeah, and we changed. Jack, when was the first time you went to Defcon? But if you remember the very old day of Defcon, I mean, it was by the swimming pool and some of it was clearly not particularly correct. And I do think in a way, yes, it’s less personal because like 20,000 people. You still can speak to the speaker. You can still go to the village. But I think it got better in terms of inclusion. It got better at making people feeling more welcome. Some of the traditions still remain but they become optional. For our listeners who don’t know that, one of the traditional Defcon is when you do your first talk you have to drink on stage. However they change, right? Now you can decide not drink alcohol if you don’t feel like it, which I think is better. I think it’s an improvement. But I think as we go bigger we also get good practices and you’re thinking from a lot of diverse people, which is really good. So I think it also incorporates this idea of I think the spirit is still there. I think it’s just we need now big and small scale events. We are adapting. That’s good.
Podcast is a new thing too, right? I mean security podcast, now there is a few. It was not the thing like 10 years ago. Like why would you do that? So we evolve. It’s good. I like it. I like the dynamic of the field. I think if our problem is we are going too fast, we in a good spot, right? That’s a good shift. It’s just the field is dying. You’ll be like in a different position, right?
[00:19:02] CS: Right, there you go. So I want to jump into the career aspects of your history here obviously right from your time at Stanford the postdoc program you jumped right into working for Google, working your way up from research scientist to security and anti-abuse research lead. Can you start out? Can you tell me a little bit about the progression of jobs and responsibilities as you move through? Because I think a lot of people who hear this are Going say like, “Well, what is that? What’s the difference between the research scientist to the sort of lead of the research department?” Things like that. Can you sort of map that out for me?
[00:19:41] EB: Yeah. I thought briefly of how I ended up in Stanford and how I ended up in Google just to give context. As I said earlier, I came to security almost by accident. And I’m not a very – It was not a career plan. That’s very important to know. And so as everything in my life, I end up in Stanford because I did a PhD in France and I did a PhD in France because someone I trusted tell me you should do research. You seem to be fit for it. I didn’t know myself, but someone at least two or three people came to me and said you should try. It was something I would never have considered. I feel like an imposter to be clear and it was really, really hard. And I got very, very lucky to enter in a very, very good school. But to be honest I felt complete loss for the first year. I really feel like an imposter and I didn’t know because they knew a lot of references and a lot of academic knowledge I never heard. So it was hard. It was very difficult.
But then I found the paper that I found I liked and then I found the author and what the author was and I just sent this person an email and say, “Hey, I think I can work on that.” And the person was really kind and say, “Well, okay, here’s the problem you should fix. When you do research.” And we might talk more about how you enter the career. But if you were to enter a career research in security, you usually read a bunch of paper and you find one subject you like. And if the author is paper is well done at the end of it you have a discussion section and this section is here’s where we are and here’s all the things we don’t know. And then usually you might find one and like, “Okay, maybe I can do that.” It might be simple as we’re doing an experiment. It might be as trying this thing or maybe going to that. And so you can get ideas from the community.
I think one of the quality of research papers, we encourage everyone to read them and not read the middle of it, but just read the abstract for the idea and read the discussion of what’s next. And if you do that for a few paper on the top conference and we can cite using security, which is open source free to download or maybe some of the paper we put online. A lot of academic researchers do it. Like I do on my website as you mentioned, but that to get access to that to people. There’s a discussion about open access for security research. You can read the discussion. You can read the abstract. So also give you an idea. Expose you to new line of thought. And then the discussion section would give you idea on where to go. And that’s why exactly what I did. And I read this paper and it’s very arcane. It’s about attack graph, which is using for more reasoning do network planning and things like that. Okay, there’s a problem on how to do two player, which is an attacker and a defender. That you should do single player and like, “Okay, we can do two player.”
I can maybe do that. I know how to code and I can see. Maybe I can write a prototype and then I didn’t share how much math it would be of course because I didn’t know anything, but I went with the enthusiasm of the newcomer and I just interacted with that person and then comes a PhD defense in France you have to have foreign reviewers. So I invited this person and she kindly be accepted. And she’s very senior in the U.S. At the time she was rotating as the head of the NSF, National Science Foundations. And she kindly took the time and she said you should go to Stanford. They have a group which do something related to that. I used to do foreign [inaudible 00:23:00] security during my PhD, to be clear. And she made the introduction and I got a job. And again I never planned to go to the U.S. It was not in my mind. I never met been to Stanford before. So I arrived in a new place on a December and that’s where I ended up in Stanford. And then there the group is incredible and they love bright people. Also again impostor syndrome all the way up to be clear. Really, really difficult for me. But that’s what happened, right? And I got exposed to new ideas.
So short story is basically we were I was on a J-1 Visa, she was on a visiting visa, and like a legal one to be clear. And she go back to France for a wedding. Nothing very out of the ordinary and she come back and then she got deported. And can’t go back to the U.S. And so the answer now is I’m stuck with finding H-1B getting married or going back to France or right now breaking up. I didn’t feel like backing up obviously. And so I decided to fine H-1B. I wanted to stay in the U.S. and I feel that the U.S. is the best place still for a tech company, right?
So then, again, thanks to the people I met at conferences. I met a bunch of people in Iceland actually. The person who introduced me to the Google research group, Udvar. I met him in a conference six years ago before I joined Google for during my PhD. So very random and you never know, right? You never know who you meet and how they’re going to change your life. And like well maybe you interview with us and let me show you what Google is. And so I had to choose between. At the end of the day I chose between Twitter and Google and I decided to go to Google because I liked the diversity of product. At the time we didn’t had all those products, but we had quite a few. Gmail was still there. Search was already there. Chrome was booming. Android was just started.
I have somewhere in my closet the first android phone. Sometimes I look at it I’m like, “Oh boy! What a change.” Yeah, so I chose a group where I felt there was the most diverse set of topic and I could work and I joined for doing web security and research, right? I was very much on the track of hard security things. I love Captcha as you mentioned. Captcha is something I personally was very intrigued about. What is the difference between human and computer, right? I mean AI was not all the Christ, right? It was kind of this period where people say AI almost died. There was a lot of hype in 80s and then into 2000, 2 90 and 2000 was like, “Okay, doesn’t seem to work.” And a lot of setback before neural network again and then things get crazy like in the last five years where there is also rage now. By the time it was more like an intellectual interest.
So I love that. I think people and academy I was like, “Okay.” I think what makes you successful is not necessarily what you like. That’s okay. It’s also like research sometimes you succeed, sometimes you fail. It’s important to fail. It’s important to accept that something you like you work. Sometimes you don’t like don’t work. It’s what it is. Very, very – It’s okay. I mean, so yeah. So I did Captcha and then the web security and then I’m joined Google. And so I joined as a researcher as you ask.
So what is a researcher? A researcher is someone usually who do a mix between industry and academia. So you would say someone which have a slant for building stuff and writing code while also liking to publish and to go to conferences. So that’s what it is. I really enjoy going to conferences and doing like out of the box thinking. For the time, for example, where because I was doing a job security, I finally got Apple to use HTTPS for the app store. That was probably one of the most famous thing I did was pushing Apple to use HTTPS. I had to write a proof of code where I can hijack the connection and literally install the app of my choice on an iPhone to get them to change. There is a video of that on my YouTube channel if you want, but that’s a stupid thing. It’s just like there is HTTP. You can hijack the stream and just say, ‘Hey, here’s an update and here’s where you should download it and it didn’t check anything.” Like we should fix that, and it took like a year. But that’s an example of where if you’re interested in one thing there is always something to find. You just have to look, right?
And again back to security and this mindset of you want to understand how things work. And I was very interested into app stores actually. We were starting the Google Play Store. I mean the Google Play Store was fairly young. So we did a research on what is the play store? How you deliver updates? Things like that, right? Yeah, it’s Google. So I started that. I said I went there personal reason but also because I feel that I know my limitation. I’m not someone who grew up with academia in mind. I just learned it as a fly. So I wanted a place where I would be exposed to new ideas, new problem, and I felt that I could – I’m good at seeing a problem and finding creative solutions. That’s really what my talent is. So I felt that it worked really well when I went to Stanford really boost off my care to be exposed to new ideas and new things. So I felt that if I’m in a place where there’s love, new things then maybe I can do better. So I try to find a place which match my kind of where I’m strong, right? And I know that for example I’m not a big thinker. I know that. That’s what it is. I don’t have like big ideas.
[00:29:57] CS: You’re more of a problem solver.
[00:29:59] EB: Yeah, I’m more problem solver, but I am more an intuitive one where I see it more quickly I think than most people how to go to the solution or to go about something. And so that’s kind of where I shine. So when you know that, you do that, right? One of the part of the security is how to turn the table and spin thing in another way or apply A to B, right? If I do A to B, what it does? You mentioned that Jack – So my Hearthstone talk. That’s for sure. That is like, “Oh, I have AI and I have a game of card I like. If I bridge the game the two together I was going to do I’ll probably get Elie banned from Blizzard for a while and then also for talk.” And a lot of angry twitter followers now because they’ve tried to follow me for games and then now I only first of all security.
[00:30:47] CS: We could probably just cut off the rest of this podcast and you guys could swap Hearthstone and early Internet stories as well.
[00:30:52] EB: Yeah, exactly. That’s what today is, right? And so that’s what it is. So a researcher at Google is basically someone whether you’re an AI or you are in data mining or network or security. You basically out of bridge between industry and academia. And the way I would explain today hindsight would be trying to unlock the next generation of capability to protect product and protect users, right? Google is very much centric around protecting users at least and making Google useful to the world. That’s kind of the overarching mission, right?
So then if you stay long enough the team go bigger, Google go bigger, right? Google grew a lot. People might not remember that, but I mean there was 15,000 people when I joined or something that. We might think it’s big, but like we’re really big now, right? So it was still very small. And as the thing grew bigger, we have more people coming in and the number of products explored and we start to have abuse problems. Fake account, fake downloads, fake writing, fake views, fake this, fake that. And so the team which you wanted you start to grow a lot, right? That was 3G plus error and then which plus whatever elevated it was. It grew a lot that in turn on Google which would become Google Photos and we have drive and we start to open it. And so it’s split. And then as a result of that system goes and then the person which was the director of it at the time say, “Okay, you seem really interested into abuse.” I was working on Gmail and account at the time and black market, and jack had a question on that. And then we look at that, right?
And so she’s like, “Okay, would you like to build your own team who will focus on that? We think if you are closer to the product you would be better doing research because you will be directly integrated with what the engineer sees as problem.” Again, remember, I’m not a big thinker. I think I strive better with other people’s problems. So I’m like, “Okay, that seems reasonable.” Never been a manager. Never understood what mean to be to mean to be a – Managing a team. Don’t know how that is going to go, but I’m willing to try. I’m willing to learn. And again, I have the impostor syndrome, and I’m there. And I got my first person who report to me which is a fantastic person. That’s Kurt Thomas. He’s a fantastic researcher. So I got really, really lucky to get really early on like the first few people who work with me were extremely good and extremely nice. And to be honest, I was a horrible manager. I didn’t know what I was doing. So I’m really glad they stuck with me. Kurt is still working with me, right?
[00:33:42] CS: Yeah, I was going to say, is a lot of your old team still with you?
[00:33:44] EB: Oh yeah, yeah, yeah. We actually really – One other thing we do really well is I think we only had one person who left the team bearing reorganization in the last five years. So we have an extremely high retention rate which is important because research is a steep curve to be clear. Research is probably one of the steeper scale because you have to absorb a lot of things before you even productive. So we like to have people around a long time. But I got very, very lucky to get the right context. Being on the fast-growing side of the company and having like a lot of problem emerging.
What started as a team was basically a team started I forgot to mention that as Gmail, right? The first need was a spam filter, phishing filter for classifier, whatever you would call it, who receive email and process them, right? That’s why I like this theme because, A, machine learning is something I loved all my life. It was always fascinating with captioning this idea I told you about AI and human. So I was very interested into security and AI and Gmail was the biggest security classifier, I would say, or one of the biggest. At least one of the biggest at Google. I don’t know worldwide, but a pretty big one at the time already, right? And you had millions and millions and hundreds of millions of people already. All right. So I thought that was a really interesting thing to work on. So that’s what I did.
And then what’s the difference between then, then your manager, some people could have to tech lead. So basically for those who don’t really know how that works in big company maybe explain that real quick if you guys think it’s useful. Big tech company I think that applied to Facebook, Yahoo, Google and a bunch of other have what we call levels, right? So levels would be start at I think two or three and then end up at something like 12 maybe. Although they do not call that way. So you go usually on up to nine and then become VP, senior VP. And then I don’t know what’s above senior VP. Senior, senior VP maybe? I don’t have a name for that. But to hire for my paycheck. But the idea would be you start.
So when you’re out of college you usually are hired at three. When you’re hired out of a PhD, you’re at four or five. I was hired at four, right? So I’m a level four, which is research scientist. So when I start to have my team, my own small team, and I become the current of engineering what you call a tech lead, TL, tech lead manager. So TM is for manager. This is a relationship where you are both a tech lead and a manager. So we can talk about work balance there or technical the serious management balance if you want, but the idea is like you have that. And then you would be around L5. And then you have L6, which is where the word staff start to be prefixed. So staff – You become staff, research scientists and then you become staff senior research scientist and then that’s L7. And then L8 is director. So research director for me or director of engineering, or I think they call that principal engineers onto the software engineering ladders. I think it will be the same for a security engineer as well. And then different ladders have different expectations.
So a ladder – What is a ladder? So I have to explain that. I’m sorry if I do a take a lot of times, but it’s kind of a little bit of a complex subject. Ladder is basically in a big company you are hired to do a job. And the way that you need to be evaluated is based on what you’re supposed to do, right? It’s good for the company. It’s also good for you. You understand that their expectation and you know what they are. And so it helps to formalize a little bit more what you’re supposed to do. So if you hire the software engineers, then you have a set of things you need to do. Like you need to write code, do code review. You are less expected, for example, to do a conference, right? You don’t have to be at the conference. If you do that it’s well regarded and you get credit for it but you’re not going to necessarily be part of your job description. If you’re a research scientist and your job description is very different. If you’re a security engineer which is more evaluating product, right? Kind of like um a pen tester would be your security engineer, then your expectation is less on writing code and maybe more on testing, right? So you might have more a different type of expectation. And it’s important that you find a ladder who correspond to what you want to do, because at the end of the day this is what people – It’s a way to say between the company and you say, “Here’s what we agree you’re going to get paid to do?” At the end of the day it’s a contract and it’s important that the contract reflect a win-win situation, right? It has to be win-win. It has to be something you like to do and something the company really wants you to do. There will always be a thing you might do outside of it. I don’t mean you only have to do that. It just means this is what your bread and butter is and you should choose one and you can change. You can change.
I saw people going from software engineer to manager. Two years in, four years, five years in. They realize it’s more time helping people, faces in conversation, managing the team, managing the projects and they want to become a manager. I saw some people going the other way around, which was being an engineering manager to become an individual contributor, right? And I see, and going back to do more technical. I saw both ways. It’s fine. It’s just like you have to say what you want to do because otherwise people have expectation that might not be what you want and then create a lot of confusion and that’s not ideal. It also helps you to know what is your career path up to the ladder, right? And so that’s a very important thing to consider when you choose a job is what is the ladder I am signing for? Because describe what your job is in a way. Also knowing that you might change. That’s important. It’s kind of a both things.
So staff as I mentioned is L6 and above. It just means that now, A, you can stay a while. That’s very important. And Google can trust you that you have Google interest at heart or Facebook trust you, you have a Facebook interest at heart. So it’s basically when you become staff is you become – It become your company in a way. In the old days would mean that you have more decision power and things like that, but really the intent behind staff was you have to take stock long enough and you have been able to carry successfully a few things long enough that the company know trust you to be part of its staff, like its permanent staff or something like that.
The way I would describe it externally would be tenured. It’s not necessarily something because you can get fired of course, but you get that type of thing. So that’s what staff really means in theory, right? I mean it might have changed and depending on website of the company, all the stuff, but that’s kind of the intent. And then so that’s more like you need to have a big success. You need to have a large impact on something, which is important for the company, which usually takes years and that’s where you get staff. So that’s how you get staff for us. It was we launched a bunch of internal staff which were very useful for the company. And so as a result of that I became staff.
Senior staff is like you’re able to repeat multiple success on the consistent basis and in a way that is the last level before you become exec. So it’s like there’s a proving ground where that’s the last time you kind of have like this – There’s a proving ground of like do you have what it takes to become a director or VP maybe? Again, although maybe it’s a pyramid. So at some point you will stop the question is, and though that’s like this consistency and you invest a lot of your time in being very successful. But then the important thing is the more you go up the ladder the more staff skill becoming important, right? Because it’s a lot of confidence building. It’s a lot of team relationship. You interact with a lot of people.
I might be still on the IC, which is individual ladder, right? I’m still a researcher. But I do have 40 meetings a week. And I’m not joking. I have 20 to 30 hours of meeting every week because I work with Gmail and I work Gsuite front and Google Cloud and this and that. And I’m really happy to do it to be clear. I’m not this thing it’s a bad thing. I’m just saying then staff skill becomes very, very important.
[00:41:51] CS: Yeah, that’s what happens when you get high enough in the ladder is that you – Yeah, all your time is spent sort of showing other people what you’ve been working on.
[00:42:01] EB: Yeah, so that’s what it is. And so that explains what is kind of the ladder.
[00:42:07] CS: Yeah. Jack, do you have anything you wanted to add to that or –
[00:42:12] JK: Yeah. I was just curious kind of talking about the academic interface with industry. I guess much collaborative anti-abuse, best practices you think are shared amongst tech companies or just the industry in general versus that’s more a one-way street that’s coming from academia? I’m sure there’re all sorts of very proprietary things that happen at Google that it wouldn’t make sense competitively for that information to be shared with other companies. But I guess like from an academic point of view, like how does all that work like in anti-abuse? And do you think it’s a problem or you think this is a system, the interface between academia and industry that’s working well or any anything you can share on your opinions of that?
[00:43:18] EB: Wow, it’s a hard question again. But yeah, okay. I’ll try. I think they are collaboration. I think there are things which works well. For example, I think tech in general is pretty good with working groups, right? I think at the end of the day we do interoperability, right? If you think of it information technology, we’re just pushing bits. That’s kind of our job, right? We move bytes from there to there to there in a certain format. That’s kind of our job, right? And we try to secure those bytes. So I think interoperability is pretty good. I think they are concerned about who can wait on those working group, but I think they are working group and I think there’s a lot of good open source software from NMAP, right? Which is a network scanner, which works really well, right? I think it’s developed by the community. That’s fine.
I think in general research is and that’s where it’s an interesting confusing difficult but easy to resolve thing, which is who is a researcher? My stance is everyone who say they are researcher is one. If you feel you do research and you feel like you’re advancing state of the art one way or another, then you’re a researcher, right? It’s a very much like a self-claimed title. It’s not like academia where you have to have a PhD and you have to be in academia. I think security researcher is just someone who is trying to learn or trying to advance the field of the state of the art one or another. And so that’s why we have vulnerability researchers, right? And vulnerability research are finding new vulnerabilities. They add some things instead of their outdoor people who do like RND, research and development is the same, right? It’s not necessarily any publication. It’s not needed in our field. I think we have a lot of good strong industry conferences which are more or less technical, right? You have Defcon. We have also Blackcat or RSA, which is a little bit more high level. And then you might have also like more closed conferences, which are more special topics. For example, bot net texts don’t mostly happen for good reason behind the scene, but they are also very collaborative upright. We could mention some of the non-profit work on that like shadow server, right? So they are collaboration. So there is collaboration for sure. I think that’s good.
I think we can always use more. I think the difficulty is not on the technical side. Most of what we use, people will use TensorFlow. People use TensorFlow is a public [inaudible 00:45:58]. Are we using specific things maybe some slight improvement there and there which are very specific to our product? For sure. We use also a lot of other thing which is not air-based. Very important on tier views and security is not REI or crypto. There’re a lot of things like access control, incident response, reverse engineering, malware analysis, data mining. There’s a lot of things the security to keep right. And again, one thing I was mentioning to Chris when we prepare the podcast is talking about security at Google is are you asking how we protect Google itself or are you asking how we protect our product? Because that’s not the same.
Because one is a company might be a big company. The other one is a multi-billion user product. That’s not the same. And we have different problem both of them. I personally feel according to protect users. So I’m on the product side. So I can’t really tell you much about our internal process. But I think corporate security and large-scale product is very, very different, right? And so for large-scale product, which is what I know the best. We do have some technical edge. But at the end of the day the main problem we have and it’s a very, very difficult one, is that the data we have is not ours it’s given to the user. Or not given. It’s actually untrusted to us by our users. It’s not given. That very important. It’s actually untrusted by our users. We have this very, very strong line in the sand. And I would say it on record. We did not share user information. We just don’t. You can come with a warrant and then that’s a different question. But is Google willing to share user data with anyone? Absolutely not. Would we ever have shared for the interest of research to the data? No, we did not.
We do tend to try to share aggregated statistics. We sometimes publish paper on how we do machine learning for malware or what are – A lot of people when I go to RSA, one of the favorite things people keep asking me is how many phishing companies do you see? Are they targeting more in Japan or are they more targeting France? Or just kind of. And I think people want to know. So we can tell you statistics, and that’s fine because that’s not about the user or company. It’s about the whole system. That’s okay, I think, and it’s anonymous, anonymized data with like threshold, right? We we don’t go deeper. We don’t want to anonymous people.
When it comes to sharing, we we will not share user data. So a lot of things like fake accounts out of questions. Even private data is only seen by the machine, right? So for example for the Gmail thing, when we train a machine learning model, we don’t see the email unless they are reported by a user for specific purpose of us looking for security. And again, they are reported to us for security, which means they are used only and only to debug why the classifier might have failed. That’s the job of the report. You do not – That’s what the expectation is. That is what we’re supposed to do.
So when people say, “Oh, Google have a lot of data.” We are interested with a lot of data, but the anti-abuse thing or security thing is we are entrusted to that data to make the product more secure. It’s not to entertain a study. It is not to entertain a case study. That’s just not helping our user security so come do that. And the same for every other company, right? This one kind of thing that is very important. So something we might be able to share and some we might not be able to share. And with regulations and things like that there is even more constraint around data retention and things like that. So even things that you find on the internet is not yours, right? We might index the web, but the web is not ours. It’s owned by other people so we don’t have the right to share it. They have to do it themselves. Like their job. There is a content we just happen to know it exists, which is a very, very different thing.
So I think that the main problem with interviews it’s full on big data set, and big data set hard to build. So we hope at some point to find and we’ve been working on it for a long time, I think probably usability is a problem. I would agree that proper disability between what company A claim and what company B claim is very hard to verify and it’s really, really difficult because we don’t have those big data sets that they have for AI, right?
One of the thing which is very known in AI is called imagenet moment where Fei-Fei Li created this 14 million image data set, right? Where you could imagine – and that fueled all the research around computer vision and getting better neural networks to detect things and that really helps the academy. We don’t have those data set yet. We’re trying to find ways to do some of it. I think a lot of people I’m sure that people are doing the same. So yes, we have a problem for producibility in our field. We have a problem of sharing data.
The technology in security [inaudible 00:50:56] is very mundane I would say. We use a lot of things from other fields I would say. Crypto might be unique to us, but the rest of it is like, “Oh yeah, we use a network stack like everyone else.” You want to run your own intrusion detection system. Well, you’re doing regular expression like everyone else or you do machine learning like everyone else. So I don’t think we have adaptation. We are more an integration field rather than a kind of like a fundamental field. Exception being we need data sets which represent the state of the world and that’s why it’s really, really hard, and sometimes illegal by this way, right? It would never be a data set for child abuse, right?
An example of an abuse that you need to fight when you’re in a company is you don’t want child abuse on the platform, but you cannot legally own or legally retain child abuse images. That’s not going to work, right? So you have to work with industry and you have to work with, in the U.S, NCMEC, National Centre for Missing Children on them creating hashes, right? And I know we have a hash stamp to do the matching and people say, “Why don’t you do better?” The answer is because that’s really hard. And so I think they are constrained around data and I think data is the number two issue in the field. We can talk one day. We can talk about the number one issue, but the number two issue is data sharing. And there is no good easy answer. So it’s all right.
[00:52:20] CS: Yeah. I want to uh move over – We’re getting into the into the realms of what anti-abuse is and what you’re what you do is in anti-abuse at Google. And so I want to sort of pivot this towards the career aspect of things, because a lot of our listeners are sort of looking for their future job. They’re interested in cyber security industry but they don’t necessarily know all the areas. So I want to sort of consolidate a few questions here and ask you a bit about what the difference between a career in cyber security versus a career in anti-abuse. And also specifically if you’re interested in anti-abuse as a career, what are some things that uh young aspirants should be learning or doing or having on their resume accomplishments they should have that would get them noticed, say, if they applied at Google or somewhere else uh in that type of work? What do you think in 2021 people should be sort of focusing on if they want to do the kind of work that you do?
[00:53:18] EB: Okay. Good question. This one is easy. So I think if you do cyber security, it’s a set of skills which are very core to the field. If you do cyber security, maybe if you start with learning about web security and how to secure website, which is something which is somewhat stable. At that point, XSS, CSRF, SQL injection and pen testing are thing which come to mind. Then it is a path. You can also have the compliance path where you look more at are you infrastructure is compliant? Do you have a disaster recovery plan, access control, things like that, which are also very important, right? I mean, you maybe also have to do to be honest all of it if you’re in a small company, right? If you have a small company, you are the engineer, the security and the city or whatever I would want.
[00:54:13] CS: Yeah, salesman, custodian, whole thing, yeah.
[00:54:15] EB: Exactly, right? And you’re holding the fort on your spare time. And so in that case, you are more on the system and network path with a slant to to crypto, right? So if you really know I think in that path I would choose to learn to be proficient in one operating system to understand how it works. So basic of authentication, the basic of what is a patch checker. I think one of the big things you do in the company is how do you update things? Because you always have this problem of am I updating and breaking things or am I leading this thing around? What is the vulnerability? When should I update patch checker? Patching is a very, very expensive thing, right? And so that’s risk management, and I think you’re managing the risk for your company. What do I back up? How much on force backing up?
An example of that would be – And then we’re coming closer to abuse is how do I secure account? Do I require two factor, right? I think one of the most important thing we can do is second factor, right? I’m taking my crypto hat here, but I think second factor is very important. Security key is useful, very powerful. Probably the best thing for protecting against phishing. Problem being you have to distribute them. If you have a bunch of employee specifically remote, what happen when they lose their security? Can you ship one to them? Do they need to buy one? How do you get them to recover their account. That’s kind of like a raw perspective.
If you’re in a bigger company then all those duly are separated. So you kind of like now enter the field of which branch am I? Am I into the IT side of it where I’m securing the corp, right? Corporation, which is like my network and then in that case intrusion detection, firewalls. And it’s getting back to network and systems, right? Or am I working on the website or the web services? Then web security and crypto might be more interesting to you. And then abuse is user-generated content mostly, right? And so now you’re like, “Okay, I have a e-commerce website.” I might get fake review. I might get fake transaction. I might get unwanted comment. Some people might decide to upload pornography to your website even though you don’t know why, but people do that. It’s been, I don’t know, maybe 20 years. We try to tell people to not upload porn on YouTube. There are other places and it still happened and we still don’t know why, but people really think YouTube would be good for pornography and like probably go somewhere else. It’s not like you shouldn’t do it. It’s just like probably not on that product specifically.
[00:56:43] CS: Right. It happens all the time.
[00:56:45] EB: Happen all the time, right. So you have user generated content which is not necessarily – And it’s a very difficult thing, right? So in a way cyber security is in a way zero one, right? Is Latino let out, right? Zero to one. And I think the 0.001 to the 0.99999 is what doesn’t have uses it’s more on the – So in that case you’re looking more at data science track, right? So you’re looking at data mining or machine learning. Not necessarily machining. Data mining is fine too or any kind of statistical background. So the abuse team will recruit a lot of people who have a background in data science. They can learn anti-abuse in a way. Anti-abuse is just one place. I always believe that having good fundamental is a key to a good career. Good fundamental means you need to be able to be pro proficient at one programming language. I think that’s very important in our day and age. You have to switch sometimes. I start my career writing C, right? And then I arrive at Stanford and it’s all over the range about Ruby, and I really love Ruby, but then I enjoy Google and Google is like, “No. We have Java, C++, Python or Go.” It’s like, “Fine, I’ll pick up Python and now I’ll write everything into Python.” But I write code every week, right? And I’m going to talk about that –
[00:58:08] CS: Right. You can’t put all your eggs in one basket. You can’t just go –
[00:58:11] EB: You cannot do all of them, but I think you should be comfortable with one. And Python is a data science language at that point. You have R, but it’s very, very old generation instead most of the newer package are on Python. Most of the machine learning framework and things. You can have back port to R, but I would say Python – The data science package on Python, if you know that, if you have done a small project, do a small project you like. Do you want to try to analyze Captcha? Captcha with OCR, recognize them. Do you want to look at the phishing pages? There is PhishTank, which is a set of free data set you can create, small data set. Or do you like to have even basic experience machine learning? Do you know what TensorFlow is or PyTorch if you are going for another world, right?
Yes, I think that’s that. I think you just recognize what is the fundamental you need. And I think a big company is more interested in having people who can transition job actually because they want to recruit you and then they might recognize that you might want to change job or change focus. So then they are looking at would you be able to perform more than what you are you are hired for?
So another way to say it is, what premature optimization is the root of all evil, which is the premature optimization is the root of all evil. And I would say early in your career you should just be very good at one part of computer science, whether it’s like data science or whether it’s system and network. Both of them will serve you extremely well. Some people have a passion for system and network. Some people have a passion for data science. Some people need to do both and figure out which one they like.
I did system and network for many, many years. I love data science now. I don’t know where I’m going to end up in five years. I just know it takes ten thousand hours, right? Honestly, mastering a field it’s ten thousand hours of work. And so choose the one you like and you feel like you’re not burned out in the morning, right?
My test for people is are you going with a smile doing in front of your computer, right? Are you smiling while you’re doing things? Do you find pleasure into it? It doesn’t mean it’s not difficult sometime or it’s not tedious sometimes. That’s okay. But at the end of the day is it something you do enjoy? And so I would say that’s a two main track. I don’t know if you guys agree and maybe other people’s opinion. I think system network is one track. Data science is the one which is kind of the best one to abuse to be honest. That would be some recommendation. And if along the way you decide that you don’t want to do security but you want to do maybe more data scientists so you want to do, I don’t know, other thing like more machine learning, which is also in high-demand, then you can forge, right? So in a way you’re not sacrificing much to do data science basically project.
[01:01:04] CS: So we’re coming up on the end, on an hour here, and we could probably talk with you for hours more, but we want to let you let you finish your day here. But, Jack, before we go, do you have any other questions for Elie that you want to sort of end this on?
[01:01:18] JK: Yeah. I guess just kind of wrap it up. One thing that I’m really curious to ask is what you feel like the future of attacking machine learning will be as AI and ML becomes a bigger part of a society and is responsible for more and more things? We just saw in the news in the United States that the U.S. military is going to have autonomous drones that have no human interaction and will be fully functional with no human interaction in terms of AI. And like I just look at that kind of thing is kind of like this is kind of crazy. But I guess you know when I think about today versus how we were 10 years ago, you can go on the darknet now and for five thousand dollars you can buy a very sophisticated web and phishing attack package, basically a SaaS tool. Do you think – Like what does the future hold for attacking machine learning? Attacking AI? And do you think that like in the future in like 2025 the attack surface is going to be within machine learning primarily? Or I guess you know if you kind of put on your hacker hat from days past, like how do you think this this will all unfold and kind of what’s the future of attacking machine learning?
[01:02:43] EB: All right. So that’s a long question. No, that’s a long one. We might have – I don’t know if I can do it in eight minutes. Maybe I’ll take a little bit more time if you don’t mind because that’s a very important question. You actually touch on the hardest question of the field, right? I think you – So there is multiple part to your question. So I’ll try to go one time and maybe I’ll let you tell me if that makes sense to you because the first thing we need to go back very, very, very, very at the beginning, and that’s where we need to start. And I promise I’ll talk to you about the again in a few minutes, I promise. But let’s start with the beginning because I think it’s important to always go back to first principle. What are we trying to do?
So I’ll take Gmail as an example because it’s probably the most well-known and easy to conceptualize. And you know we can talk about that context. I think it applies to everywhere but I think it’s easier if we take a concrete example. So what are we trying to do? What we try to do is to say there is a line and this line is what goes into your inbox or what goes into your spam box. So in a way we try to make a binary decision. So very much like an access control of what goes in and what goes out. So we don’t remove it. We put it into a spam folder, although very few people go into the spam folder. I don’t know. Chris, when was the last time you look at your spam folder?
[01:04:09] CS: Earlier this morning, but I’m a bit compulsive. So – Yeah, your point is taken. I have lost some pretty – Almost lost some pretty amazing emails in a spam folder. The problem is that has gone the other way around where there might be something I actually need in there. But your point is taken. Anyway –
[01:04:28] EB: Yeah, exactly, right? So we make a line, right? And the line is we make them in the spam folder. They are also things which do not go into your spam folder. They are also things that we literally block like malware. So malware don’t go to your spam folder because that’s too risky. We would proceed in a spam folder. So now we have kind of this two line. What goes into the spam folder? What goes into your inbox? And the problem is this line do not exist. It’s not like an all and nothing. It’s not like you know the password, you don’t know the password, or you have the security card. You don’t. It’s what is the intent behind what you want in your inbox? And that’s what abuse is about, Is how do we learn the decision boundary of the problem? If you really want to go super precise on or like abstract and really zoom in on to what we try to do, that is our job. And so we need to learn this boundary.
Now this is boundary applied to everything is like if you have a form you have to make a decision, which comment are acceptable and which comment are not acceptable? Which again depends on the topic, depend on the topic of the blog, depend on your audience, depend on the settings, depend on many things. So basically what I’ve just tried to do is to use statistical learning, machine learning before that to infer what the decision bond is, because to be honest we don’t know. And it’s not humanly trackable, right? And so the first thing which is to understand is we can’t express it with a rule, otherwise it would be what we have done. Don’t use machine learning because you want to use machine learning. You use machine learning because it is doing things you cannot do.
Some people will say that mentioning is the last result of the incompetent. For me, it’s more like the last result of the problem exceeding your brain capacity, right? I can’t tell you what the boundary for Gmail is. I just know it exists. And we got good feedback from it, which is user putting thing in the spam folder or putting things into their inbox. And then I always mention when I have talk and when I talk about Gmail, I have a token. What is the change for AI? One of the thing is how you skew your classifier to make it its mistake one way or another.
So for Gmail, what you also want in this boundary is also be on the right side of the decision. So what I mean by that is, as you mentioned, it’s better for us to put something into your inbox that you put into your spam box rather than put something into your spam box that you might need and put it into – That you have to go for, right? People are okay to spam folder something more than they’re okay to miss an email, right? So that’s a decision boundary where we have to not only draw a line in the sand, which is very, very bizarre, but we also have to make sure that when we don’t know, we are cautiously falling on the right side of the boundary. In a way you’re asking is it fair safe or is it fail open, right? Which is a big question. Security will tell you that most of the time you should fail safe, right? And sometimes you need to fail open, right?
The firewall – If your firewall goes down you have to decide whether you cut all network connection or you let all network connection in, right? That’s kind of the decision for firewall. For us is it in the inbox or is it into the spam box? If you ask the question what do we do for account recovery? Again, they have to draw the line. Do you get your icon back? Are we going for fail safe or fair open? If we’re not sure, we’re going to use the reverse. We’re going to deny and ask you to show more information, right? Again, the decision manager is important. The way you fail to the left or to the right of it is also very important, and that’s really the crux of what we try to do, right?
Now what does it mean for an attacker, right? Because as far as Jack question, which is what does it mean for an attacker? So now we know what we need to do? What happened how an attacker goes about attacking such a system? Well, you can do two things. So one thing you can try to skew the line in your favor, and we can talk about that in a second, is that set of interest, or you can decide to get us to fail into the wrong direction, right? That’s kind of the two options you have, right?
The reason I mean they are different is skewing the line means you pollute our training data set. So what you could do is attack pollution is not as discussed as the second one, but it’s very powerful is – I’ll give you an example. One thing that we know spammer are doing and not in the dark web, they would have a lot of people take a spam campaign or phishing campaign and have a lot of accounts, hijack account or fake account, say, “This is not spam. This is not span. This is not spam.”
So what they try to do is they try to add content to our training pipeline so that the boundaries shift in their favor over time. It is very, very subtle. It’s hard to just to detect if they do it correctly because they are slowly, slowly shifting where the line is until their email goes through, right? So in a way you are trying to – Because machine learning learn things and then apply it. It learn and generalize. If you’d learn the wrong thing, then you can’t do that. There is this notion of background models, right? So you could imagine that if you polish the data set as a government then it might make a wrong decision. That’s a really important attack. We’re back to data security and data integrity. That becomes very, very difficult when you use user generated content and reports to help you out. So in our case we rely on our user and the help of our user to know what’s wrong, right? They have the report button and they can tell us on moving things. So we rely on user actions. So because now our system depends on user input. And remember the [inaudible 01:10:11] in security, “Do not trust user input” is something we keep repeating. Well, here in a way, we can’t trust it, but we need it. So that’s where the difficulty is. That’s one of the probably less understood part of the thing.
Now the thing you were mentioning, Jack, is how people get us to fail into the wrong side on a specific example, right? Adversarial example, right? So there’s the other side of it. Well, then they are sophisticated and less sophisticated way to do it. There is a manual thing where we have a lot of botnets as a service, right? We also have what we call FUD, fully undetectable document where they try to do mutation, obfuscation, encryption. There was this campaign this week on people using morse code to decode some of the payload. It was completely crazy.
People try to evade, right? You can write hand-crafted rules and most of the time you use – The attacker use their own intuition on how to attack your system, right? And so that is what pen testing would be. Why you use your intuition on how to break a network. Similarly, you can have a pen testing on how to evade a classifier or an anti-virus and use arcane features of file formats. That’s a known thing. If you’re an expert into a document, you say, “Okay, Excel has this very old thingy.” Excel Macro 4 is a big thing this year right with the law of APT relying on it. So very arcane thing see which is variable from Excel rules out that attacker knew about it. Some detection system did not. So you bypass, right? That’s an example of that. Now the question is can you use machine learning to actually create example that can bypass the system?
So the sad news is at the end of the day, and that is a TRM actually. So that’s kind of like the impeding feature. So the answer is yes. It’s called clear network, right? The Google generative adversarial – Generative adversarial network, which is you’re having two network which fights each other. One generate content. There’s one detected. There is a term which is at the end of the day the one who generate content to ultimately win, right? So that is what is used for deep fake, right? And you can use deep fake for non-stick for work things, which is the one which is the most in the price. You can also use it for this information where they try to create a video or a photo of X talking to Y, or X saying something and those defects are a big problem because then it’s hard to know what to trust. So you can build detectors. So you can build, and that’s where a lot of the research in machine community is on how to detect those things, right? So there are glitches because it’s not perfect. But at the end of the day, yes, you will be able to use that to evade a machine classifier. And at the end of the day we think that any transfer is available. So it’s a cat and mouse game for now.
At the end – But when is the end? I don’t know. Not now for sure. You will be generated overpowering the detection system. So the question is what do you do next? That’s also in active research area. But that also have interesting area of research which are there’s a paper by CMU on how to use actually adversarial network to evade detection, right? So if you put the right shape of glasses, a paper by Yue Zhao. Really good paper where they show that if you put the right lenses, then I don’t see, or make you believe you are someone else. So you can use those technique also for fooling recognition so in a way.
What you do with the technique is also up to you. And some of it has good application because some people might not want to be on facial recognition, right? Some people might want to not have their – so yeah, that’s where we are. I hope that answers the question.
[01:14:11] JK: Yeah, that’s really a lot of food for thought there I think. Yeah, it’s going to be interesting, like you say, kind of rat race, arms race for the next couple of years.
[01:14:26] EB: Yeah, but we had the flex points, right? If we were talking in the beginning to circle back and close on the – Back to the root. When buffer overflow or formats rebuff came out, there was no defense, right? You would write a buffer – From a buffer overflow there was no canaries. There was no randomized stacks. There was no depth, right? And then people start to come up with defenses and then you start to have regional oriented programming and then defense seem to overcome attackers for a while and people are like, “Okay, exploit is dead.” And then circle back no exploit is again the attacker has the upper end and probably is going to go down again. And it’s just the nature of security. If you’re not a player and you don’t like the cat and mouse game and you don’t like the constant learning and the constant re-imagination, that’s a hard field to be in. We keep inventing new defense and new attacks and that’s the cycle of it, which also explain why at some point you might be tired of it. Maybe after 30 years of keeping track every morning maybe you’re like, “Okay, I’m done. That’s also a tiring side of it.” But I think it’s also an exciting side is like –
[01:15:37] CS: Very real, yeah.
[01:15:39] EB: Always new things. Yeah, today there is new attack, new defense. So it means there’s always room for new people. And as you come to the field and you take one side of it, and this will be my last advice, is take one small bit. Don’t worry about the big picture. I mean, you can learn about it, but just take one. Take one you like. Take car hacking, take website security, take machine learning security if you would. Take, I don’t know, phishing classification or whatever you want. Do one you really like. Go deep .Learn about it. Advertise it. Try to blog about it or something like that. Post it on social network and then you will learn and you will enter the community that way. And who knows where it leads? And you’ll get opportunity, that I’m sure of.
[01:16:22] CS: I think that’s an excellent piece of advice to end off the show here today. So Elie Bursztein, I’ve I mentioned it before, but I know that your website, elie.net, is probably the best place for our listeners to find you. Is there anything else you would like to direct them to?
[01:16:41] EB: If they want to have some news about sexually or some news about what is doing, I tweet that on Twitter. IT’s Elie, right?
[01:16:49] CS: Same Elie, yup.
[01:16:51] EB: Yeah. I was one of the first people to work at Twitter. I got my form –
[01:16:56] CS: You got your preferred –
[01:16:58] EB: I got my preferred username, yes.
[01:16:59] CS: Yeah. Yeah. No. Elie1974 or anything like that, okay. Yeah. Well, I want to thank you again for sharing your time in the insights today with us, Elie. And thank you also, Jack, for joining us today. Do you have any final announcements you want to mention about Infosec or anything else?
[01:17:15] JK: No. No. Just thanks so much for being on the podcast and thanks, Chris, for another fantastic episode.
[01:17:24] CS: My pleasure.
[01:17:24] JK: Fun being on your show.
[01:17:27] CS: Absolutely. Well, thank you both again for your time. And as always, thank you to our listeners for listening and watching. I just want to point out that new episodes of the Cyber Work podcast are available every Monday at 1 pm central both on video at our YouTube page and on audio wherever fine podcasts are downloaded. I also want to mention, don’t forget to check out our hands-on training series titled Cyber Work Applied. Tune in as expert infosec instructors teach you a new cyber security skill and show you how that skill applies to real-world scenarios. It’s all free. And so you go to infosecinstitute.com/learn to stay up to date on all things Cyber Work.
Thank you once again to Elie Bursztein. Thank you to Jack Koziol, and thank you all again for watching and listening. We’ll speak to you next week.
Weekly career advice
Learn how to break into cybersecurity, build new skills and move up the career ladder. Each week on the Cyber Work Podcast, host Chris Sienko sits down with thought leaders from Carbon Black, IBM, CompTIA and others to discuss the latest cybersecurity workforce trends.
Get the hands-on training you need to learn new cybersecurity skills and keep them relevant. Every other week on Cyber Work Applied, expert Infosec instructors and industry practitioners teach a new skill — and show you how that skill applies to real-world scenarios.
Q&As with industry pros
Have a question about your cybersecurity career? Join our special Cyber Work Live episodes for a Q&A with industry leaders. Get your career questions answered, connect with other industry professionals and take your career to the next level.