Identifying worms, bots, fraud and other malicious traffic

Take a deep dive into worms, spam, hijacked accounts, fraudulent transactions and more in this week's episode featuring Fang Yu, CTO of fraud detection platform DataVisor. Fang discusses her work developing algorithms and building systems for identifying malicious traffic, the process of co-founding a security startup and lessons learned from seven years at Microsoft.

Fang started in the Microsoft cybersecurity research department with her DataVisor co-founder, Yinglian Xie, before the two started their company. Fang received her Ph.D. degree from the EECS Department at University of California at Berkeley. Her interests center on "big-data for security." Over the past 10 years, she has been developing algorithms and building systems for identifying various malicious traffic such as worms, spam, bot queries, faked and hijacked account activities, and fraudulent financial transactions. Fang has published many papers at top security conferences and filed over 20 patents. Product wise, she has helped different online services combat large-scale attacks with multiple successful stories. DataVisor's customers are an impressive bunch, they span the likes of Alibaba, Pinterest, LetGo, most major U.S. banking institutions and some of the largest Chinese insurance companies.

– Get your FREE cybersecurity training resources:
– View Cyber Work Podcast transcripts and additional episodes:

Chris Sienko: It’s a celebration here in the studio, because the Cyber Work with Infosec Podcast is a winner. Thanks to the Cybersecurity Excellence Awards for awarding us a best cybersecurity podcast gold medal in our category. We're celebrating, but we're giving all of you the gift.

We’re once again giving away a free month of our Infosec skills platform, which features targeted learning modules, cloud hosted cyber ranges, hands-on projects, certification practice exams and skills assessments. To take advantage of this special offer for Cyber Work listeners, head over to, or click the link in the description below, sign up for an individual subscription as you normally would, then in the coupon box type the word ‘cyberwork’. C-Y-B-E-R-W-O-R-K. No spaces, no capital letters and just like magic, you can claim your free month.

Thank you once again for listening to and watching our podcast. We appreciate each and every one of you coming back each week. Enough of that. Let's begin the episode.

Welcome to this week's episode of the Cyber Work with Infosec Podcast. Each week, I sit down with a different industry thought leader and we discuss the latest cybersecurity trends, how those trends are affecting the work of Infosec professionals while offering tips for those trying to break in, or move up the ladder in the cybersecurity industry.

Today we're talking to Fang Yu, CTO of the Fraud Detection Platform, DataVisor, along with her DataVisor Co-Founder, Yinglian Xie. Fang started in the Microsoft Cybersecurity Research Department before the two started their company. Fang received her PhD degree from the EECS Department at University of California at Berkeley. Her interests center on big data for security.

Over the past 10 years, she has been developing algorithms and building systems for identifying various malicious traffic, such as worms, spam, bot queries faked and hijacked accounting activities and fraudulent financial transactions. Fang has published many papers at top security conferences and filed over 20 patents. Product-wise, she has helped different online services combat large-scale attacks with multiple success stories.

DataVisor’s customers are an impressive bunch and they span the likes of Alibaba, Pinterest, Letgo, most major US banking institutions and some of the largest Chinese insurance companies. Fang, welcome to Cyber Work today.

Fang Yu: Thank you so much for inviting me. Great to be here.

Chris: It's a pleasure to have you. I always like to start every show by asking our guests how and when they got into computers and tech. I wanted to ask you that, but I especially want to know how and when you got interested in this specific aspect of security, working with and reading out malicious traffic.

Fang: Sure. Yeah, I got into interest in computer science very early on. I remember I started programming when I was in elementary school, so to speak. I've been interested in that ever since. Especially for the cybersecurity, I think I got more interested there in my PhD study. We actually worked on the deep packet inspection for the network security in UC Berkeley. Then afterwards, I graduated and joined Microsoft research, where I actually work a lot helping Microsoft bars, like services protect against the fraud attacks.

Chris: Right. What was the appeal of that? Does that just serve something you naturally transition to, or was there something specifically interesting to you about being this guardian against these bad actors?

Fang: Yeah. The attack or actually, detecting attack is actually pretty complicated and pretty complex. It's actually a very introductorily challenging and it also – it's actually a very, very important task, because without that, I think everybody including every end-user would be actually affected. That's actually a very, very important thing.

Chris: Okay. You mentioned a little bit, but tell me about your transition from working with the Microsoft cybersecurity research department to co-founding your own company. What was the impetus to take your ideas and expertise and use them to start your own organization like that?

Fang: Yeah. In Microsoft, I was actually a part of the research department, but we help many product teams to fight against mars type of attacks. As Microsoft is pretty big, I have many different services. They are constant attack to Microsoft services. For example, the one was Hotmail, spam problem and the bot queries and in the payments, the Xbox, etc.

Whenever we saw a problem, we always analyze a problem and have a solution, right? Then this type of attack, what kind of a solution would be best defense against all these type of attack? We have a lot of these solutions. Over the time we feel, I would call it like a whack-a-mole approach.

Chris: Sure, sure.

Fang: You see the mole actually coming out and actually have this type of implication that you have to whack it. In the end, you see – tomorrow you’d actually come up with a different way that you can whack it. It’s –

Chris: That’s cybersecurity.

Fang: Yeah, independent. At the end of the time, we figure out all of these attacks are actually connected. Because nowadays you are not actually having someone physically stealing a credit card to actually swipe it. We still have those, but it's relatively small skill. The more important problem is after those professional attackers. They have actually network, someone specializing the getting the proxy IP, someone specializing in getting the data breaches research as in. Someone actually put together the other thing and then to attack.

It's very important to actually find the root of this problem. The way I find is actually account department. We should actually capture mole at the mole, rather than actually wait for them to actually – look at the mole and then it grows and then then we start whacking it. Because at that time, it's always late, right? We don’t know which hole it will come out later. That's actually initiate the whole initiative of – me and my co-founder, actually we want to view to something brand-new and when opening and then want to actually have something to actually capture the things before they start to attack and also capture things at the root, rather than chasing different attack pattern at the end.

Chris: Wow. Yeah, that's great. You're developing this entire holistic system that goes beyond the immediate – the problem of the day. I love that whack-a-mole analogy. That's really cool. Yeah, I want to know a little bit more about the actual tools here. Based on your bio, it says that your primary area of expertise was the creation of algorithms and building systems as we were designed up for malicious traffic detection, and that you filled 20 patents for many tools. Can you tell me about some of your creations in a technical way? I know our listeners would like to know how these tools work and how you came up with the ideas for them.

Fang: Yeah, we have many patents. Actually, I have one. One of the things that actually is pretty unique about DataVisor is actually on supervised machine learning approach. Capturing fraud, or attack is a little bit different for all the other message right. People actually have rules based systems and I think audience must be very familiar with machine learning as well. It has been so hot. In machine learning, you need to give it a label, so a supervised machine learning. Then you can train a model to actually detect things.

An anti-fraud, or in cybersecurity, you don't have much labels. Even you have it, it comes months late, because when there is a credit card fraud that you actually have that a month later and say this is bad, or it says somebody default, it's actually months. By time you train a model, that is already late, because average for our experience of the global intelligence that we analyze, most attack pattern actually change within a few days, sometimes aggressive, well even changing within two hours.

With that, what we developing DataVisor very unique to us is unsupervised machine learning, that is able to capture a tech without actually the training data, rely on attention to constantly retraining. It's able to identify new patterns on the fly. This is actually our – a very unique –

Chris: Yeah. Yeah, that's the first I've heard of that.

Fang: That’s why. At the very beginning when we actually start the company, we talked unsupervised machine learning. Very few people actually heard about it and it's actually so foreign concept, because people think why they actually even work on. Today, I'm very glad that I was older, like monkeys that actually start to realize unsupervised machine learning is very important. For example, Ghana even saying about 2020, a large percent of enterprise must have unsupervised machine learning.

The essence of my white work is that because of these attackers nowadays, they are professional, organized primaries. They actually not only attack one account, they usually either create multiple accounts, or have compromised many accounts. Because one account has a limited financial gains. They want to catch a bunch of them and then remotely control these account to either get a credit card, or actually start important stolen credit cards, or actually studying transactions, etc.

Each card account don't have to be a lot of activity, but collectively, they can still actually have a large amount of the financial case. To detect this type of distributed, highly sophisticated, evolving attack, we really need to zoom out to actually not looking at a single user perspective, how this card actually look like … We need to zoom out to see what are the patterns of these ones that are there any correlations, coordinations among these panels, because we know the attackers, because they want to make profit, they need to actually control a large group of accounts.

Then do we have and see any abnormal fraudulent patterns from the coordinated way. That one is actually the essence of unsupervised machine learning. It needs a huge amount of computation, because for any of the single event coming in, you're not only no longer actually looking at a single event anymore, rather we're looking at these users or activities and they’re correlated with all the other users.

Able to do that in real-time is actually – system-wise is actually very challenging and it's a hard task. The moment you have a system able to do that, the benefit is also huge. We are steping one level up and it always able to oversee the whole population and identify the new patterns after coming up. That's really is one of the inventions that we are really proud from DataVisor, able to have that capability and able to detect things in real-time with very high position. We're not saying, this is suspicious, but rather then, we detect this is actually bad. This is actually something that our client can directly act upon.

Chris: That's fascinating. I'm just curious, what the – what is the scope of what this unsupervised machine learning is looking at. Is it looking completely globally? Because I'm just trying to get my head around, like if you have a client and they're trying to use your device to keep them safe, what is your unsupervised machine learning tool looking at? Is it looking at the entire world of transaction somehow, or is it just looking at one big enterprises, thousands and thousands of nodes, or whatever? I guess, I don't know what I’m asking.

Fang: We actually protect one enterprise at the time, but globally currently in DataVisor, we are protecting 4 billion user accounts globally in occurrence of events. We are actually very, very high skilled. For each client that we actually protect, they can put in a different events. For example, all the logging events, all the transaction events. Then the more event they add on, the more protection we can protect.

We are also very proactive. Many of attack, we can actually have capture, even at the registration time. With account fraud register, we know this account is actually bad. Prevent it from conducting bad activities moving on.

Chris: Okay. Yeah, that's amazing. I'm going to immediately after this interview, we’re going to go back and look all at your company's website. I'm so curious about all this. That's great. The purpose of the Cyber Work Podcast here is we'd like to give our listeners some career advice and give them a sense of what different types of jobs are like on a day-to-day basis. I was wondering if you could walk me through an average day for you as a CTO at DataVisor. How is your day structured? When you get into work, how much of your day is spent on plan tasks? How long before everything turns into emergencies and you have to put out fires for the rest of the day, things like that? Do you spend a lot of time working with clients, or you actually still work with the tools and things like that?

Fang: Yeah. My day, regularly I start – got into the office around at 9:00 in the mornings, that's a very normal day. Then I do a mix of things. I still work on the algorithm, the system part. I’m now also one of our key goal is also helping our clients. We actually also see what the emerging attack there is and what bot trained it.

We actually have a small – a research department on top of all the attack. Then we actually published the quarterly fraud index report into actually globally. We have a topic and to see what are the actually, the for example, incubation period when they actually – where they come to the attack, what are those incubation periods look like. If attacker actually create the content, what the content look like. Then for example and also comparison of different sectors, what are the sectors attacking the for example, the financial institutions versus the social properties, what are the differences?

We publish a quarterly fraud index report on all of those. Then I think for anybody interested in that, you can go to our website to download those. Then yeah, so we do the fraudulent research and then we’re also helping the client to solve their real-world problems. We have a wide suite of products, not only the unsupervised machine learning that I talked about. This is actually our flagship product.

We also have a lot of other things, like for data collection in the SDK product, to the feature platform that actually able to allow analysts to real-time create a feature and in real-time practice that feature and immediate put it into production. Live ops capabilities, that's also very important for us and also for all kind.

Also the advanced rules engine that is able to base on the new – like patterns that able to write rules to deploy and back test and monitor and feedback. One thing beyond under the rule engine, I think most people think it's pretty fundamental now, right? Many people actually have it. The other uniqueness of DataVisor that we can automatic general rules. By the supervisor machine in the scene, what are the checks happening? It's able to summarize a pattern and automatic suggest a rule to actually putting.

We have automated rules engine able to every day generate a lot of rules and then able to retire the rules that is no longer effective and in managing those rules there. We want to have a different, like a layers of a potential recommended supervised machine learning for the unsupervised machine learning, for the rules, everything. It's a mix suite of the product that we provide to the client.

Chris: Okay. That's really interesting. I talked to a fair number of CEOs and CISOs and CTOs who they get to a certain point in their career and the thing that they love doing in security, actually getting in there and dealing with threats and things like that gets pulled away from them and they just find themselves just in client meetings, or just dealing with these macro-level things. It sounds like you still have a pretty good balance between working with the client and getting what they want, but also you're still really on the frontlines in terms of making things in the way that you were when you first started with the company. Is that the case?

Fang: Yeah. Of course, we have a very talented team. I know I work the teams that is not me, along and it’s actually together with the teams that we have. As I said, we have a research department and then we also have a lot of talented engineers for us computing the unsupervised machine learning and all other system. It actually needs a lot of system effort as well.

I would say, the success of the product is three aspect; the first is algorithm, how to actually run unsupervised machine learning and how to run  supervised machine learning very well and effectively within aspect. The second aspect, very, very important is system aspect. Because in today, everything's big data. We have billions of events come and seeing every day and how to actually handle these events in real-time. It's a lot of system challenges. For us, everything is distributed. How to actually handle each task, distributed high efficiency. That's also a very challenging task. Then third part is actually the domain expertise. It's not something that you can apply off-the-shelf models to for the connection that we need lots of the expertise. That’s why.

Chris: Okay. Your bio noted that you had multiple successful stories of combating large-scale attacks. Can you give me some examples of your tools and algorithms preventing major disasters? Do you have any cool war stories, things that could have been disastrous, but you helped save?

Fang: Yeah. There are a multiple of those. I think I'll probably give two examples on different industry. The first example is for financial institutions, in terms of the credit card openings. In a credit card account, I think especially in the US, that the current method of detecting the fraudulent account opening is pretty broken. This is actually because of all the data breaches we had in the past. I think all of the good user’s information for example, where they live, what's their past employer, what's their street address, all of those information the attacker had.

They can actually impersonate all of these legitimate users with very high FICO scores, then they just got pretend to be these users and apply for credit card. It's very hard for the financial institution’s existing model to distinguish this is a true user, versus this is a forced user. If you want them to do a lot of ID verification, etc., it's also very not user-friendly, right? That system is actually currently under attack. Moreover, I think that the fraudsters are even more creative to actually create a synthetic ID.

For example, someone don't even exist, or some actually really young children they actually get their social security number, start incubating a profile and then pretend to be this person. All the things that they actually now can have multiple cars and at the end, they would actually get a car with a large credit limit and immediately use and then will be gone.

Chris: That is a long game right there.

Fang: Yeah, exactly. They are very patient and incubation takes years sometimes to create –

Chris: Wow. Unbelievable.

Fang: - a fake identity. Whether we are actually protecting them is actually identifying these fraudulent application right at the actual application time. Identify all of these. This is actually a part of the bad application and these are not currently – although, their credit score appeared pretty good. That's something that we help actual multiple financial institutions to actually save the fraud laws, because each credit card can have a large credit limit, that that's actually some of the real fraud stories that we actually help them prevent.

Another case is that for the social plan, I want to give is actually come take over. One of our client that the fraudsters, I think they did a large attack at the Thanksgiving time. It’s actually, most people are often it is trying to attack the client with different logins. Then the attack is actually pretty distributed. If it's actually coming from an IP address ranges, etcetera, it’s actually pretty easy.

For this particular attack, it's actually going through a remote – I think we suspect they actually goes through the IoT device across the world, because that they all claim to be iPhones, but it's actually that the pattern actually coming from is not really iPhone. It’s emulators from the IoT devices. They appear to be very, very global attack and try to log account takeovers — and we kind of block those.

It’s very interesting. Initially, when they actually block them, they saw this pattern and then it quickly, they actually evolved the pattern. If they go through, they will do certain things. The second one, they actually – if they go through, they will actually incubate, so to pretend to be actually inactive. Then because immediately actually getting in the word. They do things that's very clear signals.

We see different migration of these attack patterns, including some of them as last very quickly. Then for this skill of attack that we helped the client to actually block, it’s a multi-million of accounts that we actually helped them protect them against these type of sophisticated attacks.

Chris: Okay. That leads perfectly to my next question here. You've been working with these types of malicious traffic type spots and worms and spam and hijacked accounts and credential hijacking and so forth for more than 10 years now. Can you talk about the ways in which these security headaches have changed and grown more sneaky and complex over the years? I mean, it sounds like 10 years, it's almost a completely different game. What was it like in 2010, versus now?

Fang: Yeah. In 2010, I think we start to see more of these sophisticated attack, but not as much as now. I think now, it's almost all industry affected by these sophisticated attacks. Then amount of sophistication is also increased quite a bit. I mean, before if you look at 2010, probably at that time it's mostly through proxies, etcetera, sometime they even have their own market ranges. Now with a lot of them actually going to data centers, I think afterwards. Now I think there are two things that we have seen the trend; one is the new data centers actually coming up all the time. The advancement of a cloud computing. They're always data center actually coming up, and so they are a lot – some provide the grounds for the fraudsters.

Fraudsters are also very moving forward and use the industry, the botnets. Those are actually a little bit harder to detect, because these are the normal user’s ranges, right? They have account users on top of it. Then even more sophisticated that we are seeing that I think by the 20 – 10 years ago, although their adoption of the two-factor authentication is actually low, but people think the two-factor authentication can block most of attacks. Nowadays, I think two-factor authentication is actually more prevalent now. Still, attackers are able to even bypass two-factor authentication, either use change the phone bindings, or sometimes even hijack the signals from the verification codes and able to conduct those.

We are seeing a lot of the simulating the traffic. Attackers are also very sophisticated now that we see those fully programmed. What they actually need is actually just like these are small devices, this little pool there. Each one needs a SIM card. Then everything is actually similar – connect to it and they can simulate an app and then do all kinds of actions. Then they have multiple code program to things to actually pretend to be solvents of a app and everything actually goes with DPS to be actually distributed. In that way, they can massively generate application level of attacks, very mimic to the normal user traffic. We are seeing them actually apply a lot of new techniques.

Chris: You mentioned that I hear you say that the cloud – the mass migration to the cloud, has that changed the playing field? Has that made things harder to track or stay on top of?

Fang: Yeah. The cloud is definitely making things harder, because there are a lot of VPNs actually setting up on the cloud. Then we cannot say every traffic from cloud, from AWS is actually bad, which is not true, because they are legitimate VPNs actually going on those cloud providers. Also, we actually even find sometimes the culprit, or some secured routers, they redirect the traffic to the cloud to scan there, to perform some security act, like scanning. Then redirect the traffic out on the cloud.

There are a lot of way that the cloud could actually generate legitimate traffic. We cannot have that using the cloud as a blacklist, although some clouds actually, they could be actually the attack person, going there to set up their computational resource there.

Chris: Okay. Yeah, so that's going to be something you probably is not going to be solved anytime soon. That just sounds that's a huge problem.

Fang: Yeah. That's something that we require to actually have all kinds of signals. You cannot just rely on traditional blacklists, this cloud provider, this range is bad. Rather, we actually need to taking that as a signal and then combined with all the other signal and have a way of detection.

Chris: Got it. Okay, so I want to go back a little bit to your career journey and apply it towards people who are listening who might be interested in doing what you do. Can you tell me a little bit about – it sounds like you love your job. You have a lot of really interesting war stories and things that you do. What other parts of your job that you love the most and get you excited to start a new week? Are there are there any parts that you dread having to do each week?

Fang: That's a good question. For us, I think it's building – every week, we are actually building a lot of new product in behalf of solving the problems. Every week, I'm actually very excited coming to work, to talk to the different – the clients and the product and see how our products are actually solving the real-world problems for them. If any feedback, or missing what other things that we actually need to do that next. That's actually a very exciting part of the building things, solving real-world problems.

There are also things that you need to do a little bit – the main work on things. Now, I do actually much less. When we start the company, just six years ago, we used to do a lot of actually – now I was a technical job, for example. Loyally those that incorporate the company and all of those things.

Chris: Yeah, you're wearing six or seven hats probably. Different hats. You’re doing everything. Doing the accounting. Right.

Fang: Now I do much less on those. It's part of the journey. We learned quite a bit along the way.

Chris: Yeah. That's a relief. Sometimes I talk to people out of here who are like, “I'm up 2 in the morning and I can't sleep and there's emergencies coming in at 3.” I'm glad you've hit a point of equilibrium there. Again, the purpose of the podcast is to help people who feel that they're interested in cybersecurity, but might not know where to get started, feel like they have some ideas and options about where to begin their research if they're looking to change.

If our listeners we're interested in doing the type of work you do, fighting malicious traffic and attackers and the cloud and elsewhere, what are some skills, experiences, educational milestones, or certifications that would help them get a leg up in your industry?

Fang: I think that's actually – Depending on what kind of role you want to be, I mean for example, in us we have different roles, right? I think it's very, very important as we have for anything who is an interested engineering to have a very solid engineering background. I mean, here, I think a lot of people are actually interested in machine learning. Then also the deep learning part of that. Those are actually very important to us as well.

For everybody, I think – for most people I now interview, almost everybody have a deep learning resume. I see a lot of other former students. I think there are two things I think it's also important, equally important. One is the system skill. Because in terms of the amount of data that we need to deal with now, it's not something that for example, Excel sheet, you can work, or even like those, like a MATLAB and those very statistically analysis tool that can help you.

You really actually need to build this through the programming on top of it. The system knowledge and distributed program is actually very important. I know that not many schools actually take a very basic system courses, like operating system, networking as a mandatory now. Those information are actually very, very important to foremost, like building up the cybersecurity knowledge from the part that actually to really implement these, because you need to actually have a lot of system knowledge. Of course, machine learning is actually also very important to actually – because a multiple, you cannot rely on a human.

The third part is actually domain knowledge is right getting to this. This is a little bit hard for – I think cyber topic – cyber is not so easy either. I think for cyber and the fraud, you at least need a working experience from companies. For the company that's actually interesting, like the detecting, looking at all of these, we can – from this experience of looking at different attack, then these knowledge actually can get together, right? Then able to say these are the patterns.

Then it's also very, very important not to be too narrow, because if you're only for example looking at a DDoS attack at one particular institution, your mind probably is occupied and what are the possibility that the things can actually – can attack? For our experience, I really actually varied from the – our Microsoft experience to actually led us start a company, because in Microsoft Research, we work with different product teams. We see the attack from different perspective and have understanding what it looked like.

Of course in DataVisor now because we are protecting so many different clients, that knowledge is actually a lot because we see here how they actually attack one client and what are the common patterns, or what are the newest patterns actually into different sectors. That information, I think it's a little bit hard for all of people I have to actually get at the schools.

Chris: Yeah, sure.

Fang: Welcome to join DataVisor.

Chris: Okay. All right, you heard it here first. Everybody, I'll ship the resume and get ready. Yeah. You started to answer my next question here and maybe you answered fully, but I want to just re-ask here. You said that when you were working at Microsoft, one of the big useful things that happened was that you got to see the attack from all these different points in the process, or whatever. Because I know a lot of people work a regular security job at Microsoft, but very few of them actually make the jump to making their own, founding their own company. Were there certain hands-on experiences you had either at Microsoft or elsewhere in the security realm that helped you move along the path where you are now? Was there a certain project you did there that made your knowledge jump exponentially, or whatever where you're like “Aha! Now I see it.”

Fang: It wasn't any particular one, but as I mentioned, it was more at the – me and my co-founder worked at Microsoft for seven and a half years. Over the seven and a half years, we worked on a lot of systems. Then we find and it's more whack-a-mole approach. That really actually drive us to say, “What if we actually want to become researchers?” We want to do something grand and it want to be actually one level ahead of all these attackers.

That's the main drive and say, we do not actually want to chase all the existing attack anymore. Rather, we actually want to do something next generation and able to detect. That was actually the main drive. It seems all those patterns and we feel the pain of actually the whacking the mole. That actually drive us to do something generic. That no matter which client, because every client have their uniqueness and the data is different, the attack is different. That drive us really actually to abstract out and then do something generic and able to conduct things that when you – we can quickly turn online, instead of having customized solution for all our clients.

Chris: Okay. It really was the point of seeing that there was a completely different way of thinking about it. You were seeing where you were, just dealing with the problems as they came and you said, “There's got to be a better way,” and you found a better way.

Fang: Yeah.

Chris: I love it. One of the topics that we like to cover at Cyber Work, we've had a number of guests on to talk about is the importance of finding new and diverse professionals to join the cybersecurity industry, especially women, people of color and differently-abled. Can you talk a bit about your experiences and/or setbacks as a woman who had to fight her way in a very male-engrained industry?

Fang: Yeah. Me and my co-founder actually when we grow up, we didn't feel at any – education we had was actually pretty supportive of the woman. We didn't feel that we are special as a woman, that actually we cannot do things that the man couldn't do. Then also, the work that we actually do is actually pretty technical as well. A lot of our peers are males, but we also have a good female populations in our – like research community. That part we never feel lonely as a woman researchers, or woman co-founders.

Being that and when we start the company, so when doing the hiring, we really actually don't take into any career something, where we want to hire a female engineer versus a male engineer. To us, it's all based on the fit. However, there is actually a good fit between us and the landing it.

If you talk about the woman's ability, in many cases, actually in the analytics field, that actually sometimes the woman's actually – could be actually even more analytical and then be more careful and the patient, either looking at the attack pattern. I don't see much a drawback at all.

Chris: Yeah, I was going to say one of the reasons a lot of people think, “Oh, you’re just trying to tick boxes, or this or that, the other thing.” The more points of view and the more different types of capabilities you have on your team, especially something like this, where you're problem-solving from so many different directions, it seems like it's really important to have people who have – if women have more of an analytical, or a patient approach, then it's good to have that in addition to other skills. If you listen to everyone in the room, then you get a number of complimentary voices that might point out gaps in thinking that you might not otherwise think to look for.

Fang: Right.

Chris: Yeah. What tips would you give to women entering the world of security right now? Are there any pitfalls you've been able to sidestep over the years that you would pass on?

Fang: Yeah. I think it’s just, don’t think of yourself are anything special. Don’t overthink that you are a woman. Just be your regular people. I think everybody actually can do the work equally well. That's where we actually grew up and educated. Then just focus on the work and really go deep. I think to do any work well, I think we need a technical ability to go deep then think the thing to find the true findings, or the amount of the – Everything that just – just focus on the work itself.

Then also, I think that the woman part in that there are a lot of actually other woman leaders out there. If people are actually interesting, they actually follow other woman leaders. There is actually a community out there and don't feel lonely.

Chris: Yes. Oh, yeah. There's lots of great online mentoring programs and we talked to the women's society of cyber jutsu recently and that's a great organization that is matching up women and mentors and so forth.

As we wrap up today, where do you see the task of dealing with these types of traffic, malicious traffic, like bots and worms and cyber nuisances? Where do you see it all going in the next five to 10 years? You've been doing this for over 10 years. Can you look at the next 10 years, where you think it's going to go next?

Fang: Yeah. I think the next 10 year is definitely more on the computation and able to automatically discover patterns. That's actually very, very important for cybersecurity. Cybersecurity is a little bit different from the traditional, like things that for example, you look at the image recognition, you recognize a cat versus a dog, right? This thing, a cat and dog look like I saw the years passed by, the cat is almost similar what they look now, versus a 1,000 years ago. The shape that greatly changed.

For cybersecurity and a fraud, not even talked about a year, even an hour ago, whether the next hour is going to be very different. I think what we really want to actually emphasize is able to discover new patterns automatically and then discover it very, very precisely. I think there are a lot of the tools actually claim to be able to discover new patterns, but it's actually come generate so many alerts that people actually think, “I cannot actually even handle this alert. I do need to actually manually go through,” and tell me what amount is – what other that you had learned, I need to actually learn.

I think the amount, the automatic discovery of new patterns are pretty important. The second important part is actually the precision. You want to precisely detect the attack and not much almost positive. That's something very important to actually for able to action upon to result. The sort of wise, actually the explainability. I think for so many, like a machine learning or deep learning, people say this is actually something that’s suspicious, but how you explain why is this actually suspicious, right? What is this? Rules is very simple. Are you defining rules? Then you say this, because of these features, it is active with that.

For supervised machine learning, why it's actually bad? For us in DataVisor, we actually have a full suite of the reason calls explanation, the UI for they actually to expect. That I find is super, super useful for all clients to actually really pinpoint. You tell me this is actually a fraud. First time meet, precisely. Second, tell me the reasons. I want to understand the reasons and see whether it actually makes sense. Actually, I can't follow. I think these three are actually very, very important in the next 10 years of the algorithm development of all the – the solutions that you’re able to – you need to do these.

In terms of the systems in the next 10 years, I think it's also very important to have a super scalable systems. That’s something that we actually try really hard to actually distribute and rise every single computation of us, so that it can be super scalable. It's actually very interesting that we work with some of the largest clients. Sometimes, the system and the security attack and the fraud attack, they have waves of attack. When that's going to a huge wave, the existing system cannot keep up because of the latencies etc.

Then fraudsters actually use that, because they know when I send a large amount of attack, your system not able to bypass it, because it cannot make the QPS. Then you will actually short a couple circuits and just say, “I just let these traffic suit.” That all the fraud attack actually can go through.

When we actually help the client, we are able to have a very resilient layer of the computation, able to speed up – more cluster in the runtime and it also have a very low latency of computation. Under such a situation, you’re able to have very predictable worst-case scenario of the latencies. That's also one of the key points for the arm, the fraud detection, or even cyber. That we really actually want to have the very high, scalable, predictable and able to handle high spikes. That's absolutely key.

Chris: Wow. That's a lot to work within the next 10 years. I'm glad you're on the frontline keeping us safe here. Fang Yu, you thank you very much. Oh, one last question. If our listeners want to know more about you and/or DataVisor, where can they go online?

Fang: Sure. Our website has a You can also e-mail me at

Chris: Perfect. Oh, that's awesome. Fang Yu, thank you so much for your time today. This was really, really fascinating.

Fang: Yeah. Thank you so much, Sienko and great talking to you as well.

Chris: Okay. Thank you all for listening and watching today. If you enjoyed today's video, you can find many more on our YouTube page. Just go to and type in Cyber Work with Infosec to check out our collection of tutorials, interviews and past webinars.

If you'd rather have us in your ears during your work day, all of our videos are also available as audio podcasts. Just search Cyber Work with Infosec in your podcast catcher of choice. For a free month of the Infosec Skills Platform which you heard about at the start of today's show, just go to and sign up for an account like you normally would. In the coupon code, type cyberwork, all one word, C-Y-B-E-R-W-O-R-K, no small letters, no spaces to get one free month. Thank you once again to Fang Yu and to DataVisor and thank you all again for watching and listening. We will speak to you next week.

Free cybersecurity training resources!

Infosec recently developed 12 role-guided training plans — all backed by research into skills requested by employers and a panel of cybersecurity subject matter experts. Cyber Work listeners can get all 12 for free — plus free training courses and other resources.


Weekly career advice

Learn how to break into cybersecurity, build new skills and move up the career ladder. Each week on the Cyber Work Podcast, host Chris Sienko sits down with thought leaders from Booz Allen Hamilton, CompTIA, Google, IBM, Veracode and others to discuss the latest cybersecurity workforce trends.


Q&As with industry pros

Have a question about your cybersecurity career? Join our special Cyber Work Live episodes for a Q&A with industry leaders. Get your career questions answered, connect with other industry professionals and take your career to the next level.


Level up your skills

Hack your way to success with career tips from cybersecurity experts. Get concise, actionable advice in each episode — from acing your first certification exam to building a world-class enterprise cybersecurity culture.