Hosted by CalypsoAI, The Cyberstream is a podcast series that delves into the technical and societal challenges surrounding AI Security.
In the second episode of The Cyberstream, we chat with CalypsoAI’s Tom Sharp, a Data Scientist at CalypsoAI. IIn this conversation, we discuss Tom’s career path and how he navigated the road to becoming a data scientist, we also talk about the future of ethical AI regulations and standards in the United States, and finally, we chat about CalypsoAI’s Secure Machine Learning Lifecycle product VESPR and its core capabilities for data scientists.
The Cyberstream refers to “Islands in the Cyberstream,” the last book by Joseph Weizenbaum, MIT professor and creator of the first modern AI software called Eliza. After making the software and seeing its social impact in the 70s, he spent the rest of his life fighting for responsible AI or even no AI.
For more information about CalypsoAI’s Secure Machine Learning Platform, VESPR, click here.
Mackenzie Mandile: [02:53] Hi Tom, welcome. Thanks so much for taking the time to speak with me today.
Tom Sharp: [02:57] Yeah, definitely.
Mackenzie: [02:58] So today you are a successful data scientist driving how CalypsoAI builds out the machine learning lifecycle product, VESPR — but I know it’s been a winding road to get here. I feel like your interests and experiences are pretty wide-ranging: You have a background in engineering, but you also have supported the US government on the business consulting side. So for you, what initially sparked that interest that brought you to data science and machine learning in the first place? I’m curious what that path has looked like for you.
Tom: [03:32] I kind of tested everything out. When I was really young, I liked inventing things. My father was an engineer and I think that it kind of rubbed off on me. So, I loved creating things, inventing things in the garage, or in my room. Going into college, my idea was to be an inventor and I saw engineering as a path to accomplish that.
But even before then, in high school, I was looking into politics for a little bit, psychology for a little bit, history for a little bit. I eventually leaned into math and science because of a really influential teacher I had in high school. They were the physics teacher for both the honors and the AP physics courses and he was awesome, and really inspired me to pursue a career in engineering. And that brought back my childhood memories of wanting to be an inventor. So that’s what led to me in engineering at school, and I actually pursued chemical engineering.
It’s funny, but there was one instance in my life that actually pushed me towards chemical engineering. It was a Popular Science article where this one Ph.D. scientist was trying to work with fuel cells, and he was explaining how fuel cells could actually be a more efficient energy source and I was really fascinated by that. And it was like, oh, this is the combination of chemistry and physics. And it made me pursue chemical engineering. I quickly found out that at least for my program in chemical engineering, that was not the case, and that if I wanted to do anything in terms of invention or science, I would have to go on to get my Ph.D. So, I stuck with the program, but towards the end, I kind of realized this wasn’t really for me. I had picked up a couple of econ classes throughout my college career and that was something I really, really enjoyed. So, that made me more interested in business, which eventually led to my first job, which was at a consulting firm.
So, kind of just all over the place. Experimenting here and there, but eventually led to a trajectory that got me into the business world, and then that eventually led to finding data science and analytics.
Mackenzie: [05:44] You mentioned your consulting experience in there, and I know you previously worked for Deloitte, but what was that experience like? And did you join as a chemical engineer right out of college, or did you join Deloitte with the intention of leaning into the business development/government support side of the house?
Tom: [06:05] Yeah, I was hired as a business technology analyst. So it is a combination of both business and technology and I worked in the technology consulting sector. That was good for me because coming from engineering, I felt like I was going to use more of my chemical engineering skills. But I was specifically in the government space. So I was helping government clients solve their problems, which are usually large-scale, long-term problems. So, my first client lasted about two years. I was on a couple of different teams, but I was doing engineering, but more technical engineering. I was drawing up design documents, I was designing architecture and things like that, which was really interesting and really helped me understand how different pieces of software and technology connect. But it was also a little bit daunting because I didn’t even know what a database was back then. Of course, that word gets thrown around, database, but I wasn’t really sure how they worked, how they operated, even how to design a system that would interface with those. So that was a little daunting.
As I said, it was a great experience, but I think a part of me was missing more of the math that I had learned in undergrad. And there was a lot of complex math that I learned in chemical engineering that I really missed. So I was trying to find a way to use math in the IT field. And I was searching nonstop. And of course, I came across this thing called analytics and that’s, I guess, the birth of of my data science career. When I found [data science] I was just so fascinated with using numbers to actually drive business strategy or to drive marketing strategy or to drive financial strategy, literally any part of the business components. I thought it was really fascinating in a way to combine kind of, you know, the math. The skills that I learned in engineering, also my fascination with economics in the business world.
Mackenzie: [07:55] Definitely, that makes a lot of sense… and that must tie in really well to the Masters and Analytics that you’re currently wrapping up at Georgia Tech. Congratulations, by the way, I know that you’re graduating soon. So, what caused you to choose this program to really solidify your expertise as a data scientist?
Tom: [08:17] For me, I learned data science on my own for probably about a year before deciding to pursue a more formal education. And that decision came about for a couple of reasons, one, to somewhat certify myself and say, I actually have these skill sets. You know, I went and paid X amount of money and I got this piece of paper that basically certifies that I knew enough to pass 10 classes and a Master’s degree. That was one thing.
The other thing was after a year of learning on your own, you kind of get a little both burnout and also unmotivated at that point, although I was motivated. It was hard to continue to create a curriculum for myself and then go through it. That was just a little bit difficult. So I saw definitely saw the benefits of more formal education as the second step in my learning path for data science.
Mackenzie: [09:07] That makes a lot of sense, and now that you’re somebody who has expertise on both the engineering and the data science sides of the house. You mentioned that with data science, it enabled you to approach business strategies and different problem sets. So I’ve heard some people say that the difference between engineering and data science is a difference in how you think, and how you approach those problems. Do you agree with that? Or, what in your mind is that differentiating factor?
Tom: [09:43] Really, really good data scientists understand what the problem actually is at the largest scale. So, you know, if we’re trying to come up with a new model for some part of the business, how does this relate to the overall business strategy for the next quarter, half-year, year, five-year plan? Having that understanding is super important because it gives you the context to help break down that larger problem into the problem that you’re solving and how those two relate. And the other, I think the more tangible aspect of it is really understanding trade-offs. I think that’s the biggest part of data science and that’s also the biggest part of engineering.
In a perfect world, engineers would be able to just calculate how many beams they need for a bridge or, you know, what’s the exact material needed for whatever. But what actually happens in real life is there’s a trade-off, usually between price and materials or strength and weight, things like that, and those trade-offs are constantly made, nonstop. Those are the things that don’t have a correct answer. But that’s what the engineers are really there to, to solve problems.
And you can see that in data science as well. I mean, the biggest, most obvious one is the bias-variance tradeoff. So, that’s maybe 90 percent of what we’re doing within machine learning, is finding that right balance. But there’s also other balances as well, like do we want more performance out of our model, or do we want it to be able to run quickly in production, for example? So the compute time versus the performance of the model is an example of another tradeoff that has to be made. Those things are the things that really fascinate me because it’s somewhat of an art within the science and there’s no correct answer. It really depends on the context of the problem we’re trying to solve.
Mackenzie: [11:32] That is really interesting. So, what was the progression then into machine learning for you and how did you start working on AI projects?
Tom: [11:44] So, when I started researching analytics, the first thing I had to do was learn how to code. So that took up a few months of the process, and then eventually when I felt like I was at a certain stage where I can handle code, clean data, even just read it in like CSV files, things like that. That’s when I started looking towards machine learning. And I was like, this is this is the cool part, right? So this is where we actually get the build models and predict things and and have outcomes that hopefully people use. So for me, that progression was really just once I had mastered Python to a certain level, actually looking at all kinds of algorithms, that I want to learn from and just kind of like taking small data sets and projects and and really breaking down the algorithms and learning how they work under the hood and then using them to do analysis here or solve a problem there. So, that was a little bit of a self-taught experience. And then I eventually made my way into Deloitte has this program called the Machine Learning Guild, which was one of the biggest things that propelled my career as a data scientist, is this multi-stage program.
Tom: [12:53] The part that I was in was called The Apprentice program. And so with that, they selected about 30 people from all over the firm. We all flew to Austin for a week. And then every day of that week, we would learn a different part of machine learning. And then throughout that experience, there were a lot of networking events. It was a really, really cool experience. And then following that week of being together and learning, we had a six month capstone project where we were able to pick any kind of project that we wanted to and and basically build it out over the course of six months. And then after that, we would fly back to what they call Deloitte University, which is just like a massive campus of learning in Dallas, Texas, and basically got to present our results to some of the leaders of the firm, which was just an amazing opportunity and really helped me understand, you know, kind of the life cycle machine learning. So that, I think really propelled my career. And then after that was just kind of applying those things for clients at Deloitte and then eventually moving on to Calypso.
Mackenzie: [13:50] And just out of curiosity, can you talk a bit about what that capstone project was that you worked on?
Tom: [13:57] Yeah, definitely. Mine was a little bit less business-related. At the time, I was using this app called Untapped. And what that is, is basically a social network. It’s not really a social network, but it’s a way to log different microbrew beers that you drank and rate them. And there were ways to rate them based off of like, I don’t know like this is more nutty, or something like that. I don’t really follow that stuff. But the thing I saw that was lacking was a recommendation system. Pretty much every app these days has a recommendation system, especially social media apps. Instagram, for example, moved away from the chronological timeline, which was basically a recommendation system or algorithm that presents to you the photos that you’re probably going to most care about, right? Then YouTube as well. Obviously, that one’s probably the most infamous. But then you have other things like Netflix, for example, which is more product-based or content-based where you’re going to consume that that movie based off of your specific preferences, like movies you’ve rated.
Tom: [15:08] With the Untapped app, I didn’t see any kind of equivalent of that. I wanted to basically make it so that when I went to a grocery store, or place with a tap, it would keep a log. Like these are the three beers that I’m thinking about trying, which one would I like the most? So, that was basically what my project was. It was basically building a recommendation system for that app. I had built three different algorithms into that, so there was, if I remember the name correctly, collaborative filtering, content-based filtering, and then I built a hybrid version of those two. And so a user could go and type in their user name and then basically select which algorithm that they wanted to use, as well as a couple of other configuration things. And they would actually build the algorithm on the fly for the user and then they could use it afterward. I had to like also build in some of those performance considerations as well. Like this is going to take an hour to fit or whatever. Then it’s not going to be very user-friendly. So all that together was a really good experience. Both the machine learning, the UI, building that and then the performance tradeoffs, deploying it as a Web app and everything was really cool.
Mackenzie: [16:11] That is really cool. So, through the work that you did on the capstone project, were you able to partner with the app creators to deploy your recommendations tool?
Tom: [16:23] I tried to. So they have an API that’s free to use, but they’re pretty restrictive on who can use it. So I applied and I mentioned that this is a project for education and never got a response. So I was like, oh well I guess I’ve got to get the data myself. So I actually had to build a web scraper bot to actually scrape all that data down for myself. Basically, I use the one in which is a pretty well-known like Arpey testing solution to actually go in and scrape all that data and I think I like five hundred thousand rows of data or something like that. That took a long time. That was a big bulk of the project. But that is usually the case with data science projects. The data gathering and cleaning take up 80 percent of the time and then the actual form part of the modeling takes up 20 percent.
Mackenzie: [17:16] Gotcha. Well, I am hopeful that Untapped will reach out. I mean, these are important problems that you’re solving! Ok, so let’s pivot a bit to your current work and all that you do with the data science team at CalypsoAI. So, big picture. What first caught your attention about Calypso and what made you decide to take the jump and join the team?
Tom: [17:44] Yeah, it was really funny because one day I was like, maybe I should try networking with my Masters degree program and my Georgia Tech program has a Slack, and someone posted “I have a connection at this startup and they’re looking for a data science role. It’s completely remote. And if you’re interested, message me.” I never met the person before. I mean, there are thousands of people in this program and it’s online. So there are very few opportunities to meet people outside of just class discussion boards and things like that. But I ended up messaging him. And then that’s kind of how my process started.
After the first interview, and I’m not just saying this, I was hooked, and I was like this is an awesome company. And I feel that I’ve been hearing about this work, but not like really paying attention to too much. And as I would be more and more into it, like security and trustworthiness and explainability, obviously I was just like, this is a market and a field that’s going to absolutely explode. And CalypsoAI is at the forefront of it, and they already have connections and partners and clients and things like that, which was just really fascinating to me.
I was like, yes, I absolutely want to want to join this company. So, after two or three weeks of interviewing, I got my offer call from Jimmy or CTO, and I was absolutely ecstatic. I was actually on my way up to New England for vacation and it was a Friday. And I was just like, man, it’d be great if I can just get that call today. And it was like 3:30 pm. I was like, it’s not going to happen. I was like, I was so upset. And then Jimmy at 3:30 Eastern Time, which is like 8:30 I think in Ireland, he gave me a call and extended the offer and I was just absolutely ecstatic.
Mackenzie: [19:31] That is awesome. So, with all that the data science team does to work on building our Secure Machine Learning Lifecycle (SMLC) product VESPR, I’m really curious how you see VESPR acting within the broader field of data science.
Tom: [19:48] Yeah, that’s a good question. And we use the data science team, I think have these conversations daily. We talk about each feature, we talk about, we talk about in the context of how that fits into our workflow and also how that is going to affect and bring about change within the science field. I believe that the research and development that we’re doing is in tandem with feedback that we’re getting from actual clients and from the government, where they are saying these are the rules that we want to enforce like these are the things we need to look out for. They’re giving us that feedback. And then we’re going and doing the research and development saying how can we actually create a test or build code? That actual requirement or enforcement or anything like that, so I think the R&D aspect of what we’re doing is kind of central to all of it.
Mackenzie: [20:38] Yes, that is awesome, and especially where AutoML is becoming so prevalent. With AutoML, you’re really removing the data scientist from the AI/ML development workflow and I think that is what really differentiates VESPR, because it provides that end-to-end environment with security built-in from the very beginning. VESPR really ensures AI creators are adhering to data science best practices, right? And based on what I’ve seen from the data science team, you guys put so much energy into the research around how VESPR is going to best support that end-user.
Tom: [21:20] Yeah, I think a big thing, as well as outside of data science, are those thought pieces and the thought leadership that we’re producing is the biggest thing, taking the time to sit down and talk to these clients and these governments and say, where do you think is going in the next year and what are some of those gaps? And sitting down and thinking through those things and writing it out, then that translates eventually into the code that we write and the product that we launch. I think that’s huge.
I think this will be the year for security, trustworthiness, and fairness bias. I think people are really starting to focus on that this year. And I think it makes sense because up until this point, we’ve made such huge leaps in terms of performance of AI and that was really the focus of the field, was how do we eke out more performance? How do we do things that no one’s ever seen before, both good and bad, thinking about faith and things like that.
I think this is the year that people are really turning towards: What does this mean and how do we make sure that this doesn’t go down the wrong path? And how do we make sure that this has some type of requirements and enforcement, just like every other industry should have? And I think it’s amazing that we already have a platform that addresses that and users can come in and with, I think, very minimal tutorials and learning use the same workflow that they’re using just much quicker and much friendlier, and that UI interface and then have all of that research and development. And there’s enforcement and those regulations are baked-in. And then they don’t really have to worry about, you know, reinventing the wheel for themselves.
Mackenzie: [23:01] Absolutely. I think that’s really well said. We’ve now mentioned VESPR quite a few times, but give me a quick rundown of the product that you and your team are building, and as a data scientist yourself, how would you use VESPR?
Tom: [23:19] VESPR is basically taking the typical workflow that you would use as a data scientist, and all of that boilerplate code, as well as the same frameworks that you’ve used over and over again, and distilling that into a friendly platform with a nice user interface. So, instead of writing or copying the same code over and over again and importing escalon or tensorflow, basically all that’s done and you’re going through the same process, the same engineering mindset, the same data set mindset, but you don’t have to write as much code to do that. So VESPR brings you through the entire process from data import to feature selection, feature engineering into model selection, through parameter tuning with Security baked in as well. Then through the model selection process.
The key component of all of this is the validation and verification tests that we include. So, once you have trained a model and you think this might be the best model, and it’s the one you plan to deploy, well, it has to go through the gauntlet at that point. That is our testing suite, and it ensures that there is no bias, that if someone were to perform adversarial attacks on the model, that it accounts for them as well. All of these things that we’ve read about in research and development and then put into actual code in our platform. So, I think going forward, security and trustworthiness are going to be a big factor.
Going back to what I was saying with trade-offs, data scientists and engineers are going to have to make tradeoffs between performance and adversarial attack performance and bias performance and basically, all these things that we have in the test and our platform allows you to do that all at once.
Mackenzie: [25:20] Yeah, perfect. So, I’m curious what you see as our differentiator because I think VESPR does address a very niche solution that other companies in this field really aren’t addressing, which is that reporting, the auditability, with a security-first mindset that is baked right into the ML development process right from the very beginning. But I’m curious what your take is.
Tom: [25:51] AutoML being basically automating a large portion of the data science workflow, whether with a user interface or with more boilerplate code, higher-level APIs, things like that. I think that’s still coming about, I think a lot of companies are focusing on that because they’re seeing data scientists copying code, things like that. AutoML is just speeding up the process. So, I think a lot of our competitors are focusing on how we automate portions of the data science workflow.
CalypsoAI is focusing on that validation and verification, on security, and on trustworthiness. That is what we’re doing. So, companies that seem to be doing the same thing… I feel like it’s actually very different. We’re actually focused on something different. So, again, we’re focused on that security, trustworthiness, validation, verification. We just happen to wrap that up into a nice UI interface that helps automate or make more rapid some of those key decisions and in the process of the pipeline building.
Mackenzie: [26:55] I want to dig a little more into that core sort of value that we have as a company of being security first and security-focused, and I think that a really great way to approach this whole topic is by first asking the question of “why does AI need to be secured?” Or, “Against who or what do we need to be securing I against?”
Tom: [27:22] Basically adversarial attacks, or adversarial machine learning, is is a concept that is similar to cybersecurity where there are hackers, but instead of them being software developers, they are data scientists and software developers kind of combined. They understand how machine learning and AI works, and basically, they’re trying to game the system for some kind of outcome. Whether it’s to move the model slightly in one direction so that all the outputs of the model, later on, are provided, providing predictions that would be beneficial to that adversarial tacker or detrimental to the company or institution deploying them. That’s one thing.
The other thing is more gaming the system for self-gain. So, if I know that a mortgage application heavily favors maybe my salary as an indicator of whether I’ll receive a mortgage or not, I could potentially move my salary above a certain threshold. So, that way I’m almost guaranteed to get that mortgage. And it’s not too much of a lie because it’s close enough to my actual salary that if someone were to come back, I could say, “Oh, I misspelled/mistyped.” or something like that.
So that’s kind of the two things that I see the most. This is super important, obviously, as we go forward with foreign governments and countries attacking both the US as well as the countries. We’ve seen it before, specifically with Russia meddling with the US government. Also, there was just a huge leak that came out a month or two ago as well. I think bigger things at the political stage where foreign entities are trying to sway certain elections or certain elections in their favor so that their policies are more accepted here, domestically. I think those are super important issues and you’re seeing kind of this convergence of AI and cybersecurity, and that’s going to be huge.
Mackenzie: [29:40] Yes, I completely agree with you, and I think that parallel between AI Security and cybersecurity is so deeply important. I think first, because AI Security is currently not being comprehensively addressed. We’re seeing our federal government really struggle to keep up with just how fast AI and ML capabilities are growing. Cybersecurity is a really amazing vertical for us to study, I think, because through the lens of cybersecurity, we have over twenty-five years worth of lessons. Now we have an opportunity to look to the cybersecurity community and the lessons that they have learned, as a playbook for us as AI security researchers. So, in the face of rising adversarial attacks, data breaches, hacks to AI systems, these are all lessons that have been learned by the cyber community and they have solutions that I think we can borrow from, such as the Shift-Left movement, the DevSecOps movement. This, I think, is really going to be the future of warfare: Hacks to AI, hacks to all cyber systems, we’re seeing it more and more frequently and I think it’s it’s imperative that we learn from the lessons that the cyber community has already learned, if that makes sense.
Tom: [31:11] Yeah, there is a book I read. The thesis basically was that the new age of warfare will be on the Internet and it will no longer be on the ground. I think there’s going to be some kind of combination of that, where there’s actual physical warfare. But I think a lot of people are underestimating how much cyber warfare and technological warfare will play in our lives. I think that’s starting to come to light much more today.
Mackenzie: [31:40] Agreed, and I think it’s really exciting that we, as a company, are really on the front lines, delivering solutions that address these challenges. It’s definitely an exciting time to be in the national security and defense technology space. To be so involved with thought leaders and policymakers on these deeply important issues. We’re really just skimming the surface. 2021 is going to be a huge year, as you mentioned earlier, with AI security and AI trust. Or as the National Security Commission on AI put it, “How do you reach that point of justified confidence in our systems?” And it’s really exciting to see that this conversation is being had through the lens of testing and evaluation. We’re starting to have this conversation on a national level, especially around regulation, which I think is amazing. But, while 2021 will be a huge year for the world, I guess globally in the field of AI, I think 2021 is also going to be a huge year for us as a company. We’re launching MVP this spring. We’re continuing to scale and grow. What does that mean for you, for your team? What is ahead this year for you?
Tom: [33:06] Yeah, it’s interesting. I think at the turn of the year, there was a lot of discussions around the vision and the plan for both Q1 as well as 2021 as a whole. I think data science had a pretty good input into that thought process and planning. Essentially for Q1, what we’re really focused on is perfecting what we have so far, which is tabular data science. So, reading CSV and SQL tables, doing that really well, and making sure that workflow makes sense and is very intuitive to data scientists who are coming in and using our platform. I think that is the number one priority for us. And then also, what we’ll be working on in tandem, in parallel is building out the platform for another aspect of the science, which is computer vision. So, you know, classification, as well as object detection, is something that we’re going to be focusing on pretty heavily this quarter. And that’s just one more of many different aspects of the platform that we’re going to eventually build-out. So, eventually, we’ll be doing MLP and maybe audio and then just kind of expanding further, but getting kind of that initial workflow and that process down. How do we make this as intuitive as possible for data scientists coming in using the tabular aspect of the platform I think is really important.
Mackenzie: [34:34] I want to pull the thread on a regulation a bit. So, with the Biden administration now in office, do you think there will be maybe an interest in emulating what our global partners have done with regulation? So, for example, Denmark and Australia have comprehensive AI ethics regulations. As a data scientist who is working on a project for secure and ethical and validated and verifiable AI, would you hope to see the United States do something similar with regulations? Or, what are your hopes and dreams for that?
Tom: [35:15] Yeah, I think I think it’s going to be very interesting to see how the government regulates AI within the current environment with these large tech companies. I’m interested to see how it is going to unfold.
I think for many years, the large tech giants were very unregulated. And I’m a tech person myself, I love technology. But I think to a certain extent, they were completely unregulated. They bypassed a lot of laws, antitrust laws, things like that. You saw with the Apple store recently being sued for charging 30 percent of all apps in the store. So, it’s funny that people are starting to focus on regulating AI. But the technology industry as a whole serves as a base for those different verticals and as a whole, the tech industry has gone pretty unregulated within the US for a long time. So, I’m interested to see how that plays out. And especially with the current political environment going on right now and a lot of social media apps taking fire from both sides of the political spectrum. Then obviously with the transition of the new administration, I’m interested to see how much focus is going to be on AI versus the tech industry, in general, and how those two things will play out if one will affect the other or if they’ll be handled separately. I think there’s a big push in the US government right now to come up with regulations and standard processes for AI, both from an adversarial perspective and then also through AI ethics and trustworthiness and things like that. But I would also like to see if there would be any kind of regulation in the tech industry as well. I think that would be interesting to see if that plays out at all.
Mackenzie: [37:11] I think that’s a really good point, and especially where, on the social media side of the house, we’ve seen our federal leaders really grapple with the implications and consequences of the lack of regulation within the tech industry. For me, it was interesting to see was just how quickly the tech outpaced our ability to regulate it. When it comes to AI, I think the tech is similar, where it’s complex, it’s rapidly advancing, and to instantiate rigorous standards around it. Does this somehow limit innovation? Possibly. But I think on the other side of that coin, if we do nothing — to not have serious regulation — then you’re dealing with unethical blackbox models that are making life-altering decisions for humans. So, I think it’s sort of a double-edged sword there, you know? But, you know, that being said, I think we’re still at the front end of this. I think it’s imperative that we make the decision now to put those guardrails in place. But what do you think?
Tom: [38:20] I’m interested to see how much regulation is going to occur. There are other countries that are progressing much more quickly than we are. I think the US government has a decision to make: Will they come up with a lot of regulations to basically standardize how the US deploys AI? Or, will they have to kind of toe the line between that and catching up or even trying to outpace other countries? I think that’s a huge decision that needs to be made this year. Then the other aspect of that is more, I guess, the consumer basis. So, as these big social media companies and other technology companies obviously are using our data to feed their algorithms, will that become regulated as well, even outside of social media? What about other aspects, such as mortgage lenders, insurance, things like that? That’s obviously a big, big industry to take a look at and see are the algorithms that they’re using. Is the data that they’re using bias? Is that fair? Is that biased towards one group or not?
So, I’m not sure the government will be so quick to move into that space, although I think it’s necessary and I wish they would. I think they’re really going to be focused on the foreign entities and the adversarial attacks. You know, do we catch up with less regulations, or do we kind of set the standard of the regulations that we’re going to use and hopefully other countries kind of abide by those in general?
It’s going to be interesting to see which side of the aisle wants to regulate AI and these tech companies. I have a feeling that it be both based on the agendas of either side of the aisle. There are arguments to be made that both sides want more regulation and these things. But I’m not sure. It’s interesting, though. It’s definitely an interesting question.
Mackenzie: [40:14] Definitely, and what do you see as VESPR’s role as these standards emerge, as they grow and change? Is your team’s intention to build something into VESPR’s reporting capabilities that demonstrate if a model has met Denmark’s ethical AI standards, or certain other ethical criteria?
Tom: [40:40] Definitely, yeah, so there are standards in I believe Europe is starting to come out with them, and you mentioned Denmark’s AI standards, but I think the EU or some combination of European countries are coming up with standards. There’s also, I think, international standards coming out soon as well. Hopefully, like we talked about, U.S. standards are a part of those. Our plan is to take all those into the VESPR platform. So, when you go through the entire process and you say, “This is my final pipeline, this is end to end, like data ingestion, all the way to model prediction, this is what I’m going to deploy.” We have a reporting tool as well, which helps you kind of slice and dice essentially the metadata of the models. So, the performance versus bias detection, things like that. So, you can go ahead and build custom reports if you want, which I think is super useful. I think the most powerful part is we’re going to have standardized templates for each of those sets of regulations. So like ISO standards, we can hand over a report that aligns with ISO standards. Basically, all you have to do is just click on that template and it will be populated with all the data, and you can print it out and it’ll show you basically how you how in line you are with those standards.
Tom: [41:53] So, as more and more of those standards are released, we’re going to integrate those into the platform, and then it’ll allow you to very quickly kind of assess, how much you meet those standards. It’s a nice-to-have as well as an actual regulatory benefit. So, you know, if you’re actually governed by these regulations, especially the insurance industry in the financial industry, you have to meet those criteria. VESPR will help you actually show how models align with that criteria.
Other industries, I think, like more of the consumer tech industry, which is going to take some time to be regulated. That will be nice if they can look at these regulations, say, although we don’t have to meet these standards, we still do. Or, if these regulations, did come about in the next month or two where we were forced to meet those regulations, we know that all the models built with best practices and security in mind.
Mackenzie: [42:48] That is so exciting, I feel like that alone is just such a powerful tool to have. So, looking ahead to this coming year, what are you hoping to accomplish and to learn with Calypso and through your current work?
Tom: [43:07] Yeah, I think the biggest thing that I think I’m going to get out of Calypso is really that the research and development and understanding how that process of R&D goes into production level. I think that whole life cycle is something that not a lot of people experience. R&D is something that I really wanted to get into, but I just didn’t want to go full in academia. So I think, I’m already kind of getting that out. But, you know, months and years of that is going to really help me in terms of my education of AI. And then eventually, you know, I want to see the future iterations of the product and how that progresses, I think it’s going to be really interesting to see something go from conception all the way through full-blown adoption. That’s not something that you necessarily see a lot, especially with me and consulting with government clients, those projects were years long. My engagement on those would be six months to maybe a year or two and so I didn’t really get to see the full life cycle of that process. I think Calypso’s will be the first time that I actually get to see that, which will be really exciting for me as well.
Mackenzie: [44:21] Awesome, that is really great. All right, Tom, well, thank you so much for taking the time to speak with me. I really appreciate it. This was great.
Tom: [44:31] Yeah. Thanks for having me.
Mackenzie: [44:32] We’ll talk to you soon.