Maxim Fedorov / Life and other stories / Skoltech x RSF

LIFE
AND OTHER STORIES

Maxim Fedorov
A Natural Philosopher
on a New Turn of the Spiral

LIFE
AND OTHER STORIES

Maxim Fedorov
A Natural Philosopher
on a New Turn of the Spiral

Story

on artificial intelligence: the danger of deepfakes, issues with the definition of AI technology and its ethical regulations, interdisciplinarity as new natural philosophy, and stress as the flip side of the philosophy of success
Story told by

Maxim Fedorov, AI and IT expert
Story told to

Elena Kudryavtseva, Journalist
Story recorded

in February 2022

— Maxim, both your parents were scientists, weren't they?
— Yes. My father was a physicist. Sadly, he passed away far too early. He was first vice president of Voronezh State Technical University. He died in the middle of a faculty meeting. My mother is a philologist. Perhaps, that's why I get so annoyed by errors in texts.

— And what does your “academic genealogy” look like? Which of your teachers gave you the biggest inspiration?
— Good question! Sometimes a person you have no formal relationships with can become your mentor. A young person's mindset is often shaped by brief encounters with wise people. For instance, I consider Leonid Valentinovich Rozenshtraukh to be one of my teachers. He passed away several years ago, but he is remembered by many of his pupils and colleagues all around the globe. He was a fellow of two Academies at once: the Russian Academy of Sciences and the Medical Academy of Sciences. I had met him before these two Academies merged. Leonid Rozenshtraukh worked at the Cardiology Center in Moscow. I met him because my brother Vadim Fedorov had been working in his laboratory (he was my brother’s PhD supervisor). My brother is a graduate of the Moscow Institute of Physics and Technology (MIPT), but his Ph.D. is in Human and Animal Physiology.

— I see that interdisciplinarity runs in the family…

— Looks like it does. My brother continues to work in the same field. He is now an internationally recognised electrophysiologist who is well known by experts for his many influential contributions in the area. I used to visit my brother's lab at the Cardiology Center, and Leonid Rozenshtraukh and I would have tea. Those were relatively short meetings but they meant a lot to me. Once he gave me a piece of advice which became one of the main principles for me in research and in life. He told me, "Never look back, Maxim! You will always find excuses there to be less successful, less talented, less everything. Always look forward and aim high, try to live up to those who have accomplished more than you have".
Another important person in my life is Viktor Matveevich Vakhtel, deputy head of the Department of Nuclear Physics at Voronezh State University (VSU). It is largely thanks to Vakhtel that VSU has the School of Medical Physics now. He did a lot for me at the onset of my research career. We worked together in the area of biomedical data and image processing in the late 1990s. My love affair with big data started there and then.
I met Simon Shnoll in the graduate school of the Institute of Experimental and Theoretical Biophysics of Russian Academy of Sciences (located in Pushchino). He was the scientific advisor on my dissertation. A brilliant scientist, he was also a great storyteller and a well-known expert in the history of science. I was young, so some topics in his stories were not always clear to me. Only years later I realized, this is the way the genealogy of science shapes up. In a sense, my scientific genealogy goes back, via Shnoll to Alexander Chizhevsky and on to Nikolay Timofeev-Ressovsky who was essentially my grandparent in science. Additionally, my scientific genealogy can be traced back to the first doctoral student of Professor Shnoll, the famous Anatoly Zhabotinsky who co-discovered the Belousov-Zhabotinsky reaction. I wrote my dissertation at the very table where Zhabotinsky had been working on his key discoveries. Now is a good time to bring up interdisciplinarity again. In 1977, Ilya Prigogine received the Nobel Prize for his mathematical framework describing non-equilibrium reactions. However, in fact, Belousov and Zhabotinsky were among the first who attracted attention of the scientific community to remarkable effects in non-equilibrium systems. Belousov was a highly-ranked military chemist. He discovered the reaction, and Zhabotinsky described it through a system of differential equations in Simon Shnoll's lab at the Institute of Biophysics in Pushchino in the 1960s.

— Now you are a leading IT expert on in silico design of molecular systems with use of machine learning. However, this research subject didn't even exist when you graduated from Voronezh State University. How did the subject of artificial intelligence become part of your life?

— One way or another, my entire career in science has always been around data processing and analysis, which is a part of what they often call "artificial intelligence" these days. My undergraduate project was titled “Image Processing in Modern Medical Practice”. When I was working on my PhD thesis, suddenly there was this explosion, one of many, of renewed interest in neural networks and data and image processing. This was, like, more than twenty years ago. I remember Gorban's now-popular book on neural networks was sitting on my desk. In Cambridge, we were among the first to handle dozens of terabytes of data (a gigantic volume for the early 2000s) and utilize parallel file systems. But I didn't identify myself as an AI professional at the time, because AI and big data were just regular working tools we used for our research. It's like when you're building a table using different screwdrivers, and suddenly one of them becomes extremely popular.

Photographer: Timur Sabirov /
for “Life and Other Stories”

— Who do you primarily identify yourself as: a physicist, a chemist, a biologist, or a mathematician?
— In an interdisciplinary environment, it's always clear what a person does until it's time to apply for a grant or a doctoral dissertation. Then the question appears about which tribe you belong to. That's why the ideas of Mikhail Kovalchuk, President of the Kurchatov Institute, about the new generation of natural philosophy really resonate with me.
Until a certain point when sciences split, there was a single comprehensive philosophy, from which natural philosophy was derived. In fact, the division of sciences that we see at present is a recent development. Today, we are witnessing a curious dialectical process: on the one hand, sciences are becoming increasingly specific and fragmented, yielding more narrow fields, and on the other hand, science and society are evolving on a spiral, so now there is a whole class of people working at the intersection of not just two, but multiple science branches. I guess we could say that a new generation of natural philosophers is coming here. I am a natural philosopher on a new turn of the spiral.

— Why do you think there has been a renewed debate about the ethical aspects of artificial intelligence?
— Artificial intelligence is one of the hottest subjects in contemporary science. The issues currently under discussion extend far beyond the technology as such. They touch upon the domain of humanity's development in the broadest sense. New AI challenges appear faster than we can wrap our mind around them. We have worked on AI ethics extensively with Alexander Kuleshov, Skoltech President and Chairman of the Russian Committee on the Ethics of Artificial Intelligence under the Commission of the Russian Federation for UNESCO, and other colleagues. Deepfakes, or images and sound generated through machine learning, are a recent case in point. Imagine being able to record on your PC a verbal presentation done by a person who had never participated in it, and share the presentation around. Many people found out about deepfakes when neural networks "revived" the Mona Lisa portrait (our Skoltech colleagues actively collaborated in this project). What this technology can do is positively mind-blowing, but this behooves us to consider a broad spectrum of social, legal, and ethical issues. It is an open secret that big data and machine learning technologies have been harnessed to manipulate the public mind, with deepfake being a powerful manipulation tool.

— A story like that happened in Germany a few years ago. A man got a phone call from his "boss" asking to transfer a large sum of money to a specified account. The voice of the "boss" had been deepfaked.
— Economic crime is just one of the possible misuses of deepfakes. Information is being falsified and manipulated on a massive scale using AI, including historical evidence. Russia has recently developed its own Artificial Intelligence Code of Ethics for Developers, a document that was developed under the National Project on Artificial Intelligence. How ethical is it to send fake images to friends, even when it's meant as a joke? Is it ethically sound to share faked images on social media? As we are dealing with these issues from a regulatory perspective, the technology continues to proliferate.

— With artificial intelligence, you can rarely, if ever, find a precise definition of what it is exactly.

— For a number of reasons, the topic is characterized by a high degree of ontological uncertainty. It is no longer "cool" to use the term "artificial intelligence" in professional circles. We prefer to speak of machine learning, a technology encompassing many different areas. There exist some 100 different definitions of artificial intelligence across expert communities, which complicates standardization and technical documentation in the field. This ambiguity gets in the way of any attempt to forge a regulatory framework: how do you regulate something you cannot clearly define?

— Can we outline some of the main types of definitions?
— Let's take a look at three basic definitions. The term "weak artificial intelligence" refers to technologies that imitate specific human or animal capabilities, such as recognition of patterns, faces, license plate numbers, or X-ray images. It's important to keep in mind that sometimes we want the machine to imitate animals or insects, not humans. For example, dragonflies exhibit stunning alacrity in recognizing and intercepting objects. We still don't know how they do it with their minuscule brain. Weak AI is for the time being the only AI we possess.
Another AI class, strong artificial intelligence, is able to simultaneously replicate a broad range of human and animal cognitive abilities, including, but not limited to in-depth comprehension of text, formulation of sophisticated scientific deductions, and full-spectrum communication with humans. This is where we are going, and there's been some progress, but it's too early to speak of strong AI as an existing thing. We don't even know yet when or on what basis it will be established. The third class, artificial super intelligence, is supposed to surpass the ultimate human talent in every way, but this is, for now, merely science fiction. Personally, I do not believe this technology will ever materialize.
A large percent of the technologies currently termed as "artificial intelligence" are about data-driven machine learning. Let's say you have a vast sample of data on hand, which you are mining for patterns. This is essentially good old statistical analysis, supplemented by a suite of novel analytical tools and extensive modern infrastructure: cameras, sensors, specialized computing devices, graphic accelerators, and so on.

— But machine learning has revolutionized many fields of science, from CERN to deepfakes.
— That's true. These technologies have solved a few problems that had long been deemed unsolvable. For a start, the algorithms redefined gaming. Although, strictly speaking, the algorithm that defeated Kasparov at chess wasn't based on data but used a technology that basically does enumerative search of variants. We do this differently now: we give the program the rules of play, then it generates a multitude of scenarios and learns from them. This is the principle underlying DeepMind's AlphaGo, which beat a Go world champion.
Another critical issue is image recognition. Researchers had struggled with it for years until suddenly, a giant leap occurred in the early 2010s. That was artificial intelligence taking over and doing a better job than humans could. That success story gave us the illusion, still espoused today, that given enough data, we can do anything.

— You don't think we can? Today, physicists have trained neural networks to make diagnoses and search for new materials.
— There are a few unresolved fundamental problems. The first one is known to experts as instability of solutions in complex-valued neural network algorithms. Furthermore, the quality of neural network solutions depends too heavily on data quality. So the machine ends up learning on limited data. If you're trying to identify cats among animals, everything will be fine until you get a manul cat or a lynx, which were not in the source sample. The system will most likely fail to recognize them as cats. Sooner or later, real life will transcend the boundaries of preset scenarios, therefore errors are inevitable in machine learning systems. Yet we frequently hear about some incredibly efficient, all-but perfect recognition systems. It is dangerous to fully trust them. There is a risk that we might start using them in areas they weren't designed for. Experts are aware that overreliance on these systems in critical industries has already led to several accidents. The question of how far neural networks should be allowed to go in their application is a very pressing one.

— Can you think of an example?
— Well, errors occur all the time in medicine, for instance, because diagnoses are always based on a specific patient sample. If you studied flu patients and you suddenly get a COVID-19 patient, the AI-based diagnostic system will initially make mistakes simply because it wasn't originally programmed with the relevant data. This problem will never go away completely because that's just how nature works: there's always something new. This is one of several universal limitations for the use of AI-based technology.

— What are some other limitations?
— There is this new class of cyber threats: Trojans in data. The danger lies in the fact that the outputs of a neural network are not analytically described, meaning you have no way of figuring out why the system made a particular decision. What is a trained neural network? It is this algorithm with trillions of parameters, and once the neural network has been trained, you can no longer make head or tail of it. Understanding how neural networks operate is its own field of science. Whoever ends up figuring out how neural networks work stands to reap immeasurable rewards.

— If that is so, how can someone place a Trojan in there?
— You can arrange the data in a specific way implementing a backdoor there and to make the trained algorithm generate the decisions you want. And you can do this openly even when both the data and the code are open-source. This fundamentally changes the entire cybersecurity philosophy. Until now, all efforts were focused on spotting Trojans/backdoors in code or hardware, but these Trojans are embedded deeper than that: they're in the very data the system learns from. To date, there is no technology capable of reliably detecting these Trojans. Which is why I treat open datasets for training with caution.

— Could you give some examples to illustrate your points?
— For example an image recognition system can be configured in such a way that the cameras won't recognize your car's license plate. Or you can apply makeup to your face in a certain creative way to go unrecognized by the cameras.
With a backdoor, you can make the system change the way in which it operates. You can set up some codeword or image, which looks just like any other, to make the system serve your purpose. If you overlay the image with a special noise invisible to the human eye, the computer will misrecognize it. For example, it may identify a person as a panda bear. This can be very funny until malicious intent comes into play. For instance, it is possible to have a backdoor embedded in the control algorithms of a driverless car, so that when the car is supposed to stop at a STOP sign, you can overlay a specific noise effect to make the car accelerate to 150 kmph instead. With some training, any student can pull off a stunt like this nowadays. So, as we can see, new cyber threats are a serious issue.

— The problem is, people are willing to trust new technology and are glad to delegate decision-making responsibility to algorithms. It rarely occurs to people to question a GPS navigator or other user-friendly systems.
— Laziness is a quality we owe to evolution. If it's okay not to think, the brain won't think, for mental activity is an energy-consuming process. This principle underlies countless marketing ploys. Why bother checking a map when you can rely on your GPS? Why memorize phone numbers when you can store them in an electronic phone book? This makes people let their guard down, but there's a price to pay eventually. When you trust an algorithm with functions critical to your livelihood, at some point, someone else may want to seize control of that algorithm.

— The Chinese social credit system offers a glaring example of how algorithms can meddle in human life directly. Using mass surveillance and big data gathering, the system awards social credit scores to every person, thereby setting their social status. Having a low score, the person is unable to get a good job, purchase an airplane ticket, and is banned from many other "privileges".
— I do not support this social credit system, like any other similar system. The key problem with social credit systems is strictly scientific in nature. Those who deploy them have to learn some math. What does a social credit system essentially represent? It is an attempt to crudely approximate reality by reducing the infinitely complex social, cultural, and other realities to a model with a finite set of parameters. But the trouble is, no universal model exists to date that could describe everything existing with the aid of a finite set of parameters. Errors are inevitable, and it's only a matter of time before they occur.
The other problem is the threat of malicious hijacking of control. Who creates these social credit scores? Who tallies them? There are always programmers, system administrators, and other professionals behind the scenes. Social credits are an easy target for manipulation. People usually feel that on a subconscious level, so no wonder social credit rankings are intuitively resented by society.

— The official implementation of the social credit system sounds so familiar it hurts: supposedly it assists in the "building of a harmonious socialist society".
— Whenever it is claimed that similar inventions help avoid social upheaval, that's a mistake. They only make things worse by exacerbating the existing inequalities and provoking discontent, which will build up until eventually it lashes back against the system, causing it to crash both on the inside and outside.

— The European Union tried to come up with a similar system, but last year they banned the use of artificial intelligence for the award of social credit scores.

— They did. I actually took part in those debates at various international forums (including UNESCO). Although I rarely concur with my EU colleagues on most key issues, in this case I backed their initiative to put social ranking on hold. I suppose the EU community is too multicultural for that. Working on their social credit model, their developers must have hit a point where they realized the model was getting too complicated. Too complicated to work out, is my guess.

Photographer: Timur Sabirov /
for “Life and Other Stories”

— In Russia, they mulled this thing called "social utility index" for a while.
— We have discussed this matter numerous times at various levels, including with Igor Ashmanov, who sits on the Presidential Council for Civil Society and Human Rights. We, the expert community, are opposed to social credit rankings on a point of principle, because we consider it to be a logically flawed and unscientific concept. As far as I know, nothing of the kind is planned in Russia any time soon.

— Let's take a look at yet another area where AI is applied: pharmaceutical industry. Experts predict that, by the year 2025, 30% of new medicinal products will be formulated by generative AI. Where in the drug development process does AI typically come in?
— It is primarily involved in candidate molecule search for medicinal formulas. It's a laborious process. You have to find one molecule out of several hundred millions known today. In fact, today we are well-equipped to generate and synthesize new molecules that don't occur naturally. Massive synthetic databases are being set up, which may also be searched for drug candidates.

— Where are all these biomedical libraries stored? Giants like Google put their data centers behind the Arctic Circle or on Arctic oil platforms where there's plenty of freezing cold water.
— One hundred million molecules actually isn't so much. The structure of a molecule is represented by a very small amount of data. A database this size can be stored on a novel PC. Incidentally, this is why pharmaceutical companies are so uptight on cybersecurity: it would be easy to steal this much data, it isn't a terabyte or even a gigabyte. The structure of a molecule takes up less than a kilobyte. It can be sketched on a piece of paper or written down in dots and dashes.
But when it comes to searching molecules in synthetic databases, we're talking billions or even trillions of units. Work of this magnitude surely needs dedicated, standalone data centers. We have them in Russia, but so far we're markedly behind most technologically advanced nations in per capita available computational capacity. Moreover, our IT infrastructure is primarily concentrated in the major cities: Moscow, St. Petersburg, and Novosibirsk, where a large data center is projected. This isn't ideal. The system ought to be distributed more widely. When some area gets a supercomputer center equipped with artificial intelligence, this by definition propagates a broad range of related technologies, creates new high-tech jobs, and makes young people want to stay in the region, not to mention many other positive aspects.

— Does it matter to a professional which to look for – an expensive candidate for cancer treatment or, say, a common one for a respiratory disease? How do the processes differ, and which has a better chance of spotting something unique and profitable for the company?
— Did you know what Western Big Pharma really cleans up on? Apart from COVID-19-related products, antidepressants and other "magic pills" are the biggest earners so far, because they're expensive and people get hooked on them for life. The profit margin can be much higher on an expensive drug for a rare disease, but the bulk of profits are earned from tranquilisers, antidepressants, schizophrenia drugs, and similar products, at least as far as the US market is concerned. Life is stressful in the West, and this dovetails perfectly with the policies pursued by both medical professionals and corporations. Entire population segments are hooked on antidepressants, such as top managers and high tech professionals in different fields. This fact is well known but never revealed outside the close-knit expert communities. Stress has a lot to do with the philosophy of success. People are all different. To some, happiness means to rule a large corporation. To others, being able to go fishing in the pond every day is enough to be truly happy. In Russia, too, these days, marketers are busy touting the "perfect" mise en scene of success, featuring an expensive car and a three-story house. I personally happen to know a great many unhappy people living in those houses. And to my regret, the "magic pill" is the drug of last resort for too many.

— Do you think startups should focus on searching candidates for antidepressants?
— That's a good question, but there's no one-size-fits-all answer. It all depends on what motivates the startup: money, recognition, or public benefit. If it's money, then they would do well to dig for molecules with the potential to earn billions. And those would not be molecules targeting some rare disease. If they want to make an ambitious statement as a team, they should take on a disease with major social implications, such as TB. But if they're out to make some really big money, they probably shouldn't do antibiotics as these drugs are case by case, they aren't a big earner. In the end, every team has to make its own choice. Numerous startups operate on funds donated by charities or on private investments. Sometimes rich people, when they do a DNA test and find out they are in a risk group for some serious disease, will invest in research in hopes of finding a cure.

Photographer: Timur Sabirov /
for “Life and Other Stories”

— What does a team tasked with creating a new drug with the aid of artificial intelligence look like? Should there be more chemists or mathematicians on the team?
— It should be an interdisciplinary team of new-generation natural philosophers — programmers, bioinformaticians, chemists, and pharmacologists. There is no universally applicable take on drug development. When I first started in in silico biomolecule design in Cambridge almost 20 years ago, working together with pioneers like Bobby Glen, Jonathan Goodman and Peter Murray-Rust, it was fascinating to learn how things work in the Western pharma industry. They had two research departments operating in parallel: one, dedicated to computational drug discovery, was filled with humming, blinking cabinets (supercomputers) and casually dressed programmers. The other place was manned by serious people in suits, supposedly endowed with some kind of chemical intuition. They just sat there, sketching new molecular structures on paper. Those two teams – one powered by artificial intelligence, the other by natural intelligence – were essentially doing the same job, working to discover candidates for chemical synthesis. It was astonishing!

— Who came out on top?
— They had the same rate of success, but the naturals did a little better sometimes. I hope it stays this way, with artificial intelligence assisting natural intelligence, not the other way around.

— Is it true that even today we don't have a drug entirely developed by artificial intelligence?
— That's a bit of an incorrectly posed question. There are plenty of pharmaceuticals on the market where AI came in at some stage in their development. On the other hand, the current regulatory system prohibits the release of drugs created entirely by artificial intelligence. You can find a candidate molecule and "cook" it in fully automated organic synthesis, but then you'll need experiments and clinical trials. With modern technology, it is possible to fully automate the production of pills, but the laws in most countries won't let you offer the pills to consumers. And that's a good thing. You know, in some countries it is legit to design bridges with licensed software and start building them without any testing. This is possible because mechanical engineering is an old exact science, whereas the human body remains largely uncharted territory.

— To locate new treatment targets in the human body – is that more of a challenge for AI?
— It sure is. But it's a trendy challenge that ties in with lots of exciting areas for thought. One is metabolic pathway modeling, 3D reconstruction of protein structures is another. It's a fact that the functions of a protein are determined by its structure. Modern bioinformatics will let us identify the primary structure of numerous target candidates. But in order to accurately match them up with therapeutic agents, their three-dimensional structure needs to be reconstructed. The key method currently employed to that end is X-ray or neutron scattering. Before that method can be applied, the proteins must first be crystallized. It is a process that isn't always successful, particularly if membrane proteins are involved. In fact, the issue of protein folding has been acknowledged as one of the foremost challenges in contemporary science. It took scientists nearly half a century to succeed in folding a primary one-dimensional structure into a three-dimensional one. The problem was eventually solved for some (but not all) protein classes, thanks to AI. Several research teams, including Google, came up with a number of successful algorithms for the problem in question. In Russia, Petr Popov in Skoltech showed excellent results. We worked on this problem together.
But it gets even more interesting when research is extended to metabolic pathways. The thing is, many drugs work like a game of billiards in our body: therapeutic effect is delivered either by their metabolites, or by derivatives of a veritable cascade of reactions. The more accurate models we build of those processes, the more sense we can get of what makes life tick. This research subject is now vigorously pursued by NTU Sirius at its IT and AI Center, working in collaboration with Sechenov University and a few other partners. Kirill Peskov and Yuri Vassilevski are the project leaders.

— I know you researched new molecules for a coronavirus drug at Skoltech. How is that going?

— We found a few promising candidates, and we largely owe this to the new methods developed by Petr Popov and his colleagues. But new drug development is a lengthy process. It's going to be a while before we can tell if those molecules are any good. The pandemic kick-started lots of regulatory processes, but the rule of thumb still holds in medicine: "do no harm". Remember how it was with certain, now illegal drugs? Cocaine once was sold over the counter as a cough suppressant, and opium as a cure for diarrhea before it transpired that both had a host of side effects. Such practices are unthinkable today, but the past lessons learned by the pharmaceutical industry can help us avoid many tragic errors in the field of AI. We have already let too many genies out of the bottles without due assessment of the risks involved.

This interview was first published on the website of Kommersant.ru magazine, July 24, 2022.