December 3, 2019
One of the biggest issues with the rise of the Internet was the ease-of-accessibility to pirated content. From music to video games, various industries affected by piracy created countermeasures to protect their content. As a result, the music industry is incredibly strict in enforcing copyright laws. Youtube videos or even entire accounts are often taken down if they include copyrighted music the user has not licensed. However, going through proper channels to license music can be very time consuming and costly. To solve this problem, Alexey Kochetkov created Mubert, an AI music generator.
Founder and CEO Alexey Kochetkov holds degrees in both computer science and music education. Mubert was born from Kochetkov’s expertise in both fields. With Mubert, Kochetkov hopes to make big waves in the music industry and offer cost-effective music solutions to video game streamers, voice assistant developers, and numerous other industries.
Furthermore, the research team at Mubert is working to create a “musical DNA”, through which music can be tailored to individuals based on their preferences and activity during which the music is being played. We sat down with Kochetkov to learn more about the product.
Alexey: Mubert is an AI music company. We’re developing an algorithm that uses artificial intelligence to generate original non-stop music for commercial or personal use which can easily be customized and streamed worldwide. In the most basic level of the app, users tap one button to choose music for a specific activity or genre. Next, Mubert will generate music for that category at a random tempo and random scale. Then, users can adjust the music by pushing the like or dislike button in the application. This is how the users train the AI to generate music closer to their tastes.
At the moment, you can’t automatically change the tempo in the mobile app. However, we are currently trying to build apps where you can control the music fully. Our first goal was to create an app where users could push one button and get non-stop music.
On the other hand, the company does offer a private API for companies who require more custom music solutions. Through the API, customers can control the AI music generator and customize it. They also have the ability to customize the input data and use it to create music.
Alexey: One day I was jogging with my friend. Every day, we would jog ten kilometers. During that time, we talked about how annoying it was to change the tracks on the playlists we were listening to. Especially when you just want to focus on your task, like jogging, it’s very annoying to have to change the tracks.
I started to tell him about an idea I had for an algorithm that would generate music based on my jogging tempo. That is how Mubert was born. The idea was to generate a non-stop infinite playlist based on one tempo and one mood. Such an application would greatly help people stay focused.
Playlists with various songs and artists don’t keep people focused very well. On that day, I grabbed five of my friends and we started to build this algorithm. A year later, we changed the team and made it into a business that could generate profit.
Alexey: Right now there is a growing trend of tightened regulations for copyrighted content. Businesses have a lot of trouble adding music to their services and apps, or playing music in public spaces. Legally, businesses need to purchase copyrighted content before using it. However, this is a very time-intensive task for them.
Generative music or AI music generators can solve this problem for businesses all over the world. Our platform allows music to be streamed worldwide. We have the full rights to all of our content. Therefore, we can stream this content globally and give access to other companies that have a contract with us.
Another problem is that musicians cannot monetize their music or sounds that easily. We solve this problem for them by paying musicians royalties to provide sample sounds and music. Our team uses these sounds and music samples in Mubert. We stream them as whole tracks or use them to create new music for our business customers. This music is copyright-protected and royalty-free.
Alexey: The most common use cases today are streaming services and voice assistants. Another large target market is music for public spaces. These use cases encounter similar problems regarding copyrighted content. For public spaces, any music played must be licensed or you risk getting fined. One big case was Peleton, a company that streams music for sports. This year they were fined 150 million dollars for using copyrighted music.
Mubert is a platform that makes it easy to get music for these use cases. With our services, you can stream royalty-free music, removing the risk of fines from using copyrighted songs.
Alexey: Mubert is very different from such algorithms. It is an API and application for generating music, but it is not a generative algorithm. The neural network itself is mainly used for sound classification, data analysis, and creating our database. This database is analyzed by musical algorithms that are based on musical rules. Finally, the streaming platform streams music to our customers.
When the music is being generated, the program simply puts sounds together. We don’t need to use neural networks to do this. However, to put sounds together correctly, sounds that match the same mood, requires using machine learning and big data algorithms. Another thing we are working on are like and dislike buttons which people can use to tailor playlists to their unique tastes. This essentially becomes a personal AI for the user and we use this data to create a personal experience for everyone that uses Mubert.
To put sounds together correctly, we need to grab small features of every sound. We have a big database of around 500,000 samples. These samples need to be accurately classified, and we need to find all the features of every sound and use them to generate music better.
There is a large set of parameters that can be extracted from each sound, for example, light sounds, fat sounds, sounds with a tempo of 120 BPM. This is the main area where AI algorithms work together. To solve different problems you need to use different algorithms. It depends on the task we are looking to solve at each step in the chain.
One of the largest problems in AI today is access to high-quality training data. Furthermore, we often need humans to annotate the data, add labels or other metadata before it can be used for training. In Mubert’s case, maintaining a database of over 500,000 sound and music samples is no easy task.
Alexey: We have a team of 2,000 musicians working on this system with us. Furthermore, we have some directors who verify the AI algorithm’s results. I think that in the future we will use these algorithms to classify the data about both the sounds and the listener.
We just began implementing the like and dislike feature. Using this data from our customers, we are trying to analyze the moment where someone “likes” or “dislikes” a particular track. We need to analyze all the features of the sounds that are being played in that moment.
Alexey: Right now I want to build a business without royalties. I want to build a subscription based business where you pay a certain price monthly and get an unlimited amount of music. However, to do this we can only use samples that we own the full rights to. Therefore, we must buy out sound or music samples in full.
We buy out sample packs, use them, transform them, and scale them to different genres, tempos, and scales. Next, we create our own samples based on sounds generated by musicians and have the rights to use them as we wish. In the end, we can sell them all over the world to any business. This is how I want to build this model for API solutions.
Alexey: Mubert helps you focus better. Unlike Spotify or other apps, Mubert doesn’t have any stops or any pauses. Our music streams are on a continuous, infinite loop. The tempo of the songs are the same. There are no changes in the mood of the music for the whole track. We have tested our prototypes with joggers and pretty much all of them said the same thing:
“Mubert helps you to become focused on the specific task the track is tailored to. It’s almost like meditation.”
With the release of synthetic voice and deepfake technology, many people worry that such technology will lead to a loss of jobs in many industries. For example, synthetic voices could pose threats to voice actors. Similarly, Open AI’s recently released GPT-2 is said to pose employment risks for writers. If Mubert or other AI music generators become mainstream it will likely have a major effect on numerous industries.
Alexey: Of course there will be some effects from AI technologies. But I think that the whole market will become stronger through them and lift the industry to another level. Just like how the emergence of recording technology drastically changed the music industry, AI will foster in a new era for the entire market.
More importantly, people want to listen to artists, not only to machine-generated songs. People mainly use machine-generated songs for background music or music for audio books.
In my opinion, we can divide music into two types: background music and artist’s music. Background music is for activities like jogging or working. With that said, technology around virtual artists has some future. If we create an AI virtual artist, I think it would be interesting to see how it evolves.
Alexey: I want to create a musical DNA, a technology through which your own personal tastes can be reflected in the sound waves that you are listening to. It can be a vocal song or background music. However, I want to create something completely personal, something so close to your taste that you won’t want to take off your headphones.
Furthermore, I would like to collaborate with synthetic voice technologies. Because I feel like Mubert and synthetic voices could create a synthetic artist! We are working right now to add vocals to Mubert, synthesize voices, and hopefully build virtual artists.
Most importantly, our analysis is not just the sound itself; it is about the moment. It isn’t as simple as “I’m working and listening to Techno music.” It’s much deeper than that. For example, “I’m working and I’m listening to this genre. The sounds are light. This percussion uses the sounds of an analogue drum”. We are not merely tailoring the music to the individual, but to their behaviour and moment it is being played.
I want to create a new layer of people’s lifestyles where they can have personalized music for every activity, for every moment of their lives.