Welcome to our stems library!

Isolated-tracks.com — Professional multitracks / stems library

Technical Background: Kinds of Backing Tracks
In the meantime, let’s talk more about the benefits of multitracks and the reason this project was created. In this article, we’ll touch on some technical topics that are often uncertain or misunderstood. We’ll also share our experience and give you a look at our vision and approach.
Let’s take a moment to consider where all phonograms come from—and what kinds of backing tracks exist.
The No. 1! Original Studio Stems
First, let’s talk about original multitracks.
Some original stems are shared by the songwriters or artists themselves. Many artists and bands intentionally release their most technically challenging, commercially successful, or simply popular songs in multitrack format. It’s a way to gain respect in the professional community—among fellow musicians and sound engineers. These multitracks can be used by those interested in the art of mixing, as well as other musicians and DJs.
Studying the multitracks of legendary hits is an incredibly valuable experience! Imagine working with original studio stems from Michael Jackson, Freddie Mercury, Deep Purple, or countless others. Studio stems are usually raw sources. You may have the physical files, but you don’t have the original hardware racks, mixing consoles, or other gear used during recording. Creating a good mix is still a complex and creative process. That’s why even with original multitracks, it’s very difficult to fully reproduce the palette of the original mix!


No. 2: Original Stereo Backing Track
The next type of phonogram is the stereo mix of an original song without vocals.
These tracks are usually released by the artists themselves or their production team. Compared to original multitracks, the main difference is that all instruments are already bounced into a single stereo track.
The upside? You don’t need to put in much effort to create a polished final mix—the work is essentially done for you. The downside? You lose the flexibility. Since everything is locked into a finished stereo mix, you can’t isolate or transform individual parts. One wrong move can ruin the whole mix!
Sometimes, though, you might get lucky and find different versions of these original stereo tracks—such as ones with or without backing vocals. That’s always a great find!
No. 3: Popular, but Not Flexible
The next type of backing track is karaoke—either as a video with on-screen lyrics or an audio file (often in MP3 format) with subtitles or lyric cues.
This is by far the most common and widely available format. There are countless karaoke providers, often with professional musicians and engineers working behind the scenes to produce an enormous catalog of songs. If a track becomes popular, chances are you’ll find a karaoke version of it almost immediately. But if the song isn’t well-known, it might never get one.
Karaoke’s biggest strength is its simplicity. Anyone can use it—no technical skills needed. Just hit play, follow the lyrics, and sing along!
The limitation? It’s not customizable. You can’t adjust or remove instruments, change arrangements, or personalize the track. For musicians who want flexibility, karaoke quickly shows its limits.


No. 4: MIDI Format — Too Many Dependencies
Another interesting type of backing track is MIDI-karaoke. You can often find MIDI files of popular songs in open sources. But MIDI works very differently from audio formats.
Unlike multitracks or stereo mixes, a MIDI file doesn’t store sound itself. Instead, it contains instructions—events like Note On, Note Off, Aftertouch, etc.—that tell a synthesizer what to play and when. MIDI was originally designed as an interface to communicate commands between devices, not to capture the physical characteristics of real-world sounds.
That means the result depends heavily on the synthesizer or software you use. When you play a MIDI file, your hardware or software synth uses its built-in timbres to generate sound. The quality and variety of those timbres are limited by the synth. Even worse, if you switch to a different synth, the same MIDI file may sound completely different—and not necessarily better.
You can import MIDI files into a sequencer (Cubase, FL Studio, Pro Tools, etc.), load VST instruments, and build a meaningful result—but this takes technical skill, time, and the right tools. And keep in mind: most MIDI files online are created by hobbyists with limited training and a poor ear. So the majority are low quality.
In theory, MIDI offers huge opportunities for creating a custom mix. But in practice, it demands technical expertise, patience, and lots of setup. Even for experienced musicians, working with MIDI sources can be frustrating and time-consuming.
No. 5 & 6: Cutting and Voice Removal — “I Believe in Miracles!
The next two heroes of our story are such backing tracks created by cutting of original songs or produced with the help of existing voice removal tools. In the case of cutting, there're no difficulties it this thing. Open any editor (Audacity, Sound Forge, Abobe Audition), select any fragment of an original song without voice and copy it again and again. But problem is that the song's arrangements are very sophisticated sometimes and you'll not be able to find a good piece to copy-paste a bridge, for instance. That is where the work of voice-removal and voice-reduction instruments.
Let's look at what's happening with the sound when we apply voice-removal, why is this so ugly and your ears go numb when you listen to such phonograms.
Voice removal tools make a new copy of an original mix where the vocal is looking like deleted with the help of phase inversion. It’s not a secret that sound producers prefer to place the main vocal in the center of stereo panoramas traditionally (by the way, the most famous violators of this rule were the Beatles). That’s why if to divide stereo track of any song into 2 independent mono tracks, turn over a phase of one of these mono tracks and put it back into the initial stereo track, the level of the main vocal will decrease significantly, it will be practically deleted. Practically, yes. In fact, it’s not so simple. Not only lead singer's signal is destroyed this way! Not only the main vocal but also other instruments are located in the center of stereo pan. These important instruments include bass and kick. While we increasing the level of phase inversion, distortions will appear inevitably and be noticeable by ear. Bass and kick will suffer first (it causes psychoacoustic factors). Strictly speaking, 99 percent of sounding instruments are presented in the center one way or another, and the whole phonogram will suffer. It’s worth considering that a vocal part is not only a dry voice but spatial effects applied to it (such as reverb or delay). This is a lion's share of a good commercial mix. In a modern digital epoch, these effects are simple to use and almost don’t cost anything. Let's take to account that delays and revs occupy all width of the stereo base and fly away all over the stereo panorama. Their sounding merges with all the rest instruments, and the so-called masking effect acts. By turning the phase over, you can damp vocal but couldn’t delete ends left from its proceeding. Even if you decide on making it, you will have to delete all music, leaving just the stereo panorama periphery, its leftmost and rightmost points. That's why phonograms got by the voice-removal method have unpleasant characteristic artifacts and overtones, being easily differentiated while hearing and not disposing of high sound quality. For those, who still want to experiment with vocal deletion, we have a fully automated online tool. You can try in by this link. Download any file in mp3 format and receive a result in the form of the link for downloading!


Where is the truth?!
Folks! Seriously?! Seemingly... it's the 21st century, we have a digital sound, technical progress, AI, LHC, NASA, but an ordinary musician still stay restricted. The reason is not in that scientists do not think about musicians, the reason is that the music has a very complex physical and psychological nature, perception of music is grounded on some psychoacoustic phenomena, and all these things are interwoven into our concept, and poorly formalized. It makes music hardly preparable even with the most advanced existing technologies! We can say that we're only in the beginning.
So, let's make interim conclusions of our discourse. Voice removal tools are simple, but they work as uglyfiers, with a huge loose of quality. Karaoke is commonly accessible but not flexible. MIDI-karaoke format (*.mid and *.kar are the same) is very flexible, but it requires synthetic knowledge of music and audio production tools, time, and equipment.
What should we do? Would you like to record all parts yourself? Do you play all the instruments well? Do you have backing vocals or you sing well? Is your hearing good enough to transcribe all parts exactly? Do you have a lot of time to do all this? But what if you have a busy schedule, and tomorrow you need, let's say twenty new backing tracks? All that can only lead to one conclusion. Let us finally say to you this magic word...