|
|
DATA MANAGEMENT
Explore Your MP3 Files with VBA
Learn how to retrieve the metadata in an MP3 file using VBA code.
Sometimes, I'm a little late jumping onto the technology bandwagon. Despite a passion for music (I have 2 300 CD jukeboxes at home), it was only in the past couple of months that I jumped into the MP3 world when I bought portable players for my daughter and myself. Since that time, I've done a lot of research into them, and there's a lot more to them than you'd think! So, in this article, I share what I've learned and show you how you can retrieve the metadata stored within an MP3 file.
Diving in
MP3 is the shortened name for MPEG-1 Layer III (or MPEG Audio Layer III), and is an audio subset of the MPEG industry standard developed by the Industry Standards Organization (ISO). It became an official standard in 1992 as part of the MPEG-1 standard. The original MP3 audio format had no native way of saving information about its contents, except for some simple yes/no parameters such as "private," "copyrighted," and "original home" (original home means it was the original file and not a copy).
In 1996, Eric Kemp (alias NamkraD) decided some information should be included in the MP3 file itself. As part of the program "Studio3," he added a small chunk of extra data in the end of the file so the MP3 file could carry information about the audio and not just the audio itself. This tag, 128 bytes of information stored at the end of the file, is now referred to as the ID3v1 tag (table 1).
In MP3 tags, you have to fill any unused bytes (such as when the song title or artist name is less than 30 characters) with 0 (the Null character).
While the ID3v1 tag may have been easy for programmers to implement, its fixed size and lack of "Reserved for future use" space meant there wasn't really room for that much improvement, assuming you wanted to maintain compatibility with existing software (i.e., if you change the definition of the tag, existing software might stop working). Despite these limitations, though, a German audio engineer named Michael Mutschler managed to make a clever improvement on the ID3v1 tag specification. Mutschler assumed all ID3v1 readers would stop reading a field when they encountered a zeroed byte within the field. That means that if the second to last byte (i.e., for a 30-byte field, the second to last byte would be byte 29, or byte 28 if you're counting from 0) of a field is zero, you could use the last byte to store other information. He felt 30 characters of comments were too short to be useful, and most people wouldn't use it. Consequently, he suggested the Comment field be limited to 28 characters (for those who really wanted to include comments), that the next byte should always be zero and that the last byte before the genre byte should contain which track on the album this music comes from. This became known as the ID3v1.1 standard, explained in table 2.
Extracting MP3 information
Because everything in the ID3v1.1 tag is fixed-width, it's pretty easy to write code to extract this information. You can use the VBA Open and Get statements to read the last 128 bytes of the file:
Dim intFile As Integer
Dim strBuff As String*128
intFile = FreeFile
Open FullPathToFile For Binary _
Access Read As intFile
Get #intFile, FileLen(FullPathToFile) - 127, _
strBuff
After you've got those 128 bytes, you can use the VBA Mid function to extract the individual parts of the tag, or you can create a structure that matches the ID3 tag and read the data into that structure.
Of course, remember that unused bytes are going to be filled with the Null character (Chr$(0)), and trying to display strings with Null characters in them can sometimes cause odd effects. Create a simple TrimNull function to throw away everything in the substring from the first Null characters on, such as:
Function TrimNull(InputString As String) _
As String
Dim intNull As Integer
intNull = InStr(InputString, vbNullChar)
If intNull > 0 Then
TrimNull = Left$(InputString, intNull - 1)
Else
TrimNull = InputString
End If
End Function
Armed with this function, reading the ID3v1.1 tag is simply:
If StrComp(Left$(strBuff), 3), "TAG", 0) = 0 Then
strSongname = TrimNull(Mid$(strBuff, 4, 30))
strArtist = TrimNull(Mid$(strBuff, 34, 30))
strAlbumTitle = TrimNull(Mid$(strBuff, 64, 30))
strAlbumYear = TrimNull(Mid$(strBuff, 94, 4))
strComment = TrimNull(Mid$(strBuff, 98, 28))
intTrack = Asc(Mid$(strBuff, 127, 1))
intGenre = Asc(Right$(strBuff, 1))
Else
MsgBox "Not a valid ID3v1 Tag"
End If
Alternatively, if you want to go the Structure route, the code would resemble this:
Public Type MP3Info
Tag As String * 3
Songname As String * 30
Artist As String * 30
AlbumTitle As String * 30
AlbumYear As String * 4
Comment As String * 30
Genre As String * 1
Track As String * 1
End Type
Function GetMP3Info(FullPathToFile As String) As MP3Info
Dim intFreeFile As Integer
intFreeFile = FreeFile
Open FullPathToFile For Binary As intFreeFile
With GetMP3Info
Get #intFreeFile, FileLen(FullPathToFile) - 127, .Tag
If Not .Tag = "TAG" Then
Debug.Print "No tag for " & FullPathToFile
Else
Get #intFreeFile, , .Songname
Get #intFreeFile, , .Artist
Get #intFreeFile, , .AlbumTitle
Get #intFreeFile, , .AlbumYear
Get #intFreeFile, , .Comment
Get #intFreeFile, , .Genre
.Songname = TrimNull(.Songname)
.Artist = TrimNull(.Artist)
.AlbumTitle = TrimNull(.AlbumTitle)
.AlbumYear = TrimNull(.AlbumYear)
.Track = Mid$(.Comment, 30, 1)
.Comment = TrimNull(.Comment)
.Genre = TrimNull(.Genre)
End If
End With
End Function
Pretty simple, isn't it? However, Martin Nilsson (a Swedish computer scientist) and several others felt it was a too limited, so they developed another more flexible tagging standard., the ID3v2 tag. Their intent was to create something flexible and expandable. Rather than set field sizes, they wanted to have an approach that resembles HTML, where parsers are built to ignore any information they don't recognize so you can add new information types to the definition without impacting existing products. (You can read more about their design philosophy at http://www.id3lib.org/id3/easy.html.)
From what I can tell, all of my MP3s (mostly encoded within the past three or four months using Microsoft Windows Media Player 9.0) conform to the ID3v2.3 standard. However, because I'm not going to go into an in-depth analysis of all of the tags, I don't think it really matters if you know which exact standard to use (and I'll point out differences where it matters).
As I mentioned, the ID3v1 tag exists at the end of the file, which makes it unsuitable for streaming audio because you can't read the data until you've read the entire file. To get around this problem, the ID3v2 tags are at the beginning of the file. Unfortunately, the ID3v2 tags aren't nearly as easy to decode as their ID3v1 predecessors. Table 5 describes the layout of the ID3v2 tag.
It's not really my intent to explain every last detail of the tags, but I'll go into a little bit of detail about the 10-byte header to give you a feel for how things are encoded in the tag.
The ID3v2 tag header should be the first 10 bytes in the file. The way the header is defined in the Standards documentation (http://www.id3lib.org/id3/id3v2.3.0.txt and http://www.id3lib.org/id3/id3v2.4.0-structure.txt) is explained in table 6.
The first three bytes of the tag are always "ID3" (must be upper-case), directly followed by two bytes containing version information, which lets you revise the ID3v2 tag if necessary.
The first byte of the ID3 version is its major version, while the second byte is its revision number. The notation used in the standard is that numbers preceeded with $ are hexadecimal, while numbers preceded with % are binary. You can see the definition of the version for the IDv2.3.0 tag is $03 00, while it's $04 00 for the IDv2.4.0 tag. This is intended to indicate that the first byte of the version (in other words, the fourth byte in the file) will have a value of hex 03 for ID3v2.3.0, and hex 04 for ID3v2.4.0 tags. The second byte of the version will have a value of hex 00 for both types of tags. The rules state that all revisions are backwards-compatible, while major versions are not.
Following the version information is one byte designated as "ID3v2 flags." The ID3v2.4 standard defines the first 4 bits of the 8-byte flag, while the ID3v2.3 standard only defined the first 3 bits. (This is what the 0s in the format definition are intended to show). The convention used is that the most significant bit, or MSB, of a byte is called "bit 7", while the least significant bit, or LSB, is called bit 0. Although it isn't important for our discussion to know what the flags mean, I'll quickly go through them, just to ensure you understand how to read the definition in Table 6.
- The first bit (bit 7, or the "a" bit shown in Table 6) in the ID3v2 flags indicates whether unsynchronization is applied on all frames. A value of 1 indicates unsynchronization is to be applied on all frames.
Note: "Unsynchronization" was intended to make the ID3v2 tag as compatible as possible with existing software. When ID3v2 tags were first introduced, the existing MP3 players didn't know anything about them. Some of the tags contain binary data, including data that might appear to an old MP3 player as a syncframe; it would then try to play the data contained in the tag. Unsynchronization alters any such data so the players wouldn't attempt to play it. I don't believe it's used anymore, but I suppose you might encounter an old MP3 file somewhere.
- The second bit (bit 6, or the "b" bit) indicates whether an extended header follows the header. A value of 1 indicates there is an extended header.
- The third bit (bit 5, or the "c" bit) is an "experimental indicator." A value of 1 indicates the tag is in an experimental stage.
- The fourth bit (bit 4, or the "d" bit) indicates whether a footer is present at the very end of the tag. A value of 1 indicates that there is a footer in the tag.
The standards state you must clear all the other bits (set them to zero), because if one of the undefined flags are set, a parser that doesn't know the flags function might not be able to read the tag.
The final 4 bytes of the tag header represent the size of the entire tag. However, the encoding used to represent the tag size is rather unusual. The first bit of each of the 4 bytes is set to zero (that's what the %0xxxxxxx in table 6 is intended to indicate), so really, the size is encoded in a total of 28 bits (which can represent up to 256MB). This is referred to as a "synchsafe" integer. The zeroed bits are ignored, so for example, a 257-byte tag would be represented as $00 00 02 01, as opposed to the $00 00 01 01 you'd expect. (If you call the first byte A, the second byte B, the third byte C, and the fourth byte D, the tag size can be calculated as A*2^21 + B*2^14 + C*2^7 + D, as opposed to the usual A*2^24 + B*2^16 + C*2^8 + D.) The tag size is the size (in bytes) of the complete tag (after unsynchronization), including padding, excluding the header, but not excluding the extended header. In other words, the actual size of the tag is really 10 more bytes than what's indicated in the size part of the tag.
Should there be an extended header (as indicated by the second bit of the "ID3v2 flags"), it will be immediately after the 10 byte header already discussed, and its size is indicated it the first 4 bytes. Table 7 shows the definition of the ID3v2 extended header.
Because I'm not going to dive into the content of the extended header, the critical part here is to be able to determine how big the extended header is if it's present. An extended header must always be at least 6 bytes. However, because it can contain additional information (as determined by the values of 'number of flag bytes' and the values of the various bits in the flags field), you have to check what value is stored in the Extended Header Size field. (Remember the value stored doesn't include the 6 bytes of the extended header tag itself.)
Now that we've learned where the header and extended header are located, and how big they are, you can start reading the individual frames. Recognize that each frame will be somewhat different in format. Table 8 shows the official definition for the header of a frame.
The Frame ID is a 4-character text tag, consisting of the characters capital A-Z (characters 65- 90) and the digits 0-9 (characters 48-57).
Each frame header includes the size of the data in the final frame (after encryption, compression, and unsynchronization), excluding the size of the frame header (10 bytes).
Now, from what I can see, the specifications for ID3v2.3.0 (http://www.id3lib.org/id3/id3v2.3.0.txt) indicate the 4 bytes used to store the size of a frame are simply normal bytes, as opposed to the "synchsafe" format you saw in the tag header. This is certainly consistent with my experience reading the tags. However, the ID3v2.4.0 specifications (http://www.id3lib.org/id3/id3v2.4.0-structure.txt) imply it's supposed to be a "synchsafe" integer (4 * %0xxxxxxx, as opposed to the $xx xx xx xx used previously.) Unfortunately, I don't have any ID3v2.4.0 encoded files to check, so I'm only going to use the ID3v2.3.0 definition.
A tag must contain at least one frame, and there is no fixed order of the frames' appearance in the tag. A frame must be at least 1 byte, excluding the header.
Table 9 lists all of the Frame ID identifiers reserved to date. (Note, though, that PRIV frames can include any information you want).
Don't worry: I'm not going to discuss all the defined tags! In examining just over a thousand MP3 files, the only tags I found in use are those in table 10.
Note: All the tags starting with a T have a text encoding byte directly after the frame header (in other words, as byte 11 in the frame). Because all my MP3s contain English text, I'm not really sure what are valid values for the text encoding byte. All I know is that if I start 12 bytes into the frame, and go one byte less than the frame size indicates, I get the proper values.
Okay, enough description. Let's look at some code.
First, let's define a couple of "helper functions." You're going to have to convert the size bytes to an actual number fairly often, and you'll have to work with both synchsafe integers, and normal unsigned integers.
Function SynchSafeToLong( _
SynchSafe As String) As Long
Dim intLoop As Integer
Dim lngCurrByte As Long
Dim lngSize As Long
If Len(SynchSafe) <> 4 Then
lngSize = -1
Else
lngSize = 0
For intLoop = 1 To 4
lngCurrByte = _
Asc(Mid$(SynchSafe, intLoop, 1))
If lngCurrByte > 127 Then
lngSize = -1
Exit For
Else
lngSize = lngSize + _
lngCurrByte * (2 ^ ((4 - intLoop) * 7))
End If
Next intLoop
End If
SynchSafeToLong = lngSize
End Function
Function NonSynchSafeToLong( _
SizeBytes As String) As Long
Dim intLoop As Integer
Dim lngCurrByte As Long
Dim lngSize As Long
If Len(SizeBytes) <> 4 Then
lngSize = -1
Else
lngSize = 0
For intLoop = 1 To 4
lngCurrByte = _
Asc(Mid$(SizeBytes, intLoop, 1))
lngSize = lngSize + _
lngCurrByte * (2 ^ ((4 - intLoop) * 8))
Next intLoop
End If
NonSynchSafeToLong = lngSize
End Function
Armed with all this information, let me present a routine to read the tags. In what follows, I'll simply write the information to the Immediate Window. (In a future article, I'll write the data to a table.)
First, declare the subroutine and the variables it'll use:
Sub ReadMP3ID3v2Tag(MP3File As String)
Dim booExtendedHeader As Boolean
Dim booNonNull As Boolean
Dim intFreeFile As Integer
Dim lngActualTags As Long
Dim lngCurrLocn As Long
Dim lngFileSize As Long
Dim lngFrameSize As Long
Dim lngHeaderSize As Long
Dim lngLoop As Long
Dim strActualTags As String
Dim strBuff As String
Dim strCurrChar As String
Dim strCurrTag As String
Dim strHeader As String * 10
Dim strTagVersion As String
Dim varTagDescription As Variant
Check that the file exists. If it does, determine the length of the file, then open the file, read the entire file into a buffer, and close it. Note you must call the FileLen function before you open the file. As well, there's no real need to read the entire file into the buffer. If you prefer, you could read simply the header (the first 10 bytes), determine the size of the ID3v2 tag, and read only that portion of the file into the buffer:
If Len(Dir$(MP3File)) > 0 Then
lngFileSize = FileLen(MP3File)
intFreeFile = FreeFile
Open MP3File For Binary As intFreeFile
strBuff = Space(lngFileSize)
Get #intFreeFile, , strBuff
Close intFreeFile
Extract the header and ensure it's a legitimate ID3v2 tag. If it is, read the version information from bytes 4 and 5, the flags in byte 6 and the tag size information in bytes 7-10.
strHeader = Left$(strBuff, 10)
If StrComp(Left$(strBuff, 3), "ID3") = 0 Then
strTagVersion = _
Right("00" & _
Hex(Asc(Mid$(strHeader, 4, 1))), 2) & _
"." & _
Right("00" & _
Hex(Asc(Mid$(strHeader, 5, 1))), 2)
booExtendedHeader = _
((Asc(Mid$(strHeader, 6, 1)) And 64) = 64)
lngHeaderSize = _
SynchSafeToLong(Mid$(strHeader, 7, 4))
Debug.Print MP3File
Debug.Print "Tag version = " & strTagVersion
Debug.Print "Tag " & IIf(booExtendedHeader, _
"has", "does not have") & _
" an extended header."
Debug.Print "The tag is " & lngHeaderSize & _
" bytes (+ 10 bytes for the header)"
You now know how much more of the file you actually need to work with. Note I've never encountered an extended header in any of the MP3 files I've checked, so I'm ignoring that possibility. If you want to be thorough, you should check the value of booExtendedHeader, and set strActualTags to start after the extended headers:
strActualTags = _
Mid$(strBuff, 11, lngHeaderSize)
lngActualTags = Len(strActualTags)
Start reading at the beginning of the tag information. I've found that all my tags include some nulls (Chr$(0)) at the end of the actual tags, so I check that the next four characters aren't 4 null characters. If they aren't, I assume it's a valid tag and determine the size of the frame. Remember I mentioned earlier that ID3v2.3 and ID3v2.4 use different approaches to storing the frame size, so I actually look at the version (determined above) when I'm determining the size of the frame. The strTagVersion will be 03.xx for ID3v2.3 tags, or 04.xx for ID3v2.4 tags, so I just check the second character in the string:
lngCurrLocn = 1
Do While lngCurrLocn < LngActualTags
strCurrTag = _
Mid$(strActualTags, lngCurrLocn, 4)
If strCurrTag <> String(4, 0) Then
Select Case Mid$(strTagVersion, 2, 1)
Case "3"
lngFrameSize = _
NonSynchSafeToLong( _
Mid$(strActualTags, _
lngCurrLocn + 4, 4))
Case "4"
lngFrameSize = _
SynchSafeToLong( _
Mid$(strActualTags, _
lngCurrLocn + 4, 4))
Case Else
End Select
Debug.Print "Frame for " & _
strCurrTag & _
" starts at " & lngCurrLocn & _
" and is " & lngFrameSize & _
" bytes (+ 10 bytes for the header)."
As I mentioned, all tags starting with T contain text, so I'll actually report on the text contained in the frame. Remember that text frames have an additional byte (representing text encoding) as the first byte after the frame header, so the text starts 11 bytes after the start of the frame and has a length of 1 less than the size indicated in the frame header. All tags starting W are Web references. Because Web references don't get represented in other languages, there's no text encoding byte, so the text for the URL actually starts 10 bytes after the start of the frame and has a length of the size indicated in the frame header:
If Left$(strCurrTag, 1) = "T" Then
Debug.Print "Text encoding is " & _
Asc(Mid$(strActualTags, _
lngCurrLocn + 10, 1))
Debug.Print "Tag value is " & _
TrimNull(Mid$(strActualTags, _
lngCurrLocn + 11, _
lngFrameSize - 1))
ElseIf Left$(strCurrTag, 1) = "W" Then
Debug.Print "Tag value is " & _
TrimNull(Mid$(strActualTags, _
lngCurrLocn + 10, _
lngFrameSize))
Else
End If
That's it. You've read the tag! Increment lngCurrLocn to point to the start of the next frame, and start all over again with the next tag.
lngCurrLocn = _
lngCurrLocn + 10 + lngFrameSize
Remember I indicated that all the tags I've examined have Null characters at the end of the tag. I haven't found any explanation as to why, but I'm not losing any sleep over it. What I do, though, is ensure any superfluous bytes in the tag are strictly Nulls. This code loops through the remaining bytes and checks to see whether any of them are non-null:
Else
booNonNull = False
For lngLoop = _
lngCurrLocn To lngActualTags
strCurrChar = _
Mid$(strActualTags, lngLoop, 1)
If strCurrChar <> Chr$(0) Then
booNonNull = True
Exit For
End If
Next lngLoop
Debug.Print "There are " & _
(LngActualTags - lngCurrLocn + 1) & _
" characters after the last tag, " & _
IIf(booNonNull, _
"not all of which are null.", _
"all of which are null.")
Exit Do
End If
Loop
If the first 3 characters of the file weren't ID3, write out a message to this effect:
Else
Debug.Print MP3File & " does not " & _
"have a valid ID3V2 tag."
End If
Finally, for the sake of completeness, write out a message if the file passed to the routine doesn't exist.
Else
Debug.Print MP3File & " does not exist."
End If
End Sub
Complete
I bet you had no idea there was that much information stored in your MP3 files. I hope you can use the information in this article to uncover the metadata in your MP3 files. Happy listening!
Doug Steele has worked for many years with databases on mainframes and PCs. He has been recognized as an Access MVP for his contributions to Microsoft-sponsored newsgroups. http://I.Am/DougSteele AccessHelp@rogers.com
ARTICLE INFO
Web Edition: 2004 Week 49, Doc #14849
FREE ACCESS
Keyword Tags: Audio, Data Integration, Database, Database Management, Digital Media, Microsoft, Microsoft Access, Microsoft Office, Microsoft Windows, MP3, Music, VBA, VBA - Visual Basic for Applications
|
|