My newest problem is one that I knew I’d come across eventually: What do I do with songs that have missing data? This most recently came up when I was adapting Thomas Lidy’s rhythm feature code… It couldn’t open some of the MP3s in my test set, so I have no rhythm feature data for those songs. Lacking a better idea, I just gave them the mean values of all the other songs. But this doesn’t seem right… I can’t really give them any value. But if I don’t give them any values, the PCA can’t process these tracks anymore; I can’t put them in the space at all.
So, not sure what to do about these songs. Anyone have any suggestions?
Does anyone know of some projects/papers on spatially-based organization of digital music collections? I’m trying to gather information for writing my thesis’s “Background” section.
Here’s what I have so far:
I am going to implement a mode in my interface that is completely free-form, allowing the user to assign tracks to locations however they want. Does anyone know of projects that allow for this kind of music library organization?
The responses I got to my previous post about evaluation requirements for my thesis pretty much boils down to this: I need to clarify what I hope my contribution will be before I can decide how to appropriately test it.
To make my goals more clear, here are the kinds of questions I’m trying to ask with this new interface:
- Does this spatial interface give a better understanding of the overall scope of your music?
- Is the music->visual mapping intuitive?
- To introduce the notion of “really fuzzy searching”… Is a spatial representation more appropriate than text-based lists for music browsing?
- Does the interface help you see/examine your listening patterns?
- Does the interface help you see the relationship between multiple people’s music libraries?
- Can you more easily find music of a certain style/type, or for a particular activity, than you can with a more traditional music browser?
- Can looking at your music and someone else’s music at the same time in the space help you find recommendations for new music?
- Does this interface change what things are important when it comes to looking at your library? Do you find that you are looking for, or thinking about, different sorts of things than you would with a traditional music browser?
- Does the interface make you more aware of the context for your music-browsing decisions?
I’m at a big decision point in my thesis. I have a very primitive music browser implemented in both 2D and 3D. I want to choose the number of dimensions (2 or 3) for my main project before I move too much farther in developing the interface. I just don’t have time to develop them both.
My biggest concern: I had been pushing for a 3D interface throughout the proposal process, but I’m worried that continuing with it will force me in my remaining time to focus much more on elements of 3D interfaces (e.g. how to orient the user, how to show the overall cloud shape despite obscuration) than on elements core to my own thesis motivations (e.g. how to organize music, how to find patterns in music listening).
I think a 2D interface is currently more easy to develop than a 3D interface, and that perhaps I should focus on only two dimensions and have a better chance of making an interface that demonstrates all the things I had hoped to show (outlined in my proposal).
In the end, my thesis is not about interfaces; it is about the organizational model itself. That organizational model is the use of audio and contextual data to organize a music collection in a fuzzy manner that I think is more appropriate for this type of data, in addition to providing others with a framework to add onto it, both in terms of input features and output interface. This approach is in opposition to what we see in most music browsers (well, and data browsers in general), which limit organization to non-configurable lists and, ultimately, text labels.
So, my thesis work becomes: (1) an implementation of this organizational model, (2) made publicly-available, along with (3) demonstration(s) of an interface built on top of the model. An analog to this manner of thinking is the Echo Nest’s recent announcement of their AudioAnalysis API. Last year, they made this tool (1) available to others (2) — it gave me numbers, and I built an interface on top of it (3). In this thesis, I am the one providing the numbers, and letting others build interfaces on top.
Even though the main contribution is the model, I will demonstrate one such interface with a 2D representation of a music collection that is user-configurable and dynamically updated through RSS feeds.
Here are the main questions:
- Am I losing something integral to the project if I move down from three dimensions to two?
- Is this line of thinking (that my contribution is more an organizational model than an interface) too dangerous?
- Am I contributing enough?
“For the system you are designing, I would put an emphasis on first and foremost building a really killer application based on your underlying research, grounding this in a discussion of important precedents and your reasons for doing this work, following the design process and how important decisions were made, capped with your own critical analysis of the system, including what works best, what remains most problematic, and what major challenges remain. If you wanted to do some limited user testing, that would be fine, but for this particular project I would be more interested in your ideas and your evaluation.”
John, Paul, and Henry: Please let me know if you have any concerns about my plans for the evaluation part of my thesis work.
Well, the first passes at visualizing my test music libraries are promising. But it’s really hard to test when I can’t play the tracks from within the visualization (“Why is that track all the way out there?” ). So I’m just going to jump in and start building the UI, including the 3-D interface.
I am thinking that I will use Java 3D, with the special bonus of it already having spatial sound support. I’m imagining an interface somewhat like this virtual solar system, but with songs/playlists instead of planets/stars.
[Paul, you used Java 3D (well, jplot3d) for SITM, right? Do you think this is a good way for me to proceed? I know you've given me some code, but I think it will be faster for me to just build my own at this point, especially since the interface will be really different from SITM anyway.]
Any advice or cautions about Java 3D, or 3D interfaces in general, is most welcome.
What’s a good Java package for scatter plots? I just need to plot points with labels, no interactivity.
I’m getting more into my coding, and now am trying to answer the question, “What is a feature?” Specifically, what are the features that I can glean from the audio to get a meaningful distinction between one song and the next, and what is the general description of this thing I call a “feature”.
My project is not intended to be focused on figuring out or implementing these features; I am focused on bringing them together into a navigable representation. But in the design of the Feature class I’m finding myself wondering:
- Do features operate at the song level, or at the section level? I should have the ability to deal with either type, but then, I am sometimes mapping song sections, and sometimes whole songs. What do I show to the user in the interface if I’m only really mapping a section of a song?
- Should I try to choose one section to be representative of the piece as a whole, and just do my analysis on that section?
- What are the must-have features people have already written code for, that can be easily adapted and plugged in to my engine?
- What kind of rhythm-based features can I pull out? (I mention this because I am sorely lacking in the rhythm arena.)
I will start with features like this (for each track):
- number of sections
- number of types of sections (counted by timbre type)
- number of types of sections (counted by pitch pattern type)
- mode of timbres
- mean level of specific timbre coefficients (coefficients shown visually at the bottom of this page)
- mean loudness (or max, maybe)
- confidence level of autotag assignment with tag1, tag2, tag3, etc… (multiple features here)
- frequency of appearance of tag assignment with tagA, tagB, tagC, etc… (multiple features here)
- time signature
- time signature stability
- track duration
(Note that, right now, I am not talking about similarity measures for pairs of songs, but rather quantifiable measures for one song at a time. I’ll deal with similarity later.)
Here are the slides from my presentation today.
Key points raised during the question session:
- Concern about a 3D interface… will it be too complicated? (Dave/Sajid)
- Is there a commonsense language for music? (John/Henry)
- How can the elements of soundsieve be maintained and built upon here, at this higher level? (Tod)
- Will moods be represented in the space? (Paul)
- Interesting social aspect possible. Maybe you can even “bump up against” someone else’s music, and influence movement/representation in their musical space. (Dave Merrill)