We believe in a future where music will no longer be considered a linear composition, but a dynamic structure, and musical composition will extend to interaction. We also believe that the divisions of composer, performer, and audience will be blurred, by the introduction of such media.
Block Jam is a musical interface controlled by the arrangement of 25 tangible blocks. By arranging the blocks musical phrases and sequences are created, allowing multiple users to play and collaborate. The system takes advantage of both graphical and tangible user interfaces. Each block has a visual display and a combination of a gestural input and a click-able input. Each Block metaphorically contains a sound group that can be chosen via the gestural input, the click-able input changes a block functionally. Thus, musically complex and engaging configurations can be rapidly assembled. The tangible nature of the blocks and the intuitive interface promotes face-to-face collaboration, and the presence of the GUI allows for remote collaboration across a network.
By creating both a tangible and a visual language, we are able to create endless meaningful musical structures in a novel and intuitive way that predisposes itself to collaboration and exploration, face to face or via a network, pushing interactive music towards the casual user.

Figure 1 Photograph showing arranged tangible blocks, note the different colors indicating the different sound groups.
Our goals for Block Jam are:
To reach these goals, an interface must be intuitive, simple and engaging, hence, our decision to use and to extend the tangible interface paradigm.
Block Jam was influenced by the Triangles project [2] created at the MIT Media Lab Tangible Media Group. The Triangles project attempted to create a tangible interface where each triangular block held no meaning; meaning was derived from the topological structure of the information relative to the ascribed content. It was used to tell stories and control various media by connecting the triangles together. The resulting outcome was displayed on a separate display, creating a discrepancy between the display space and the input space.
Unlike Triangles, the Block Jam blocks have meaning. The cuboid shape was chosen because when multiple cuboids are added together the shape implies direction along a given axis. Furthermore, each block depending on its type and status has varying explicit functions, such as controlling the direction of musical flow (for more details see the description of Interacting with Block Jam). Additionally each block has a built-in display and interactive input.
Augmented Groove is a musical controller based on modulation through interaction with physical records, which were tracked using computer vision [1]. However, despite the engaging qualities of the interaction, the user was not able to compose their own musical sequences, the interaction was limited to modulation. The display in Augmented Groove was also separated from the input mechanism.
Musical Trinkets [4], is a tangible interface that used clever passive RFID tracking technology in multiple dimensions, that allowed the user to issue musical commands, such as play a note via simple toy like objects. Though engaging, the interface was not designed to create complex musical structures. Additionally each object didnt interact functionally with the other objects.
There are also non-contact interfaces (see review in [5]) such as gesture sensing and posture sensing (data gloves etc.), although very expressive, suffer from lack of haptic feedback and as a result a deficit in precise input. The musical results are improvisational and thus, transient and difficult to reproduce or modify after the fact.
As we can see, while tangible interfaces provide a tactile feedback and are easy to use, most previous systems assumed the instrument as a metaphor, resulting in the same problem as non-contact interfaces. The focus of interaction is performance. It can be difficult to dynamically modify, replay, or edit what they have created.
Unlike previous systems, Block Jam is a tangible interface that is aimed at creating musical structures assembled from simple blocks. The nature of the blocks allows the music to be modified, thus the creation of music is not static, allowing for creation and performance at the same time. Interestingly, because of the simplicity of the interface, if the blocks are disassembled, the same configuration can be easily recreated.
In designing Block Jam we tried push the tangible interface further, by combining a visual display with the block allows us to tightly couple the blocks interactive input to the output.
A further innovation, as the application of the E-Mesh technology created by Dr. Jun Rekimoto [3]. It uses an array of capacitive sensors for multiple hand tracking. Block Jam uses a reduced version of this technology for gesture recognition.
All the blocks consist of:

Figure 3 photo of a Block showing the connectors on each side
All blocks contain one group of sound known as a Sound Pack. There are two types of block; one is known as a Path Block, the other is known as a Play Block. They can be arranged by snapping the blocks together, forming a structure.
A musical sequence is originated from a Play Block, when the user presses the display it activates the block and starts to play the sound. The sequence then jumps to the next block in the arrangement. The Path Block denotes which adjacent block the sequence will move to next. Thus, the sequence will continue jump from block to block until it jumps to a location where a block is not placed. When this occurs the sequence will jump back to the Play Block it originated from, thus creating a loop.

Figure 4 Using the click interface to choose a blocks function
The Play Blocks is denoted by the play symbol shown on the LED display. The Path Block iconically shows the direction of the next jump. The directions that Path Blocks can specify are:
Clicking the block as shown above chooses the functions.
Additionally, a Play Block stops a sequence and denotes the speed of a sequence. Because we have multiple Play Blocks, we can start multiple sequences at the same time, creating musical complexity.
As a sequence jumps from block to block, a sound is triggered. The sound played is determined by two variables, the chosen Sound Pack in each block and the speed of the sequence. The speed of a sequence is determined when the sequence is originated at the Play Block according to how long the user depresses it during the click (see video). A sequence can travel at one of 3 speeds, slow, medium, and fast, which is visually represented by the LED display:
A Sound Pack contains three musical elements, one for each speed. Thus, the sound played from a given Sound Pack is appropriately mapped to the speed of the sequence. A fast sequence will trigger a short musical phrase or sample and by the same logic a slow sequence will play a slower (but semantically related) phrase or sample.
Because we have multiple Play Blocks and Path Blocks, we can have multiple sequences playing simultaneously (one sequence per Play Block). This, combined with the different sequence speeds creates musical layering. Giving the user the ability to interactively layer the musical sequences, which massively increases the users engagement and sense of control.

Figure 6 Using the gestural interface to choose a blocks Sound Pack
A user can intuitively choose the Sound Pack in each block by using gestural interface. Using a circular array of infrared opto-reflector sensors embedded into the display, we can detect simple gestures. In Block Jam, we detect a circular dial gesture to count through the available sound packs. The packs are grouped into three colors, packs 1-5 = red, 6-10 = orange, and 11-15 = green. This allows for easier identification of which sound pack is currently in which block, for example all the red blocks could contain guitar sounds, orange blocks contain vocals and green blocks contain percussive sounds.

Figure 8 The Graphical User Interface
Block Jam uses the same interactive block metaphor for both the tangible interface and the on-screen interface. When the tangible blocks are manipulated, the activity is mirrored on the on-screen interface.
We have an on-screen interface as a means of overall control for the system, allowing the user to load in new Song Packs (collections of Sound Packs), to interact with other users across a network, and to enable the users to interact with the system without having to use the tangible blocks. By having an on screen version of the interface, the system can be represented on many platforms, and networked into one group experience.
What will broadband bring to music? Will it give us more than the ability to buy, stream, and download music online?
We want to put the group experience back into music. We understand that the musical experience changes with technology. Musical technology allows greater control and more possibilities than ever before and greater access to the beginner or novice. The way we receive and listen to music is also changing, not just in terms of Low-Fi to Hi-Fi or Phono to Tape to CD to MD to MP3, but also in terms of experience, music is moving from a social experience to a personal experience, from campfire to orchestra to living room to Walkman. Degrees of separation have occurred between the composer and the performer, the performer and the audience. This trend continues. Inversely, technology is moving towards community, towards the group, towards the network.
We would suggest a need for interactive music networked technology. We are not suggesting a musical toy or a novel musical instrument, but a discreet networked controller allowing multiple users the ability to control or modulate different elements of a single musical construct, to create a shared musical experience.
We envision a musical experience that can be shared equally by the novice or the musically adept, retaining a notion of author, composer, yet allowing the user enough creative flexibility to add their own stamp. When users actions are contextualised by the actions of other users, the experience will become shared, and perhaps, be not so far from the campfire of before.
If such a system is to come about, perhaps it will change the way that we define a musical experience, and hence, a way to approach it creatively and technically. Will we necessarily record music in a liner fashion, or will we prepare it so that it renders in any number of ways? The notion of music as renderable media frees us from its linear constraints, as 3D rendering has freed the movie industry and game industry.
Perhaps, soon listening to the Walkman wont be so lonely.