D3, Conceptually

Lesson 3: (Moderately) Advanced Data

originally published 13 december, 2012

index

3.1 Dearly Data'd Descendants
I know, I know: I promised at the end of Lesson 1 you would get to see some crazy animations soon. Well, soon isn't yet. Soon is in the future. Now, sadly, we get to spend an entire lesson not making eyeballs bleed and potential romantic partners swoon. We shall, merely, take a deep dive into data (sorry to the rebels; there will be no sidetrack into lore).
Back in the first tutorial, there was a short example involving sugar, spice, and appending nodes. Let's revisit it, but as a data-driven example using what we've learned in previous lessons:
<style type='text/css'> .thing { border: 1px solid black; margin: 5px; padding: 5px; width: 200px; } .thing.nice { background: #dfd; } .thing.icky { background: #fdd; } </style> <div id='chart'></div> var things = [ {name: 'Sugar', isNice: true}, {name: 'Spice', isNice: true}, {name: 'Toe Fungus', isNice: false}, ]; d3.select('#chart').selectAll('div.thing') .data(things).enter() .append('div') .classed({ 'thing': true, 'nice': function(d) { return d.isNice; }, 'icky': function(d) { return !d.isNice; }, }) .text(function(d) { return d.name; });
Not the same as before, but close. We use one new feature - the classed operator. This is a shorthand for accessing the classList property on nodes (or an equivalent if the browser is old). Usually, setting the class through an attr call is enough, when setting multiple independent classes based on data it proves convenient.
Now, we can't leave this as-is – our viewer might be colorblind! By adding a little description to each div, we can help set that right. We can easily add a description field to each of our data objects, but how to get that into a new div as a child of the thing? With append, of course!
<style type='text/css'> .thing { border: 1px solid black; margin: 5px; padding: 5px; width: 200px; } .thing .description { margin-left: 1em; color: #666; } .thing.nice { background: #dfd; } .thing.icky { background: #fdd; } </style> <div id='chart'></div> var things = [ {name: 'Sugar', isNice: true, description: 'A tasty, sweet thing.'}, {name: 'Spice', isNice: true, description: 'A tasty, savory thing.'}, {name: 'Toe Fungus', isNice: false, description: 'A not so tasty, pungent thing.'}, ]; var things = d3.select('#chart').selectAll('div.thing') .data(things).enter() .append('div') .classed({ 'thing': true, 'nice': function(d) { return d.isNice; }, 'icky': function(d) { return !d.isNice; }, }) .text(function(d) { return d.name; }); things.append('div') .attr('class', 'description') .text(function(d) { return d.description; });
Magic! The div.description nodes that were appended never had .data called but somehow still have data bound (unless the nodes were instead touched by His Noodly Appendange, but I doubt that). I'm going to go out on a limb here and say it isn't magic. Rather pedestrian, and expected, behavior, sadly. D3 automatically propagates data from parent nodes to child nodes. Whenever you use append, the new node inherits the bound data from its parent. We can go deeper, even:
<div id='chart'></div> var nums = [5, 2, 8]; var numDivs = d3.select('#chart').selectAll('div.num') .data(nums).enter() .append('div') .attr('class', 'num') .text(function(d, i) { return 'First-level! data:' + d + ', index: ' + i; }); var subNumDivs = numDivs.append('div') .text(function(d, i) { return ' Second-level! data:' + d + ', index: ' + i; }); subNumDivs.append('div') .text(function(d, i) { return '  THIRD-level! data:' + d + ', index: ' + i; });
In addition to inheriting the data of the parent, the index is also inherited.
3.2 Data in your data
The last section may have raised an interesting question. Or, potentially, not. It wasn't exactly obvious. The question is: if we can inherit data, can we also bind new data? Let's try!
<div id='chart'></div> var nums = [5, 2, 8]; var numDivs = d3.select('#chart').selectAll('div.num') .data(nums).enter() .append('div') .attr('class', 'num') .text(function(d, i) { return 'First-level! data:' + d + ', index: ' + i; }); var subNumDivs = numDivs.append('div') .data(42) .text(function(d, i) { return ' Second-level! data:' + d + ', index: ' + i; });
Hrm. That didn't work. Wait... data calls we have seen previously all take an array; we only gave it a number.
<div id='chart'></div> var nums = [5, 2, 8]; var numDivs = d3.select('#chart').selectAll('div.num') .data(nums).enter() .append('div') .attr('class', 'num') .text(function(d, i) { return 'First-level! data:' + d + ', index: ' + i; }); var subNumDivs = numDivs.append('div') .data([42]) .text(function(d, i) { return ' Second-level! data:' + d + ', index: ' + i; });
A little better - we gave it one number, and we appear to have one thing modified. But wait - shouldn't there have been an error? Or some sort of... badness? We had 3 parent nodes, but only specified a single data. Those other two nodes should be, I don't know, something? Well, nope, D3 doesn't care if you provide less data than elements (or more, as you've seen with the selectAll-data-enter pattern). This may come in handy later, just saying.
If you examine the resulting HTML, you'll note each div.num has a div child, but only one of them has any text. Aha! If only we put three elements in that array, then everything should work. Right...?
<div id='chart'></div> var nums = [5, 2, 8]; var numDivs = d3.select('#chart').selectAll('div.num') .data(nums).enter() .append('div') .attr('class', 'num') .text(function(d, i) { return 'First-level! data:' + d + ', index: ' + i; }); var subNumDivs = numDivs.append('div') .data([42, 43, 44]) .text(function(d, i) { return ' Second-level! data:' + d + ', index: ' + i; });
Right! Elated with your success, one might show off in that way programmers like to show off - doing something outdated and also trendy. In other words, using table elements, but using them as intended and not for layout hacks.
<div id='chart'></div> var tableRows = [ [1, 2, 3], [4, 5, 6], [7, 8, 9], ]; var table = d3.select('#chart').append('table'); var trs = table.selectAll('tr') .data(tableRows).enter() .append('tr'); trs.append('td') .text(function(d) { return 'TD - my data is ' + d; }) .data(/* harumph... what can we put in here? */);
Well, snap, that didn't work. Two problems: we don't know what to put in the data call, and we only have three tds. In order to solve these most annoying problems with our matrix, you'll need to take the red pill and see how deep the data rabbit hole goes.
3.3 The Red Pill
So, shall we blow your mind? Maybe? Maybe it isn't actually that impressive. Maybe I should just go home and stop writing... or not. I've really got nothing better to do, honestly.
<style type='text/css'> .position { color: #666; } </style> <ul id='ship-list'></div> var ships = [ { 'name': 'Enterprise', 'crew': [ {name: 'Picard', position: 'Captain'}, {name: 'Riker', position: 'Tool'}, {name: 'Data', position: 'Android'}, {name: 'Worf', position: 'Security Officer'}, {name: 'Troi', position: 'Counselor'}, ], }, { 'name': 'Millenium Falcon', 'crew': [ {name: 'Hans Solo', position: 'Pilot'}, {name: 'Chewie', position: 'Pilot'}, ], }, ]; var ships = d3.select('#ship-list').selectAll('li.starship') .data(ships).enter() .append('li') .attr('class', 'starship') .text(function(starshipData) { return starshipData.name; }); var shipCrewList = ships.append('ul') .attr('class', 'crew-list'); var crewMembers = shipCrewList.selectAll('li.crew-member') .data(function(starshipData) { return starshipData.crew; }).enter() .append('li') .attr('class', 'crew-member'); crewMembers.append('span') .attr('class', 'name') .text(function(crewData) { return crewData.name; }); crewMembers.append('span') .attr('class', 'position') .text(function(crewData) { return ' [' + crewData.position + ']'; });
So. Yeah. That. Not sure what I'm talking about, with all the code above? Well, mostly these lines: var crewMembers = shipCrewList.selectAll('li.crew-member') .data(function(starshipData) { return starshipData.crew; }).enter() Two things immediately pop out - data is a function, not a constant. And selectAll is called on something already selectAll-d. Far out! Never before have we seen these things. If you need to take a break, sit down, cool off, think about what you've just seen—go right ahead. I'm not going anywhere. Back? Great.
The easiest one to explain is the data call. Just like attr and style calls you have seen before, data can also take a function. Works just like the others, too. Not much to explain there, really. Each node has data, and we can bind data based on it.
The harder part is the line before that – the selectAll called on an existing multi-element selection. It works kind of like you might expect when you're not binding data. Consider the following example, with no messy data to get in the way:
<div class='outer'> <div class='inner'></div> <div class='inner'></div> </div> <div class='outer'> <div class='inner'></div> <div class='inner'></div> <div class='inner'></div> </div> var outerDivs = d3.selectAll('div.outer'); outerDivs.insert('div', '.inner') .text(function(d, outerIndex) { return 'I\'m outer div number ' + outerIndex + '!'; }); var innerDivs = outerDivs.selectAll('div.inner') .text(function(d, innerIndex) { return '»I\'m inner div number ' + innerIndex + '!'; });
The insert method used in the example is new. It takes a first argument of a node name, like append. A second argument is also required: a selector string. For each element in the selection, insert creates a node and inserts it as a child of element before the node returned by element.querySelector(selector) – in the example, inserting a new div before the first child element with a class of "inner". We use it to append some text to a node without removing the child elements, as the text operator clears any content—including children—of the node.
The example is either a surprise to you, dear reader, or completely expected behavior. After all, if one can write d3.select(...).selectAll(...), why couldn't one write d3.selectAll(...).selectAll(...)? After all, select and selectAll both return d3.selection objects, and selection objects have both selection methods. Why not descend into meme-land and put a selectAll in our selectAll? I do believe, in fact, to have overheard a discussion in which it was insinuated you enjoyed selections. I hope I was not mistaken; that would be most embarrassing.
Yes, yes, back to the matter at hand—in the example, the line d3.selectAll('div.outer') results in a selection that, conceptually, looks like this:
<div class='outer'>

<div class='outer'>

Where the line d3.selectAll('div.outer').selectAll('div.inner') results in a selection that, conceptually, looks like:
<div class='inner'>

<div class='inner'>


<div class='inner'>

<div class='inner'>

<div class='inner'>

There are two outer lists, and each of those has the corresponding inner divs from the outer div. An operation on this nested selection modifies all the inner nodes, just like an operation on the outer selection modifies all the outer nodes. Each block of nodes, however, is treated as its own list for numbering. This is why, in the code example, the second set of inner nodes claim to have numbers starting again at 0.
Or, in more detail:
<style type='text/css'> button { display: block; width: 400px; } .outer,.inner { margin: 5px; padding: 2px; border: 1px solid black; display: inline-block; color: #666; } .selected { color: black; border: 1px solid green; } </style> <div class='outer'> <div class='inner'>not selected</div><br> <div class='inner'>not selected</div><br> </div><br> <div class='outer'> <div class='inner'>not selected</div><br> <div class='inner'>not selected</div><br> <div class='inner'>not selected</div><br> </div><br> <div id='buttons'></div> var options = [ 'd3.selectAll(\'div.inner\')', 'd3.selectAll(\'div.outer\').selectAll(\'div.inner\')', 'd3.select(\'div.outer\').selectAll(\'div.inner\')', 'd3.selectAll(\'div.outer\').select(\'div.inner\')', ]; d3.select('#buttons').selectAll('button') .data(options).enter() .append('button') .text(String) .on('click', function(d) { d3.selectAll('div.inner') .text('not selected') .classed('selected', false); var nodes = eval(d); nodes .text(function(d, index) { return 'node index ' + index; }) .classed('selected', true); });
We now have the two pieces of the puzzle. Ready for the red pill? d3.selection objects always contain an array of arrays. Really, try it out in your JS console. If you type: d3.selectAll('div') you will see the following value printed (numbers changed protect the innocent): [Array[3]] And if you type: d3.selectAll('div').selectAll('div') you will see something like: [Array[1], Array[1], Array[2]] No difference. None. Your very first selectAll introduced nesting, and you just didn't bother to check. Shows how much you care.
If a D3 selection contains Array[s0], Array[s1], ... , Array[sn] and a selectAll is run on it, the resulting selection will contain i=0..n si arrays; arrays 0 to s1 - 1 will have the matched sub-nodes for the elements in the original first array, s1 to s2 - 1 those from the second original array, etc. In essence, the selectAll replaces each inner array with a new array for each element in it.
3.4 Whats data got to do with it?
Hopefully, now, the starship example from above makes more sense. If not, the next bit might help. If you think you understand it all perfectly well, I still suggest you keep reading. I mean, I'd hate those words to go to waste. Plus, it's important stuff. Let's look a little more closely at what happens in a data join of nested selections.
outer 1
inner 1
inner 2
outer 2
inner 1
As you can see, data binding works exactly the same for a nested selection as it does for a normal one. Each selected node gets data, and each selected selected node receives placeholder children for datum without a matching node. You could, if you really wanted, do this one node at a time by iterating over the outer nodes using d3.selection.each. But I wouldn't recommend it. It gets messy, and we don't like messes.
Note the format of data calls for a nested selection: the argument is a function to give children a new set of data, usually based on the parent's data. If you use a literal like [42, 8], all children will get the same data. Sometimes you may want this, but it is rare. Maybe now we can write the table example from before?
<style type='text/css'> table { margin: 5px; border: 1px solid #666; border-radius: 10px; border-top-color: transparent; border-bottom-color: transparent; } </style> <div id='chart'></div> var tableRows = [ [1, 2, 3], [4, 5, 6], [7, 8, 9], ]; var table = d3.select('#chart').append('table'); var trs = table.selectAll('tr') .data(tableRows).enter() .append('tr'); trs.selectAll('td') // Each datum of the outer array is itself an array - // The first one is [1, 2, 3], the second [4, 5, 6], etc. // One can just use those arrays as the data for the next level. .data(Object).enter() .append('td') .text(String);
Or, the above example as one of those nifty step-through examples I spent so much time writing, so dammit I'm going to use them again:
Make sense? If not, please don't continue. Read it again, step through the examples again, play around in a sandbox, do something to make it click. Maybe even message your dear author, or a friend that knows D3. If you don't get this bit, you are going to constantly scratch your head. Become one with the data, and it will reward you. Battle the data at your own peril.
3.5 Gettin' ij-y with it
I, honestly, don't have much additional content to add that wouldn't open a can of worms and require a whole lesson to explain. But I really liked the name of this section, and I wanted to use it. So, one last thing:
All of our attribute functions have taken the form function(d, i). There is actually a third argument to the function - j. Where i is the index within a set of data, j is the index of the parent within its own set of data.
The next lesson will feature charts, again. Not like this lesson. How could it be so chartless? Those animations, which I bet you've been dying for, will also make an appearance.