The canonical mind-control task is to operate a thought mediated computer cursor to hit targets on a screen (that’s true for brain-computer interfaces anyway, but maybe not for Jedi). The idea is that if someone who can’t use their body in the typical way could operate a computer cursor with their thoughts, then it would open the full screen-based digital cornucopia to them along with a modicum of newfound independence. To test the algorithms translating between thoughts and cursor movement (“decoders”), the setup is for the user to pilot a cursor to targets on a screen. The more hits they get, the better the decoder.
As simple as that sounds, though, making fair comparisons between decoders has become basically impossible because no two cursor tasks are ever the same. A change as simple as the number of onscreen targets (i.e., potential mistakes you could make) might be something one could adjust for across studies with measures like bitrate, but there is a bottomless latrine of bespoke variations to draw from to suit the needs of each particular study. The cursor task gets harder with so many different potential knobs that could be turned that it is hard to see how any battery of measures could level the playing field in a coherent and mathematically rigorous way. Allow me to list a few:
List of Commonly Tuned Cursor Task Parameters
- Number of simultaneously displayed targets
- The more targets available the more chances you have to make a mistake by hitting the wrong one. Normally center-out tasks (moving from the workspace center to targets on a ring) display only one at a time, while keyboard tasks have a grid of keys always available for pressing – rightly or wrongly.
- Number of possible targets
- The more places targets could appear in the workspace, the harder the task becomes by forcing you to learn a more complete representation of the territory. This is potentially independent of how many targets are shown at once because any subset of targets could be displayed at any time at the whim of the study coordinator.
- Size of targets
- The smaller the target is as a percentage of the available workspace the harder it is to get your cursor into that area.
- Density of targets
- The more densely packed the targets the more precise you have to be to hit the one you’re aiming for (and avoid the rest).
- Typical inter-target distance
- The farther you typically have to travel along the workspace to hit a new target the harder the task will be, because more travel breeds more opportunity for mistakes in heading and intervening false positives.
- Workspace boundaries
- When your cursor runs off the available workspace, what happens? Does the cursor continue at escape velocity into an extended workspace that exists but isn’t shown to the user (hardest)? Does the cursor enter the extended workspace but you at least have an indicator about what side of the screen and/or how far off it you’ve traveled (slightly less hard than hardest)? Does the workspace have toroidal edges that wrap back on themselves Asteroids-style (easier)? Is there a well-deified boundary so you can’t ever leave the visual workspace (easiest)?
- Hit requirements
- What do you have to do to hit a target? The longer you need to hold the cursor inside the target to register a hit the harder the task will be. If you also must stay within the target continuously (as opposed to disjointed cumulative time), it will be harder still. Occasionally studies will make the actual (secret) target bigger than the displayed target, giving you more tolerance for cursor position error to register a hit, which I presume makes things easier but it isn’t obvious. There are interaction effects as well: the larger the targets the more likely you can simply pass through it without stopping (or turning around inside it) and still register a hit, making things much easier (since stopping is notoriously difficult in BCI continuous velocity control). Or perhaps you have a “neural click” feature, and there is no hold time at all – aarrgghh!
- Collisions or obstacles
- If there’s anything between you and the target that necessitates navigating around, things will only get harder. The more interference and the narrower the path the harder it becomes.
- Timeout periods
- If the task puts no time limit on how long you can flounder around before things get reset for the next attempt then it will be harder because you have more time to make new errors (unless there is some kind of normalization to time spent, like in the bitrate measure).
One solution to clean up this issue would be for an intrepid soul to dedicate a study (or two) to testing all these possibilities, then distilling the results down into a few fundamental principles; though, I doubt there is something as satisfying or elegant as the speed-accuracy tradeoff at the end of that road.
Perhaps a better solution is to have a “benchmark block” in cursor studies at the end of the last intervention that runs a standard N trials on a grid of X targets of size Y% of the workspace with a hold requirement of Z milliseconds, with …. well, you see where I’m going. In this way, we would at least have a task with an unobtrusive amount of trials that calibrates our understanding of how difficult the main study task was against our expectations on this benchmark. Eventually, once there has been buy-in, a meta-analysis across many studies that contain both the in-house favorite cursor task and the benchmark block under paired conditions (e.g., decoders) would have a huge set of difficulty-calibrated data from which to extract general principles of how we should understand the meta-behavior of cursor control tasks across parameter space. So much the better if the task would dynamically scale in difficulty as the user logged more hits, accommodating a wide range of decoders without having performance saturate at 0% (too hard) or 100% (too easy), which narrow the range where we could make meaningful comparisons.
Or alternatively, we could all just keep using different task parameters so there is always plausible deniability for why this or that intervention worked for us but not them.
Has anyone proposed parameters for a benchmark block, as you suggest? That seems like such an obvious and necessary step. For whatever reason, your description of the challenge brought to mind White Balancing in photography. It would be amazing to establish the 18% gray of digital telekinesis tasks.