README.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715

# Ink Stroke Modeler

This library smooths raw freehand input and predicts the input's motion to
minimize display latency. It turns noisy pointer input from touch/stylus/etc.
into the beautiful stroke patterns of brushes/markers/pens/etc.

Be advised that this library was designed to model handwriting, and as such,
prioritizes smooth, good-looking curves over precise recreation of the input.

This library is designed to have minimal dependencies; the library itself relies
only on the C++ Standard Library and [Abseil](https://abseil.io/), and the tests
use the [GoogleTest](https://google.github.io/googletest/) framework.

## Usage

The Ink Stroke Modeler API is in the namespace `ink::stroke_model`. The primary
surface of the API is the `StrokeModeler` class, found in `stroke_modeler.h`.
You'll also simple data types and operators in `types.h`, and the parameters
which determine the behavior of the model in `params.h`.

To begin, you'll need choose parameters for the model (the parameters for your
use case may vary; see `params.h`), instantiate a `StrokeModeler`, and set the
parameters via `StrokeModeler::Reset()`:

```c++
StrokeModelParams params{
    .wobble_smoother_params{
        .timeout{.04}, .speed_floor = 1.31, .speed_ceiling = 1.44},
    .position_modeler_params{.spring_mass_constant = 11.f / 32400,
                             .drag_constant = 72.f},
    .sampling_params{.min_output_rate = 180,
                     .end_of_stroke_stopping_distance = .001,
                     .end_of_stroke_max_iterations = 20},
    .stylus_state_modeler_params{.max_input_samples = 20},
    .prediction_params = StrokeEndPredictorParams()};

StrokeModeler modeler;
if (absl::Status status = modeler.Reset(params); !status.ok()) {
  // Handle error.
}
```

You may also call `Reset()` on an already-initialized `StrokeModeler` to set new
parameters; this also clears the in-progress stroke, if any. If the `Update` or
`Predict` functions are called before `Reset`, they will return
`absl::FailedPreconditionError`.

To use the model, pass in an `Input` object to `StrokeModeler::Update()` each
time you recieve an input event:

```c++
Input input{
  .event_type = Input::EventType::kDown,
  .position{2, 3},  // x, y coordinates in some consistent units
  .time{.2},  // Time in some consistent units
  // The following fields are optional:
  .pressure = .2,  //  Pressure from 0 to 1
  .orientation = M_PI  // Angle in plane of screen in radians
  .tilt = 0,  // Angle elevated from plane of screen in radians
};
absl::StatusOr<std::vector<Result>> result = modeler.Update(input);
if (status.ok()) {
  std::vector<Result> smoothed_input = *result;
  // Do something with the result.
} else {
  absl::Status error_status = result.status();
  // Handle error.
}
```

`Input`s are expected to come in a *stream*, starting with a `kDown` event,
followed by zero or more `kMove` events, and ending with a `kUp` event. If the
modeler is given out-of-order inputs or duplicate inputs, it will return
`absl::InvalidArgumentError`.

The `time` values of the returned `Result` objects start at the `time` of the
first input and end either at the time of the last `Input` (for in-progress
strokes) or a hair after (for completed ones). The position of the first
`Result` exactly matches the position of the first `Input` and the `position` of
the last `Result` attempts to very closely match the match the `position` of the
last `Input`, wile the `Result` objects in the middle have their `position`
adjusted to smooth out and interpolate between the input values.

To construct a prediction, call `StrokeModeler::Predict()` while an input stream
is in-progress:

```c++
if (absl::StatusOr<std::vector<Result>> = modeler.Predict()) {
  std::vector<Result> smoothed_input = *result;
  // Do something with the result.
} else {
  absl::Status error_status = result.status();
  // Handle error.
}
```

If no input stream is in-progress, it will instead return
`absl::FailedPreconditionError`.

When the model is configured with `StrokeEndPredictorParams`, the last `Result`
returned by `Predict` will have a `time` slightly after the most recent `Input`.
(This is the same as the points that would be added to the `Result`s returned by
`Update` if the most recent input was `kUp` instead of `kMove`.) When the model
is configured with `KalmanPredictorParams`, the prediction may take more time to
"catch up" with the position of the last input, then take an additional interval
to extend the stroke beyond that position.

The state of the most recent stroke can be represented by concatenating the
vectors of `Result`s returned by a series of successful calls to `Update`,
starting with the most recent `kDown` event. If the input stream does not end
with a `kUp` event, you can optionally append the vector of `Result`s returned
by the most recent call to `Predict`.

## Implementation Details

The design goals for this library were:

*   Minimize overall latency.
*   Accurate reproduction of intent for drawing and handwriting (preservation of
    inflection points).
*   Smooth digitizer noise (high frequency).
*   Smooth human created noise (low and high frequency). For example:
    *   For most people, drawing with direct mouse input is very accurate, but
        looks bad.
    *   For most people, drawing with a real pen looks bad. We should capture
        some of the differences between professional and amateur pen control.

At a high level, we start with an assumption that all reported input events are
estimates (from both the person and hardware). We keep track of an idealized pen
constrained by a physical model of motion (mass, friction, etc.) and feed these
position estimates into this idealized pen. This model lags behind reported
events, as with most smoothing algorithms, but because we have the modeled
representation we can sample its location at arbitrary and even future times.
Using this, we predict forward from the delayed model to estimate where we will
go. The actual implementation is fairly simple: a fixed timestep Euler
integration to update the model, in which input events are treated as spring
forces on that model.

Like many real-time smoothing algorithms, the stroke modeler lags behind the raw
input. To compensate for this, the stroke modeler has the capability to predict
the upcoming points, which are used to "catch up" to the raw input.

### The Model

Typically, we expect the stroke modeler to receive raw inputs in real-time; for
each input it receives, it produces one or more modeled results.

When asked for a prediction, the stroke modeler computes it based on its current
state -- this means that the prediction is invalid as soon as the next input is
received.

> NOTE: The definitions and algorithms in this document are defined as though
> the input streams are given as a complete set of data, even though the engine
> actually receives raw inputs one by one. It's written this way for ease of
> understanding.
>
> However, one of the features of these algorithms is that any result that has
> been constructed will remain the same even if more inputs are added, i.e. if
> we view the algorithm as a function $$F$$ that maps from an input stream to a
> set of results, then $$F(\{i_0, \ldots, i_j\}) \subseteq F(\{i_0, \ldots, i_j,
> i_{j + 1}\})$$. Applying the algorithm in real-time is then effectively the
> same as evaluating the function on increasingly large subsets of the input
> stream, i.e. $$F(\{i_0\}), F(\{i_0, i_1\}), F(\{i_0, i_1, i_2\}), \ldots$$

##### Math Stuff

**Definition**: We define the *pressure* $$\sigma$$ as the amount of force which
is applied to the writing surface with the stylus, as reported by the platform.
We expect $$\sigma \in [0, 1]$$ for devices that report pressure, with 0 and 1
representing the minimum and maximum detectable pressure, or $$\sigma =
\emptyset$$ if the device does not report pressure. $$\square$$

**Definition**: We define the *tilt* $$\theta$$ as the angle between the stylus,
and a vector perpendicular to the writing surface. We expect $$\theta \in [0,
\frac{\pi}{2}]$$. $$\square$$

**Definition:** We define the *orientation* $$\phi$$ as the counter-clockwise
angle between the projection of the stylus onto the writing surface, and the
writing surface's x-axis. We expect $$\phi \in [0, 2 \pi)$$. $$\square$$

NOTE: In the code, we use a sentinel value of -1 to represent unreported
pressure, tilt, or orientation.

**Definition:** A *raw input* is a `struct` $$\{p, t, \sigma, \theta, \phi\}$$,
where $$p$$ is a `Vec2` position, $$t$$ is a `Time`, $$\sigma$$ is a `float`
representing the pressure, $$\theta$$ is a `float` representing tilt, and
$$\phi$$ is a `float` representing orientation. For any raw input $$i$$, we
denote its position, time, pressure, tilt, and orientation as $$i^p$$, $$i^t$$,
$$i^\sigma$$, $$i^\theta$$, and $$i^\phi$$ respectively. $$\square$$

**Definition:** An *input stream* is a finite sequence of raw inputs, ordered by
time strictly non-decreasing, such that the first raw input is the `kDown`, and
each subsequent raw input is a `kMove`, with the exception of the final one,
which may alternatively be the `kUp`. Additionally, we say that a *complete
input stream* is a one such that the final raw input is the `kUp`. $$\square$$

**Definition:** A *tip state* is a `struct` $$\{p, v, t\}$$, where $$p$$ a
`Vec2` position, $$v$$ is a `Vec2` velocity, and $$t$$ is a `Time`. Similarly to
raw input, for a tip state $$q$$, we denote its position, velocity, and time as
$$q^p$$, $$q^v$$, and $$q^t$$, respectively. $$\square$$

**Definition:** A *stylus state* is a `struct` $$\{\sigma, \theta, \phi\}$$,
where $$\sigma$$ is a `float` representing the pressure, $$\theta$$ is a `float`
representing tilt, and $$\phi$$ is a `float` representing orientation. As above,
for a stylus state $$s$$, we denote its pressure, tilt, and orientation as
$$s^\sigma$$, $$s^\theta$$, and $$s^\phi$$ respectively. $$\square$$

**Definition:** We define the *rounding function* $$R$$ such that $$R(x) =
\begin{cases} \lfloor x \rfloor, & x - \lfloor x \rfloor \leq \frac{1}{2} \\
\lceil x \rceil & o.w. \end{cases}$$. $$\square$$

**Definition:** We define the *linear interpolation function* $$L$$ such that
$$L(a, b, \lambda) = a + \lambda (b - a)$$. $$\square$$

**Definition:** We define the *normalization function* $$N$$ such that $$N(a, b,
\lambda) = \frac{\lambda - a}{b - a}$$. $$\square$$

**Definition:** We define the *clamp function* $$C$$ such that $$C(a, b,
\lambda) = \max(a, \min(b, \lambda))$$. $$\square$$

#### Wobble Smoothing

The positions of the raw input tend to have some noise, due to discretization on
the digitizer or simply an unsteady hand. This noise produces wobbles in the
model and prediction, especially at slow speeds.

To reduce high-frequency noise, we take a time-variant moving average of the
position instead of the position itself -- this serves as a low-pass filter.
This, however, introduces some lag between the raw input position and the
filtered position. To compensate for that, we use a linearly-interpolated value
between the filtered and raw position, based on the moving average of the
velocity of the input.

##### Math Stuff

**Algorithm #1:** Given an input stream $$\{i_0, i_1, \ldots, i_n\}$$, a
duration $$\Delta$$ over which to take the moving average, and minimum and
maximum velocities $$v_{min}$$ and $$v_{max}$$ we construct the noise-filtered
input stream $$\{i_0', i_1', \ldots, i_n'\}$$ as follows:

> NOTE: The values of $$\Delta$$, $$v_{min}$$, and $$v_{max}$$ are taken from
> `WobbleSmootherParams`:
>
> *   $$\Delta$$ = `timeout`
> *   $$v_{min}$$ = `speed_floor`
> *   $$v_{max}$$ = `speed_ceiling`

For each raw input $$i_j$$, we can then find the set of raw inputs that occurred
within the previous duration $$\Delta$$:

$$I_j^{<\Delta} = \left\{i_k \middle| i_j^t - i_k^t \in [0, \Delta]\right\}$$

We then calculate the moving averages of position $$\overline p_j$$ and velocity
$$\overline v_j$$:

$$\begin{align*}
\overline p_j & = \begin{cases}
i_j^p, & j = 0 \vee j = n \\
\frac{1}{\|I_j^{<\Delta}\|} \sum\limits_k i_k^p
\left[i_k \in I_j^{<\Delta} \right],
& \mathrm{o.w.} \end{cases} \\
\overline v_j & = \begin{cases}
0, & j = 0 \\
\frac{1}{\|I_j^{<\Delta}\|} \sum\limits_k \frac{\|i_k^p - i_{k - 1}^p\|}
{i_k^t - i_{k - 1}^t} \left[i_k \in I_j^{<\Delta} \right],
& \mathrm{o.w.}
\end{cases}
\end{align*}$$

NOTE: In the above equations, the square braces denote the
[Iverson bracket](https://en.wikipedia.org/wiki/Iverson_bracket).

We then normalize $$\overline v_j$$ between $$v_{min}$$ and $$v_{max}$$, and
find the filtered position $$p'_j$$ by linearly interpolating between $$i_j^p$$
and $$\overline p_j$$:

$$\begin{align*}
\lambda_j & = \begin{cases}
0, & \overline v_j < v_0 \\
1, & \overline v_j > v_1 \\
N(v_{min}, v_{max}, v_j), & \mathrm{o.w.}
\end{cases} \\
p'_j & = L\left(\overline p_j, i_j^p, \lambda_j\right)
\end{align*}$$

Finally, we construct raw input $$i'_j = \left\{p'_j, i_j^t, i_j^\sigma\right\}.
\square$$

### Resampling

Input packets may come in at a variety of rates, depending on the platform and
input device. Before performing position modeling, we upsample the input to some
minimum rate.

##### Math Stuff

**Algorithm #2:** Given an input stream $$I = \{i_0, i_1, \ldots, i_n\}$$, and a
maximum time between inputs $$\Delta_{max}$$, we construct the upsampled input
stream $$U \supseteq I$$ as follows:

NOTE: The value of $$\Delta_{target}$$ is `1 / SamplingParams::min_output_rate`.

We construct a sequence $$\{a_1, a_2, \ldots, a_n\}$$ such that $$a_j = \left
\lceil \frac{i_j^t - i_{j - 1}^t}{\Delta_{target}} \right \rceil$$, representing
the required number of interpolations between each pair of raw inputs.

We then construct a sequence of sequences $$\{u_1, u_2, \ldots, u_n\}$$ such
that $$u_j = \{L(i_{j - 1}, i_j, \frac{k}{a_j} \, | \, k = 1, 2, \ldots,
a_j\}$$. Finally, we construct the upsampled input stream by concatenation:
$$U = \{i_0, u_1, u_2, \ldots, u_n\}$$. $$\square$$

### Position Modeling

The position of the pen is modeled as a weight connected by a spring to an
anchor. The anchor moves along the resampled raw inputs, pulling the weight
along with it across a surface, with some amount of friction. We then use Euler
integration to solve for the position of the pen.

NOTE: For simplicity, we say that the spring has a relaxed length of zero, and
that the anchor is unafected by outside forces such as friction or the drag from
the pen.

![Position Model Diagram](./position_model.svg){height="200"}

##### Math Stuff

**Lemma:** We can model the position of the pen with the following ordinary
differential equation:

$$\frac{d^2s}{dt^2} = \frac{k_s}{m_{pen}}(\Phi(t) - s(t)) - k_d \frac{ds}{dt}$$

where $$t$$ is the time, $$s(t)$$ is the position of the pen, $$\Phi(t)$$ is the
position of the anchor, $$m_{pen}$$ is a the mass of the pen, $$k_s$$ is the
spring constant, and $$k_d$$ is the drag constant.

**Proof:** From the positions of the pen and the anchor, we can use Hooke's Law
to find the force exerted by the spring:

$$F = k_s (\Phi(t) - s(t))$$

Combining the above with Newton's Second Law of Motion, we can then find the
acceleration of the pen $$a(t)$$:

$$F = m_{pen} \cdot a(t) \\
a(t) = \frac{k_s}{m_{pen}}(\Phi(t) - s(t))$$

We also apply a drag to the pen, applying an acceleration of $$-k_dv(t)$$, where
$$v(t)$$ is the velocity of the pen. This gives us:

$$a(t) = \frac{k_s}{m_{pen}}(\Phi(t) - s(t)) - k_dv(t)$$

NOTE: This isn't exactly how you would calculate friction of an object being
dragged across a surface. However, if you assume that $$m_{pen} = 1$$, then it
matches the form of the equation for a sphere moving through a viscous fluid.
See [Kinetic Friction](https://en.wikipedia.org/wiki/Friction#Kinetic_friction)
and [Stokes' Law](https://en.wikipedia.org/wiki/Stokes%27_law).

From the equations of motion, we know that $$v(t) = \frac{ds}{dt}$$ and $$a(t) =
\frac{dv}{dt} = \frac{d^2s}{dt^2}$$. By substitution, we arrive at the ordinary
differential equation stated above. $$\square$$

**Algorithm #3:** Given an incomplete input stream $$\{i_0, i_1, \ldots,
i_n\}$$, initial position $$s_0$$, and initial velocity $$v_0$$, we construct
tip states $$\{q_0, q_1, \ldots\}$$ as follows:

NOTE: When modeling a stroke, we choose $$s_0 = i_0^p$$ and $$v_0 = 0$$, such
that pen is at stationary at the first raw input point. The `StrokeEndPredictor`
also makes use of this algorithm with different initial values, however.

We define the function $$\Phi$$ such that $$\forall j, \Phi(i_j^t) = i_j^p$$.
From above, we know that we can model the position $$s(t)$$ with an ordinary
differential equation:

$$\frac{d^2s}{dt^2} = \frac{k_s}{m_{pen}}(\Phi(t) - s(t)) - k_d \frac{ds}{dt}$$

Because the spring constant and mass are so closely coupled in our model, we
combine them into a single "shape mass" constant $$M = \frac{m_{pen}}{k_s}$$,
giving us:

$$\frac{d^2s}{dt^2} = \frac{\Phi(t) - s(t)}{M} - k_d \frac{ds}{dt}$$

NOTE: The values of $$M$$ and $$k_d$$ are
`PositionModelerParams::spring_mass_constant` and
`PositionModelerParams::drag_constant`, respectively.

We define $$s_j = s(i_j^t)$$, $$v_j = \frac{ds}{dt}(i_j^t)$$, and $$a_j =
\frac{d^2s}{dt^2}(i_j^t)$$, and use the given $$s_0$$ and $$v_0$$. Recalling
that $$\Phi(i_j^t) = i_j^p$$, we can iteratively construct the each following
position:

$$\begin{align*}
\Delta_j & = i_j^t - i_{j - 1}^t \\
a_j & = \frac{i_j^p - s_{j - 1}}{M} - k_d v_{j - 1} \\
v_j & = v_{j - 1} + \Delta_j a_j \\
s_j & = s_{j - 1} + \Delta_j v_j
\end{align*}$$

Finally, we construct tip state $$q_j = \{s_j, i_j^t, v_j\}.\square$$

#### Stroke End

The position modeling algorithm, like many real-time smoothing algorithms, tends
to trail behind the raw inputs by some distance. At the end of the stroke, we
iterate on the position modeling algorithm a few additional times, using the
final raw input position as the anchor, to allow the stroke to "catch up."

##### Math Stuff

**Algorithm #4:** Given the final anchor position $$\omega$$, the final tip
state $$q_{final}$$, the maximum number $$K_{max}$$ of end-of-stroke tip states
to create, the initial time between tip states $$\Delta_{initial}$$, and a
stopping distance $$d_{stop}$$, we construct the next tip state $$q_{next}$$ as
follows:

> NOTE: The values of $$\Delta_{initial}$$, $$K_{max}$$, and $$d_{stop}$$ are
> taken from `SamplingParams`:
>
> *   $$\Delta_{initial}$$ = `min_output_rate`
> *   $$K_{max}$$ = `end_of_stroke_max_iterations`
> *   $$d_{stop}$$ = `end_of_stroke_stopping_distance`

Let $$\Delta_{next} = \Delta_{initial}$$. Using the update equations from
**algorithm #3**, we construct a candidate tip state $$q_c$$:

$$\begin{align*}
a & = \frac{\omega - q_{final}^p}{M} - k_d q_{final}^v \\
v & = q_{final}^v + \Delta_{next} a \\
s & = q_{final}^p + \Delta_{next} v \\
q_c & = \{s, q_{final}^t + \Delta_{next}, v\}
\end{align*}$$

If the distance traveled $$\|q_c - q_{final}\|$$ is less than $$d_{stop}$$, we
discard the candidate, stop iterating, and declare the stroke complete. Further
iterations are unlikely to yield any significant progress.

We then construct the segment $$S$$ connecting $$q_{final}^p$$ and $$q_c^p$$,
and find the point $$S_\omega$$ on $$S$$ nearest to $$\omega$$. If $$S_\omega
\neq q_c^p$$, it means that the candidate has overshot the anchor. We discard
the candidate, and find a new candidate using a smaller timestep, setting
$$\Delta_{next}$$ to one half its previous value. This is repeated until either
we find a candidate that does not overshoot the anchor, or the minimum distance
traveled check fails.

Once we have found a suitable candidate, we let $$q_{next} = q_c$$.

If the remaining distance $$\|\omega - q_{next}^p\|$$ is greater than
$$d_{stop}$$ and the total number of iterations performed, including discarded
candidates, is not greater than $$K_{max}$$, we use this same process to
construct an additional tip state for the end-of-stroke, substituting
$$q_{next}$$ and $$\Delta_{next}$$ for $$q_{final}$$ and $$\Delta_{initial}$$,
respectively. $$\square$$

### Prediction

As mentioned above, the position modeling algorithm tends to trail behind the
raw inputs. In addition to correcting for this at the end-of-stroke, we
construct a set of predicted tip states every frame to present the illusion of
lower latency.

The are two predictors: the `StrokeEndPredictor` (found in
`internal/prediction/stroke_end_predictor.h`) uses a slight variation of the
end-of-stroke algorithm, and the `KalmanPredictor` (found in
`internal/prediction/kalman_predictor.h`) uses a Kalman filter to estimate the
current state of the pen tip and construct a cubic prediction.

Regardless of the strategy used, the predictor is fed each position and
timestamp of raw input as it is received by the model. The model's most recent
tip state is given to the predictor when the predicted tip states are
constructed.

#### Stroke End Predictor

The `StrokeEndPredictor` constructs a prediction using **algorithm #4** to model
what the stroke would be if the last raw input had been a `kUp`, letting
$$\omega$$ be the most recent tip state, and $$q_{final}$$ be the last raw
input.

#### Kalman Predictor

The `KalmanPredictor` uses a pair of
[Kalman filters](https://en.wikipedia.org/wiki/Kalman_filter), one each for the
x- and y-components, to estimate the current state of the pen from the given
inputs.

The Kalman filter implementation operates on the assumption that all states have
one unit of time between them; because of this, we need to rescale the
estimation to account for the actual input and output rates. Additionally, our
experiments suggest that linear predictions look and feel better than non-linear
ones; as such, we also provide weight factors that may be used to reduce the
effect of the acceleration and jerk coefficients on the estimate and prediction.

Once we have the estimated state, we construct the prediction in two parts, each
of which is a sequence of tip states described by a cubic polynomial curve. The
first smoothly connects the last tip state to the estimate state, and the second
projects where the expected position of the pen is in the future.

##### Math Stuff

**Algorithm #5**: Given a process noise factor $$\epsilon_p$$ and measurement
noise factor $$\epsilon_m$$, we construct the Kalman filter as follows:

The difference in time between states, $$\Delta_t$$, is always assumed to be
equal to 1.

The *state transition model* is derived from the equations of motion: given
$$s$$ = position, $$v$$ = velocity, $$a$$ = acceleration, and $$j$$ = jerk, we
can find the new values $$s'$$, $$v'$$, $$a'$$, and $$j'$$:

$$\begin{align*}
s' & = s + v \Delta_t + \frac{a \Delta_t^2}{2} + \frac{j \Delta_t^3}{6} \\
v' & = v + a \Delta_t + \frac{j \Delta_t^2}{2} \\
a' & = a + j \Delta_t \\
j' & = j \\
\end{align*}$$

The *process noise covariance* is modeled as noisy force; from the equations of
motion, we can say that the impact of that noise on each state is equal to:

$$\epsilon_p
\begin{bmatrix}\frac{\Delta_t^3}{6} \\ \frac{\Delta_t^2}{2} \\
\Delta_t \\ 1 \end{bmatrix}
\begin{bmatrix}\frac{\Delta_t^3}{6} & \frac{\Delta_t^2}{2} & \Delta_t & 1
\end{bmatrix}$$

The *measurement model* is simply $$\begin{bmatrix}1 & 0 & 0 & 0\end{bmatrix}$$,
representing the fact that we only observe the position of the input.

The *measurement noise covariance* is set to $$\epsilon_m$$. $$\square$$

> NOTE: The values of $$\epsilon_p$$ and $$\epsilon_m$$ are taken from
> `KalmanPredictorParams`:
>
> *   $$\epsilon_p$$ = `process_noise`
> *   $$\epsilon_m$$ = `measurement_noise`

**Algorithm #6:** Given the input stream $$\{i_0, i_1, \ldots, i_n\}$$, the
Kalman filter's estimation $$E_K = \{s_K, v_K, a_K, j_K\}$$, the number of
samples to use $$N_{max}$$, and weights $$w_a$$ and $$w_j$$ for the acceleration
and jerk, we construct the state estimate $$E = \{s, v, a, j\}$$ as follows:

We first determine the average amount of time $$\Delta_a$$ between the
$$N_{max}$$ most recent raw inputs:

$$\begin{align*}N_{actual} & = \max(n, N_{max}) \\
\Delta_a & = \frac{i_n^t - i_{n - N_{actual}}^t}{N_{actual}}
\end{align*}$$

We then construct the state estimate:

$$E = \left \{ s_K, \frac{v_K}{\Delta_a}, w_a \frac{a_K}{\Delta_a^2}, w_j \frac{j_K}{\Delta_a^3} \right \} \square$$

> NOTE: The values of $$N_{max}$$, $$w_a$$, and $$w_j$$ are taken from
> `KalmanPredictorParams`:
>
> *   $$N_{max}$$ = `max_time_samples`
> *   $$w_a$$ = `acceleration_weight`
> *   $$w_j$$ = `jerk_weight`

**Algorithm #7:** Given the last tip state produced $$q$$, the target time
between inputs $$\Delta_t$$, and the estimated pen position $$s_K$$ and velocity
$$v_K$$, we construct the connecting tip states $$\{p_1, p_2, \ldots\}$$ as
follows:

NOTE: The value of $$\Delta_t$$ is `1 / SamplingParams::min_output_rate`.

We first determine how many points to sample along the curve, based on the
velocities at the endpoints, and the distance between them:

$$n = \left \lceil \frac{\|s_K - q^p\|}{\Delta_t \max(\|q^v\|, \|v_K\|)} \right \rceil$$

From this, we say that the time when the connecting curve reaches the estimated
pen position is $$t_{end} = n \cdot \Delta_t$$. We then construct a cubic
polynomial function $$F_C$$ such that:

$$\begin{align*}
F_C(0) & = q^p \\
\frac{d F_C}{d t}(0) & = q^v \\
F_C(t_{end}) & = s_K \\
\frac{d F_C}{d t}(t_{end}) & = v_K
\end{align*}$$

NOTE: The full derivation of $$F_C$$ can be found in the comments of
`KalmanPredictor::ConstructCubicConnector`, in
`internal/prediction/kalman_predictor.cc`.

We construct the times at which to sample the curve $$t_i = i \cdot n \cdot
\Delta_t, i \in \{1, 2, \ldots, n\}$$, and construct the tip states $$p_i =
\{F_C(t_i), \frac{d F_C}{d t}(t_i), q^t + t_i\}$$. $$\square$$

**Algorithm #8:** Given the number of raw inputs received so far $$n$$, the last
raw input received $$i_R$$, the target prediction duration $$t_p$$, the
estimated state $$\{s_K, v_K, a_K, j_K\}$$, the desired number of samples
$$N_s$$, the maximum distance $$\varepsilon_{max}$$ between the estimated
position and the last observed state, the minimum and maximum speeds $$V_{min}$$
and $$V_{max}$$, maximum deviation from the linear prediction $$\mu_max$$, and
baseline linearity confidence $$B_\mu$$, we estimate our confidence $$\gamma$$
in the prediction using the following heuristic:

There are four coefficients that factor into the confidence:

*   A sample ratio $$\gamma_s$$, which captures how many raw inputs have been
    observed so far. The rationale for this coefficient is that, as we
    accumulate more observations, the error in any individual observation
    affects the estimate less.
*   An error coefficient $$\gamma_\varepsilon$$, based on the distance between
    the estimated state and the last observed state. The rationale for this
    coefficient is that it should be indicative of the variance in the
    distribution used for the estimate.
*   A speed coefficient $$\gamma_V$$, based on the approximate speed that the
    prediction travels at. The rationale for this coefficient is that when the
    prediction travels slowly, changes in direction become more jarring,
    appearing wobbly.
*   A linearity coefficient $$\gamma_\mu$$, based on how the much the cubic
    prediction differs from a linear one. The rationale for this coefficient is
    that testers have expressed that linear predictions look and feel better.

We first construct the final positions $$s_l$$ and $$s_c$$, of the linear and
cubic predictions, respectively:

$$\begin{align*}
s_l & = s_K + t_p v_K \\
s_c & = s_K + t_p v_K + \frac{1}{2} t_p^2 a_K + \frac{1}{6} t_p^3 j_K \\
\end{align*}$$

We then construct the coefficients as follows:

$$\begin{align*}
\gamma_s & = \frac{n}{N_s} \\
\gamma_\varepsilon & = C(1 - N(0, \varepsilon_{max}, \|s_K - i_R^p\|)) \\
\gamma_V & = C(N(V_{min}, V_{max}, \frac{\|s_c - s_K\|}{t_p})) \\
\gamma_\mu & = L(B_\mu, 1, C(0, 1, N(0, \mu_{max}, \|s_c - s_l\|))) \\
\end{align*}$$

Finally, we take the product of the coefficients to find the confidence
$$\gamma = \gamma_s \gamma_\varepsilon \gamma_V \gamma_\mu$$. $$\square$$

> NOTE: The value of $$t_p$$ = `KalmanPredictorParams::prediction_interval`, and
> the values of $$N_s$$, $$\varepsilon_{max}$$, $$V_{min}$$, $$V_{max}$$,
> $$\mu_max$$, and $$B_\mu$$ are taken from
> `KalmanPredictorParams::ConfidenceParams`:
>
> *   $$N_s$$ = `desired_number_of_samples`
> *   $$\varepsilon_{max}$$ = `max_estimation_distance`
> *   $$V_{min}$$ = `min_travel_speed`
> *   $$V_{max}$$ = `max_travel_speed`
> *   $$\mu_max$$ = `max_linear_deviation`
> *   $$B_\mu$$ = `baseline_linearity_confidence`

**Algorithm #9:** Given the target time between inputs $$\Delta_t$$, the time at
which prediction starts $$t_0$$, the target prediction duration $$t_p$$, the
confidence $$\gamma$$, and the estimated pen position $$s_K$$, velocity $$v_K$$,
acceleration $$a_K$$, and jerk $$j_K$$, we construct the prediction tip states
$$\{p_1, p_2, \ldots\}$$ as follows:

We first determine the number of tip states $$n$$ to construct:

$$n = \left \lceil \gamma \frac{t_p}{\Delta_t} \right \rceil$$

We then iteratively construct a cubic prediction of the state at intervals of
$$\Delta_t$$:

$$\begin{align*}
s_{i+1} & = s_i + \Delta_t v_i + \frac{1}{2} \Delta_t^2 a_i + \frac{1}{6} \Delta_t^3 j_i \\
v_{i+1} & = v_i + \Delta_t a_i + \frac{1}{2} \Delta_t^2 j_i \\
a_{i+1} & = a_i + \Delta_t j_i \\
j_{i+1} & = j_i \\
\end{align*}$$

We let $$s_0 = s_K$$, $$v_0 = v_K$$, $$a_0 = a_K$$, and $$j_0 = j_K$$, and then
finally construct the tip states $$p_i = \{s_i, v_i, t_0 + i \Delta_t\}$$ for
$$i \in \{1, 2, \ldots, n\}$$. $$\square$$

### Stylus Modeling

Once the tip states have been modeled, we need to construct stylus states for
each of them. For input streams that contain stylus information, we find the
closest pair of sequential raw inputs within a set window, and interpolate
between them to get pressure, tilt, and orientation. For input streams that
don't contain any stylus information, we simply leave the pressure, tilt, and
orientation unspecified.

#### Math Stuff

**Algorithm #10:** Given an input stream $$\{i_0, i_1, \ldots, i_n\}$$ with
stylus information, a tip state $$q$$, and the number of raw inputs
$$n_{search}$$ to use as the search window, we construct the stylus state $$s$$
as follows:

NOTE: The value of $$n_{search}$$ is
`StylusStateModelerParams::max_input_samples`.

We first construct line segments $$\{s_1, s_2, \ldots, s_n\}$$ from the
positions of the input stream, such that $$s_j$$ starts at $$i_{j - 1}$$ and
ends at $$i_j$$.

We next find the index $$k > n - n_{search}$$ of the closest segment, such that
for all $$j > n - n_{search}, \mathrm{distance}(s_k, q^p) \leq
\mathrm{distance}(s_j, q^p)$$.

We define $$r$$ be the ratio of $$s_k$$'s length at which the closest point to
$$q^p$$ lies, and use this to interpolate between the stylus values of $$i_{k -
1}$$ and $$i_k$$, allowing for the orientation to "wrap" at $$0 = 2 \pi$$:

$$\begin{align*}
\rho & = L(i_{k - 1}^\rho, i_k^\rho, r) \\
\theta & = L(i_{k - 1}^\theta, i_k^\theta, r) \\
\phi & = \begin{cases}
L(i_{k - 1}^\phi + 2 \pi, i_k^\phi, r), & i_k^\phi - i_{k - 1}^\phi > \pi \\
L(i_{k - 1}^\phi, i_k^\phi + 2 \pi, r), & i_k^\phi - i_{k - 1}^\phi < -\pi \\
L(i_{k - 1}^\phi, i_k^\phi, r), & \mathrm{o.w.} \\
\end{cases} \\
\end{align*}$$

Finally, we construct the stylus state $$s = \{\rho, \theta, \mod(\phi, 2
\pi)\}$$. $$\square$$