Optimizing Performance in Mobile Text Entry

two young adults hold smartphones in their hands Contents:
  1. Overview
  2. Process
    1. Identify Features
    2. Explore and Prototype
    3. Evaluate and Iterate
  3. Deliverables, Insights, and Future Work


Summary: I developed a mobile keyboard that arranged characters according to their frequency within an English corpus (i.e., collection of written texts representative of a language), comprised of emails, tweets, and text messages. The goal was to minimize finger/thumb movement and improve expert performance, relative to Qwerty.

Users and Stakeholders: As this was a project within my research group, the stakeholders were my research supervisor and my supervisory committee. The target users were smartphone users who regularly engaged in text-based communications.

Pain Points: The mobile Qwerty keyboard often necessitated sequential key selections that spanned the screen or involved rotating the screen for two handed typing, and was prone to inaccurate auto-correct substitutions.

Scope and Constraints: $300 out-of-pocket budget for evaluation, equating to a maximum of sis participants at the required rate. I also lent study participants my personal smartphone to minimize costs.

Methods Used: Usability-lab studies (i.e., user testing), usability benchmarking, diary study, quantitative surveys, metrics analysis, log analysis, heuristic evaluation, user interviews, concept testing.


Identify Features

I identified the following desirable features in a mobile keyboard, by conducting secondary research gathered from online forums, surveys, and academic articles, and by consulting and testing concepts with mobile human-computer interaction experts:

Explore and Prototype

Where are the fastest buttons?

I designed an experiment to gather data on key press and swipe events. From this data, I calculated: 1) the average duration for a swipe in each of the four directions (up, down, left, right), and 2) the average duration to select a pair of keys using one-handed input. The permutations for the key pairs were exhaustive to cover the entire input area, and the pairs were presented in a random order. When selecting a pair of keys, the first selection would start the timer, and the second would stop it. Varying the key pair locations averaged the key selection time from all possible trajectories.

screenshot of key pair selection study; in a grid arrangement, there are two keys highlighted in differing shades of green; one is labelled '1' and another is labelled '2'

I gathered data on left-handed input and right-handed input. Surprisingly, some right-handed participants performed input faster with their left hand, and vice versa. Consequently, I decided to use timing data averaged from both hands. (Note: A stakeholder asked me to use a colour-blind friendly palette.)

screenshot of TEMA with stats hidden

As a side effect, I also obtained key selection accuracy heatmaps. In this image, red dots represent incorrect selections of a key. In the right image, one can notice erroneous selections caused by participants' palms along the right edge of the screen.

screenshot of TEMA with stats hidden

How should I arrange the characters?

In previous work, I had developed a corpus for mobile text entry (i.e., a language model derived from multiple sources of English tweets, text messages, and emails). From this, I was able to determine which characters were typed most frequently (i.e., character frequency), and how frequently each character pair is entered (i.e., digram frequency). This, combined with the key pair selection time data I had previously gathered, provided me with an evaluation metric. It allowed me to calculate the words-per-minute (wpm) typing speed to enter the corpus using a candidate keyboard arrangement.

Unfortunately, evaluating all possible character arrangements is not practical, as there are 10.9 octillion of them! Instead, I mimicked a genetic algorithm: I created an optimal layout for a subset of characters (called “S1”) by evaluating all (approximately 479 million) permutations and selecting the one with the highest typing speed. I then incrementally added subsets of characters (S2-S5) in a locally optimal manner (i.e., not allowing changes to the previously optimal arrangement).

Subset Characters Location
S1 i n t h e a r o u g m s Alpha
S2 . l c v y w b f d k p ! : / j ' x , Alpha
S3 0-9 Beta
S4 q * _ - ? ) @ z ( > = < ~ # & " % $ ^ ; Beta
S5 | \ + { } [ ] Sym submenu

The twelve characters in S1 represent the majority (54.5%) of the corpus. Users could type characters with a tap of a key (described as location “Alpha” in the character arrangement table), a long-tap of a key (“Beta”), or by selecting it from a symbols submenu (“Sym”). They could trigger shift, space, backspace, or enter functionality with key press, or a swipe over the keyboard in the direction up, right, left, or down, respectively.

The completed arrangement is the result of evaluating over 1.8 billion permutations. For mnemonic reasons, I arranged some characters together. For example, I arranged the left parenthesis near the right parenthesis, and 1-9 in a numeric keypad layout. I implemented this keyboard layout as an Android Input Method Editor (IME), and called it “MIME” (My IME).

evolution of the character arrangement; alpha row 1: comma, y, d, l, b, x, row 2: j, u, o, e, r, period, row 3: f, m, n, t, s, p, row 4: apostrophe, g, i, a, h, slash, row 5: exclamation mark, v, k, c, w, colon; buttons for settings, symbols, shift, space, backspace, and enter occupy the bottom row

Evaluate and Iterate

I recruited six paid participants for a longitudinal diary-based usability benchmarking study. Each of them received a Nexus 4 smartphone (returned at the end of the study), preinstalled with MIME and Google GBoard (Qwerty) keyboards, and TEMA to administer the study trials and record results.

evaluating MIME using my TEMA app (see other case study for TEMA details)

Participants stuck to a flexible schedule of ten study sessions, during which, they typed phrases using the two keyboards, using only their dominant hand. The duration for the MIME portion averaged 33 minutes with a peak typing speed of 17 wpm, while the Qwerty portion averaged 19 minutes and a peak of 24 wpm. Extrapolating the results suggests that MIME would yield a faster typing speed after 12 hours of practice. There was also a significant difference in accuracy. Qwerty error rate averaged 5.2%, but MIME error rate was much better, averaging only 1.7%.

I encouraged participants to write brief diary-like feedback after each session. Many wrote that MIME was initially frustrating to used because of the novel character arrangement, but typing became more fluid around the midpoint of the study. Some recognized that the unfamiliar arrangement had the side effect of improving typing accuracy, as participants had to focus more on their selections.

Three participants noted discomfort in their hands when using Qwerty but not MIME. I had counterbalanced the conditions, so this discomfort could not be attributed to the order of using the keyboards. These participants had XS and S glove sizes and owned mobile devices smaller than the Nexus 4. The discomfort was likely the consequence of frequent thumb movement across the screen when using Qwerty, necessitating a shift in grip for those participants.

In an updated iteration of MIME, I redesigned the layout to shrink its footprint, thus maximizing screen space for the underlying app (e.g., web browser or messaging app). The points below highlight the most significant changes.

updated layout alpha row 1: j, y, d, l, b, x, row 2: comma, u, o, e, r, period, row 3: f, m, n, t, s, p, row 4: v, g, i, a, h, apostraphe, row 5: settings, symbols, k, c, w, semi-colon

These changes reduced the overall height of the MIME keyboard by approximately 30%. Although I did not have the resources to perform additional benchmark testing, the redesign had no noticeable impact on performance, but did improve user experience by revealing more of the underlying app.

Deliverables, Insights, and Future Work

Although I no longer develop MIME, I still continue to use it. I find it especially useful for entering URLs, passwords, and composing simple text messages. I only switch to the provided Google keyboard when typing emojis or international characters.

In summary, here are the project deliverables:

And here are the insights and outcomes:

As possible future work, I could implement language-based error correction: When a key is tapped within x milliseconds of a previous tap, and the tap occurs within y pixels of the key's edge, use the corpus digram (or trigram) frequency to determine if an adjacent key's character is more likely to be the intended target. If so, type the adjacent character instead of the selected one. The threshold x would have to be empirically determined to trigger this error correction strategy with fast input, which is typically associated with input errors, but not slower typing (e.g., correcting errors, or entering non-prose text). Additionally, I could use the key selection heatmaps from above to determine the y values for each key's edges. Without additional funding for study participants, I was not able to continue this work. However, even without language-based error correction, I still find MIME very practical, and use it on my personal smartphone.