Operational Overhead Caused by Horizontal Scrolling Text

Introduction

As a reader of horizontal scrolling for 59 years and a mathematician, I decided to quantify what I have always thought about the process. I started using magnifying glasses when I was 10; I moved to telescopic glasses with a hand-held lens in the mid 70’s and used CCTV. I was a beta tester for Luna in the 90’s, and have tracked the process of screen magnification software (SMS) ever since. In early 2002, I found CSS. The web was new and one user style sheet would work on about 60% of sites. In the next 10 years, I read more books and articles than I did in preceding 40 years. This article gives some quantitative insight as to why.

As demonstrated in Table 1 below, the minimum number of scrolls required to read with horizontal scrolling is dramatically greater than the scrolls required to read with word wrapping. Since scrolling is overhead to reading comprehension, the scrolling by itself, is a serious disruption.

The Data

The sample text comes from Flappers and Philosophers by F. Scott Fitzgerald. It was downloaded from Project Gutenberg. The column width is 45%. Font size is 16px. It is a 48-line sample at 1600px by 900px resolution. Counting was done with Chrome in full-screen mode. See supporting files to count for yourself.

Method ZOOM WRAP
E= 100% 1 1
E= 200% 3 3
E= 300% 58 5
E= 400% 68 9
E= 500% 99 15
E= 600% 107 21
E= 700% 133 28
Table 1: A comparison by enlargement method of the number of essential scrolls needed to read 48 lines of text.

Comparison

As witnessed in Table 1, horizontal scrolling requires many more scrolling actions (keyboard and / or mouse). In addition, ZOOM has a narrower effective magnification range than WRAP. It takes more scrolls to read at 300% with ZOOM than it does to read WRAP at 700%. Comparing ZOOM to WRAP we get ratios of scrolling that range from 11.6 to 1 to 4.75 to 1. If we compare low vision use to full sighted use the results are more startling. Fully sighted users take 1 scroll to read the section. Thus at 300% the partially sighted user takes 58 to 1 scrolls. All enlargement will take more scrolls, because the text takes more space, but even at 700% (112px) WRAP takes only 28 scrolls.

The ZOOM and WRAP scrolls are the same for 100% and 200% because both fit on the viewing window. 100% is the size for fully sighted readers. 200% is for people with mild low vision.

Why ZOOM Imposes a Higher User Cost Than WRAP

There are more lines of text than viewing windows in digital information documents that contain blocks of text. ZOOM with horizontal scrolling forces scrolls line by line. WRAP scrolls window by window. Counting horizontal scrolls is messier than simply multiplying the number of lines by the magnification factor, but line by line scrolling is central to the problem.

Counting Scrolls

This section is mathematical. It is elementary but tedious. You may wish to skip to the conclusions. The full analysis will appear in the proceedings Conference of the Association of Applied Human Factors and Ergonomics (July 2017).

Terms and Notation

ZOOM – The zoom enlargement used by lenses (hand held and virtual), CCTV and SMS (Screen Magnification Software) is called ZOOM. It treats text as if it is a picture of text.

WRAP - Enlargement obtained with word wrapping is called WRAP.

Zoom Grid - A virtual grid swept out while reading with ZOOM is the Zoom Grid. In this article only the minimum ZOOM Grid is discussed. That is the one that involves the fewest scrolls.

Scroll - When the window for viewing is smaller than the text in a document the window must move to see more text. Any such movement is a scroll. When we turn a page in a book, page down in a software window, move a reading lens, or shift the book tray for a CCTV reader, we are scrolling. All these actions are scrolls.

ceil - If X is any real number then ceil applied to X is the smallest whole number N such that X≤N. Examples: ceil(1.11)=2 and ceil(5)=5. It is the round up function.

Important quantities:

See Figure 1 below, In that example: E=500%, T=720px, M=1600px, W=1600, R=3. Only three rows are shown in Figure 1, so we cannot count B. Computing B will be discussed later in Example 2. The fact that M=W (the magnification window and the viewing window have the same width), means we have full-screen zoom or a horizontal strip zoom. For hand held lenses or simulated lenses mode with SMS, M<W, and frequently M<T.

The Zoom Grid

As a user reads with ZOOM, the window of magnification moves across the content, digital or paper, and it sweeps out a virtual grid. The Zoom Grid is the grid swept out in a minimum pass over the data that enables perceiving the data in a correct reading sequence as defined in WCAG 2.0. To comprehend the content, most users will scroll more. This is important when words or lines are cut by the window boundaries. In a left to right language if the user reads lines following the Zoom Grid left to right and top to bottom, every letter of text will be encountered in reading order.

500% zoomed text under the ZOOM Grid
Figure 1: A partial Zoom Grid from the 48-line text sample –magnified 500%. In this portion of the Zoom Grid, all lines extend into every box horizontally. These lines each take 3 scrolls, one for each right boundary crossed and one to get back to the next line. All 12 lines thus require 36 horizontal scrolls.

Fact 1: Every time a line exceeds a right boundary a horizontal scroll is required.

Fact 2: If at least one right boundary is crossed by a line then that line requires one extra scroll to get to the next line

Let 0, ..., R-1 enumerate the columns of the Zoom Grid. Let, L[k] be the set of lines that end in the kth column, k= 0, …, R-1. L[0] takes 0 scrolls to read the line. For k=1, …, R-1, it takes k+1 scrolls to read each line in L[k]. That is k scrolls to get to the kth column and 1 to read the next line. This gives us:

Fact 3: The number of essential horizontal scrolls is given by

HS= 2*L[1]+…+(k+1)L[k]+…+(R)*L[R-1]

Fact 4: The number of downward vertical scrolls, VS, required while reading a document is equal to the number of bottom boundaries minus one.

VS= B-1.

Fact 3 and Fact 4 combined give the following:

Fact 5: The total number of scrolls required to read a page with horizontal scrolling is

TS= [(2*L[1]+…+(k+1)L[k]+…+(R)*L[R-1])]+ [B-1].

That is, TS= HS+VS. Our sample text was so small we counted B directly, by scrolling down. The formula for R is given by:

Fact 6: The number of right bourdaries is:

R = ceil ((T/M)*(E/100)).

Note: E is a percent so we divide by 100.

Example 1:

Consider our sample text at 500% ZOOM: (T/M)*(500/100)=(720px/1600px)*5)= 0.45*5= 2.25, R=ceil (2.25) =3. We count the vertical scrolls directly (press [page-down] 8 times in Google Chrome). L[0]=15, L[1]=8, L[2]= 48-L0-L1= 25. The horizontal scrolls required are2*8+3*25=91 (Fact 3). Add in the vertical scrolls to get 91+8=99 in total Fact 5.

Example 2:

I am reading text with a magnifying glass, a 2 by 4-inch aspheric lens with 350% (3.5x) enlargement. My text is 6.5 inches wide and 9 inches high on each page. Suppose we have 1 page, 9 inches of text. Although we did not compute B before we will do it now because it illustrates the difference between document pages (viewing windows) and magnification windows. They rarely fit exactly. To compute B you compute it page by page and then multiply by the number of pages including fractional pages. Let HT be the height of the text on the page, HZ be the height of the ZOOM view, E be the enlargement in percent and P be the number of pages, fractions included. B= ceil((P*(HT/HZ)*(E/100)).

In our example B= ceil((1) * (9/2)*3.5))= ceil(15.75)= 16. If this seems too big, it is because each 2-inch high ZOOM window does not hold 2 inches of text at 350%. It only shows 2 / (3.5) inches of text. The number of vertical scrolls is VS= B-1= 15.

For right boundaries use Fact 6 R= ceil(6.5/4)*(350/100))= ceil(5.6875) =6.

A ZOOM Grid on top of a page of text
Figure 2: Scaled image of the Zoom Grid for 350% on an 8.5 by 11-inch page. The rectangles represent the view under a 4x2 inch lens. Each rectangle represents one lens window. Some lines are marked to illustrate the column where they end.

Looking at the Figure 2 using the terminology above we see that L[0]=2, two lines end in column 0. L[1]=1, one line ends in column 1. L[2]=2, two lines end in column 2. L[3]=2, two lines end in column 3. L[4]=5, five lines end in column 4. L[5]=8, eight lines end in column 5. The horizontal scrolls, moving the lens right, are 2*L[1]+3*L[2]+4*L[3]+5*L[4]+6*L[5]= 2*1+3*2+4*2+5*5+6*8= 89 (Fact 3). The number of downward scrolls B-1=15 (Fact 4). The total is 104 total scrolls to read one page.

Minimum vs. Real Scrolls

Looking at Figure 2, we see lines cut in half by horizontal boundaries and letters chopped by vertical boundaries. One letter is chopped into 4 pieces. To read the document more scroll adjustments are needed.

When I use a lens, I always scroll the lens so that entire words are in the lens (when possible). If the bottom boundary in full-screen magnification is cut, I move the window to include it. For memory assistance, I usually include one or two words from the previous window when I scroll right. I scroll more than the ZOOM Grid indicates. However, this is a fact. Nobody can use the ZOOM method with fewer scrolls than demonstrated here. The large numbers for ZOOM are the smallest numbers possible.

When I used WRAP, I was a little less careful. I just used the Chrome page down. It actually keeps the last line from the previous window on top when it scrolls. For WRAP, my count was a little higher than the minimum. That was not significant, ZOOM still overwhelmed WRAP when I compared total scrolls.

Conclusions

The Overhead Difference between ZOOM and WRAP

The difference in scroll overhead for ZOOM verses WRAP is explained entirely by the difference between use of line by line verses page by page scrolling. I only gave one text sample to keep this discussion short, but we would get the same results with other text samples. The principles I define in the quantitative section above, apply to all documents with blocks of text and all ZOOM methods.

ZOOM chops the document like a paper cutter. Text in ZOOM windows is not sequenced in a readable order. There are huge content gaps between lines. The reader is responsible for detecting the reading order. The reader must scroll from ZOOM Grid window to ZOOM Grid window tracing the paths of lines. When the end of a line is found, the reader backtracks to the next line to repeat scrolling through Grid windows. It usually takes more than one pass to get through each row of windows in the ZOOM Grid. See the ZOOM example below.

Note: A screen reader user will not get much from this example. I am sorry but the significance is visual. You may want to skip it. The same is true for the two wrap examples. Skip the ZOOM example.

ZOOM 300% Example

WRAP is a lexical method, it preserves the language structure. WRAP enlarges characters, groups them into lexical tokens (words, numbers, punctuation etc.) and then packs them in a correct reading sequence. In WRAP, the user reads an entire window before moving to the next. Skip the WRAP examples.

WRAP Example 300%

WRAP Example 700%

ZOOM is the oldest form of magnification. Europeans started using reading lenses in the 14th century. WRAP is new by comparison, but the science of wrapping text has used sophisticated algorithms since the 1970’s. None the less, it is still unavailable to people with low vision. Many people in the disability community claim that ZOOM is enough. The W3C deems ZOOM alone to be sufficient accessibility support.

Does this really make sense in light of the mathematical analysis given here?

ZOOM in our example above took 17 scrolls to read 9 lines at 300%. WRAP at 300% took 1. WRAP for 700% took 7 scrolls. That is more than twice the enlargement with fewer scrolls. Given this gross imbalance in overhead between the two methods, I find it very difficult to call ZOOM accessibility support or reasonable accommodation for digital information documents that contain blocks of text.