Ballot Image Test Data: Lehigh-Muhlenberg Simulated Survey

Obtaining access to hand-marked ballots created by voters for use in real elections appears to be problematic due to various legal constraints. As part of our work on the PERFECT project, we are conducting various data collection activities that will give us access to similar sorts of markings created in less restrictive contexts. We plan to make these data freely available as a service to the research community.

We are also studying ways of generating realisitic-looking synthetic ballot images. For this task, the relevant fields from the ballot specification include the race definitions (the candidates in each race, the maximum number of choices permitted, the rates at which each candidate receives a vote if the ballot is being generated randomly, etc.), as well as the physical locations of the appropriate mark targets (e.g., the ovals which the voter is expected to fill in).

In addition to the ballot substrate, ballot synthesis also requires a supply of previously drawn markings. This can be assembled in any of several ways: scanned off of paper ballots, drawn on-screen using a digitizing tablet, etc. Marks can be added to the ballot image either by superimposing them on the target (in this case, the mark must be drawn with a transparent background) or by replacing the target/mark combination as a single component. Our software converts ballot and mark images to PNM format before they are manipulated. Image-based transformations are provided so that a single mark can be given a variety of visible manifestations. Marks can be scaled either uniformly or non-uniformly in the x- and y-dimensions, and they can be rotated by an arbitrary amount. The color of the mark can also be remapped before it is placed in the ballot image. Finally, the new ballot image is converted to TIF format using LZW compression. A PDF version (for printing purposes) is also created at the same time. Examples of the variations created from a single input checkmark are shown below.

(a) Variations synthesized from a single input checkmark.

Using our BallotGen software for creating synthetic ballot images, we are conducting a study of how human evaluators judge ballot marks. The results will demonstrate important variations in how people judge voter intent, an important criterion used in many state election laws. In some cases nearly identical marks are judged very differently by human evaluators. Establishing the determinants of such judgments is an important element of our proposed research. In particular, we plan to study the relative effects of the social characteristics of the judges (partisanship, age, etc.) and contextual information (votes on the same ballot on related questions) provided by different ballot designs in judgments human evaluators make regarding voter intent.

Below we present a simulated survey form of voter attitudes we are using in our work. For this evaluation, 125 survey forms were randomly generated in a realistic fashion using known ballot markings, including a pre-determined percentage of "marginal" markings. These surveys are then presented to test subjects for their interpretation of the markings. The test subjects are not aware that the markings are only simulated and were not created by real voters. Our analysis of the results of this experiment will be released at a later date.

Blank Lehigh-Muhlenberg Survey Form

Button

125 Randomly Synthesized Lehigh-Muhlenberg Survey Forms
(PDF and Ground-Truth) (TIF images)