Top-Down Research Design

by Loren Cobb, PhD.

Research Associate Professor
Family, Community, and Emergency Medicine
University of New Mexico School of Medicine
Albuquerque, NM 87131, USA

Current address:
417 Eisenhower Drive, Louisville CO 80027, USA

Abstract

Many scientists experience great difficulty in designing their research, in part because their design decisions are focused on which variables to measure rather than on how to attain the goals of the research. This paper presents a formal process, called top-down design, for ensuring that every variable measured actually has a bearing on at least one research hypothesis, that controls are in place for all foreseeable threats to the validity of the results, that the research design is adequate to confirm or disconfirm each research hypothesis, and that nothing is included in the design which does not contribute to reaching the goals of the project.

Introduction

In my capacity as a consulting statistician in a medical center, I am often asked to analyze data, to help write papers and grant applications, and to offer constructive criticism of research designs. Like all statisticians, I prefer to work on studies at the design stage, in order to prevent serious flaws from appearing in the design that will render analysis difficult or impossible. Twice in my career I have had to tell a researcher that all of his data, painfully gathered with the greatest dedication over a period of years, was absolutely worthless. But I have gradually come to realize that the stock answer to these problems ("See a statistician first!") addresses only a small part of the problem and may even miss the crucial point altogether.

Every research design is flawed in some way — this is an unavoidable fact of scientific life. Designing a research study is an exercise in compromise between the demands of the goals of the project and the realities of limited time, money, staff, and opportunity. A statistician can help by articulating the impact that certain design decisions will have on the validity of the study, but I have found that the most valuable advice I supply usually applies at a deeper level. This fundamental level is more cognitive than technical, as it concerns style of thought rather than mathematical or statistical knowledge. The style of thought that creates trouble is all too common and natural: it is a style that begins with decisions as to when and where to measure what on whom, and then builds the complete research design on top of this very weak foundation. The alternative is a style of thought that begins with the purpose of the research, and derives every detail of research design with reference to this keystone concept.

Too many scientific research projects have their beginnings in a conversation that starts with "We need to do a study of X. What shall we measure?" or "Here we have this magnificent instrument. What shall we study?" The research staff then proceed to identify variables to measure, collect some subjects, and gather their data. This style of research design can be called bottom-up, because it begins at the lowest level, with a specification of what to observe or measure. It is only after huge quantities of data on every conceivable item of interest have been gathered and recorded that the investigators begin seriously to think about analysis. "Thinking about analysis" usually means searching for a statistical magic wand which will transform their bloated grab-bag of miscellaneous data into beautiful figures and significant statistics. If any statistical analysis can be found which arguably confirms a research hypothesis (framed post hoc, naturally), then papers are written which report everything that was done and everything that was found, interesting or not.

The style of thought that I try to nurture in my colleagues is one that begins with the ultimate goal of the project, and explicitly derives every design decision by working downwards from this overarching purpose. By "working downwards" I mean identifying the chain or branching tree of subgoals that links the ultimate goal (at the top) with existing reality (at the bottom). This approach can be called top-down research design. In my experience, scientists who naturally use this style of thought seldom need statisticians, while those who reason "from the bottom up" either use statisticians heavily or spend many years wandering in the vast wilderness of bad science. Further, I believe this to be true regardless of the mathematical skills of the scientist. It has been my informal observation, for example, that most Nobel prize winners use very primitive statistics, if indeed they use any at all, and what few techniques they do use they understand very well indeed.

More to the point, the research designs that support Nobel prize-winning research in medicine and physiology tend to be simple, clean, straightforward, and very goal-directed. Research at this level almost never requires advanced forms of statistical analysis. I once mentioned this to a medical scientist who wanted my assistance with a very complex analytical method that he did not understand, and he responded that he was not trying to win a Nobel prize, he just wanted to be sophisticated in his analyses. If this kind of scientific "logic" is at all widespread, then medical science has a very dim prognosis indeed.

An Example of Bottom-Up Design

What follows is a composite description of several retrospective medical research projects with which I have been associated, with the worst aspects of each included. The overall goal of this illustrative but fictional research project was to evaluate a new medical diagnostic method.

We designed the research using the traditional bottom-up approach. "What shall we measure?" was the first question we asked. We agreed that it would be interesting to obtain not just diagnoses from several physicians working independently, but also the confidence with which each physician makes the diagnosis. And, naturally, we insisted that the database include a broad range of potentially relevant background variables obtained from the patients' charts: clinical history, demographics, laboratory tests, pathology and autopsy reports, etc. We identified about a hundred suitable cases from recent hospital records. At this point the data collection began, and four physicians made independent diagnoses of each case based solely on the new diagnostic method. A member of the team who was familiar with computers volunteered to create a database. Next, the staff typed in the results. Then we located a statistician who said he could look at the data.

What could possibly go wrong with research based on such comprehensive data? Alas, almost everything.

First, since at the time the data were entered we did not know how the final results were to be presented or even analyzed, we devoted very little thought to the structure of the data itself. We simply entered the data using a well-known relational database program designed for business use. Such database programs are fine for business purposes, e.g. creating alphabetized lists and retrieving individual cases, but they are much less than helpful for statistical work. In our research the fundamental unit of statistical analysis was the patient, but the fundamental unit of the database was the physician's diagnosis, of which there were four for each patient. After many wasted hours we finally concluded that the tools available in our commercial database package were simply incapable of rearranging the data into the required format. Forward progress finally resumed when we exported the data to a simple spreadsheet, in which rearrangement of the data is easy, and we were finally able to make a preliminary analysis of the data.

At this point the second major problem appeared: the tables generated in the preliminary analysis did not adequately address the (until this time largely unstated) goals of the project. By the time a mode of presentation had been found that satisfied all parties, no fewer than four fundamentally different analyses had been performed. The decision as to which mode of presentation and analysis to use should have been made before the study ever began.

In all of our massive ancillary data on clinical history, demographics, lab tests, etc., only one variable ever found its way into the final analysis. Meanwhile, the burden of extracting this worthless ancillary information from patient charts had thoroughly exhausted and alienated the project staff, and we discovered that a great many typographic errors had been made during data entry. Ultimately we recreated a simplified dataset for the final analysis from scratch — essentially all of the initial data-entry effort had been in vain.

Our fourth problem was a minor disaster: it gradually dawned on the staff that there were insufficient cases to determine the efficacy of the diagnostic method in fully half of all of the specific diseases of interest to the investigators. This defect could have been found during the design phase of the project, had a quick and simple power calculation for sample size been performed. But it was too late — the study was already behind schedule and over budget.

Lastly, journal referees severely criticized the submitted draft of the paper for failing to deal with the possibility of selection bias in the initial selection of patient records. The study as it was finally published attracted much less attention than it would have, had it not been flawed in so many ways.

In retrospect each of these problems seems quite clear and obvious. Some might feel inclined to mumble platitudes about the advantages of 20-20 hindsight over foresight, but, fortunately, we don't have to fall back on platitudes. Top-down design principles are much more effective.

The Principles of Top-Down Design

Top-down design is a procedure for creating a research design that any scientist can follow. It offers an almost automatic mechanism for assuring:

that every variable measured actually has a bearing on at least one research hypothesis;
that controls are in place for all foreseeable threats to the validity of the results;
that the research design is adequate to confirm or disconfirm each research hypothesis;
that nothing is included in the design which does not contribute to reaching the goals of the project.

Top-down design achieves this somewhat remarkable result by focusing the design effort on the goals of the research, rather than on cases or variables.

The six steps in top-down design for experimental research:

Write a single-sentence summary of the main result your research is intended to produce as it might be stated in a literature review by some future scientist who is citing your research.
Write the abstract for the paper that reports the result quoted in Step 1. Invent plausible summary statistics as needed - your research will provide the actual numbers later.
Draw the figure or table that conclusively establishes the main result claimed in Step 2. Again, invent plausible numbers.
Write down every realistic way in which the validity of the figure or table in Step 3 can be attacked. Use this list to identify the controls and comparisons needed in the research design.
Write the research methods section for the paper. If appropriate, use a power calculation to determine the sample sizes necessary to make the numbers in Step 3 statistically significant.
Flesh out the full research design, using Step 5 as a guide. Do not include anything not demanded by Step 5! Design the structure of the data so that these figures or tables are readily derivable.

The design phase is complete only after Step 6 is done. Here is what has been accomplished when the design is complete: the research methods section and abstract have already been written, the mode of presentation of the results is already settled, it is known what the results are expected to be, there is nothing in the design that is not demanded by the goals of the project, and there is a known probability that the study will succeed (this last is provided by the power analysis in Step 5).

In my experience, the crucial step in top-down design is #3 — sketching the key figure or table. The discussions and arguments that arise while identifying the key figure clarify and largely eliminate the conceptual confusions that arise from imprecise language or lack of mathematics. Furthermore, if the key figure or table is on display then everyone connected with the research knows exactly what, in the final analysis, it is all about. For a research project that requires a large staff, this benefit alone can justify the extra effort entailed by top-down design.

Step 5 mentions power analysis, a statistical technique for estimating the necessary sample size and guarranteeing a specified probability of success. Power analysis is not the obscure technicality that some researchers think - observe that many review committees of the National Institutes of Health now expect that every grant application document the suitability of its sample size by calculating the statistical power of the proposed test. (A good rule of thumb is that any experiment whose estimated probability of success is less than 70% will not be funded.)

Power analysis requires that the investigators specify three pieces of information:

The size of the effect: the difference that is substantively meaningful. For example, a difference in mortality rates between two groups of only one per thousand per annum might be statistically significant, but not substantively meaningful. The investigators must specify the minimum difference that will be considered meaningful.
The significance level: the probability of erroneously concluding that an effect exists when the true effect is zero. Most investigators use 5% for this figure, but 1% or less should be used when there is a high social cost of finding a nonexistent effect.
The power of the test: the probability of correctly confirming an effect of specified size. The size specified should be the minimum substantively meaningful effect. Most investigators are satisfied with designs that promise a power of 70-80%, but 90% or higher should be demanded when there is a high social cost of failing to confirm a true effect.

For the vast majority of all research projects, power analysis is no more difficult than looking up the sample size in the appropriate table in a suitable reference, or using a power analysis computer program [1]. These tables and programs, moreover, provide sobering information on the trade-offs that can be made between meaningful effect size, statistical significance, power, and sample size. In a costly or time-consuming study, knowledge of these trade-offs is crucial for good study design.

How NOT to Consult a Statistician

It is neither necessary nor sufficient to consult a statistician before you make a commitment to a particular research design. A statistician's help is not necessary if you can keep your methods simple and rigorously adhere to a goal-directed top-down design process. The statistician's help is also not sufficient to avoid major flaws, because it is all too easy to lure an unaware statistician into a morass of bottom-up design. Here is a tongue-in-cheek recipe for how to confuse a statistician.

First, prepare yourself for the interview by creating a research design that is as nearly as possible an exact copy of a design used by a recognized authority in your field. Don't bother to look into whether this design will help you obtain the answers you need, and be prepared to argue in its favor by citing higher authority and the demands of mythical "journal referees". Bring along at least twenty badly photocopied journal articles for ammunition. Choose articles whose methods sections are written in the most impenetrable and telegraphic style possible, so that the statistician will be discouraged from reading them.

Second, start the interview with a statement along these lines, "I have almost everything worked out, but I need some help in determining the sample sizes that I will need with this design." Be sure not to enlighten the statistician by mentioning the goals of your research. If he asks for details, focus your answer on what you measured, not on why you did it.

Third, ask the statistician for help in altering the design to accomodate the particular advanced analytical method that seems to be lighting the sky in your discipline at the moment. For example, if an incredibly impressive paper has appeared in the last year that uses multivariate nonparametric spectral analysis with implicit adaptive filtering, then by all means try to get this technique into your design.

If these maneuvers do not succeed in preventing the statistician from comparing your design to your goals, then drastic measures need to be taken. Suggest that he simply doesn't understand your discipline, and give him all twenty journal articles to read. Promise him 10% of his salary for working on your grant, and tell him to send you a one-paragraph statistical statement for inclusion in the grant application, by Friday at the latest. Then walk out.

All of this illustrates the essential point, that consulting a statistican first is neither necessary nor sufficient for creating a good research design. You will be much more likely to succeed by adopting a goal-directed style of thought. Top-down research design is nothing more than a formalization of that style.

Top-Down Design for Exploratory Research

Even exploratory and descriptive studies can and should use a top-down approach to research design. Naturally there are some differences in emphasis. I recommend the use of the following steps, which are somewhat modified from the steps given in Section 2 for experimental research:

The six steps in top-down design for exploratory research:

Write single-sentence summaries of the two or three most likely main results of the kind your research is intended to produce, as it might be stated in a literature review by some future scientist who is citing your research.
Write a separate abstract for each version of the paper that reports the results quoted in Step 1. Invent plausible summary statistics as needed - your research will provide the actual numbers later.
Draw the figures or tables that establish or describe the various results claimed in Step 2. Again, invent plausible numbers.
Write down every realistic way in which the validity of the figures or tables in Step 3 can be attacked. Use this list to identify the controls needed in the research design.
Write the research methods section for the paper.
Flesh out the full research design, using Step 5 as a guide. Include nothing that is not plausibly suggested by Steps 4 and 5. Design the structure of the data to maximize its flexibility for exploratory data analysis.

The point here is that even exploratory research has a design, whether or not anyone ever writes it down or even acknowledges that it exists, and it is just as important for an exploratory design to focus on the goals of the research as it is for an experimental design. Top-down design is the means by which the research activity is focused on its goals.

Conclusion

The concept of top-down design has been embraced by many engineering disciplines. For example, virtually all large buildings are designed in this fashion, architects having learned the hard way that it is better to have a clear idea of what one is building before pouring the foundations. Large computer programs are almost always designed in this fashion [2], as are rocket probes to the outer planets.

The efficiencies of the top-down approach for research design are almost self-evident: no variables are measured that do not have a bearing on the hypothesis, no control is used that does not relate to a known threat to the validity of the study, no subject is included beyond the number demanded by the power analysis, and the investigators and funding agencies have solid assurance that they will not be left with a mountain of worthless data.

It may be argued that the designs generated by a top-down approach are not rich enough to provide opportunities to discover the unexpected, that they actively discourage serendipity. I disagree entirely. First, if serendipity is among the goals of a project, then a top-down approach will reflect that fact by building opportunities for unexpected discoveries into the design. Second, the efficiencies of top-down design will release resources so that a greater variety of studies can be done. Creative scientific leadership is usually helped far more than it is hurt by design methods of greater efficiency. Third, and most important, top-down design actually fosters creativity, by laying bare the essential core of the research effort - the core which is so often neglected or forgotten amid the distractions of peripheral issues.

References

Cohen, Jacob (1988). Statistical Power Analysis for the Behavioral Sciences, 2nd Edition. Hillsdale, New Jersey: Lawrence Erlbaum Associates. The tables in this reference are also available as a computer program.
Leith, P. (1984) "Top-down design within a functional environment." Software, 14, 921-930.