TRIM3 Logo  Modifications to the Underlying Surveys
The Urban Institute  TRIM3  Reference  Contact Us

The TRIM3 system converts raw surveys to standard TRIM3 format. The conversion procedures enter the data into the TRIM3 database, create new variables needed by TRIM3, generate monthly income amounts, and create random numbers.

The conversion enables a survey file to be used by the TRIM3 system without making any changes in the model's computer code. Two different surveys could use the same sampling units and collect the same variables, while using completely different coding schemes for those variables and a different ordering of variables on the computer records. Without conversion, there would need to be two different versions of the TRIM3 model to read the variables from different locations and interpret the codes in different ways.

In practice, any two surveys never collect exactly the same variables. Similar information may be available, but collected in different ways. For example, information contained in one variable on one survey may have to be pieced together from several variables on another survey. The March CPS itself changes each year; variables are added, coding schemes are changed, and variables are recoded in different ways. These changes are generally minor, but periodically there are major changes in the processing and editing methods used by the Census Bureau. These changes are all handled in the conversion process.

The conversion has both technical and substantive features. From a technical viewpoint, the conversion reformats a datafile so that the computer can process it more efficiently. The substantive aspects of the conversion include:

Additional variables are imputed in other TRIM3 modules. Each of these aspects is discussed next.

Reformatting the Data

The raw March CPS file includes household, family, and person records. During the conversion of a March CPS, the data is entered into the TRIM3 database where it is stored in household, family, adult, and person database tables. The "adult" tables contain variables that pertain to "economic adults"--persons aged 15 and older. The "person" tables contain variables coded for all persons regardless of age. Monthly variables created during the conversion are stored in a separate table in the database. The database design allows information to be stored in an efficient manner where it can be easily accessed when processing households.

Empty (non-interview) March CPS households are excluded from the TRIM3 database. Occasionally, the March CPS contains a household consisting entirely of children under the age of 15-these households are dropped as well. If a child under the age of 15 has no relatives in the household, TRIM3 places the child in the family of the household reference person.

With additional work, surveys other than the March CPS can be converted to TRIM3 format. In general, the survey's variables must be mapped into the household, family, adult, and person variables used by TRIM3. If the survey has additional data not ordinarily included in a TRIM3 file, the data can be brought into the TRIM3 system through special supplementary database tables.

Editing the Data

Although March CPS files are generally well edited by the Census Bureau, a variety of additional edits are performed as part of the conversion. In general, the programs confirm that variable codes and universes are as specified in the survey's codebook. The conversion procedure flags out-of-range and unexpected values so that potential problems can be identified and corrected before the data are brought into the TRIM3 system.

Recoding Variables and Creating New Variables

Even when the same information is collected by two surveys, different coding schemes may be used. The conversion programs recode variables to have the coding schemes expected by the TRIM3 simulation modules.

The conversion procedure also creates new variables from existing variables in order to increase processing efficiency. For example, broad income variables are generated by summing detailed income components, new health insurance variables are generated by combining initial reported information with information from final "catch all" questions, and standard TRIM3 "family type" variables are constructed. If a raw survey file does not have a variable needed by TRIM3, the variable may be imputed during the conversion.

Creating Monthly Data

The March CPS and most of the other surveys that might be used by TRIM3 report annual income amounts. However, government transfer programs calculate monthly eligibility and benefits using monthly income information. The conversion procedure must therefore allocate annual income amounts into monthly amounts.

The monthly allocation uses CPS-reported data on number of weeks of employment and number of different spells of work for the calendar year preceding the March survey. The different spells are distributed over the 52 weeks of the year, and the weekly information is then summarized into monthly variables. Beginning with the conversion of the March 2003 CPS survey, all months are assigned an equal number of weeks so that income is evenly distributed. Prior to that time, four months are treated as having 5 weeks, and the remaining months are treated as having 4 weeks; a week is assigned to the month in which most of its days fall. A series of monthly unemployment rates is input to the conversion procedure as a target, and weeks of work and unemployment are assigned so as to approach that target.

Once the periods of employment and unemployment have been established, incomes are allocated over the year.

Wages, self-employment income, and farm income are distributed evenly over weeks of employment; implicitly, a person is assumed to have earned the same amount during each week of work over the year.

Unemployment compensation is generally divided over weeks of unemployment, but for some proportion of recipients a one-month lag in receipt of benefits may be simulated.

Workers' compensation is generally divided over all weeks in which a person was either unemployed or out of the labor force; but a random subset of recipients may be simulated to receive their workers' compensation as a lump sum, all in one month.

The number of months over which alimony and child support income is allocated is determined probabilistically; a look-up table generated from SIPP data gives the percentages of persons receiving that income in different numbers of months, by the annual amount of the income.

Beginning with the conversion of the March 2001 CPS survey, separate look-up tables were made available for TANF recipients and non-TANF recipients using data from the 1996 SIPP on alimony and child support recipiency patterns of these two groups. Persons who receive both (Child Support and/or Alimony) and TANF and whose Child Support+Alimony divided by months of TANF receipt is equal to their state's pass-through amount are excluded from the look-up table. Rather, their months of Alimony and/or Child Support receipt is set equal to the reported number of months of TANF receipt.

Beginning with the March 2006 CPS conversion, combined other income from several different sources is allocated according to its components. This income measure, DetailedOtherIncome, contains workers compensation and disability insurance allocated by weeks not worked, and educational assistance, black lung miner benefits, and any other income all allocated equally across all months. Prior to that time, the entire combined income amount was allocated evenly to all months.

Other types of income are divided evenly over the 12 months of the year.

Beginning with the March 2006 conversion, MonthlyUnearnedIncome consists of the following components allocated evenly across the months:

Allocation in years prior to the March 2006 was very similar, though workers compensation and disability were allocated evenly across the months rather than by weeks not worked.

For further details about the procedures used to allocate annual income amounts into monthly amounts, click here.

Generating Random Numbers

Random numbers are used in many TRIM3 modules to determine whether a person or unit with a certain probability of some outcome will actually have that outcome. For example, random numbers are used in deciding whether a unit that is eligible for TANF will actually receive TANF benefits. A unit's characteristics determine its probability of participation, and a random number between 0 and 1 is compared to that probability to determine whether the unit will participate. If a probability is .73, any random number of .73 or less results in participation. Random numbers are also required by various imputation functions used when variables needed by the model are not present on the raw survey.

TRIM3 creates different random numbers for each task requiring a random number, so that there will be no unintended relationship between, for example, the Food Stamp Program participation decision and imputed child care expenditure amounts. Some random number variables are created during the conversion procedure and are saved in the TRIM3 database. Other random number variables are created "on-the-fly" at the time of simulation. In either case, the random number seed used in generating a person's random number is derived from the name of the random number variable, the data year, the household and person id, and the month (if a monthly random number). This ensures that if a module using a random number is run twice on the same input file with the same program rules, the results will be exactly the same.

Additional Imputations

Additional variables are imputed to TRIM3 by other TRIM3 modules and procedures to provide information not available from the survey data. For example, child care expenses are imputed by the Child Care Expenses module, housing expenses are imputed by the Public and Subsidized Housing module, and expenditures that may be used as deductions in the Federal Income Tax module are statistically matched to the March CPS using data from the Internal Revenue Service's Statistics of Income Public Use File (SOI).

TRIM3 imputations use a variety of methods, including OLS regressions, probit equations, statistical matches, and look-up tables. An imputation procedure is developed using data from another survey-one that contains the variable in question and that also contains whatever demographic and economic variables are important correlates or predictors of the missing variable. The goal of imputation is to reproduce in the TRIM3 data the observed interrelationships between the imputed variable and explanatory variables.