The ELSI-Brasil sample was designed to be representative of the Brazilian population aged 50 and older. This section contains information about data (format, description and usage), sampling design, weighting and financial derived variables.
The documentation regarding the household interview, individual interview, physical measurements and interview protocols as well as general information about the study, sample and research publications are all available on the ELSI-Brasil homepage (www.elsi.cpqrr.fiocruz.br).
It is important to note that ELSI-Brasil does not have a research team to provide help to users with data analysis and interpretation of findings. All required information for any data analysis are properly documented here and in other documentation available on the ELSI-Brasil homepage (www.elsi.cpqrr.fiocruz.br).
In order to access the dataset you need to register through the “Registration to acess” menu. Then fill in all the required registration information. The username will be your email address. After confirming the password, click “Send”. You will receive a verification link in the email address to confirm your registration. By accessing the link you will be directed to the login. After the login you will have access to the datasets. Save the file in your computer. To exit, click on the upper menu on the right, “exit” option.
ELSI-Brasil’s steering committee (and scientific team) encourages the public use of its data. Registration to access the data is important since it will be used as an indicator of interest from both Brazilian and international researchers.
The dataset is available in two formats: Stata (version 13) and text files with values separated by commas (extension ‘.cvs’). In Stata, all variable labels are described. The second format i.e. text files does not allow label specification. However, this problem can be solved by using the variable names i.e. their respective labels can be easily identified in both household and individual questionnaires which are available on the study’s homepage.
For all variables in the dataset, except when specified otherwise in the questionnaires, the categories with the following codes/values: 8/88/888/8888/88888 and so on, refer to “Not Applicable (NA)”.
ELSI-Brasil has a complex sample design. Therefore, analyses should account for weighting (variable = peso_calibrado_n) and sampling design based on the following variables: primary sampling unity (variable = UPA) and geographic stratification and clustering (variable = estrato). Stata users should use the following command to account for weighting and sampling design:
svyset UPA [pweight=peso_calibrado_n], strata(estrato) vce(linearized) singleunit(missing)
Financial derived variables:
The last six variables in the dataset are financial derived variables (i.e. not included in the main questionnaire). They are based on household income (total and per capita), income of the respondent, household consumption (total and per capita) and housing assets (properties). The financial derived variable names are listed in Box 1. The variables used to derive these variables are described in the footnote of Box 1. More information can be found in the Appendix of this document.
Box 1 – Financial derived variables
|Household monthly income a||rendadom|
|Household monthly income per capita a||rendadompc|
|Monthly income of respondent b||rendaind|
|Household monthly expenditure c||consumo|
|Household monthly expenditure per capita c||consumopc|
|Housing assets d||propriedades|
a: Bloc D (all variables, except d28, d29, d30)
b: Based on Bloc D variables, except d28, d29, d30
c: b3, b5, b39, c2, c4, c5, c6, c7, c8, c9, c10, c11, c12 (divided by 12), c13 (divided by 12), c14 (divided by 12), c15 (divided by 12), c17, c18 (divided by 12)
d: b6 (minus b4), b8, b37
a,b: It was assumed that missing values in one or more items represent lack of income for the item in question. This could led to a sub estimation of our parameters. However, it is important to note that the observed average household income per capita in our sample (R$ 1,175.00) is very similar to the national average for the same period (R$ 1,113.00 in 2015 and R$ 1,226.00 in 2016) based on data from the Brazilian Institute of Geography and Statistics (IBGE).
Box 2 shows a summary of the financial indicators mentioned above and by the magnitude of inequalities according to different percentiles and number of valid answers. There are some missing data in some of the derived variables and data imputation could be performed.
The figures in Table 2 clearly show: (1) inequalities of similar magnitude in household and individual income as well as household consumption and (2) greater inequalities are observed regarding assets.
Box 2 – Financial derived variables
|Percentiles||Household monthly income (n = 9,412)||Household monthly income per capita (n = 9,412)
|Monthly income of respondent
(n = 9,412)
|Household monthly expenditure (n = 6,108)||Household monthly expenditure per capita (n = 6,108)||Assets (houses, rural properties and vehicles)
(n = 7,066)
This appendix describes the criteria used to create the financial variables included in the dataset.
Household and individual income
The income variables i.e. monthly household income and monthly income of the interviewee were constructed based on answers given in block D of the household questionnaire (residents’ income), using the variables specified in the Box 1 footnotes.
For each source of income, the following three questions were asked: (1) income in the last 30 days, by source; (2) its continuous value in Brazilian Real and (3) its value using unfolding brackets range closest to the gain (when the respondent could not provide a continuous value). A combination of these two questions was used to calculate both household and individual income variables. The specified value to the open question (2) was considered when provided. For the responses using only unfolding brackets ranges closest to that gain (3), the mean point corresponding to each value range was considered. For the last amount/value range i.e. the highest one, the value was imputed, considering the median gain of other residents with income in this range, who reported their income in the open question.
The same procedure described above was adopted for each income source for all household residents. The composition of household income was based on the sum of all household residents’ income, considering the five sources of income investigated. It was assumed that missing data for one or more items represent no income in the item in question. The monthly household income per capita was calculated as the ratio between the monthly household income and the variable ar6 (number of residents in the household).
The household consumption variable (monthly household expenditure) was constructed based on three variables from block B in the questionnaire: value of monthly house/property mortgage installment (item b3); amount paid for the last rent of house/property (item b5) and total household expenses with all domestic workers (item b39) and items in block C. For more details please see Box 1 footnote.
All household consumption information was obtained using unfolding brackets ranges. For the composition of the total monthly household consumption value, the average point of the bracket range was considered for each source, with the exception of the last category, for which the lower limit of the interval was assumed. For expenses based on annual expenditure, the amount was divided by 12, in order to obtain the equivalent monthly expense.
The monthly household consumption was calculated as the sum of the expenditure in all items considered, and the monthly household consumption per capita as the ratio between the monthly household expenditure and the number of residents of the household (variable ar6).
The asset variable was based on items from the following questions: b6 (market value of house/property), b4 (amount still outstanding to finish paying for house/property), b8 (market value of other properties) and b37 (market value of vehicles).
Because the information on the variables mentioned above was collected using unfolding brackets ranges, their values were based on the average point of the range value, except for the last category, for which the lower limit of the range was assumed.
The real state value was defined as the difference between its current market value (item b6) and the outstanding amount for its full acquisition (item b4). For those participants who declared that they did not own their own house (item b1), the value 0 (zero) was assumed.
The value of ownership of other properties, such as houses, apartments, land, small or large farm (item b8) was added to this amount. The value of vehicles was based on their declared market value (item b37). The final value of the asset variable was calculated by adding all real estate and vehicles values.