Food patterns derived with multivariate statistical methods and their association with chronic disease in a multi-country setting: the European Prospective Investigation into Cancer and Nutrition
Epidemiology and Medical Biometry/Statistics
Final Report Abstract
This project aims to find out whether dietary patterns identified with supervised machine-learning techniques better predict risk of chronic disease than dietary patterns identified with conventional methods. To address this question, data was used of the European Prospective Investigation into Cancer and Nutrition (EPIC). Dietary intake in EPIC was estimated from diet questionnaires in 477,326 men and women from 10 European countries (29 centres). Principal component analysis was used as the first conventional method to derive dietary patterns. One EPlC-wide principal component analysis extracted seven dietary patterns, identifying several intuitive meaningful patterns. The seven patterns combined explained 61.8% ofthe variability in dietary intake in the entire sample and captured a large proportion of variability within each country. Dietary intake data from a highly standardized 24-h dietary recall, administered in a representative sample of 34,436 men and women, was used to internally validate the seven dietary patterns. The diet questionnaires and 24-h recalls showed good agreement about which foods characterized the principal components (PCs), results of which are graphically presented using two types of radar plots. The 24-h recall data were also used to formally interpret the dietary patterns. PCI was characterized by convenient foods; PC2 by Mediterranean foods; PCS by vegetables; PC6 by animal foods; and PC7 by several vegetables, poultry, and fish, amongst others. PC3 and PC4 loaded on very few foods. Country explained most of the variability in the PC scores, from 45.6% for PCI to 10.3% for PC7. Next, the associations between the dietary patterns from the principal component analysis and risk of chronic disease were evaluated. Colorectal cancer risk was chosen because diet is believed to be involved in the aetiology of this cancer outcome and because the number of incident cases of colorectal cancer in EPIC is sufficiently large (n=4517 during 5.3 million person-years of follow-up). Hazard ratios of colorectal cancer according to 1-SD higher PC scores were computed with Cox proportional-hazards regression analysis. After adjustment for age, sex, energy intake, education, and lifestyle factor, but not for study centre, PCI related to higher, and PC2 to lower risk of colorectal cancer. When differences between centres were accounted for, associations were weaker but still significant. PCI and PC2 also modified their associations with colorectal cancer risk. Accordingly, the lowest risk was seen for participants with poor adherence to PCI and better adherence to PC2. The other PCs were not materially associated with colorectal cancer.