Utilizing clinical data for studying natural phenotypic variation in human populations
- Project number: 202-08-12
- Students: Danny Mindroul
- Supervisor: Abraham Melkman
Decomposing age and sex trends.
Clinical traits change with age. Our understanding of growth and aging suggests that these changes
follow composite functions which may be faithfully described with a small set of primitive functions
each applying to different age range. Visual inspection suggests that platelet count, for example,
declines from birth to age 12, remains constant until age 56, and linearly declines from age 56.
The project implements a method for identifying the optimal combination of interval limits and linear
functions that best explains the data. Assuming the existence of a small number of intervals (five or fewer),
it searches the possible intervals and calculates the best fitting linear composite function.
Some similarity is expected between the intervals of all traits. Merging the emerging intervals can
assist in splitting human life time to its significant stages as far as the peripherial blood system is concerned.
Data Processing.
Assuming those critical ages do exist, the correlation between two suggested linear regressions around that age is
expected to be much stronger than a correlation of one regression over the same age span around the suggested critical age.
Furthermore, critical ages, relevant for this project, are expected to be at least N years apart.
Based on those two assumptions, the search for the critical ages is done by scanning traits data over age with a window
of 2*N years, matching a linear regression over the whole window and comparing it to a two linear regressions,
from X-N to X and X to X+N age spans. The best fitting critical ages are then selected and a linear fit is then
calculated for the ranges between them. The traits data is then merged over the age axis to examine possible common
critical ages over the human life span. It is possible that there are a few different sets of traits having different critical ages. To inspect this theory, traits need to be arranged in clusters by their critical ages. This part is not yet implemented and might mot be included in the project.
Project Details.
The project is implemented as part the research by Dr. Eitan Rubin and his team.
It should analyze the selected traits from the database and produce the optimal age intervals and functions for each
interval in trait. The intervals from all traits are united in seek of a match.
Data from the 3rd National Health and Nutrition Examination Survey (NHANES3) will be used. In this survey, a cohort of
over 30,000 non-institutionalized individuals was assembled such that it serves as a representative sample of the US population.
Individuals were subjected to a battery of tests, including standard and non-standard lab tests, physical examination,
and various questionnaires about their health, life habits, nutrition, and psychiatric capacity. Altogether, about 8,000 data
fields were recorded for each individual. From those, around 70 are relevant for the purpose of this project.