meuGenoma v1.5

Antonio C B Oliveira

acbolive@sloan.mit.edu

April 2009

 

meuGenoma is a Python program to match genomes sequenced by 23andMe and/or deCODEme with the information in SNPedia.

The following files are used:

meugenoma.txt

A one line text file with two strings: <name> <pop>.

<name> is a string that identifies a person. <pop> is the population identifier used in SNPedia {CEU,HCB,JPT,YRI}

<name>_23andme.txt

The tab-delimited file downloaded from 23andme including all genotype calls

<name>_decodeme.csv

The comma separated file downloaded from deCODEme including all genotype calls

meuGenoma read SNPedia  and generate auxiliary files in subdirectory meuGenomaTemp.  This process might take a few hours depending on the speed of the internet connection and might fail due to time out. If this happens, the program should be executed again and it will automatically restart from where it was stopped.  In order to use the latest information from SNPedia the subdirectory meuGenomaTemp should be deleted. Storing a local copy of the information from SNPedia saves time when running meuGenoma for more than one person. The program works with files from both 23andMe and deCODEme or with a single file from one of them.

Two output files are produced:

<name>_<yyyy-mm-dd>.csv

One line for each SNP  containing the information obtained from SNPedia plus the genotypes from 23andMe and deCODEme. The SNP name is a hyperlink to the corresponding entry in SNPedia.   Using the macro in this spreadsheet, convenient genome reports (e.g.  xls pdf) can be produced. The report includes useful information as follows:

[>] indicates a new SNP or modified SNPedia information as compared to the previous run of the program.

[<>] genotypes provided by 23andMe and deCODEme differ

[?] difficult to interpret genotypes because we do not know the strand orientation of SNPedia alleles.

[??] alleles in SNPedia are incompatible with genotypes provided by 23andMe or deCODEme

[>>] genotype is not the most common variant according to frequency data in SNPedia

 

<name>_dump.txt

Contains all relevant information, saved to enable the program to detect differences in a subsequent run.

 

meuGenoma is available as a Windows executable, Python source code and demo input files in  meuGenoma.zip .

 

 

www.barbosadeoliveira.com/meuGenoma

2009/04/18