I have a an excel file with a table as follows:
Genes Family Desc
A123 B2 Cytochrome p450 enzyme
A124 B2 Cytochrome p450 protease
B352 B1 lipid
C132 A1 heat shock 72
A331 A1 heat shock 70
i want to store these in a flat file: "family"/genelist.txt
where the genelist.txt contains all the genes.
The family folder is all the "B2s" "A1's" "A2's" etc, BUT instead of calling the folder A1 or B2 etc i want to look at its corresponding description and then name it co-ordinatingly.
i.e. the first gene A123, would be grouped in the B2 family, but the folder it is contained in should be called Cytochrome p450.
so basically i want to be able to run through the desciptions of the genes and create a suitable name for the folder to be in.
e.g. family A2 would contain a txt file with C132 and A331 inside the text file, this text file would be inside a folder called "Heat Shock"
I have managed to group the genes into folders but using the familyname to name the folders instead of using the description. this is the perl file so far:
my ($gene, $family);
print "Please type filename you wish to use
open(F,"$fileID.tab") || die "can't open input file
# get rid of newline character at end of line
# split up line
($gene, $family) = split(/ /);
# remove " chars
# $gene =~ /"/g;
#$family =~ /"/g;
# create the family directory
mkdir $family; # don't care if it already exists
# add gene to end of file
|| die "can't write genelist file
print OF "$gene
Much Much appreciated!! thanks in advance