Package Biskit :: Package Mod :: Module ValidationSetup :: Class ValidationSetup
[hide private]
[frames] | no frames]

Class ValidationSetup

source code

Takes a TemplateSearcher result folder and creates sub-projects with each template cluster center as target sequence to be modeled. In each sub-project folder a folder structures analogue to the main project is set up. The real structure is linked into the sub-project folder as reference.pdb

Instance Methods [hide private]
  __init__(self, outFolder, log=None)
  prepareFolders(self)
Check that all needed folders exist, if not create them
  logWrite(self, msg, force=1)
Write message to log.
[str] cluster_result(self, chain_index=None)
Take clustering result from the file 'chain_index.txt'
  createTemplatesFolder(self, validation_folder, cluster)
Create folders for the templates to be used for the validation.
{str:str} prepare_alpha(self, cluster_list, alpha_folder=None, output_folder=None)
Create a dictionary where the keys are template pdb codes and the value are the corresponding file names of the carbon alpha pdb files for ALIGNER (.alpha).
{str:str} prepare_pdb(self, cluster_list, pdb_folder=None, output_folder=None)
Create a dictionary which keys are templates pdb code and the value the different file names of pdb files for MODELLER
  prepare_templatesfasta(self, cluster_list, pdb_dictionary, output_folder=None)
Create 'templates.fasta' file for each template to validate
  link_pdb(self, cluster_list, pdb_dictionary, alpha_dictionary, output_folder=None)
Create link in each template folder to the pdb files for MODELLER and for the alpha files for T-Coffee.
  prepare_target(self, cluster, output_folder=None)
Create the 'target.fasta' file for each template to validate
  prepare_sequences(self, cluster, sequences_folder=None, output_folder=None)
Link the 'sequences' directory from the project directory in each template folder
  link_reference_pdb(self, cluster, input_folder=None, output_file=None)
Create a link in each template folder with their own known structure 'reference.pdb'
  go(self, validation_folder=None)
Greate validation directory setup.

Class Variables [hide private]
  F_RESULT_FOLDER = '/validation'
  F_NR_FOLDER = '/sequences'
  F_ALPHA_FOLDER = F_RESULT_FOLDER+ '/t_coffee/'
  F_PDB_FOLDER = F_RESULT_FOLDER+ '/modeller/'
  F_PDB_LINK = F_RESULT_FOLDER+ '/modeller/'
  F_ALPHA_LINK = F_RESULT_FOLDER+ '/t_coffee/'
  F_TEMPLATE_SEQUENCE = '/target.fasta'
  F_TCOFFEE = '/t_coffee_template_files'
  F_TEMPLATES_FASTA = '/templates.fasta'
  F_KNOWN_STRUCTURE = '/reference.pdb'

Method Details [hide private]

__init__(self, outFolder, log=None)
(Constructor)

source code 
Parameters:
  • outFolder (str) - base folder
  • log (LogFile instance or None) - None reports to STDOUT

prepareFolders(self)

source code 

Check that all needed folders exist, if not create them

logWrite(self, msg, force=1)

source code 

Write message to log.
Parameters:
  • msg (str) - message to print

cluster_result(self, chain_index=None)

source code 

Take clustering result from the file 'chain_index.txt'
Parameters:
  • chain_index () - file with clustering results (default: None-> TemplateSearcher.F_CHAIN_INDEX)
Returns: [str]
pdb codes of templates

createTemplatesFolder(self, validation_folder, cluster)

source code 

Create folders for the templates to be used for the validation.
Parameters:
  • validation_folder (str) - top folder for the validation
  • cluster (str) - name for validation subfolder (e.g. pdb code of cluster center)

prepare_alpha(self, cluster_list, alpha_folder=None, output_folder=None)

source code 

Create a dictionary where the keys are template pdb codes and the value are the corresponding file names of the carbon alpha pdb files for ALIGNER (.alpha).
Parameters:
  • cluster_list ([str]) - pdb codes of templates
  • alpha_folder (str) - folder with template CA-trace files (default: None -> F_ALPHA_FOLDER)
  • output_folder (str) - top output folder (default: None -> F_RESULT_FOLDER)
Returns: {str:str}
dictionary mapping pdb code to CA-trace files

prepare_pdb(self, cluster_list, pdb_folder=None, output_folder=None)

source code 

Create a dictionary which keys are templates pdb code and the value the different file names of pdb files for MODELLER
Parameters:
  • cluster_list ([str]) - pdb codes of templates
  • pdb_folder (str) - folder with Modeller pdb files (default: None -> F_PDB_FOLDER)
  • output_folder (str) - top output folder (default: None -> F_RESULT_FOLDER)
Returns: {str:str}
dictionary mapping pdb code to pdb files used by Modeller

prepare_templatesfasta(self, cluster_list, pdb_dictionary, output_folder=None)

source code 

Create 'templates.fasta' file for each template to validate
Parameters:
  • cluster_list ([str]) - pdb codes of templates
  • pdb_dictionary ({str:str}) - dictionary mapping pdb code to pdb files used by Modeller
  • output_folder (str) - top output folder (default: None -> F_RESULT_FOLDER)

link_pdb(self, cluster_list, pdb_dictionary, alpha_dictionary, output_folder=None)

source code 

Create link in each template folder to the pdb files for MODELLER and for the alpha files for T-Coffee.
Parameters:
  • cluster_list ([str]) - pdb codes of templates
  • pdb_dictionary ({str:str}) - dictionary mapping pdb code to pdb files used by Modeller
  • alpha_dictionary ({str:str}) - dictionary mapping pdb code to CA-trace files
  • output_folder (str) - top output folder (default: None -> F_RESULT_FOLDER)

prepare_target(self, cluster, output_folder=None)

source code 

Create the 'target.fasta' file for each template to validate
Parameters:
  • cluster (str) - name of the cluster which is used for the foldder name in which the validation is run.
  • output_folder (str) - top output folder (default: None -> F_RESULT_FOLDER)

prepare_sequences(self, cluster, sequences_folder=None, output_folder=None)

source code 

Link the 'sequences' directory from the project directory in each template folder
Parameters:
  • cluster (str) - name of the cluster which is used for the folder name in which the validation is run.
  • sequences_folder (str) - folder with sequences (default: None -> SequenceSearcher.F_RESULT_FOLDER)
  • output_folder (str) - top output folder (default: None -> F_RESULT_FOLDER)

link_reference_pdb(self, cluster, input_folder=None, output_file=None)

source code 

Create a link in each template folder with their own known structure 'reference.pdb'
Parameters:
  • cluster (str) - name of the cluster which is used for the foldder name in which the validation is run.
  • input_folder (str) - folder with pdb files (default: None -> F_PDB_FOLDER)
  • output_file (str) - target file

go(self, validation_folder=None)

source code 

Greate validation directory setup.
Parameters:
  • validation_folder (str) - top output folder (default: None -> F_RESULT_FOLDER)

Class Variable Details [hide private]

F_RESULT_FOLDER

Value:
'/validation'                                                          
      

F_NR_FOLDER

Value:
'/sequences'                                                           
      

F_ALPHA_FOLDER

Value:
F_RESULT_FOLDER+ '/t_coffee/'                                          
      

F_PDB_FOLDER

Value:
F_RESULT_FOLDER+ '/modeller/'                                          
      

F_PDB_LINK

Value:
F_RESULT_FOLDER+ '/modeller/'                                          
      

F_ALPHA_LINK

Value:
F_RESULT_FOLDER+ '/t_coffee/'                                          
      

F_TEMPLATE_SEQUENCE

Value:
'/target.fasta'                                                        
      

F_TCOFFEE

Value:
'/t_coffee_template_files'                                             
      

F_TEMPLATES_FASTA

Value:
'/templates.fasta'                                                     
      

F_KNOWN_STRUCTURE

Value:
'/reference.pdb'